Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Aws - Sa Notes

Download as pdf or txt
Download as pdf or txt
You are on page 1of 68

AMAZON WEB SERVICES

AWS – Solution architect associate Exam notes


Table of Contents
AWS Global Infrastructure ............................................................................................................................ 2
Identity Access Management ....................................................................................................................... 3
S3 Storage ...................................................................................................................................................... 6
Different S3 Storage Classes ..................................................................................................................... 7
Cloud Front.. ............................................................................................................................................ 11
Storage Gateway ..................................................................................................................................... 12
EC Compute Cloud ....................................................................................................................................... 18
Volumes & Snapshots.............................................................................................................................. 23
CloudWatch.. ........................................................................................................................................... 25
Elastic File System.. ................................................................................................................................. 27
AWS DATABASE........................................................................................................................................... 29
RedShift.. ................................................................................................................................................ 35

Aurora.. ................................................................................................................................................... 36

ElastiCache.. ........................................................................................................................................... 38

Amazon Route 53 ........................................................................................................................................ 39


Amazon VPC ................................................................................................................................................ 44
Network ACL.. ......................................................................................................................................... 47

HA Architecture.. ......................................................................................................................................... 52

Load Balancer.. ....................................................................................................................................... 53

AWS Applications.. ...................................................................................................................................... 55

SQS.......................................................................................................................................................... 56

SWF ......................................................................................................................................................... 57

SNS .......................................................................................................................................................... 58

Elastic Transcoder................................................................................................................................... 59

API Gateway ........................................................................................................................................... 60

Kenisis ..................................................................................................................................................... 62

Web Identify Foundation.. ..................................................................................................................... 63

Serverless..................................................................................................................................................... 65

Lambda ................................................................................................................................................... 66

1|Page
AWS Global Infrastructure

Currently (in 2019 October), AWS has a total of

22 Regions (3 coming soon)


69 Availability Zones (15 coming soon)

Region
A region is a geographical area. Each Region consists of 2 (or more) availability Zones.

Availability Zone
An availability Zone is one or more discrete data centers, each with redundant power,
networking and connectivity, housed in separate facilities but because they are close together,
they are counted as 1 Availability Zone.

Edge Locations
In every Region, there are edge locations. Edge locations are endpoints for AWS which are used
for caching content. Typically this consists of CloudFront, Amazon’s Content Delivery Network
(CDN).
There are many more Edge Locations than Regions. Currently there are over 150 Edge
Locations.

2|Page
IDENTITY ACCESS MANAGEMENT (IAM)

IAM allows you to manage users and their level of access to the AWS console. It is important to
understand IAM and how it works, both for the exam and for administrating a company’s AWS
account in real life.

IAM offers the following features;

• Centralized control of your AWS account


• Shared Access to your AWS account.
• Granular level permissions
• Identity Federation (including Active Directory, Facebook, LinkedIn etc)
o Eg; users can login to AWS account with active directory credentials, and users
can use their Facebook id or LinkedIn account to login to features in AWS.
• Multifactor Authentication
• Provide temporary access for users/ devices and services where necessary.
• Allows you to set up your own password rotation policy
• Integrates with many different AWS services.
• Supports PCI DSS Compliance.

Key Terminology for IAM


Users:
End Users such as people, employees of an organization etc.
Groups:
A collection of users. Each user in the group will inherit the permissions of the group.

Policies:
Policies are made up of documents, called Policy documents. These documents are in a format
called JSON and they give permissions as to what a User/ Group/Role is able to do.
Roles:

You create roles and then assign them to AWS Resources.

3|Page
Key points:
• IAM is universal. It does not apply to regions at this time.
• The “root account” is simply the account created when first setup your AWS account. It
has complete Admin access.
• New Users have No permissions when first created.
• New Users are assigned Access Key ID & Secret Access Keys when first created.
• These are not the same as a password. You cannot use the Access Key ID & Secret
Access Key to Login in to the console. You can use this to access AWS via the APIs and
Command Line, however.
• You only get to view these once. If you lose them, you have to regenerate them. So,
save them in a secure location.
• Always setup Multifactor Authentication on your root account.
• You can create and customize your own password retention policy.

4|Page
SIMPLE STORAGE SERVICE (S3)

5|Page
S3 Storage

S3 stands for Simple Storage Service. It provides developers and IT teams with secure, durable,
higly-scalable object storage. Amazon S3 is easy to use, with a simple web services interface to
store and retrieve any amount of data from anywhere on the web.
S3 is a safe place to store your files. It is an Object-based storage. The data is spread across
multiple devices and facilities.
S3 is Object-based – i.e. allows you to upload files. Files can be from 0 Bytes to 5 TB. There is
unlimited Storage & Files are stored in buckets. Bucket is basically just like a folder.
S3 is a universal namespace. Ie, names must be unique globally. A typical cloud bucket name
will be like https://s3-eu-west-1.amazonaws.com/grbbucket
When you upload a file to S3, you will receive a HTTP 200 code if the upload was successful.
As it is object based, it is not suitable to install OS or DB (For that you need block based
storage).

S3 is object based. Think of the objects just as files. Objects consist of the following:

• Key (This is simply the name of the object)


• Value (This is simply the data and is made up of a sequence of bytes).
• Version ID (important for versioning)
• Metadata (Data about data you are storing)
• Sub resources like;
o Access Control Lists
o Torrents

How does data consistency work for S3?


• Read after write consistency for PUTS of new Objects.
o If you write a new file and reat it immediately afterwards, you will be able to
view that data.
• Eventual Consistency for overwrite PUTS and DELETES (can take some time to
propagate).

6|Page
o If you update AN EXISTING file or delete a file and read it immediately, you may
get the older version, or you may not. Basically changes to objects can take a
little bit of time to propagate.

S3 has the following guarantees from Amazon;


• Built for 99.99% availability for the S3 platform.
• Amazon Guarantee 99.9 % availability
• Amazon guarantees 99.999999999% (11 x 9s) durability for S3 information.

S3 has the following features;


• Tiered Storage Available
• Lifecycle Management
• Versioning
• Encryption
• MFA Delete
• Secure your data using Access Control List & Bucket Policies.

Different S3 Storage Classes

• S3 Standard
99.99 % availability, 99.999999999% durability.
Stored redundantly across multiple devices in multiple facilities
Designed to sustain the loss of 2 facilities concurrently.

• S3-IA (Infrequently Accessed)


For data that is accessed less frequently, but requires rapid access when needed.
Lower fee than S3, but you are charged a retrieval fee.

• S3 One Zone – IA
For where you want a lower-cost option for infrequently accessed data,
But do not require the multiple Availability Zone data resilience.

• S3 – Intelligent Tiering

7|Page
Designed to optimize costs by automatically moving data to the most cost-effective
access tier, without performance impact or operational overhead.
Data Archival Storage Classes;

• S3 Glacier
S3 Glacier is a secure, durable, and low-cost storage class for data archiving. You can
reliably store any amount of data at costs that are competitive with or cheaper than on-
premises solutions. Retrieval times configurable form minutes to hours.

• S3 Glacier Deep Archive


S3 Glacier Deep Archive is Amazon S3’s lowest-cost storage class where a retrieval time
of 12 hours is acceptable.

Different Charging methods in S3 Storage;


• Based on Storage
The more storage you use, the more bill you get.
• Based on Requests
If you make more requests to objects, it will be more expensive
• Storage Management Pricing
Based on different storage tiers used.
• Data Transfer Pricing
• Transfer Acceleration
Amazon S3 Transfer Acceleration enables fast, easy, and secure transfers of files over
long distances between your end users and an S3 bucket.
Transfer Acceleration takes advantage of Amazon CloudFront’s globally distributed edge
locations. As the data arrives at an edge location, data is routed to Amazon S3 over an
optimized network path (by using amazon backbone network).

8|Page
• Cross Region Replication Pricing
If you need to replicate between buckets in 2 different regions mainly for DR or High
availability.

Bucket Security & Encryption


By default, all newly created buckets are PRIVATE. You can setup access control to your buckets
using;

• Bucket Policies – Bucket Level


• Access Control Lists – For Individual Objects.

S3 buckets can be configured to create access logs which log all requests made to the S3
bucket. This can be sent to another bucket and even another bucket in another account.
Encryption in Transit is achieved by SSL or TLS
Encryption at rest for S3 buckets (Server Side) is achieved by

• S3 managed Keys – SSE-S3 (Server Side Encryption – S3)


• AWS Key management Service , Managed Keys – SSE-KMS – where key is managed by
AWS
• Server Side Encryption with Customer provided Keys – SSE-C

9|Page
S3 Versioning
• Stores all versions of an object (including all writes and even if you delete an object)
• Great Backup tool
• Once enabled, Versioning cannot be disabled, only suspended.
• Integrates with Lifecycle rules.
• Versioning’s MFA Delete capability, which uses multi-factor authentication, can be used
to provide an additional layer of security.

S3 Lifecycle Management
• Automates moving your objects between the different storage tiers.
• Can be used in conjunction with versioning.
• Can be applied to current versions and previous versions.

S3 Cross Region Replication


• Versioning must be enabled on both the source and destination buckets.
• Regions must be unique.
• Files in an existing bucket are not replicated automatically.
• All subsequent updated files will be replicated automatically
• Delete markers are not replicated.
• Deleting individual versions or delete markets will not be replicated.

S3 Transfer Acceleration
S3 Transfer Acceleration utilizes the CloudFront Edge Network to accelerate your uploads to S3.
Instead of uploading directly to your S3 bucket, you can use a distinct URL to upload directly to
an edge location which will then transfer that file to S3. You will get a distinct URL to upload to:
Alcoudguru.s3-accelerate.amazonaws.com

10 | P a g e
CLOUDFRONT

A content delivery network (CDN) is a system of distributed servers (network) that deliver
webpages and other web content to a user based on the geographic locations of the user, the
origin of the webpage, and a content delivery server.

CloudFront can be used to deliver your entire website, including dynamic, static, streaming, and
interactive content using a global network of edge locations. Requests for your content are
automatically routed to the nearest edge location, so content is delivered with the best possible
performance.

There are 2 different types of distribution;

• Web Distribution - Typically used for Websites.


• RTMP – Used for Media Streaming.

Terminologies
Edge Location: This is the location where content will be cached.
This is separate to an AWS Region/AZ.
Edge Locations are not just READ only – you can write to
them too. (Ie, put an Object on to them.
Objects are cached for the life of the TTL (Time to live). You can clear cached
objects, but you will be charged.

Origin : This is the origin of all the files that the CDN will distribute.
This can be an S3 Bucket, an EC2 Instance, and Elastic
Load Balancer, or Route53.

Distribution: This is the name given the CDN which consists of a


Collection of Edge Locations.

11 | P a g e
To explain the working of Cloudfront, consider our web server hosted in London, and have
users all around the world. We have our edge locations all over the world as well. First user will
do a query to edge location, if edge location doesn’t have that content, it will download that
content from the origin and will be cached until the time to live (TTL). For instance, If the TTL is
for 48 hrs, if another user needs that content, he can access the same content lot quicker from
edge location.

SNOWBALL

Snowball is a petabyte-scale data transport solution that uses secure appliances to transfer
large amounts of data into and out of the AWS. Using Snowball address common challenges
with large-scale data transfers including high network costs, long transfer times, and security
concerns. Transferring data with Snowball is simple, fast, secure, and can be as little as one-fifth
the cost of high-speed internet.
Snowball comes in either a 50TB or 80TB size. Snowball uses multiple layers of security
designed to protect your data including tamper-resistant enclosures, 256-bit encryption, and an
industry-standard Trusted Platform Module (TPM) designed to ensure both security and full
chain-of-custody of your data. Once the data transfer job has been processed and verified, AWS
performs a software erasure of the Snowball appliance.

AWS Snowball Edge is a 100TB data transfer device with on-board storage and compute
capabilities. You can use Snowball Edge to move large amounts of data into and out of AWS, as
a temporary storage tier for large local datasets, or to support local workloads in remote or
offline locations.

Snowball Edge connects to your existing applications and infrastructure using standard storage
interfaces, streamlining the data transfer process and minimizing setup and integration.
Snowball Edge can cluster together to form a local storage tier and process your data on-
premises, helping ensure your applications continue to run even when they are not able to
access the cloud.
AWS Snowmobile is an Exabyte-scale data transfer service used to move extremely large
amounts of data to AWS. You can transfer up to 100PB per Snowmobile, a 45-foot long
ruggedized shipping container, pulled by a semi-trailer truck. Snowmobile makes it easy to
move massive volumes of data to the cloud, including video libraries, image repositories, or
even a complete data center migration. Transferring data with Snowmobile is secure, fast and
cost effective.

12 | P a g e
STORAGE GATEWAY

AWS Storage Gateway is a service that connects an on-premises software appliance with cloud-
based storage to provide seamless and secure integration between an organization’s on-
premises IT environment and AWS’s storage infrastructure. The service enables you to securely
store data to the AWS Cloud for scalable and cost-effective storage.
AWS Storage Gateway’s software appliance is available for download as a virtual machine (VM)
image that you install on a host in your datacenter. Storage Gateway supports either VMware
ESXi or Microsoft Hyper-V. Once you’ve installed your gateway and associated it with your AWS
account through the activation process, you can use the AWS Management Console to create
the storage gateway option that is right for you.

13 | P a g e
Three Different Types of Storage Gateway;

1. File Gateway (NFS & SMB)

Files are stored as objects in your S3 buckets, accessed through a Network File System
(NFS) mount point. Ownership, permissions, and timestamps are durably stored in S3 in
the user-metadata of the object associated with the file. Once objects are transferred to
S3, they can be managed as native S3 objects, and bucket policies such as versioning,
lifecycle management, and cross-region replication apply directly to objects stored in
your bucket.

2. Volume Gateway (iSCSI)

The volume interface presents your applications with disk volumes using the iSCSI block
protocol. Data written to these volumes can be asynchronously backed up as point-in-
time snapshots of your volumes, and stored in the cloud as Amazon EBS snapshots.
Snapshots are incremental backups that capture only changed blocks. All snapshot
storage is also compressed to minimize your storage charges.

- Stored Volumes

Stored volumes let you store your primary data locally, while asynchronously
backing up that data to AWS. Stored volumes provide your on-premises applications
with low-latency access to their entire datasets, while providing durable, off-site
backups. You can create storage volumes and mount them as iSCSI devices from
your on=premises application servers.
Data written to your stored volumes is stored on your on-premises storage
hardware. This data is asynchronously backed up to Amazon Simple Storage Service
(Amazon S3) in the form of Amazon Elastic Block Store (Amazon EBS) snapshots. 1
GB – 16 TB in size for stored volumes.

14 | P a g e
- Cached Volumes

Cached volumes let you use Amazon Simple Storage Service (Amazon S3) as your
primary data storage while retaining frequently accessed data locally in your storage
gateway. Cached volumes minimize the need to scale your on-premises storage
infrastructure, while still providing your applications with low-latency access to their
frequently accessed data. You can create storage volumes up to 32 TiB in size and
attach to them as iSCSI devices from your on-premises application servers. Your
gateway stores data that you write to these volumes in Amazon S3 and retains
recently read data in your on-premises storage gateway’s cache and upload buffer
storage. 1 GB – 32 TB in size for Cached Volumes.

15 | P a g e
3. Tape Gateway (VTL)

Tape Gateway offers a durable, cost-effective solutions to archive your data in the AWS
Cloud. The VTL interface it provides lets you leverage your existing tape-based backup
application infrastructure to store data on virtual tape cartridges that you create on
your tape gateway. Each tape gateway is preconfigured with a media changer and tape
drives, which are available to your existing client backup applications as iSCSI devices.
You add tape cartridges as you need to archive your data. Supported by NetBackup,
Backup Exec, Veeam etc.

16 | P a g e
ELASTIC COMPUTE CLOUD

(EC2)

17 | P a g e
ELASTIC COMPUTE CLOUD (EC2)

Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides resizable compute
capacity in the cloud. Amazon EC2 reduces the time required to obtain and boot new server
instances to minutes, allowing you to quickly scale capacity, both up and down, as your
computing requirements change.

EC2 Pricing Models

1. On Demand
Allows you to pay a fixed rate by the hour (or by the second) with no commitment.
Its useful for;
o Users that want the low cost and flexibility of Amazon EC2 without any up-front
payment or long-term commitment
o Applications with short term, spiky, or unpredictable workloads that cannot be
interrupted.
o Applications being developed or tested on Amazon EC2 for the first time.

2. Reserved
Provides you with a capacity reservation, and offer a significant discount on the hourly
charge for an instance. Contract Terms are 1 year or 3 years Terms.
Its useful for;
o Applications with steady state or predictable usage
o Applications that require reserved capacity.
o Users able to make upfront payments to reduce their total computing costs even
further.

18 | P a g e
Reserved Pricing Types;
a. Standard Reserved instances: These offer up to 75% off on demand instances.
The more you pay up front and the longer the contract, the greater the discount.
b. Convertible Reserved Instances: These offer up to 54% off on demand capability
to change the attributes of the RI as long as the exchange results in the creation
of Reserved Instances of equal or greater value.

c. Scheduled Reserved Instances: Theses are available to launch within the time
windows you reserve. This option allows you to match your capacity reservation
to a predictable recurring schedule that only requires a fraction of a day, a week,
or a month.

3. Spot Pricing
Enables you to bid whatever price you want for instance capacity, providing for even
greater savings if your applications have flexible start and end times.
If the spot instance is terminated by Amazon EC2, you will not be charged for a partial
hour of usage. However, if you terminate the instance yourself, you will be charged for
any hour in which the instance ran.
It is useful for;
o Applications that have flexible start and end times.
o Applications that are only feasible at very low compute prices.
o Users with urgent computing needs for large amounts of additional capacity.

4. Dedicated Hosts
Physical EC2 server dedicated for your use. Dedicated Hosts can help you reduce costs
by allowing you to use your existing server-bound software licenses.
It is useful for;
o Useful for regulatory requirements that may not support multi-tenant
virtualization.
o Great for licensing which does not support multi-tenancy or cloud deployments.
o Can be purchased On-Demand (hourly.)
o Can be purchased as a Reservation for up to 70% off the On-Demand price.

19 | P a g e
EC2 Instance Types – Mnemonic

F For FPGA
I For IOPS
G Graphics
H High Disk Throughput
Cheap general purpose (Think T2
T Micro)
D For Density
R For RAM
Main choice for general purpose
M apps
C For Compute
P Graphics (Think Pics)
X Extreme Memory
Z Extreme Memory AND CPU
A Arm-based workloads
U Bare Metal

20 | P a g e
Some Additional Points;

• While creating EC2 instance, Termination Protection is turned off by default, you must
turn it on.
• On an EBS-backed instance, the default action is for the root EBS volume to be deleted
when the instance is terminated.
• EBS Root Volumes of your DEFAULT AMI’s cannot be encrypted. You can also use a third
party tool (such as bit locker etc.) to encrypt the root volume, or this can be done when
creating AMI’s in the AWS console or using the API.
• Additional volumes can be encrypted.

Security Groups

With security groups, you can enable and disable ports (both inbound and outbound) for each
instance. Changes made to Security groups is applied and reflected instantaneously.
All inbound traffic is blocked by default.
All Outbound traffic is allowed.
Changes to Security Groups take effect immediately.

You can have any number of EC2 instances within a security group.
You can have multiple security groups attached to EC2 Instances.
Security Groups are STATEFUL, means if you create an inbound rule allowing traffic in, that
traffic is automatically allowed back out again.

You cannot block specific IP address using Security Groups, instead use network Access Control
Lists.
You can specify allow rules, but not deny rules.

21 | P a g e
ELASTIC BLOCK STORE

Amazon Elastic Block Store (EBS) provides persistent block storage volumes for use with
Amazon EC2 instances in the AWS Cloud. Each Amazon EBS volume is automatically replicated
within its Availability Zone to protect you from component failure, offering high availability and
durability.

5 Different types of EBS Storage

1. General Purpose SSD


2. Provisioned IOP SSD
3. Throughput Optimized HDD
4. Cold HDD
5. EBS Magnetic

22 | P a g e
Volumes & Snapshots

• Volumes exist on EBS. Think of EBS as a virtual hard disk.


• Snapshots exist on S3. Think of snapshots as a photograph of the disk.
• Snapshots are point in time copies of Volumes.
• Snapshots are incremental – This means that only the blocks that have changed isnce
your last snapshot are moved to S3.
• If this is your first snapshot, it may take some time to create.
• To create a snapshot for Amazon EBS volumes that serve as root devices, you should
stop the instance before taking snapshot.
• However you can take a snap while the instance is running.
• You can create AMI’s from both volumes and Snapshots.
• You can change EBS volume sizes on the fly, including changing the size and storage
type.
• Volumes will ALWAYS be in the same availability zone as the EC2 instance.
• To move an EC2 volume form one AZ to another, take a snapshot of it, create an AMI
from the snapshot and then use the AMI to launch the EC2 instance in a new AZ.
• To move an EC2 volume from one region to another, take a snapshot of it, create an
AMI from the snapshot and then copy the AMI from one region to the other. Then use
the copied AMI to launch the new EC2 instance in the new region.

Amazon Machine Image (AMI) Types (EBS & Instance Store)

You can select your AMI based on:

• Region (see Regions and Availability Zones)


• Operating system
• Architecture (32-bit or 64-bit)
• Launch Permissions
• Storage for the Root Device (Root Device Volume)
o Instance Store (EPHEMERAL STORAGE)
o EBS Backed Volumes

23 | P a g e
EBS vs Instance Store Volumes

All AMIs are categorized as either backed by Amazon EBS or backed by instance store.
For EBS Volumes: The root device for an instance launched from the AMI is an Amazon EBS
volume created from an Amazon EBS snapshot. EBS backed instances can be stopped. You will
not lose the data on this instance if it is stopped.
For Instance Store Volumes: The root device for an instance launched from the AMI is an
instance store volume created from a template stored in Amazon S3. Instance store volumes
cannot be stopped. If the underlying host fails, you will lose your data.
If you reboot both types, you will not lose your data. By default, both ROOT volumes will be
deleted on termination. However, with EBS volumes, you can tell AWS to keep the root device
volume.

Encrypted Root Device Volumes & Snapshots

• Snapshots of encrypted volumes are encrypted automatically.


• Volumes restored from encrypted snapshots are encrypted automatically.
• You can share snapshots, but only if they are unencrypted.
• These snapshots can be shared with other AWS accounts or made public.
• You can now encrypt root device volumes upon creation of the EC2 instance.
• If the root device volume is not encrypted then;

o Create a Snapshot of the unencrypted root device volume


o Create a copy of the Snapshot and select the encrypt option
o Create an AMI from the encrypted Snapshot
o Use that AMI to launch new encrypted instances

24 | P a g e
CLOUD WATCH

Amazon cloudwatch is a monitoring service to monitor your AWS resources, as well as the
applications that you run on AWS. Cloudwatch basically monitors performance.
It can monitor things like;

• Compute
o EC2 Instances
o Auto scaling Groups
o Elastic Load Balancers
o Route53 Health Checks
• Storage & Content Delivery
o EBS Volumes
o Storage Gateways
o CloudFront
Host level Metrics Consist of:

• CPU
• Network
• Disk
• Status Check

Cloudwatch with EC2 will monitor events every 5 minutes by default. You can have 1 minute
intervals by turning on detailed monitoring.

You can create CloudWatch alarms which trigger notifications.


CloudWatch Events helps you to respond to state changes in your AWS resources &
CloudWatch Logs helps you to aggregate, monitor, and store logs.

AWS Cloud Trail

AWS CloudTrail increases visibility into your user and resource activity by recording AWS
Management Console actions and API calls. You can identify which users and accounts called
AWS, the source IP address from which the calls were made, and when the calls occurred.

25 | P a g e
CloudTrail vs CloudWatch

• CloudWatch monitors performance


• CloudTrail monitors API calls in the AWS platform.
• CloudWatch is all about performance & CloudTrail is all about auditing.

AWS Command Line

You can interact with AWS form anywhere in the world just by using the command line (CLI).
You will need to set up access in IAM
To access aws command line, login to any aws ec2 instance, and use “aws” prefix.
For eg: aws s3 ls
This command list all the S3 buckets.

IAM ROLES with EC2

• Roles are more secure than storing your access key and secret access key on individual
EC2 instances.
• Roles are easier to manage.
• Roles can be assigned to an EC2 instance after ti is created using both the console &
command line.
• Roles are universal – you can use them in any region.

Instance Metadata

Metadata is used to get information about an instance (such as public ip)


Use command “curl http://169.254.169.254/latest/meta-data/”
To get the user data, use command;
Curl http://169.254.169.254/latest/user-data

26 | P a g e
Elastic File System (EFS)

Amazon Elastic File System (Amazon EFS) is a file storage service for Amazon Elastic Compute Cloud
(Amazon EC2) instances. Amazon EFS is easy to use and provides a simple interface that allows you to
create and configure file systems quickly and easily. With Amazon EFS, storage capacity is elastic,
growing and shrinking automatically as you add and remove files, so your applications have the storage
they need, when they need it.

• It supports the Network File System Version 4 (NFSv4) protocol


• You only pay for the storage you use (no pre-provisioning required.)
• Can scale up to the petabytes
• Can support thousands of concurrent NFS connections
• Data is stored across multiple AZ’s within a region
• Read After Write Consistency.

EC2 Placement Groups


It is a way of placing your EC2 instances.

• You can’t merge placement groups


• You can’t move an existing instance into a placement group. You can create an AMI from your
existing instance, and then launch a new instance from the AMI into a placement group.

There are 3 different types of placement groups namely;

• Clustered Placement Group


A cluster placement group is a grouping of instances within a single Availability Zone. Placement
groups are recommended for applications that need low network latency, high network
throughput, or both. AWS recommend homogenous instances within Clustered placement
group.
Only certain instances can be launched in to a Clustered Placement Group.
It can’t span multiple Availability Zones.

• Spread Placement Group


A spread placement group is a group of instances that are each placed on distinct underlying
hardware.
Spread placement groups are recommended for applications that have a small number of critical
instances that should be kept separate from each other.

27 | P a g e
• Partitioned Placement Group
When using partition placement groups, Amazon EC2 divides each group into logical segments
called partitions. Amazon EC2 ensures that each partition within a placement group has its own
set of racks. Each rack has its own network and power source. No two partitions within a
placement group share the same racks, allowing you to isolate the impact of hardware failure
within your application. For Multiple EC2 isntances HDFS, HBase, and Cassandra

28 | P a g e
AWS DATABASES

29 | P a g e
Relational Databases
Relational databases are what most of us ar all used to. They have been around since the 70’s.
Think of a traditional spreadsheet;

• Database
• Tables
• Row
• Fields

Eg;

There are different relational databases on AWS namely;

• MS SQL Server
• Oracle
• MySQL Server
• PostgreSQL
• Aurora
• MariaDB

30 | P a g e
Relational Database Services (RDS) Features;
RDS has two key features;

• Multi-AZ – For Disaster Recovery


• Read Replicas – For Performance.

Non-Relational Databases are as follows;


• Collection = Table
• Document = Row
• Key Value Pairs = Fields

Example of a non relational database;


{

“_id” : “51262c865caasdsadfbe0545435,
“firstname” : ”John”,
“surname” : “Smith”,
“Age” : “23”,
“address” : [

{“street”: “21 Jump Street”,


“suburb” : “Richmond”}
]
}

Relational Database Vs Non-Relational Database


We can have any number of rows and fields in non-relational database, but in relational
database, we need to keep some consistency in data.

DynamoDB (No SQL)


It is amazon’s No SQL solution. Amazon DynamoDB is a fast and flexible NoSQL database service
for all applications that need consistent, single-digit millisecond latency at any scale. It is a fully
managed database and supports both document and key-value data models. Its flexible data
model and reliable performance make it a great fit for mobile, web, gaming, ad-tech, IoT, and
many other applications.

31 | P a g e
The basics of DynamoDB are as follows;

• Stored on SSD Storage


• Spread across 3 geographically distinct data centers.
• Eventual consistent Reads (Default)
Consistency across all copies of data is usually reached within a second. Repeating a
read after a short time should return the updated data. (Best Read Performance)
• Strongly Consistent Reads.
A strongly consistent read returns a result that reflects all writes that received a
successful response prior to the read

Data Warehousing
Used for business intelligence. Tools like Cognos, Jaspersoft, SQL Server Reporting Services,
Oracle Hyperion, SAP NetWeaver.
Data warehousing databases use different type of architecture both from a database
perspective and infrastructure layer.
Amazon’s Data Warehouse Solution is called Redshift. (Mainly for OLAP)
Used to pull in very large and complex data sets. Usually used by management to do queries on
data (such as current performance vs targets etc)

OLTP vs OLAP
Online Transaction Processing (OLTP) differs from OLAP Online Analytics Processing (OLAP) in
terms of the types of queries you will run.
OLTP Example:
Order number 212002
Pulls up a row of data such as Name, Data, Address to Deliver to, Delivery Status etc.

OLAP Transaction Example:


Net profit for EMEA and pacific for the Digital Radio Product. Pulls in large numbers of records.
Sum of Radios Sold in EMEA
Sum of Radios Sold in Pacific
Unit Cost of Radio in each region
Sales price of each radio
Sales price – unit cost

32 | P a g e
Additional Points;
• RDS runs on virtual machines
• You cannot log in to these operating systems however.
• Patching of the RDS Operating System and DB is Amazon’s responsibility
• RDS is NOT Serverless
• Aurora Serverless IS Serverless

RDS – Back Ups, Multi-AZ & Read Replicas


There are two different types of Backups for RDS:
• Automated Backups
Automated Backups allow you to recover your database to any point in time within a
“retention period”. The retention period can be between one and 35 days. Automated
Backups will take a full daily snapshot and will also store transaction logs throughout the
day. When you do a recover, AWS will first choose the most recent daily back up, and
then apply transaction logs relevant to that day. This allows you to do a point in time
recovery down to a second, within the retention period.

Automated Backups are enabled by default. The backup data is stored in S3 and you get
free storage space equal to the size of your database. So If you have an RDS Instance of
10Gb, you will get 10Gb worth of storage.
Backups are taken within a defined window. During the backup window, storage I/O
may be suspended while your data is being backed up and you may experience elevated
latency.

• Database Snapshots
DB Snapshots are done manually (ie they are user initiated.) They are stored even after
you delete the original RDS instance, unlike automated backups.

Restoring Backups
Whenever you restore either an Automatic Backup or a manual Snapshot, the restored version
of the database will be a new RDS instance with a new DNS endpoint.

33 | P a g e
Encryption At Rest
Encryption at rest is supported for MySQL, Oracle, SQL Server, PostgreSQL, MariaDB & Aurora.
Encryption is done using the AWS Key Management Service (KMS) Service. Once your RDS
instance is encrypted, the data stored at rest in the underlying storage is encrypted, as are tis
automated backups, read replicas, and snapshots.

Multi-AZ
Multi-AZ allows you to have an exact copy of your production database in another Availability
Zone. AWS handles the replication for you, so when your production database is written to, this
write will automatically be synchronized to the stand by database. It is used for DR.

In the event of planned database maintenance, DB Instance failure, or an Availability Zone


failure, Amazon RDS will automatically failover to the standby so that database operations can
resume quickly without administrative intervention.
Multi-AZ is available for the following databases
• SQL Server
• Oracle
• MySQL Server
• PostgreSQL
• MariaDB

Read Replica
Read replicas allow you to have a read-only copy of your production database. This is achieved
by suing Asynchronous replication from the primary RDS instance to the read replica. You use
read replicas primarily for very read-heavy database workloads.

• It is used for scaling, not for DR!


• Must have automatic backups turned on in order to deploy a read replica.
• You can have up to 5 read replica copies for any database.
• You can have read replicas of read replicas (but watch out for latency.)
• Each read replica will have its own DNS end point.
• You can have read replicas that have Multi-AZ.
• You can create read replicas of Multi-AZ source databases.
• Read replicas can be promoted to be their own databases. This breaks the replication.
• You can have a read replica in a second region.

34 | P a g e
Read Replicas are available for the following databases
• MySQL Server
• PostgreSQL
• MariaDB
• Oracle
• Aurora

Redshift

Amazon Redshift is a fast and powerful, fully managed, petabyte-scale data warehouse service
in the cloud. Customers can start small for just $0.25 per hour with no commitments or upfront
costs and scale to a petabyte or more for $1,000 per terabyte per year, less than a tenth of
most other data warehousing solutions.

• It is used for Business Intelligence


• Enabled by default with 1 day retention period.
• Maximum retention period is 35 days.
• Redshift always attempts to maintain at least 3 copies of your data (the original &
replica on the compute nodes and a backup in Amazon S3).
• Redshift can also asynchronously replicate your snapshots to S3 in another region for
disaster recovery.
Advanced Compression: - Columnar data stores can be compressed much more than row-based
data stores because similar data is stored sequentially on disk. Amazon Redshift employs
multiple compression techniques and can often achieve significant compression relative to
traditional relational data stores. In addition, Amazon Redshift doesn’t require indexes or
materialized views, and so uses less space than traditional relational database systems. When
loading data into an empty table, Amazon Redshift automatically samples your data and selects
the most appropriate compression scheme.
Massively Parallel Processing (MPP): Amazon Redshift automatically distributes data and query
load across all nodes. Amazon Redshift makes it easy to add nodes to your data warehouse and
enables you to maintain fast query performance as your data warehouse grows.

35 | P a g e
Redshift is priced as follows;
• Compute Node Hours (Total number of hours you run across all your compute nodes for
the billing period. You are billed for 1 unit per node per hour, so a 3-node data
warehouse cluster running persistently for an entire month would incur 2,160 instance
hours. You will not be charged for leader node hours; only compute nodes will incur
charges.)
• Backup
• Data Transfer (Only within a VPC, not outside it)

Security Considerations (Redshift):


• Encrypted in transit using SSL
• Encrypted at rest using AES-256 encryption
• By default RedShift takes care of key management.
o Manage your own keys through HSM
o AWS Key Management Service.

Redshift Availability
• Currently only available in 1 AZ
• Can restore snapshots to new AZs in the event of an outage.

AURORA

Amazon Aurora is a MySQL-compatible, relational database engine that combines the speed and
availability of high-end commercial databases with the simplicity and cost-effectiveness of open source
databases. Amazon Aurora provides up to five times better performance than MySQL at a price point
one tenth that of a commercial database while delivering similar performance and availability.

• It start with 10GB, scales in 10GB increments to 64TB (Storage Autoscaling)


• Compute resources can scale up to 32vCPUs and 244GB of Memory

36 | P a g e
• 2 copies of your data is contained in each availability zone, with minimum of 3 availability zones.
6 copies of your data.

Scaling Aurora
• Aurora is designed to transparently handle the loss of up to 2 copies of data without affecting
database write availability and up to three copies without affecting read availability.
• Aurora storage is also self-healing. Data blocks and disks are continuously scanned for errors and
repaired automatically.

Two Types of Aurora Replicas are available;


• Aurora Replicas (currently 15)
• MySQL Read Replicas (currently 5)

Backups With Aurora


• Automated backups are always enabled on Amazon Aurora DB Instances. Backups do not impact
database performance.
• You can also take snapshots with Aurora. This also does not impact on performance.
• You can share Aurora Snapshots with other AWS accounts.

37 | P a g e
ElastiCache

ElastiCache is a web service that makes it easy to deploy, operate, and scale an in-memory
cache in the cloud. The service improves the performance of web applications by allowing you
to retrieve information from fast, managed, in-memory caches, instead of relying entirely on
slower disk-based databases.

ElastiCache supports two open-source in-memory caching engines:


• Memcached
• Redis

38 | P a g e
ROUTE 53

39 | P a g e
Amazon Route 53

It is amazons free DNS Service

Some basics facts;


• IPv4 space is a 32 bit field and has over 4 billion different addresses (4,294,367,296)
• IPv6 has 340 undecillion address ie,
340,282,366,920,938,463,463,374,607,431,768,211,456 addresses
• If we look at common domain names such as google.com, bbc.co.uk, you will notice a
string of characters separated by dots (periods). The last word in a domain name
represents the “top level domain”. The second word in a domain name is known as a
second level domain name (this is optional though and depends on the domain name).
• These top level domain names are controlled by the Internet Assigned Numbers
Authority (IANA) in a root zone database which is essentially a database of all available
top level domains. You can view this database by visiting:
http://www.iana.org/domains/root/db
• Because all of the names in a give domain name have to be unique there needs to be a
way to organize this all so that domain name aren’t duplicated. This is where domain
registrars come in. A registrar is an authority that can assign domain names directly
under one or more top-level domains. These domains are registered with InterNIC, a
service of ICANN, which enforces uniqueness of domain names across the Internet. Each
domain name becomes registered in a central database known as the WhoIS database.
Popular domain registrars include Amazon, GoDaddy.com, 123-reg.co.uk etc.
• When you buy a domain, every DNS address begins with a Start Of Authority record or
an SOA
It contains information about:
o The name of the server that supplied the data for the zone.
o The administrator of the zone.
o The current version of the data file.
o The default number of seconds for the time-to-live file on resource records.
• NS Stands for Name Server Records
They are used by Top Level domain servers to direct traffic to the Content DNS server
which contains the authoritative DNS records.
• An “A” record is the fundamental type of DNS record. The “A” in A record stands for
“Address”. The A record is used by a computer to translate the name of the domain to
an IP address. For example, http://www.google.com to http://123.10.20.30.

40 | P a g e
• TTL: It’s the length that a DNS record is cached on either the Resolving Server or the
users own local PC is equal to the value of the “Time To Live” (TTL) in seconds. The lower
the time to live, the faster changes to DNS records take to propagate throughout the
internet.
• CNAME or a Canonical Name can be used to resolve one domain name to another. For
example, you may have a mobile website with the domain name http://m.acloud.guru
that is used for when users browse to your domain name on their mobile devices. You
may also want the name http://mobile.acloud.guru to resolve to this same address.

• Alias Records are used to map resource record sets in your hosted zone to Elastic Load
Balancers, CloudFront distributions, or S3 buckets that are configured as websites.
Alias records work like a CNAME record in that you can map one DNS name
(www.example.com) to another ‘target’ DNS name (elb123.elb.amazonaws.com)
• Key difference – A CNAME can’t be used for naked domain names (zone apex record.)
You can’t have a CNAME for http://acloud.guru, it must be either an A record or an
Alias.
• You can buy domain names directly with AWS & it can take upto 3 days to register
depending on the circumstances.

Routing Policies Available with Route53

• Simple Routing
If you choose the simple routing policy you can only have one record with multiple IP
addresses. If you specify multiple values in a record, Route 53 returns all values to the
user in a random order.
• Weighted Routing
Allows you to split your traffic based on different weights assigned.
For example, you can set 10% of your traffic to go to US-EAST-1 and 90% to go to EU-
WEST-1
You can set health checks on individual record sets.
If a record set fails a health check it will be removed from Route53 until it passes the
health check.
You can set SNS notifications to alert you if a health check is failed.

41 | P a g e
• Latency-based Routing
Allows you to route your traffic based on the lowest network latency for your end user
(ie, which region will give them the fastest response time).
To use latency-based routing, you create a latency resource record set for the Amazon
EC3 (or ELS) resource in each region that hosts your website. When Amazon Route 53
receives a query for your site, it selects the latency resource record set for the region
that gives the user the lowest latency. Route 53 then responds with the value associated
with that resource record set.

• Failover Routing
Failover routing policies are used when you want to create an active/passive set up. For
example, you may want your primary site to be in EU-WEST-2 and your secondary DR Site in AP-
SOUTHEAST-2.
Route53 will monitor the health of your primary site using a health check.
A health check monitors the health of your end points.

• Geolocation Routing
Geolocation routing lets you choose where your traffic will e sent based on the geographic
location of your users (is the location from which DNS queries originate). For example, you
might want all queries from Europe to be routed to a fleet of EC2 instances that are specifically
configured for your European customers. These servers may have the local language of your
European customers and all prices are displayed in Euros.

• Geoproximity Routing (Traffic Flow only)


Geoproximity routing lets Amazon Route 53 route traffic to your resources based on the
geographic location of your users and your resources. You can also optionally choose to route
more traffic or less to a given resource by specifying a value, known as a bias. A bias expands or
shrinks the size of the geographic region from which traffic is routed to a resource.
To use Geoproximity routing, you must use Route 53 traffic flow.

• Multivalue Answer Policy


Multivalue answer routing lets you configure Amazon Route 53 to return multiple values, such
as IP addresses for your web servers, in response to DNS queries. You can specify multiple
values for almost any record, but multivalue answer routing also lets you check the health of
each resource, so Route 53 returns only values for healthy resources.
This is similar to simple routing however it allows you to put health checks on each record set.

42 | P a g e
VIRTUAL PRIVATE CLOUD (VPC)

43 | P a g e
VPC

Amazon Virtual Private Cloud (Amazon VPC) lets you provision a logically isolated section of the
Amazon Web Services (AWS) cloud where you can launch AWS resources in a virtual network
that you define. You have complete control over your virtual networking environment,
including selection of your own IP address range, creation of subnets, and configuration of
route tables and network gateways.

You can easily customize the network configuration for your Amazon Virtual Private Cloud. For
example, you can create a public-facing subnet for your webservers that has access to the
Internet, and place your backend systems such as databases or applications servers in a private-
facing subnet with no Internet access. You can leverage multiple layers of security, including
security groups and network access control lists, to help control access to Amazon EC2
instances in each subnet.

Additionally, you can create a Hardware Virtual Private Network (VPN) connection between
your corporate datacenter and your VPN and leverage the AWS cloud as an extension of your
corporate datacenter.

44 | P a g e
Usable Internal IP Ranges in VPC

• 10.0.0.0 -- 10.255.255.255 (10/8 prefix)


• 172.16.0.0 -- 172.31.255.255 (172.16/12 prefix)
• 192.168.0.0 -- 192.168.255.255 (192.168/16 prefix)

What can we do with a VPC?

• Launch instances into a subnet of your choosing


• Assign custom IP address ranges in each subnet
• Configure route tables between subnets
• Create internet gateway and attach it to our VPC
• Much better security control over your AWS resources
• Instance security groups
• Subnet network access control lists (ACLS)

Default VPC vs Custom VPC


• Default VPC is user friendly, allowing you to immediately deploy instances.
• All Subnets in default VPC have a route out to the internet.
• Each EC2 instance has both a public and private IP address.

VPC Peering
• Allows you to connect one VPC with another via a direct network route using private IP
addresses.
• Instances behave as if they were on the same private network
• You can peer VPC’s with other AWS accounts as well as with other VPCs in the same
account.
• Peering is in a star configuration: ie 1 central VPC peers with 4 others.
• NO TRANSITIVE PEERING!!! (If A peer with B and B peer with C, A & C is not peered
automatically)
• You can peer between regions.

45 | P a g e
Remember;

• Think of a VPC as a logical datacenter in AWS.


• Consists of IGWs (or Virtual Private Gateways), Route Tables, Network Access Control
Lists, Subnets, and Security Groups
• 1 Subnet = 1 Availability Zone
• Security Groups are Stateful; Network Access Control Lists are Stateless
• When you create a VPC a default Route Table, Network Access Control Lists (NACL) and
a default Security Group is created.
• It won’t create any subnets, nor will it create a default internet gateway.
• US-East-1A in your AWS account can be a completely different availability zone to US-
East-1A in another AWS account. The AZ’s are randomized.
• Amazon always reserve 5 IP addresses within your subnets.
• You can only have 1 Internet Gateway per VPC.
• Security Groups can’t span VPCs.

NAT INSTANCES
• When creating a NAT instance, Disable Source/Destination Check on the Instance.
• NAT instances must be in a public subnet.
• There must be a route out of the private subnet to the NAT instance, in order for this to
work.
• The amount of traffic that NAT instances can support depends on the instance size. If
you are bottlenecking, increase the instance size.
• You can create high availability using Autoscaling Groups, multiple subnets in different
AZs, and a script ot automate failover.
• Behind a Security Group

NAT Gateways
• Redundant inside the Availability Zone
• Preferred by the enterprise
• Starts at 5 Gbps and scales currently to 45 Gbps
• No need to patch
• Not associated with security groups
• Automatically assigned a public ip address
• Remember to update your route tables.
• No need to disable Source/Destination Checks.

46 | P a g e
• If you have resources in multiple Availability Zones and they share one NAT gateway, in
the event that the NAT gateway’s Availability zone is down, resources in the other
Availability Zones lose internet access. To create an Availability Zone-independent
architecture, create a NAT gateway in each Availability Zone and configure your routing
to ensure that resources use the NAT gateway in the same Availability Zone.

Network ACL

• Your VPC automatically comes with ad default network ACL, and by default it allows all
outbound and inbound traffic.
• You can create custom network ACLs. By default, each custom network ACL denies all
inbound and outbound traffic until you add rules.
• Each subnet in your VPC must be associated with a network ACL. If you don’t explicitly
associate a subnet with a network ACL, the subnet is automatically associated with the
default network ACL.
• Block IP Addresses using network ACLs not Security Groups
• You can associate a network ACL with multiple subnets; however, a subnet can be
associated with only one network ACL at a time. When you associate a network ACL
with a subnet, the previous association is removed.
• Network ACLs contain a numbered list of rules that is evaluated in order, starting with
the lowest numbered rule.

47 | P a g e
• Network ACLs have separate inbound and outbound rules, and each rule can either
allow or deny traffic.
• Network ACLs are stateless; responses to allowed inbound traffic are subject to the rules
for outbound traffic (and vice versa.)

VPC FLOW LOGS


VPC Flow Logs is a feature that enables you to capture information about the IP traffic going to
and from network interfaces in your VPC. Flow log data is stored using Amazon CloudWatch
Logs. After you’ve created a flow log, you can view and retrieve its data in Amazon CloudWatch
Logs.

Flow logs can be created in 3 levels;


• VPC
• Subnet
• Network Interface Level.

Remember;
• You cannot enable flow logs for VPCs that are peered with your VPC unless the peer VPC
is in your account.
• You cannot tag a flow log.
• After you’ve created a flow log, you cannot change its configuration; for example, you
can’t associate a different IAM role with the flow log.

The Below IP Traffics are not monitored by Flow Log;


• Traffic generated by instances when they contact the Amazon DNS server. If you use
your own DNS server, then all traffic to that DNS server is logged.
• Traffic generated by a Windows instance for Amazon Windows license activation.
• Traffic to and from 169.254.169.254 for instance metadata.
• DHCP traffic.
• Traffic to the reserved IP address for the default VPC router.

48 | P a g e
Bastion Host

A bastion host is a special purpose computer on a network specifically designed and configured
to withstand attacks. The computer generally hosts a single application, for example a proxy
server, and all other services are removed or limited to reduce the threat to the computer. It is
hardened in this manner primarily due to its location and purpose, which is either on the
outside of a firewall or in a demilitarized zone (DMZ) and usually involves access form untrusted
networks or computers.

• A NAT Gateway or NAT Instance is used to provide internet traffic to EC2 instances in a
private subnets.
• A Bastion is used to securely administer EC2 instances (Using SSH or RDP). Bastions are
called Jump Boxes in Australia.
• You cannot use a NAT Gateway as a Bastion host.

Direct Connect
AWS Direct Connect is a cloud service solution that makes it easy to establish a dedicated
network connection from your premises to AWS. Using AWS Direct Connect, you can establish
private connectivity between AWS and your datacenter, office, or colocation environment,
which in many cases can reduce your network costs, increase bandwidth throughput, and
provide a more consistent network experience than Internet-based connections.

49 | P a g e
VPC Endpoint
A VPC endpoint enables you to privately connect your VPC to supported AWS services and VPC
endpoint services powered by PrivateLink without requiring an internet gateway, NAT device,
VPN connection, or AWS Direct Connect connection. Instances in your VPC do not require
public IP addresses to communicate with resources in the service. Traffic between your VPC and
the other service does not leave the Amazon network.

Endpoints are virtual devices. They are horizontally scaled, redundant, and highly available VPC
components that allow communication between instances in our VPC and services without
imposing availability risks or bandwidth constraints on your network traffic.

There are 2 types of VPC endpoints;


• Interface Endpoints
• Gateway Endpoints (currently support Amazon S3 & Dynamo DB)

50 | P a g e
An interface endpoint is an elastic network interface with a private IP address that serves as an
entry point for traffic destined to a supported service. The following services are supported:

• Amazon API Gateway


• AWS CloudFormation
• Amazon CloudWatch
• Amazon CloudWatch Events
• Amazon CloudWatch Logs
• AWS CodeBuild
• AWS Config
• Amazon EC2 API
• Elastic Load Balancing API
• AWS Key Management Service
• Amazon Kinesis Data Streams
• Amazon SageMaker
• Amazon SageMaker Runtime
• Amazon SageMaker Notebook Instance
• AWS Secrets Manager
• AWS Security Token Service
• AWS Service Catalog
• Amazon SNS
• Amazon SQS
• AWS Systems Manager
• Endpoint Services Hosted by other AWS accounts
• Supported AWS Marketplace partner Services.

51 | P a g e
HA ARCHITECTURE

52 | P a g e
What is a Load Balancer?

It is a physical or virtual device that’s designed to help you balance the load of the network

Load Balancer Types


• Application Load Balancers
Best suited for load balancing of HTTP and HTTPS traffic. They operate at Layer 7 and are
application-aware. They are intelligent, and you can create advanced request routing,
sending specified requests to specific web servers.
• Network Load Balancer
They are best suited for load balancing of TCP traffic where extreme performance is
required. Operating at the connection level (Layer 4), Network Load Balancer are
capable of handling millions of requests per second, while maintaining ultra-low
latencies.
• Classic Load Balancer
They are legacy Elastic Load Balancers. You can load balance HTTP/HTTPS applications
and use Layer 7-specific features, such as X-Forwarded and sticky sessions. You can also
use strict Layer 4 load balancing for applications that rely purely on the TCP protocol.
If your application stops responding, the ELV (Classic Load Balancer) responds with a 504
error. This means that the application is having issues. This could be either at the Web
Server layer or at the Database Layer.
Identify where the application is failing, and scale it up or out where possible.

** If you need the IPv4 Address of your end user, look for the X-Forwarded-For header.
** Instances monitored by ELB are reported as InService, or Out of Service
** Health Checks check the instance health by talking to it.
** Load Balances have their own DNS name. You are never given an IP address.

Advanced Load Balancer Theory

• Sticky Sessions enable your users to stick to the same EC2 instance.
Can be useful if you are storing information locally to that instance.
• Cross Zone Load Balancing enables you to load balance across multiple availability
zones.

53 | P a g e
• Path patterns allows you to direct traffic to different EC2 instances based on the URL
contained in the request.

HA Architecture Example

• Always Design for failure


• Use Multiple AZ’s and Multiple Regions where ever you can.

CloudFormation
• Is a way of completely scripting your cloud environment
• Quick Start is a bunch of CloudFormation templates already built by AWS solutions
Architects allowing you to create complex environments very quickly.

Elastic Beanstalk
With Elastic Beanstalk, you can quickly deploy and manage applications in the AWS Cloud
without worrying about the infrastructure that runs those applications. You simply upload your
application, and Elastic Beanstalk automatically handles the details of capacity provisioning,
load balancing, scaling, and application health monitoring.

54 | P a g e
AWS APPLICATIONS

55 | P a g e
SQS

Amazon SQS is a web service that gives you access to a message queue that can be used to
store messages while waiting for a computer to process them.
Amazon SQS is a distributed queue system that enables web service applications to quickly and
reliably queue messages that one component in the application generates to be consumed by
another component. A queue is a temporary repository for messages that are awaiting
processing.

Using Amazon SQS, you can decouple the components of an application so they run
independently, easing message management between components. Any component of a
distributed application can store messages in a fail-safe queue.

Messages can contain up to 356 KB of text in any format. Any component can later retrieve the
messages programmatically using the Amazon SQS API.

The queue acts as a buffer between the component producing and saving data, and the
component receiving the data for processing. This means the queue resolves issues that arise if
the producer is producing work faster than the consumer can process it, or if the producer or
consumer are only intermittently connected to the network.

There are 2 types of queue, namely;

• Standard Queue;
Amazon SQS offers standard as the default queue type. A standard queue lets you have
a nearly-unlimited number of transactions per second. Standard queues guarantee that
a message is delivered at least once. However, occasionally (because of the highly-
distributed architecture that allows high throughput), more than one copy of a message
might be delivered out of order. Standard queues provide best-effort ordering which
ensures that messages are generally delivered in the same order as they are sent.

• FIFO Queue;
The FIFO queue complements the standard queue. The most important features of this
queue type are FIFO (first-in-first-out) delivery and exactly-once processing: The order in
which messages are sent and received is strictly preserved and a message is delivered
once and remains available until a consumer processes and deletes it; duplicates are not
introduced into the queue.

56 | P a g e
Visibility Time Out:
It is the amount of time that the message is invisible in the SQS queue after a reader picks up
that message. Provided the job is processed before the visibility time out expires, the message
will then be deleted from the queue. If the job is not processed within that time, the message
will become visible again and another reader will process it. This could result in the same
message being delivered twice.
Visibility timeout maximum is 12 hours.

** SQS guarantees that your messages will be processed at least once.


** Amazon SQS long polling is a way to retrieve messages from your Amazon SQS queues.
While the regular short polling returns immediately (even if the message queue being polled is
empty), long polling doesn’t return a response until a message arrives in the message queue, or
the long poll times out.

SIMPLE WORKFLOW SERVICE (SWF)

It is a web service that makes it easy to coordinate work across distributed application
components. SWF enables applications for a range of use cases, including media processing,
web application back-ends, business process workflows, and analytics pipelines, to be designed
as a coordination of tasks.

Tasks represent invocations of various processing steps in an application which can be


performed by executable code, web service calls, human actions, and scripts.

SWF Actors
• Workflow Starters – An application that can initiate (start) a workflow. Cloud be your e-
commerce website following the placement of an order, or a mobile app searching for
bus times.
• Deciders – Control the flow of activity tasks in a workflow execution. If something has
finished (or failed) in a workflow, a Decider decided what to do next.
• Activity workers – Carry out the activity tasks.

57 | P a g e
SWF vs SQS

• SQS has a retention period of up to 14 days; with SWF, workflow executions can last up
to 1 year
• Amazon SWF represents a task-oriented API, whereas Amazon SQS offers a message-
oriented API
• Amazon SWF ensure staht a task is assigned only once and is never duplicated. With
Amazon SQS, you need to handle duplicated messages and may also need to ensure that
a message is processed only once.
• Amazon SWF keeps track of all the tasks and events in an application. With Amazon
SWS, you need to implement your own application-level tracking, especially if your
application uses multiple queues.

SIMPLE NOTIFICATION SERVICE (SNS)

Amazon Simple Notification Service (Amazon SNS) is a web service that makes it easy to set up,
operate, and send notifications from the cloud.
It provides developers with a highly scalable, flexible, and cost-effective capability to publish
messages from an application and immediately deliver them to subscribers or other
applications.

Besides pushing cloud notifications directly to mobile devices, Amazon SNS can also deliver
notifications by SMS text message or email to Amazon Simple Queue Service (SQS) queues, or
to any HTTP endpoint.
SNS allows you to group multiple recipients using topics. A topic is an “access point” for
allowing recipients to dynamically subscribe for identical copies of the same notification.
One topic can support deliveries to multiple endpoint tpes – for example, you can group
together iOS, Android and SMS recipients. When you publish once to a topic, SNS delivers
appropriately formatted copies of your message to each subscriber.

To prevent messages from being lost, all messages published to Amazon SNS are stored
redundantly across multiple availability zones.

58 | P a g e
SNS Benefits
• Instantaneous, push-based delivery (no polling)
• Simple APIs and easy integration with applications
• Flexible message delivery over multiple transport protocols
• Inexpensive, pay-as-you-go model with no up-front costs
• Web-based AWS Management Console offers the simplicity of a point-and-click
interface.

SNS vs SQS?
• Both Messaging Services in AWS
• SNS – Push
• SQS – Polls (Pulls)

ELASTIC TRANSCODER

• Media Transcoder in the cloud.


• Convert media files form their original source format in to different formats that will
play on smartphones, tablets, PCs, etc.
• Provides transcoding presets for popular output formats, which means that you don’t
need to guess about which settings work best on particular devices.
• Pay based on the minutes that you transcode and the resolution at which you
transcode.

59 | P a g e
• Just remember that Elastic Transcoder is a media transcoder in the cloud. It converts
media files form their original source format in to different formats that will play on
smartphones, tablets, PCs etc.

API GATEWAY

Amazon API Gateway is a fully managed service that makes it easy for developers to publish,
maintain, monitor, and secure APIs at any scale.
With a few clicks in the AWS Management Console, you can create an API that acts as a “front
door” for applications to access data, business logic, or functionality from your back-end
services, such as applications running on Amazon Elastic Compute Cloud (Amazon EC2), code
running on AWS Lambda, or any web application.

What Can API Gateway Do?


• Expose HTTPS endpoints to define a RESTful API
• Serverlessly connect to services like Lambda & DynamoDB
• Send each API endpoint to a different target
• Run efficiently with low cost
• Scale effortlessly
• Track and control usage by API key
• Throttle request to prevent attacks
• Connect to CloudWatch to log all requests for monitoring
• Maintain multiple versions of your API

How do I configure API Gateway?


• Define an API (container)
• Define Resources and nested Resources (URL paths)
• For each Resource:
o Select supported HTTP methods (verbs)
o Set security
o Choose target (such as EC2, Lambda, DynamoDB, etc.)
o Set request and response transformationds.

60 | P a g e
How do I deploy API Gateway?
• Deploy API to a stage:
o Uses API Gateway domain, by default
o Can use custom domain
o Now supports AWS Certificate Manager: free SSL/TLS certs.

API Gateway Caching


You can enable API caching in Amazon API Gateway to cache your endpoint’s response. With
caching, you can reduce the number of calls made to your endpoint and also improve the
latency of the requests to your API. When you enable caching for a stage, API Gateway caches
responses from your endpoint for a specified time-to-live (TTL) period, in seconds. API Gateway
then responds to the request by looking up the endpoint response from the cache instead of
making a request to your endpoint.

Same Origin Policy


In computing, the same-origin policy is an important concept in the web application security
model. Under the policy, a web browser permits scripts contained in a first web page to access
data in a second web page, but only if both web pages have the same origin.
This is done to prevent Cross-Site Scripting (XSS) attacks;
o Enforced by web browsers.
o Ignored by tools like PostMan and Curl.

CORS
CORS is one way the server at the other end (not the client code in the browser) can relax the
same-origin policy.
Cross-origin resource sharing (CORS) is a mechanism that allows restricted resources (eg. Fonts)
on a web page to be requested from another domain outside the domain from which the first
resource was reserved.

CORS in Action:
• Browser makes an HTTP OPTIONS call for a URL (OPTIONS is an HTTP method like GET,
PUT, and POST)
• Server returns a response that says:
“These other domains are approved to GET this URL.”
• Error – “Origin policy cannot be read at the remote resource?” You need to enable CORS
on API Gateway.

61 | P a g e
KENESIS

Streaming Data is data that is generated continuously by thousands of data sources, which
typically send in the data records simultaneously, and in small sizes (order of Kilobytes.)

• Purchases from online stores (think amazon.com)


• Stock Prices
• Game data (as the gamer plays)
• Social network data
• Geospatial data (think uber.com)
• iOT sensor data.

Amazon Kinesis is a platform on AWS to send your streaming data to. Kinesis makes it easy to
load and analyze streaming data, and also providing the ability for you to build your own
custom applications for your business needs.
3 Different Types of Kinesis :

• Kinesis Streams
Consist of Shards;
o 5 transactions per second for reads, upto a maximum total data read rate of 2
MB per second and up to 1,000 records per second for writes, up to a maximum
total data write rate of 1 MB per second (including partition keys.)
o The data capacity of your stream is a function of the number of shards that you
specify for the stream. The total capacity of the stream is the sum of capapcities
of its shards.

62 | P a g e
• Kinesis Firehose
Here the data has to be analyzed as it comes in. There is no data persistence & you need
to do something with that as soon as it comes to firehose Kinesis. It can then store the
analyzed the data on S3 redshift or Elasticsearch cluster.
• Kinesis Analytics
Kinesis Analytics works with Kinesis streams and with Kinesis firehose and essentially it
can analyze the data on the fly inside either service and then it goes in and stores this
data either on S3, redshift or Elasticsearch Cluster .

Web Identity Federation

Web Identity Federation lets you give your user’s access to AWS resources after they have
successfully authenticated with a web-based identity provide like Amazon, Facebook, or
Google. Following successful authentication, the user receives an authentication code from the
Web ID provider, which they can trade for temporary AWS security credentials.

Amazon Cognito provides Web Identity Federation with the following features:

• Sign-up and sign-in to your apps


• Access for guest users
• Acts as an Identity Broker between your application and Web ID providers, so you don’t
need to write any additional code.
• Synchronizes user data for multiple devices.
• Recommended for all mobile applications AWS services.
• Allows user to authenticate with a Web Identity provider (Google, Facebook, Amazon)
• The user authenticates first with the Web ID provider and receives an authentication
token, which is exchanged for temporary AWS credentials allowing them to assume an
IAM role.
• Cognito is an Identity Broker which handles interaction between your applications and
the Web DI provider (You don’t need to write your own code to do this.)
• User pool is user based. It handles things like user registration, authenctication, and
account recovery.
• Identity pools authorize access to your AWS resources.

63 | P a g e
Amazon Cognito brokers between the app and Facebook or Google to provide temporary
credentials which map to an IAM role allowing access to the required resources.
No need for the application to embed or store AWS credentials locally on the device and it gives
users a seamless experience across all mobile devices.

Cognito User Pools are user directories used to manage sign-up and sign-in functionality for
mobile and web applications. Users can sign-in directly to the User Pool, or using Facebook,
Amazon, or Google. Cognito acts as an Identity Broker between the identity provider and AWS.
Successful authentication generates a JSON Web token (JWTs)
Identity Pools enable provide temporary AWS credentials to access AWS services like S3 or
DynamoDB.

Cognito Synchronization Tracks the association between user identity and the various different
devices they sign-in from. In order to provide a seamless user experience for your application,
Cognito uses Push Synchronization to push updates and synchronize user data across multiple
devices. Cognito uses SNS to send a notification to all the devices associated with a given user
identity whenever data stored in the cloud changes.

64 | P a g e
SERVERLESS

65 | P a g e
Lambda

AWS Lambda is a compute service where you can upload your code and create a Lambda
function. AWS Lambda takes care of provisioning and managing the servers that you use to run
the code. You don’t have to worry about operating systems, patching, scaling, etc.

You can use Lambda in the following ways;


• As an event-driven compute service where AWS Lambda runs your code in response to
events. These events could be changes to data in an Amazon S3 bucket or an Amazon
DynamoDB table.
• As a compute service to run your code in response to HTTP requests using Amazon API
gateway or API calls made using AWS SDKs. This is what we use at A Cloud Guru.

Lambda supports the following Languages;


• Node.js
• Java
• Python
• C#
• Go

Lambda Pricing
• Priced on the number of requests
First 1 million requests are free. $0.20 per 1 million requests thereafter.
• Duration
Duration is calculated from the time your code begins executing until it returns or
otherwise terminates, rounded up to the nearest 100ms. The price depends on the
amount of memory you allocate to your function. You are charged $0.00001667 for
every GB-second used.

66 | P a g e
Benefits of Lambda
• No Servers!
• Continuous Scaling
• Very Cheap!
• Lambda functions are independent, 1 event = 1 function
• Lambda functions can trigger other lambda functions.

Remember
• Lambda architectures can get extremely complicated, AWS X-ray allows you to debug
what is happening.
• Lambda can do things globally, you can use it to back up S3 bucket to other S3 buckets.

67 | P a g e

You might also like