Aws - Sa Notes
Aws - Sa Notes
Aws - Sa Notes
Aurora.. ................................................................................................................................................... 36
ElastiCache.. ........................................................................................................................................... 38
HA Architecture.. ......................................................................................................................................... 52
SQS.......................................................................................................................................................... 56
SWF ......................................................................................................................................................... 57
SNS .......................................................................................................................................................... 58
Elastic Transcoder................................................................................................................................... 59
Kenisis ..................................................................................................................................................... 62
Serverless..................................................................................................................................................... 65
Lambda ................................................................................................................................................... 66
1|Page
AWS Global Infrastructure
Region
A region is a geographical area. Each Region consists of 2 (or more) availability Zones.
Availability Zone
An availability Zone is one or more discrete data centers, each with redundant power,
networking and connectivity, housed in separate facilities but because they are close together,
they are counted as 1 Availability Zone.
Edge Locations
In every Region, there are edge locations. Edge locations are endpoints for AWS which are used
for caching content. Typically this consists of CloudFront, Amazon’s Content Delivery Network
(CDN).
There are many more Edge Locations than Regions. Currently there are over 150 Edge
Locations.
2|Page
IDENTITY ACCESS MANAGEMENT (IAM)
IAM allows you to manage users and their level of access to the AWS console. It is important to
understand IAM and how it works, both for the exam and for administrating a company’s AWS
account in real life.
Policies:
Policies are made up of documents, called Policy documents. These documents are in a format
called JSON and they give permissions as to what a User/ Group/Role is able to do.
Roles:
3|Page
Key points:
• IAM is universal. It does not apply to regions at this time.
• The “root account” is simply the account created when first setup your AWS account. It
has complete Admin access.
• New Users have No permissions when first created.
• New Users are assigned Access Key ID & Secret Access Keys when first created.
• These are not the same as a password. You cannot use the Access Key ID & Secret
Access Key to Login in to the console. You can use this to access AWS via the APIs and
Command Line, however.
• You only get to view these once. If you lose them, you have to regenerate them. So,
save them in a secure location.
• Always setup Multifactor Authentication on your root account.
• You can create and customize your own password retention policy.
4|Page
SIMPLE STORAGE SERVICE (S3)
5|Page
S3 Storage
S3 stands for Simple Storage Service. It provides developers and IT teams with secure, durable,
higly-scalable object storage. Amazon S3 is easy to use, with a simple web services interface to
store and retrieve any amount of data from anywhere on the web.
S3 is a safe place to store your files. It is an Object-based storage. The data is spread across
multiple devices and facilities.
S3 is Object-based – i.e. allows you to upload files. Files can be from 0 Bytes to 5 TB. There is
unlimited Storage & Files are stored in buckets. Bucket is basically just like a folder.
S3 is a universal namespace. Ie, names must be unique globally. A typical cloud bucket name
will be like https://s3-eu-west-1.amazonaws.com/grbbucket
When you upload a file to S3, you will receive a HTTP 200 code if the upload was successful.
As it is object based, it is not suitable to install OS or DB (For that you need block based
storage).
S3 is object based. Think of the objects just as files. Objects consist of the following:
6|Page
o If you update AN EXISTING file or delete a file and read it immediately, you may
get the older version, or you may not. Basically changes to objects can take a
little bit of time to propagate.
• S3 Standard
99.99 % availability, 99.999999999% durability.
Stored redundantly across multiple devices in multiple facilities
Designed to sustain the loss of 2 facilities concurrently.
• S3 One Zone – IA
For where you want a lower-cost option for infrequently accessed data,
But do not require the multiple Availability Zone data resilience.
• S3 – Intelligent Tiering
7|Page
Designed to optimize costs by automatically moving data to the most cost-effective
access tier, without performance impact or operational overhead.
Data Archival Storage Classes;
• S3 Glacier
S3 Glacier is a secure, durable, and low-cost storage class for data archiving. You can
reliably store any amount of data at costs that are competitive with or cheaper than on-
premises solutions. Retrieval times configurable form minutes to hours.
8|Page
• Cross Region Replication Pricing
If you need to replicate between buckets in 2 different regions mainly for DR or High
availability.
S3 buckets can be configured to create access logs which log all requests made to the S3
bucket. This can be sent to another bucket and even another bucket in another account.
Encryption in Transit is achieved by SSL or TLS
Encryption at rest for S3 buckets (Server Side) is achieved by
9|Page
S3 Versioning
• Stores all versions of an object (including all writes and even if you delete an object)
• Great Backup tool
• Once enabled, Versioning cannot be disabled, only suspended.
• Integrates with Lifecycle rules.
• Versioning’s MFA Delete capability, which uses multi-factor authentication, can be used
to provide an additional layer of security.
S3 Lifecycle Management
• Automates moving your objects between the different storage tiers.
• Can be used in conjunction with versioning.
• Can be applied to current versions and previous versions.
S3 Transfer Acceleration
S3 Transfer Acceleration utilizes the CloudFront Edge Network to accelerate your uploads to S3.
Instead of uploading directly to your S3 bucket, you can use a distinct URL to upload directly to
an edge location which will then transfer that file to S3. You will get a distinct URL to upload to:
Alcoudguru.s3-accelerate.amazonaws.com
10 | P a g e
CLOUDFRONT
A content delivery network (CDN) is a system of distributed servers (network) that deliver
webpages and other web content to a user based on the geographic locations of the user, the
origin of the webpage, and a content delivery server.
CloudFront can be used to deliver your entire website, including dynamic, static, streaming, and
interactive content using a global network of edge locations. Requests for your content are
automatically routed to the nearest edge location, so content is delivered with the best possible
performance.
Terminologies
Edge Location: This is the location where content will be cached.
This is separate to an AWS Region/AZ.
Edge Locations are not just READ only – you can write to
them too. (Ie, put an Object on to them.
Objects are cached for the life of the TTL (Time to live). You can clear cached
objects, but you will be charged.
Origin : This is the origin of all the files that the CDN will distribute.
This can be an S3 Bucket, an EC2 Instance, and Elastic
Load Balancer, or Route53.
11 | P a g e
To explain the working of Cloudfront, consider our web server hosted in London, and have
users all around the world. We have our edge locations all over the world as well. First user will
do a query to edge location, if edge location doesn’t have that content, it will download that
content from the origin and will be cached until the time to live (TTL). For instance, If the TTL is
for 48 hrs, if another user needs that content, he can access the same content lot quicker from
edge location.
SNOWBALL
Snowball is a petabyte-scale data transport solution that uses secure appliances to transfer
large amounts of data into and out of the AWS. Using Snowball address common challenges
with large-scale data transfers including high network costs, long transfer times, and security
concerns. Transferring data with Snowball is simple, fast, secure, and can be as little as one-fifth
the cost of high-speed internet.
Snowball comes in either a 50TB or 80TB size. Snowball uses multiple layers of security
designed to protect your data including tamper-resistant enclosures, 256-bit encryption, and an
industry-standard Trusted Platform Module (TPM) designed to ensure both security and full
chain-of-custody of your data. Once the data transfer job has been processed and verified, AWS
performs a software erasure of the Snowball appliance.
AWS Snowball Edge is a 100TB data transfer device with on-board storage and compute
capabilities. You can use Snowball Edge to move large amounts of data into and out of AWS, as
a temporary storage tier for large local datasets, or to support local workloads in remote or
offline locations.
Snowball Edge connects to your existing applications and infrastructure using standard storage
interfaces, streamlining the data transfer process and minimizing setup and integration.
Snowball Edge can cluster together to form a local storage tier and process your data on-
premises, helping ensure your applications continue to run even when they are not able to
access the cloud.
AWS Snowmobile is an Exabyte-scale data transfer service used to move extremely large
amounts of data to AWS. You can transfer up to 100PB per Snowmobile, a 45-foot long
ruggedized shipping container, pulled by a semi-trailer truck. Snowmobile makes it easy to
move massive volumes of data to the cloud, including video libraries, image repositories, or
even a complete data center migration. Transferring data with Snowmobile is secure, fast and
cost effective.
12 | P a g e
STORAGE GATEWAY
AWS Storage Gateway is a service that connects an on-premises software appliance with cloud-
based storage to provide seamless and secure integration between an organization’s on-
premises IT environment and AWS’s storage infrastructure. The service enables you to securely
store data to the AWS Cloud for scalable and cost-effective storage.
AWS Storage Gateway’s software appliance is available for download as a virtual machine (VM)
image that you install on a host in your datacenter. Storage Gateway supports either VMware
ESXi or Microsoft Hyper-V. Once you’ve installed your gateway and associated it with your AWS
account through the activation process, you can use the AWS Management Console to create
the storage gateway option that is right for you.
13 | P a g e
Three Different Types of Storage Gateway;
Files are stored as objects in your S3 buckets, accessed through a Network File System
(NFS) mount point. Ownership, permissions, and timestamps are durably stored in S3 in
the user-metadata of the object associated with the file. Once objects are transferred to
S3, they can be managed as native S3 objects, and bucket policies such as versioning,
lifecycle management, and cross-region replication apply directly to objects stored in
your bucket.
The volume interface presents your applications with disk volumes using the iSCSI block
protocol. Data written to these volumes can be asynchronously backed up as point-in-
time snapshots of your volumes, and stored in the cloud as Amazon EBS snapshots.
Snapshots are incremental backups that capture only changed blocks. All snapshot
storage is also compressed to minimize your storage charges.
- Stored Volumes
Stored volumes let you store your primary data locally, while asynchronously
backing up that data to AWS. Stored volumes provide your on-premises applications
with low-latency access to their entire datasets, while providing durable, off-site
backups. You can create storage volumes and mount them as iSCSI devices from
your on=premises application servers.
Data written to your stored volumes is stored on your on-premises storage
hardware. This data is asynchronously backed up to Amazon Simple Storage Service
(Amazon S3) in the form of Amazon Elastic Block Store (Amazon EBS) snapshots. 1
GB – 16 TB in size for stored volumes.
14 | P a g e
- Cached Volumes
Cached volumes let you use Amazon Simple Storage Service (Amazon S3) as your
primary data storage while retaining frequently accessed data locally in your storage
gateway. Cached volumes minimize the need to scale your on-premises storage
infrastructure, while still providing your applications with low-latency access to their
frequently accessed data. You can create storage volumes up to 32 TiB in size and
attach to them as iSCSI devices from your on-premises application servers. Your
gateway stores data that you write to these volumes in Amazon S3 and retains
recently read data in your on-premises storage gateway’s cache and upload buffer
storage. 1 GB – 32 TB in size for Cached Volumes.
15 | P a g e
3. Tape Gateway (VTL)
Tape Gateway offers a durable, cost-effective solutions to archive your data in the AWS
Cloud. The VTL interface it provides lets you leverage your existing tape-based backup
application infrastructure to store data on virtual tape cartridges that you create on
your tape gateway. Each tape gateway is preconfigured with a media changer and tape
drives, which are available to your existing client backup applications as iSCSI devices.
You add tape cartridges as you need to archive your data. Supported by NetBackup,
Backup Exec, Veeam etc.
16 | P a g e
ELASTIC COMPUTE CLOUD
(EC2)
17 | P a g e
ELASTIC COMPUTE CLOUD (EC2)
Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides resizable compute
capacity in the cloud. Amazon EC2 reduces the time required to obtain and boot new server
instances to minutes, allowing you to quickly scale capacity, both up and down, as your
computing requirements change.
1. On Demand
Allows you to pay a fixed rate by the hour (or by the second) with no commitment.
Its useful for;
o Users that want the low cost and flexibility of Amazon EC2 without any up-front
payment or long-term commitment
o Applications with short term, spiky, or unpredictable workloads that cannot be
interrupted.
o Applications being developed or tested on Amazon EC2 for the first time.
2. Reserved
Provides you with a capacity reservation, and offer a significant discount on the hourly
charge for an instance. Contract Terms are 1 year or 3 years Terms.
Its useful for;
o Applications with steady state or predictable usage
o Applications that require reserved capacity.
o Users able to make upfront payments to reduce their total computing costs even
further.
18 | P a g e
Reserved Pricing Types;
a. Standard Reserved instances: These offer up to 75% off on demand instances.
The more you pay up front and the longer the contract, the greater the discount.
b. Convertible Reserved Instances: These offer up to 54% off on demand capability
to change the attributes of the RI as long as the exchange results in the creation
of Reserved Instances of equal or greater value.
c. Scheduled Reserved Instances: Theses are available to launch within the time
windows you reserve. This option allows you to match your capacity reservation
to a predictable recurring schedule that only requires a fraction of a day, a week,
or a month.
3. Spot Pricing
Enables you to bid whatever price you want for instance capacity, providing for even
greater savings if your applications have flexible start and end times.
If the spot instance is terminated by Amazon EC2, you will not be charged for a partial
hour of usage. However, if you terminate the instance yourself, you will be charged for
any hour in which the instance ran.
It is useful for;
o Applications that have flexible start and end times.
o Applications that are only feasible at very low compute prices.
o Users with urgent computing needs for large amounts of additional capacity.
4. Dedicated Hosts
Physical EC2 server dedicated for your use. Dedicated Hosts can help you reduce costs
by allowing you to use your existing server-bound software licenses.
It is useful for;
o Useful for regulatory requirements that may not support multi-tenant
virtualization.
o Great for licensing which does not support multi-tenancy or cloud deployments.
o Can be purchased On-Demand (hourly.)
o Can be purchased as a Reservation for up to 70% off the On-Demand price.
19 | P a g e
EC2 Instance Types – Mnemonic
F For FPGA
I For IOPS
G Graphics
H High Disk Throughput
Cheap general purpose (Think T2
T Micro)
D For Density
R For RAM
Main choice for general purpose
M apps
C For Compute
P Graphics (Think Pics)
X Extreme Memory
Z Extreme Memory AND CPU
A Arm-based workloads
U Bare Metal
20 | P a g e
Some Additional Points;
• While creating EC2 instance, Termination Protection is turned off by default, you must
turn it on.
• On an EBS-backed instance, the default action is for the root EBS volume to be deleted
when the instance is terminated.
• EBS Root Volumes of your DEFAULT AMI’s cannot be encrypted. You can also use a third
party tool (such as bit locker etc.) to encrypt the root volume, or this can be done when
creating AMI’s in the AWS console or using the API.
• Additional volumes can be encrypted.
Security Groups
With security groups, you can enable and disable ports (both inbound and outbound) for each
instance. Changes made to Security groups is applied and reflected instantaneously.
All inbound traffic is blocked by default.
All Outbound traffic is allowed.
Changes to Security Groups take effect immediately.
You can have any number of EC2 instances within a security group.
You can have multiple security groups attached to EC2 Instances.
Security Groups are STATEFUL, means if you create an inbound rule allowing traffic in, that
traffic is automatically allowed back out again.
You cannot block specific IP address using Security Groups, instead use network Access Control
Lists.
You can specify allow rules, but not deny rules.
21 | P a g e
ELASTIC BLOCK STORE
Amazon Elastic Block Store (EBS) provides persistent block storage volumes for use with
Amazon EC2 instances in the AWS Cloud. Each Amazon EBS volume is automatically replicated
within its Availability Zone to protect you from component failure, offering high availability and
durability.
22 | P a g e
Volumes & Snapshots
23 | P a g e
EBS vs Instance Store Volumes
All AMIs are categorized as either backed by Amazon EBS or backed by instance store.
For EBS Volumes: The root device for an instance launched from the AMI is an Amazon EBS
volume created from an Amazon EBS snapshot. EBS backed instances can be stopped. You will
not lose the data on this instance if it is stopped.
For Instance Store Volumes: The root device for an instance launched from the AMI is an
instance store volume created from a template stored in Amazon S3. Instance store volumes
cannot be stopped. If the underlying host fails, you will lose your data.
If you reboot both types, you will not lose your data. By default, both ROOT volumes will be
deleted on termination. However, with EBS volumes, you can tell AWS to keep the root device
volume.
24 | P a g e
CLOUD WATCH
Amazon cloudwatch is a monitoring service to monitor your AWS resources, as well as the
applications that you run on AWS. Cloudwatch basically monitors performance.
It can monitor things like;
• Compute
o EC2 Instances
o Auto scaling Groups
o Elastic Load Balancers
o Route53 Health Checks
• Storage & Content Delivery
o EBS Volumes
o Storage Gateways
o CloudFront
Host level Metrics Consist of:
• CPU
• Network
• Disk
• Status Check
Cloudwatch with EC2 will monitor events every 5 minutes by default. You can have 1 minute
intervals by turning on detailed monitoring.
AWS CloudTrail increases visibility into your user and resource activity by recording AWS
Management Console actions and API calls. You can identify which users and accounts called
AWS, the source IP address from which the calls were made, and when the calls occurred.
25 | P a g e
CloudTrail vs CloudWatch
You can interact with AWS form anywhere in the world just by using the command line (CLI).
You will need to set up access in IAM
To access aws command line, login to any aws ec2 instance, and use “aws” prefix.
For eg: aws s3 ls
This command list all the S3 buckets.
• Roles are more secure than storing your access key and secret access key on individual
EC2 instances.
• Roles are easier to manage.
• Roles can be assigned to an EC2 instance after ti is created using both the console &
command line.
• Roles are universal – you can use them in any region.
Instance Metadata
26 | P a g e
Elastic File System (EFS)
Amazon Elastic File System (Amazon EFS) is a file storage service for Amazon Elastic Compute Cloud
(Amazon EC2) instances. Amazon EFS is easy to use and provides a simple interface that allows you to
create and configure file systems quickly and easily. With Amazon EFS, storage capacity is elastic,
growing and shrinking automatically as you add and remove files, so your applications have the storage
they need, when they need it.
27 | P a g e
• Partitioned Placement Group
When using partition placement groups, Amazon EC2 divides each group into logical segments
called partitions. Amazon EC2 ensures that each partition within a placement group has its own
set of racks. Each rack has its own network and power source. No two partitions within a
placement group share the same racks, allowing you to isolate the impact of hardware failure
within your application. For Multiple EC2 isntances HDFS, HBase, and Cassandra
28 | P a g e
AWS DATABASES
29 | P a g e
Relational Databases
Relational databases are what most of us ar all used to. They have been around since the 70’s.
Think of a traditional spreadsheet;
• Database
• Tables
• Row
• Fields
Eg;
• MS SQL Server
• Oracle
• MySQL Server
• PostgreSQL
• Aurora
• MariaDB
30 | P a g e
Relational Database Services (RDS) Features;
RDS has two key features;
“_id” : “51262c865caasdsadfbe0545435,
“firstname” : ”John”,
“surname” : “Smith”,
“Age” : “23”,
“address” : [
31 | P a g e
The basics of DynamoDB are as follows;
Data Warehousing
Used for business intelligence. Tools like Cognos, Jaspersoft, SQL Server Reporting Services,
Oracle Hyperion, SAP NetWeaver.
Data warehousing databases use different type of architecture both from a database
perspective and infrastructure layer.
Amazon’s Data Warehouse Solution is called Redshift. (Mainly for OLAP)
Used to pull in very large and complex data sets. Usually used by management to do queries on
data (such as current performance vs targets etc)
OLTP vs OLAP
Online Transaction Processing (OLTP) differs from OLAP Online Analytics Processing (OLAP) in
terms of the types of queries you will run.
OLTP Example:
Order number 212002
Pulls up a row of data such as Name, Data, Address to Deliver to, Delivery Status etc.
32 | P a g e
Additional Points;
• RDS runs on virtual machines
• You cannot log in to these operating systems however.
• Patching of the RDS Operating System and DB is Amazon’s responsibility
• RDS is NOT Serverless
• Aurora Serverless IS Serverless
Automated Backups are enabled by default. The backup data is stored in S3 and you get
free storage space equal to the size of your database. So If you have an RDS Instance of
10Gb, you will get 10Gb worth of storage.
Backups are taken within a defined window. During the backup window, storage I/O
may be suspended while your data is being backed up and you may experience elevated
latency.
• Database Snapshots
DB Snapshots are done manually (ie they are user initiated.) They are stored even after
you delete the original RDS instance, unlike automated backups.
Restoring Backups
Whenever you restore either an Automatic Backup or a manual Snapshot, the restored version
of the database will be a new RDS instance with a new DNS endpoint.
33 | P a g e
Encryption At Rest
Encryption at rest is supported for MySQL, Oracle, SQL Server, PostgreSQL, MariaDB & Aurora.
Encryption is done using the AWS Key Management Service (KMS) Service. Once your RDS
instance is encrypted, the data stored at rest in the underlying storage is encrypted, as are tis
automated backups, read replicas, and snapshots.
Multi-AZ
Multi-AZ allows you to have an exact copy of your production database in another Availability
Zone. AWS handles the replication for you, so when your production database is written to, this
write will automatically be synchronized to the stand by database. It is used for DR.
Read Replica
Read replicas allow you to have a read-only copy of your production database. This is achieved
by suing Asynchronous replication from the primary RDS instance to the read replica. You use
read replicas primarily for very read-heavy database workloads.
34 | P a g e
Read Replicas are available for the following databases
• MySQL Server
• PostgreSQL
• MariaDB
• Oracle
• Aurora
Redshift
Amazon Redshift is a fast and powerful, fully managed, petabyte-scale data warehouse service
in the cloud. Customers can start small for just $0.25 per hour with no commitments or upfront
costs and scale to a petabyte or more for $1,000 per terabyte per year, less than a tenth of
most other data warehousing solutions.
35 | P a g e
Redshift is priced as follows;
• Compute Node Hours (Total number of hours you run across all your compute nodes for
the billing period. You are billed for 1 unit per node per hour, so a 3-node data
warehouse cluster running persistently for an entire month would incur 2,160 instance
hours. You will not be charged for leader node hours; only compute nodes will incur
charges.)
• Backup
• Data Transfer (Only within a VPC, not outside it)
Redshift Availability
• Currently only available in 1 AZ
• Can restore snapshots to new AZs in the event of an outage.
AURORA
Amazon Aurora is a MySQL-compatible, relational database engine that combines the speed and
availability of high-end commercial databases with the simplicity and cost-effectiveness of open source
databases. Amazon Aurora provides up to five times better performance than MySQL at a price point
one tenth that of a commercial database while delivering similar performance and availability.
36 | P a g e
• 2 copies of your data is contained in each availability zone, with minimum of 3 availability zones.
6 copies of your data.
Scaling Aurora
• Aurora is designed to transparently handle the loss of up to 2 copies of data without affecting
database write availability and up to three copies without affecting read availability.
• Aurora storage is also self-healing. Data blocks and disks are continuously scanned for errors and
repaired automatically.
37 | P a g e
ElastiCache
ElastiCache is a web service that makes it easy to deploy, operate, and scale an in-memory
cache in the cloud. The service improves the performance of web applications by allowing you
to retrieve information from fast, managed, in-memory caches, instead of relying entirely on
slower disk-based databases.
38 | P a g e
ROUTE 53
39 | P a g e
Amazon Route 53
40 | P a g e
• TTL: It’s the length that a DNS record is cached on either the Resolving Server or the
users own local PC is equal to the value of the “Time To Live” (TTL) in seconds. The lower
the time to live, the faster changes to DNS records take to propagate throughout the
internet.
• CNAME or a Canonical Name can be used to resolve one domain name to another. For
example, you may have a mobile website with the domain name http://m.acloud.guru
that is used for when users browse to your domain name on their mobile devices. You
may also want the name http://mobile.acloud.guru to resolve to this same address.
• Alias Records are used to map resource record sets in your hosted zone to Elastic Load
Balancers, CloudFront distributions, or S3 buckets that are configured as websites.
Alias records work like a CNAME record in that you can map one DNS name
(www.example.com) to another ‘target’ DNS name (elb123.elb.amazonaws.com)
• Key difference – A CNAME can’t be used for naked domain names (zone apex record.)
You can’t have a CNAME for http://acloud.guru, it must be either an A record or an
Alias.
• You can buy domain names directly with AWS & it can take upto 3 days to register
depending on the circumstances.
• Simple Routing
If you choose the simple routing policy you can only have one record with multiple IP
addresses. If you specify multiple values in a record, Route 53 returns all values to the
user in a random order.
• Weighted Routing
Allows you to split your traffic based on different weights assigned.
For example, you can set 10% of your traffic to go to US-EAST-1 and 90% to go to EU-
WEST-1
You can set health checks on individual record sets.
If a record set fails a health check it will be removed from Route53 until it passes the
health check.
You can set SNS notifications to alert you if a health check is failed.
41 | P a g e
• Latency-based Routing
Allows you to route your traffic based on the lowest network latency for your end user
(ie, which region will give them the fastest response time).
To use latency-based routing, you create a latency resource record set for the Amazon
EC3 (or ELS) resource in each region that hosts your website. When Amazon Route 53
receives a query for your site, it selects the latency resource record set for the region
that gives the user the lowest latency. Route 53 then responds with the value associated
with that resource record set.
• Failover Routing
Failover routing policies are used when you want to create an active/passive set up. For
example, you may want your primary site to be in EU-WEST-2 and your secondary DR Site in AP-
SOUTHEAST-2.
Route53 will monitor the health of your primary site using a health check.
A health check monitors the health of your end points.
• Geolocation Routing
Geolocation routing lets you choose where your traffic will e sent based on the geographic
location of your users (is the location from which DNS queries originate). For example, you
might want all queries from Europe to be routed to a fleet of EC2 instances that are specifically
configured for your European customers. These servers may have the local language of your
European customers and all prices are displayed in Euros.
42 | P a g e
VIRTUAL PRIVATE CLOUD (VPC)
43 | P a g e
VPC
Amazon Virtual Private Cloud (Amazon VPC) lets you provision a logically isolated section of the
Amazon Web Services (AWS) cloud where you can launch AWS resources in a virtual network
that you define. You have complete control over your virtual networking environment,
including selection of your own IP address range, creation of subnets, and configuration of
route tables and network gateways.
You can easily customize the network configuration for your Amazon Virtual Private Cloud. For
example, you can create a public-facing subnet for your webservers that has access to the
Internet, and place your backend systems such as databases or applications servers in a private-
facing subnet with no Internet access. You can leverage multiple layers of security, including
security groups and network access control lists, to help control access to Amazon EC2
instances in each subnet.
Additionally, you can create a Hardware Virtual Private Network (VPN) connection between
your corporate datacenter and your VPN and leverage the AWS cloud as an extension of your
corporate datacenter.
44 | P a g e
Usable Internal IP Ranges in VPC
VPC Peering
• Allows you to connect one VPC with another via a direct network route using private IP
addresses.
• Instances behave as if they were on the same private network
• You can peer VPC’s with other AWS accounts as well as with other VPCs in the same
account.
• Peering is in a star configuration: ie 1 central VPC peers with 4 others.
• NO TRANSITIVE PEERING!!! (If A peer with B and B peer with C, A & C is not peered
automatically)
• You can peer between regions.
45 | P a g e
Remember;
NAT INSTANCES
• When creating a NAT instance, Disable Source/Destination Check on the Instance.
• NAT instances must be in a public subnet.
• There must be a route out of the private subnet to the NAT instance, in order for this to
work.
• The amount of traffic that NAT instances can support depends on the instance size. If
you are bottlenecking, increase the instance size.
• You can create high availability using Autoscaling Groups, multiple subnets in different
AZs, and a script ot automate failover.
• Behind a Security Group
NAT Gateways
• Redundant inside the Availability Zone
• Preferred by the enterprise
• Starts at 5 Gbps and scales currently to 45 Gbps
• No need to patch
• Not associated with security groups
• Automatically assigned a public ip address
• Remember to update your route tables.
• No need to disable Source/Destination Checks.
46 | P a g e
• If you have resources in multiple Availability Zones and they share one NAT gateway, in
the event that the NAT gateway’s Availability zone is down, resources in the other
Availability Zones lose internet access. To create an Availability Zone-independent
architecture, create a NAT gateway in each Availability Zone and configure your routing
to ensure that resources use the NAT gateway in the same Availability Zone.
Network ACL
• Your VPC automatically comes with ad default network ACL, and by default it allows all
outbound and inbound traffic.
• You can create custom network ACLs. By default, each custom network ACL denies all
inbound and outbound traffic until you add rules.
• Each subnet in your VPC must be associated with a network ACL. If you don’t explicitly
associate a subnet with a network ACL, the subnet is automatically associated with the
default network ACL.
• Block IP Addresses using network ACLs not Security Groups
• You can associate a network ACL with multiple subnets; however, a subnet can be
associated with only one network ACL at a time. When you associate a network ACL
with a subnet, the previous association is removed.
• Network ACLs contain a numbered list of rules that is evaluated in order, starting with
the lowest numbered rule.
47 | P a g e
• Network ACLs have separate inbound and outbound rules, and each rule can either
allow or deny traffic.
• Network ACLs are stateless; responses to allowed inbound traffic are subject to the rules
for outbound traffic (and vice versa.)
Remember;
• You cannot enable flow logs for VPCs that are peered with your VPC unless the peer VPC
is in your account.
• You cannot tag a flow log.
• After you’ve created a flow log, you cannot change its configuration; for example, you
can’t associate a different IAM role with the flow log.
48 | P a g e
Bastion Host
A bastion host is a special purpose computer on a network specifically designed and configured
to withstand attacks. The computer generally hosts a single application, for example a proxy
server, and all other services are removed or limited to reduce the threat to the computer. It is
hardened in this manner primarily due to its location and purpose, which is either on the
outside of a firewall or in a demilitarized zone (DMZ) and usually involves access form untrusted
networks or computers.
• A NAT Gateway or NAT Instance is used to provide internet traffic to EC2 instances in a
private subnets.
• A Bastion is used to securely administer EC2 instances (Using SSH or RDP). Bastions are
called Jump Boxes in Australia.
• You cannot use a NAT Gateway as a Bastion host.
Direct Connect
AWS Direct Connect is a cloud service solution that makes it easy to establish a dedicated
network connection from your premises to AWS. Using AWS Direct Connect, you can establish
private connectivity between AWS and your datacenter, office, or colocation environment,
which in many cases can reduce your network costs, increase bandwidth throughput, and
provide a more consistent network experience than Internet-based connections.
49 | P a g e
VPC Endpoint
A VPC endpoint enables you to privately connect your VPC to supported AWS services and VPC
endpoint services powered by PrivateLink without requiring an internet gateway, NAT device,
VPN connection, or AWS Direct Connect connection. Instances in your VPC do not require
public IP addresses to communicate with resources in the service. Traffic between your VPC and
the other service does not leave the Amazon network.
Endpoints are virtual devices. They are horizontally scaled, redundant, and highly available VPC
components that allow communication between instances in our VPC and services without
imposing availability risks or bandwidth constraints on your network traffic.
50 | P a g e
An interface endpoint is an elastic network interface with a private IP address that serves as an
entry point for traffic destined to a supported service. The following services are supported:
51 | P a g e
HA ARCHITECTURE
52 | P a g e
What is a Load Balancer?
It is a physical or virtual device that’s designed to help you balance the load of the network
** If you need the IPv4 Address of your end user, look for the X-Forwarded-For header.
** Instances monitored by ELB are reported as InService, or Out of Service
** Health Checks check the instance health by talking to it.
** Load Balances have their own DNS name. You are never given an IP address.
• Sticky Sessions enable your users to stick to the same EC2 instance.
Can be useful if you are storing information locally to that instance.
• Cross Zone Load Balancing enables you to load balance across multiple availability
zones.
53 | P a g e
• Path patterns allows you to direct traffic to different EC2 instances based on the URL
contained in the request.
HA Architecture Example
CloudFormation
• Is a way of completely scripting your cloud environment
• Quick Start is a bunch of CloudFormation templates already built by AWS solutions
Architects allowing you to create complex environments very quickly.
Elastic Beanstalk
With Elastic Beanstalk, you can quickly deploy and manage applications in the AWS Cloud
without worrying about the infrastructure that runs those applications. You simply upload your
application, and Elastic Beanstalk automatically handles the details of capacity provisioning,
load balancing, scaling, and application health monitoring.
54 | P a g e
AWS APPLICATIONS
55 | P a g e
SQS
Amazon SQS is a web service that gives you access to a message queue that can be used to
store messages while waiting for a computer to process them.
Amazon SQS is a distributed queue system that enables web service applications to quickly and
reliably queue messages that one component in the application generates to be consumed by
another component. A queue is a temporary repository for messages that are awaiting
processing.
Using Amazon SQS, you can decouple the components of an application so they run
independently, easing message management between components. Any component of a
distributed application can store messages in a fail-safe queue.
Messages can contain up to 356 KB of text in any format. Any component can later retrieve the
messages programmatically using the Amazon SQS API.
The queue acts as a buffer between the component producing and saving data, and the
component receiving the data for processing. This means the queue resolves issues that arise if
the producer is producing work faster than the consumer can process it, or if the producer or
consumer are only intermittently connected to the network.
• Standard Queue;
Amazon SQS offers standard as the default queue type. A standard queue lets you have
a nearly-unlimited number of transactions per second. Standard queues guarantee that
a message is delivered at least once. However, occasionally (because of the highly-
distributed architecture that allows high throughput), more than one copy of a message
might be delivered out of order. Standard queues provide best-effort ordering which
ensures that messages are generally delivered in the same order as they are sent.
• FIFO Queue;
The FIFO queue complements the standard queue. The most important features of this
queue type are FIFO (first-in-first-out) delivery and exactly-once processing: The order in
which messages are sent and received is strictly preserved and a message is delivered
once and remains available until a consumer processes and deletes it; duplicates are not
introduced into the queue.
56 | P a g e
Visibility Time Out:
It is the amount of time that the message is invisible in the SQS queue after a reader picks up
that message. Provided the job is processed before the visibility time out expires, the message
will then be deleted from the queue. If the job is not processed within that time, the message
will become visible again and another reader will process it. This could result in the same
message being delivered twice.
Visibility timeout maximum is 12 hours.
It is a web service that makes it easy to coordinate work across distributed application
components. SWF enables applications for a range of use cases, including media processing,
web application back-ends, business process workflows, and analytics pipelines, to be designed
as a coordination of tasks.
SWF Actors
• Workflow Starters – An application that can initiate (start) a workflow. Cloud be your e-
commerce website following the placement of an order, or a mobile app searching for
bus times.
• Deciders – Control the flow of activity tasks in a workflow execution. If something has
finished (or failed) in a workflow, a Decider decided what to do next.
• Activity workers – Carry out the activity tasks.
57 | P a g e
SWF vs SQS
• SQS has a retention period of up to 14 days; with SWF, workflow executions can last up
to 1 year
• Amazon SWF represents a task-oriented API, whereas Amazon SQS offers a message-
oriented API
• Amazon SWF ensure staht a task is assigned only once and is never duplicated. With
Amazon SQS, you need to handle duplicated messages and may also need to ensure that
a message is processed only once.
• Amazon SWF keeps track of all the tasks and events in an application. With Amazon
SWS, you need to implement your own application-level tracking, especially if your
application uses multiple queues.
Amazon Simple Notification Service (Amazon SNS) is a web service that makes it easy to set up,
operate, and send notifications from the cloud.
It provides developers with a highly scalable, flexible, and cost-effective capability to publish
messages from an application and immediately deliver them to subscribers or other
applications.
Besides pushing cloud notifications directly to mobile devices, Amazon SNS can also deliver
notifications by SMS text message or email to Amazon Simple Queue Service (SQS) queues, or
to any HTTP endpoint.
SNS allows you to group multiple recipients using topics. A topic is an “access point” for
allowing recipients to dynamically subscribe for identical copies of the same notification.
One topic can support deliveries to multiple endpoint tpes – for example, you can group
together iOS, Android and SMS recipients. When you publish once to a topic, SNS delivers
appropriately formatted copies of your message to each subscriber.
To prevent messages from being lost, all messages published to Amazon SNS are stored
redundantly across multiple availability zones.
58 | P a g e
SNS Benefits
• Instantaneous, push-based delivery (no polling)
• Simple APIs and easy integration with applications
• Flexible message delivery over multiple transport protocols
• Inexpensive, pay-as-you-go model with no up-front costs
• Web-based AWS Management Console offers the simplicity of a point-and-click
interface.
SNS vs SQS?
• Both Messaging Services in AWS
• SNS – Push
• SQS – Polls (Pulls)
ELASTIC TRANSCODER
59 | P a g e
• Just remember that Elastic Transcoder is a media transcoder in the cloud. It converts
media files form their original source format in to different formats that will play on
smartphones, tablets, PCs etc.
API GATEWAY
Amazon API Gateway is a fully managed service that makes it easy for developers to publish,
maintain, monitor, and secure APIs at any scale.
With a few clicks in the AWS Management Console, you can create an API that acts as a “front
door” for applications to access data, business logic, or functionality from your back-end
services, such as applications running on Amazon Elastic Compute Cloud (Amazon EC2), code
running on AWS Lambda, or any web application.
60 | P a g e
How do I deploy API Gateway?
• Deploy API to a stage:
o Uses API Gateway domain, by default
o Can use custom domain
o Now supports AWS Certificate Manager: free SSL/TLS certs.
CORS
CORS is one way the server at the other end (not the client code in the browser) can relax the
same-origin policy.
Cross-origin resource sharing (CORS) is a mechanism that allows restricted resources (eg. Fonts)
on a web page to be requested from another domain outside the domain from which the first
resource was reserved.
CORS in Action:
• Browser makes an HTTP OPTIONS call for a URL (OPTIONS is an HTTP method like GET,
PUT, and POST)
• Server returns a response that says:
“These other domains are approved to GET this URL.”
• Error – “Origin policy cannot be read at the remote resource?” You need to enable CORS
on API Gateway.
61 | P a g e
KENESIS
Streaming Data is data that is generated continuously by thousands of data sources, which
typically send in the data records simultaneously, and in small sizes (order of Kilobytes.)
Amazon Kinesis is a platform on AWS to send your streaming data to. Kinesis makes it easy to
load and analyze streaming data, and also providing the ability for you to build your own
custom applications for your business needs.
3 Different Types of Kinesis :
• Kinesis Streams
Consist of Shards;
o 5 transactions per second for reads, upto a maximum total data read rate of 2
MB per second and up to 1,000 records per second for writes, up to a maximum
total data write rate of 1 MB per second (including partition keys.)
o The data capacity of your stream is a function of the number of shards that you
specify for the stream. The total capacity of the stream is the sum of capapcities
of its shards.
62 | P a g e
• Kinesis Firehose
Here the data has to be analyzed as it comes in. There is no data persistence & you need
to do something with that as soon as it comes to firehose Kinesis. It can then store the
analyzed the data on S3 redshift or Elasticsearch cluster.
• Kinesis Analytics
Kinesis Analytics works with Kinesis streams and with Kinesis firehose and essentially it
can analyze the data on the fly inside either service and then it goes in and stores this
data either on S3, redshift or Elasticsearch Cluster .
Web Identity Federation lets you give your user’s access to AWS resources after they have
successfully authenticated with a web-based identity provide like Amazon, Facebook, or
Google. Following successful authentication, the user receives an authentication code from the
Web ID provider, which they can trade for temporary AWS security credentials.
Amazon Cognito provides Web Identity Federation with the following features:
63 | P a g e
Amazon Cognito brokers between the app and Facebook or Google to provide temporary
credentials which map to an IAM role allowing access to the required resources.
No need for the application to embed or store AWS credentials locally on the device and it gives
users a seamless experience across all mobile devices.
Cognito User Pools are user directories used to manage sign-up and sign-in functionality for
mobile and web applications. Users can sign-in directly to the User Pool, or using Facebook,
Amazon, or Google. Cognito acts as an Identity Broker between the identity provider and AWS.
Successful authentication generates a JSON Web token (JWTs)
Identity Pools enable provide temporary AWS credentials to access AWS services like S3 or
DynamoDB.
Cognito Synchronization Tracks the association between user identity and the various different
devices they sign-in from. In order to provide a seamless user experience for your application,
Cognito uses Push Synchronization to push updates and synchronize user data across multiple
devices. Cognito uses SNS to send a notification to all the devices associated with a given user
identity whenever data stored in the cloud changes.
64 | P a g e
SERVERLESS
65 | P a g e
Lambda
AWS Lambda is a compute service where you can upload your code and create a Lambda
function. AWS Lambda takes care of provisioning and managing the servers that you use to run
the code. You don’t have to worry about operating systems, patching, scaling, etc.
Lambda Pricing
• Priced on the number of requests
First 1 million requests are free. $0.20 per 1 million requests thereafter.
• Duration
Duration is calculated from the time your code begins executing until it returns or
otherwise terminates, rounded up to the nearest 100ms. The price depends on the
amount of memory you allocate to your function. You are charged $0.00001667 for
every GB-second used.
66 | P a g e
Benefits of Lambda
• No Servers!
• Continuous Scaling
• Very Cheap!
• Lambda functions are independent, 1 event = 1 function
• Lambda functions can trigger other lambda functions.
Remember
• Lambda architectures can get extremely complicated, AWS X-ray allows you to debug
what is happening.
• Lambda can do things globally, you can use it to back up S3 bucket to other S3 buckets.
67 | P a g e