Durability & Availability: Durability Can Be Described As The Probability That You Will Eventually Be
Durability & Availability: Durability Can Be Described As The Probability That You Will Eventually Be
Durability & Availability: Durability Can Be Described As The Probability That You Will Eventually Be
In the quizzes, you’ll often see a misleading “Corporate” plan which does not
exist
Web console SAML authentication flow
SAML allows configuring federated user access to the corporate AWS account
(basically you can re-use your corporate user management system to grant
access to AWS and avoid to replicate all your users into IAM).
1. The flow starts from the corporate portal (not on the AWS console)
2. The user triggers the auth on the corporate portal which verifies the user
identity
3. The portal generates a SAML authentication response that includes
assertions and attributes about the user
4. The client browser is then redirected to the AWS single sign-on endpoint
posting the SAML assertion
5. The AWS console endpoint validates the SAML assertion and generates a
redirect to access the management console (suing STS)
6. The browser follows the redirect which brings into the AWS console as an
authenticated user
More info available in Enabling SAML 2.0 Federated Users to Access the AWS
Management Console.
S3
AWS Object Storage Solution.
Consistency model
Read after write consistency:
PUT (New files)
Eventual consistency:
PUT (updates)
DELETE
Updates are atomic: requesting a file immediately after an update will give
you either the old data or the new data (no partially updated or corrupted
data).
Storage classes
Frequent Access:
STANDARD
REDUCED_REDUNDANCY (RRS):
for non-critical and reproducible data, you might be
losing up to 0.01% of all your objects per year
Infrequent Access (pay per read):
STANDARD_IA: grant quick reads (small latency), but you pay per read.
ONEZONE_IA: like STANDARD_IA, but stored only in one availability zone, less
availability (if that specific zone is down you won’t be able to access the
files until the zone is back)
Archive
GLACIER:data is archived and needs to be explicitly restored (which takes 3-5
hours) to be available again. Cheapest solution.
Durability & Availability table
Class Durability Availability
STANDARD 99,999999999% 99,99%
STANDARD_IA 99,999999999% 99,9%
ONEZONE_IA 99,999999999% 99,5%
GLACIER 99,999999999% 99,99%*
REDUCED_REDUNDANCY 99,99% 99,99%
*
After restore
Durability is always 11 nines (99,999999999%) except for REDUCED_REDUNDANCY,
suitable for cases where you might afford to lose files (e.g. you can
regenerate them).
Availability is almost always 99,99% except for STANDARD_IA (99,9%)
and ONEZONE_IA (99,5%), these are suitable for cases where you rarely have to
access the data, but when you need to it has to be as fast as STANDARD
(can’t wait hours like with GLACIER).
Encryption features
SSE-S3: fully managed encryption at rest (don’t have to worry about
encryption keys)
SSE-C: encryption at rest with custom encryption keys (to be provided
together with the uploaded or download the file). The key is not stored by
AWS. Use this option if you want to maintain your own encryption keys, but
don’t want to implement or leverage a client-side encryption library.
SSE-KMS: encryption at rest using keys managed through the KMS service.
Allows to define fine grained permissions based on the permissions of KMS
keys. Best solution for compliance (PCI-DSS, HIPAA, etc.).
MFA Delete
For extra security S3 supports the option “MFA on delete” which requests a
user to enter an MFA code to be able to delete a file. Protects against
accidental or unauthorized deletions.
Transfer optimizations
If latency is an issue you can use S3 transfer acceleration.
If transfer speed is a problem or you have to transfer big files through HTTP
you can use multi-part upload.
If you need to transfer massive amount of data into AWS and it might take
too long to do that on the wire you can use Snowball.
If your storage needs to exist also on premise (hybrid cloud storage), you
can use Storage gateway.
Storage gateway offers 2 main volume modes:
Cached volumes: stores files in the cloud and keeps a local cache to speed up
reads
Stored volumes: optimized for low latency, stores files locally and asynchronously
sends back up point-in-time snapshots to S3.
Cross Region Replication (CRR)
An Amazon EC2 Dedicated Host is a physical server with EC2 instance capacity
fully dedicated to your use. Dedicated Hosts can help you address compliance
requirements and reduce costs by allowing you to use your existing server-
bound software licenses. There are 2 different dedicated hosting modes
(tenancy):
Dedicated Hosts: gives you additional visibility and control over how
instances are placed on a physical server, and you can consistently deploy
your instances to the same physical server over time. As a result, Dedicated
Hosts enable you to use your existing server-bound software licenses and
address corporate compliance and regulatory requirements.
Dedicated instances: less configurable (no host billing, no visibility of
sockets and cores, can’t add capacity).
You can change the tenancy of Dedicated instances from “dedicated” to
“host”. After you’ve stopped your Dedicated instances, you can change the
tenancy to “host” using the ModifyInstancePlacement API or the AWS
Management Console.
EBS backups and snapshots
You can back up the data on your Amazon EBS volumes to Amazon S3 by
taking point-in-time snapshots. Snapshots are incremental backups, which
means that only the blocks on the device that have changed after your most
recent snapshot are saved. This minimizes the time required to create the
snapshot and saves on storage costs by not duplicating data.
A snapshot is constrained to the region where it was created. After you
create a snapshot of an EBS volume, you can use it to create new volumes
in the same region. You can also copy snapshots across regions, making it
possible to use multiple regions for geographical expansion, data center
migration, and disaster recovery.
While snapshots are per region, volumes created from snapshots are tied to
only a given availability zone in that region, so EBS Volumes cannot be
attached to an EC2 instance in another AZ.
To take application consistent snapshots: Shut down the EC2 instance
and detach the EBS volume, then take the snapshot.
The most resilient way to backup EBS disks is to take regular snapshots.
EBS encryption:
Snapshots of encrypted volumes are automatically encrypted.
Volumes that are created from encrypted snapshots are automatically
encrypted.
When you copy an unencrypted snapshot that you own, you can encrypt it
during the copy process.
When you copy an encrypted snapshot that you own, you can re-encrypt it
with a different key during the copy process.
Other:
Cannot use pre-existing MS Windows licenses
Important things when selecting an instance: I/O and memory
requirements.
ELB: AWS CloudTrail can be enabled to record Application Load Balancer
API calls for your account and deliver log files
To create EC2 instances using the VMDK files you can use the VM
Import/Export service
Placement groups cannot be extended across availability zones
Limits / Defaults / SLA
Uptime SLA for Amazon EC2 and EBS within a given region is 99,95%
Maximum number of VMWare VMs that can be migrated concurrently is 50
ECS
When you create a table or index in Amazon DynamoDB, you must specify
your capacity requirements for read and write activity. By defining your
throughput capacity in advance, DynamoDB can reserve the necessary
resources to meet the read and write activity your application requires, while
ensuring consistent, low-latency performance.
You specify throughput capacity in terms of read capacity units and write
capacity units.
One read capacity unit represents one strongly consistent read per
second, or two eventually consistent reads per second (you can read twice
as much if you opt for an eventually consistent read), for an item up to 4 KB
in size. If you need to read an item that is larger than 4 KB, DynamoDB will
need to consume additional read capacity units. The total number of read
capacity units required depends on the item size, and whether you want an
eventually consistent or strongly consistent read.
One write capacity unit represents one write per second for an item up to
1 KB in size. If you need to write an item that is larger than 1 KB, DynamoDB
will need to consume additional write capacity units. The total number of
write capacity units required depends on the item size.
Operation Capacity Unit Size
READ 4 KB
WRITE 1 KB
For example, suppose that you create a table with 5 read capacity units and 5
write capacity units. With these settings, your application could:
Perform strongly consistent reads of up to 20 KB per second (4 KB × 5 read
capacity units).
Perform eventually consistent reads of up to 40 KB per second (8 KB × 5
read capacity units, twice as much read throughput).
Write up to 5 KB per second (1 KB × 5 write capacity units).
If you exceed your allocated capacity your will get a 400 Bad Request response
(HTTP API) or a ProvisionedThroughputExceededException (SDK). The SDK can offer
automatic retry with exponential backoff.
Limits
The cumulative size of attributes per item must fit within the maximum
DynamoDB item size (400 KB).
Redshift
AWS analytics database (columnar data storage).
Events that would cause Amazon RDS to initiate a failover to the standby
replica:
An Availability Zone outage
The primary DB instance fails
The DB instance’s server type is changed
The operating system of the DB instance is undergoing software patching
A manual failover of the DB instance was initiated using Reboot with
failover
Limits
Maximum retention period for automated backup: 35 days
Other
Amazon RDS enables automated backups settings depends on the DB
engine selected
RDS Provisioned IOPS (SSD) Storage is preferable for high-performance
OLTP workloads
default metrics for RDS: The number of current connections to the database
VPC
Amazon Virtual Private Cloud. Allows you to create complex private networks
in the cloud.
Availability zones names
Availability Zones consist of one or more discrete data centers. Every account
has references to availability zones per regions shuffled. As such, ‘eu-west-1b’
for instance, is not necessarily the same physical location for ‘eu-west-1b’ in
another account (important in some latency related questions).
How to enable site to site VPN:
Hardware VPN enabled on VPC
On premise customer gateway
Virtual private gateway
Peering connections do not support Edge to Edge
Default of 5 elastic IPs per region
To enable network access logging you can:
Make use of an OS level logging tools such as iptables and log events to
CloudWatch or S3.
Set up a Flow Log for the group of instances and forward them to
CloudWatch
You can use a network address translation (NAT) gateway to enable instances
in a private subnet to connect to the internet or other AWS services, but
prevent the internet from initiating a connection with those instances