Implementing Cloud Design Patterns For AWS - Sample Chapter
Implementing Cloud Design Patterns For AWS - Sample Chapter
$ 44.99 US
29.99 UK
"Community
Experience
Distilled"
Marcus Young
C o m m u n i t y
D i s t i l l e d
E x p e r i e n c e
Marcus Young
Introduction
The paradigm for development of applications has shifted in many ways over
the years. Instead of just developing pure applications, aimed at specific system
configurations, the trend has moved towards web applications. These applications
present a very different set of challenges not just for the developers, but also for the
people who manage the systems that host them. The reaction to this need to build,
test, and manage such web applications has been to develop an abstraction on top of
the hardware that allows for the ability to bring up entire virtualized environments
quickly and consistently.
Throughout these chapters, you will learn basic design principles for applications
and known issues. These may not be completely compatible with all application
types but should serve as a basic toolkit for bigger design patterns. It is also very
important to note that AWS adds new services all the time, some of which by default
solve these design patterns at the time of writing. If your software or services handle
sensitive data and have in-flight or at-rest requirements, be very careful to read how
each individual AWS-provided service handles data.
The topics that are covered in this chapter are:
Introduction to AWS
[1]
Introduction
Introduction to AWS
Amazon Web Services (AWS) is a very large suite of Cloud services provided
by Amazon. AWS provides, at a base level, virtual machines and the services
surrounding them. Many Cloud-based virtual machine services such as Google
Compute Engine, DigitalOcean, Rackspace, Windows Azure, and so on provide
the ability to bring up a machine from a supported base operating system image
or snapshot, and it's up to the user to customize it further.
Amazon has made itself one of the leaders for Cloud-hosting by providing not
just virtual machines, but configurable services and software implementations of
hardware found in data centers. For most large-scale systems, the move to Cloud
infrastructure brings to the table a huge set of questions on how to handle issues
such as load balancing, content delivery networks, failover, and replication. The
AWS suite can handle the same issues that a physical data center can, usually for
a fraction of the cost. They can get rid of some of the red tape that comes with a
data center such as requesting provisioning, repairs, and scheduling downtime.
Amazon is constantly offering new services to tackle new and unique problems
encountered with Cloud infrastructure. However, this book may not cover every
service offered by Amazon. The services that this book will cover include:
Database
[2]
Chapter 1
Application services
Logging
Infrastructure as a Service
IaaS can be described as a service that provides virtual abstractions for hardware,
servers, and networking components. The service provider owns all the equipment
and is responsible for its housing, running, and maintenance. In this case, AWS
provides APIs, SDKs, and a UI for creating and modifying virtual machines, their
network components, routers, gateways, subnets, load balancers, and much more.
Where a user with a physical data center would incur charges for the hardware,
shelving, and access, this is removed by IaaS with a payment model that is per-hour
(or per-use) type.
[3]
Introduction
Platform as a Service
While AWS itself is an IaaS provider, it contains a product named ElasticBeanstalk,
which falls under the PaaS category for Cloud models. PaaS is described as the
delivery of a computing platform, typically an operating system, programming
language execution environment, database, or web server. With ElasticBeanStalk,
a user can easily turn a code into a running environment without having to worry
about any of the pieces underneath such as setting up and maintaining the database,
web server, or code runtime versions. It also allows it to be scaled without having to
do anything other than define scale policies through the configuration.
Software as a Service
AWS also provides a marketplace where a user can purchase official and third-party
operating system images that provide configurable services such as databases, web
applications, and more. This type of service falls under the SaaS model. The best
interpretation for the SaaS model is on-demand software, meaning that the user need
only configure the software to use and interact with it. The draw to SaaS is that there
is no need to learn how to configure and deploy the software to get it working in a
larger stack and generally the charges are per usage-hour.
The AWS suite is both impressive and unique in that it doesn't fall under any one of
the Cloud service models as described previously. Until AWS made its name, the need
to virtualize an entire environment or stack was usually not an easy task and consisted
of a collection of different providers, each solving a specific part of the deployment
puzzle. The cost of using many different providers to create a virtual stack might not
be cheaper than the initial hardware cost for moving equipment into a data center.
Besides the cost of the providers themselves, having multiple providers also created
the problem of scaling in one area and notifying another of the changes. While making
applications more resilient and scalable, this Frankenstein method usually did not
simplify the problem as a whole.
Operations
[4]
Chapter 1
For a developer, the biggest benefit of Cloud providers is the ability to throw away
entire environments. In a traditional developer setting, the developers usually
develop their code locally, have access to a shared physical server, or have access
to a virtual server of some type. Issues that usually arise out of these setups are that
it's hard to enforce consistency and the servers can become stale over time. If each
developer works locally, inconsistency can arise very quickly. Different versions
of the core language or software could be used and might behave differently from
machine to machine. One developer might use Windows and prefer registry lookups while another developer may use Mac and prefer environment variables.
If the developers share a core server for development, other issues may arise such
as permissions or possibly trying to modify services independent of each other
causing race conditions. No matter what problems exist, known or unknown, they
could be solved by always starting from the same base operating system state.
The leading software for solving this issue is Vagrant. Vagrant provides the ability
to spin up and destroy a virtual machine from a configuration file along with a
configuration management suite such as Puppet, Chef, Docker, or Ansible. Vagrant
itself is agnostic to the Cloud hosting tool in the sense that it does not require AWS.
It can spin up instances at AWS given the credentials, but it can also spin up virtual
machines locally from VirtualBox and VMWare.
Vagrant gives back consistency to the developers in the sense that it takes a base box
(in AWS terms this is an Amazon Machine Image or AMI) and configures it via one
of the configuration suites or shell to create a running virtual machine every time it
is needed. If all the developers share the same configuration file then all of them are
mostly guaranteed to work against the same environment. That environment can be
destroyed just as easily as it was created, giving the resources back and incurring no
charges until needed again.
The bringing up and destroying of the instances becomes a small invisible piece of
their workflow. By virtue of enforcing a strategy like this on a team, a lot of issues
can be found and resolved before they make their way up the chain to the testing or
production environments.
More information on Vagrant can be found at
http://www.vagrantup.com.
The other team I mentioned that benefits from moving to the Cloud is the operations
team. This team differs greatly in responsibility from company to company but it is
safe to assume that the team is heavily involved with monitoring the applications and
systems for issues and possible optimizations. AWS provides enough infrastructure for
monitoring and acting on metrics and an entire book could be dedicated to the topic.
However, I'll focus only on auto scaling groups and CloudWatch.
[5]
Introduction
For AWS, an auto scaling group defines scaling policies for instances based on
schedules, custom metrics, or base metrics such as disk usage, CPU utilization,
memory usage, and so on. An auto scaling group can act on these thresholds
and scale up or down depending on how they are configured. In a non-Cloud
environment this same setup takes quite a bit of effort and depends on the
software whereas, it's a core concept to AWS.
Auto scaling groups also automatically register instances with a load balancer and
shift them into a round robin distribution. For an operations team, the benefit of
moving to Amazon might justify itself only to alleviate all the work involved in
duplicating this functionality elsewhere, allowing the team to focus on creating
deeper and more meaningful system health checks.
Throw-away environments can also be beneficial to the operations teams. A sibling
product to Vagrant, very recently released, is Terraform. Terraform, like Vagrant,
is agnostic to the hosting environment but does not solely spin up virtual instances.
Terraform is similar to CloudFormation in the sense that its goal is to take a central
configuration file, which describes all the resources it needs. It then maps them into
a dependency graph, optimizes, and creates an entire stack. A common example
for Terraform would be to create a production environment from a few virtual
machines, load balancers, Route53 DNS entries, and set auto scaling policies. This
flexibility would be nearly impossible in traditional hardware settings and provides
an on-demand mentality not just for the base application, but also for the entire
infrastructure, leading to a more agile core.
More information on Terraform can be found at
http://www.terraform.io.
Chapter 1
AWS has a Service Level Agreement (SLA) policy for each service, which guarantees
that at their end you will meet a certain percentage of uptime. This implies a certain
amount of downtime for scheduled maintenance and repairs of the hardware
underneath.
As an AWS user this means you can expect an e-mail at some point during usage
warning about instances being stopped and the need to restart manually. While
this is no different from a physical environment where the user schedules their
own downtime, it does mean that instances can misbehave when the hardware is
starting to fail. Most of the replication and failover is handled underneath but if the
application is real-time and is stopped, it does mean that you, as a user, should have
policies in place to handle this situation.
Over-provisioning
Another issue with having virtual machines in the Cloud is over-provisioning.
An instance type is selected when an instance is launched that corresponds to
the virtualized hardware required for it. Without taking measures to ensure that
replication or scaling happens on multiple data centers, there is a very real risk that
when a new instance is needed, the hardware will not be immediately available.
If scaling policies are in effect that specify your application should scale out to a
certain number of instances, but all of those instances are in a data center nearing
its maximum capacity, the scaling policy will fail. This failure negates having a
scaling policy in place if it cannot always scale to the size required.
Under-provisioning
A topic that is rarely talked about but is very common is under-provisioning
and it is the opposite of over-provisioning. We will start with an example: say
we purchase a server for hosting a database and purchase the smallest instance
type possible. Let's assume that for the next few days this is the only machine
running in a specific rack in the AWS data center. We are promised the resources
of the instance we purchased but as the hardware is free, it gives us a boost in
performance that we get accustomed to unknowingly.
After a few days, the hardware that has been provisioned for other customers, now
gives us the resources we were promised and not the extra boost we got for free in
the background. While monitoring we now see a performance degradation! While
this database was originally able to perform so many transactions per second it now
does much less. The problem here is that we grew accustomed to the processing
power that technically was not ours and now our database does not perform the
way we expected it to.
[7]
Introduction
Perhaps the promised amount is not suitable but it is live and has customer data
within it. To resolve this, we must terminate the instance and change the instance
type to something more powerful, which could have downstream effects or even
full downtime to the customer. This is the danger of under-provisioning and it
is extremely hard to trace. Not knowing what kind of a performance we should
actually get (as promised in the SLA) causes us to possibly affect the customer,
which is never ideal.
Replication
The previous examples are extreme and rarely encountered. For example, in a
traditional hosting environment where there are multiple applications behind a load
balancer, replication is not trivial. Replication of this application server requires
registration with the load balancer and is usually done manually or requires
configuration each time. AWS-provided ELBs are a first-class entity just like the
virtual machines themselves. The registration between this is abstracted and can
be done with the click of a button or automatically through auto scaling groups
and start-up scripts.
Redundancy
Apart from replication, redundancy is another hot topic. Most database clustering
takes redundancy into effect but requires special configuration and initial setup. The
RDS allows replication to be specified at the time of setup and guarantees redundancy
and uptime as per its SLA. Their Multi-AZ specification guarantees that the replication
crosses availability zones and provides automatic failover. Besides replication, special
software or configuration is needed to store offsite backups. With S3, an instance may
synchronize with an S3 bucket. S3 is itself a redundant storage that crosses data center
sites and can be done via an AWS CLI or their provided API. S3 is also a first-class
entity so permissions can be provided transparently to virtual machines.
The previous database example hints towards a set of issues deemed high availability.
The purpose of high availability is to mitigate redundancy through a load balancer,
proxy, or crossing availability zones. This is a part of risk management and disaster
recovery. The last thing an operations team would want is to have their database go
down and be replicated to New Orleans during Hurricane Katrina. This is an extreme
and incredibly rare example but the risk exists. The reason that disaster recovery exists
and will always exist is the simple fact that accidents happen and have happened
when ill-prepared.
[8]
Chapter 1
Summary
Throughout this brief introduction to AWS, we learned not only the background
and industry shift into virtualized infrastructure, but also where AWS fits in with
some competitors. We not only discussed the kinds of problems AWS solves, but
also the kinds of problems that can be encountered in Cloud infrastructure. There are
countless unique processes to be solved with this dynamic paravirtual environment.
Picking up consistent patterns throughout this book will help to strengthen
applications of many forms against these issues. In the next chapter, we will go over
some basic design patterns. It is one of the easier topics and covers data backups
through instance snapshots, replication through machine imaging, scaling instance
types, dynamic scaling through CloudWatch, and increasing the disk size when
needed. These patterns help solve common provisioning issues for single instances.
[9]
www.PacktPub.com
Stay Connected: