You've successfully subscribed to
Great! Next, complete checkout for full access to
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.
Success! Your billing info is updated.
Billing info update failed.

AWS Well-Architected Framework Cheatsheet


Regardless of the AWS certification you are preparing for, reviewing the AWS Well-Architected Framework should be on your to-do list. This post is going to breakdown the framework in a quick and easy way so you can focus on learning the services and passing that exam. I can speak from personal experiences that if this is your first AWS certification, reviewing the whitepaper could easily be a substitute for your favorite sleep-aid. It’s rough… But no worries, we’ll make this as painless as possible!

The purpose of the AWS Well-Architected Framework Whitepaper is to help you understand how to best go about using and integrating AWS services with your new or existing on-premise infrastructure. By using this cheatsheet, you’ll be able to better understand the pros and cons to using one service over another. In particular, we will cover how to design and operate architectures that are reliable, secure, efficient, and cost-effective.

The Five Pillars to the AWS Well Architected Framework

Operational Excellence

The operational excellence pillar is the first pillar in the AWS Well Architected Framework. It focuses on you running and monitoring your systems in order to deliver business value while having the ability to continually  improve supporting processes and procedures. There are six principles to focus on within Operational Excellence for the cloud:

  1. Perform operations as code: Those days of racking and stacking large servers are done! When using the cloud, you have the ability to spin up large environments simply by using code. You also have the ability to update the environment details using code. The real benefit comes when you can automate responses to events (good or bad). What do I mean by this? Well, if you happen to be pulling from a database and that database goes down, you can automate the redirection to a backup database while rebuilding the existing database that just when down, and have this all done before you even wake up for your morning cup of coffee.
  2. Annotate Documentation: If you’ve ever worked in an on-premise datacenter, you’ve probably experienced the difficulties of documentation. I get it, the pace of change moves faster than your team keeps up with and for many, the logging of what happened 2 hours ago is long forgotten over hangry employees trying to get some grub before the next fire arises. In a cloud environment, you can automate your documentation, even “hand-written” documentation so that this no longer becomes an issue. Now you and your team can focus on putting out the fires and staying well-fueled while the cloud handles the rest.
  3. Make Frequent, Small, Reversible Changes: This kind of goes without saying but if there is one thing I’ve learned while working as a systems engineer it is to make your changes gradually. Why? because you don’t want to update or modify 20+ components to your well oiled machine and then run into problems and be left with trying to sift through your changes for what caused the problem.
    When making these updates automated, the nightmare can be even worse! So make the changes small, frequently, and make sure they can be reversible.
  4. Refine Operations Procedures Frequently: This is pretty explanatory but constantly looking to improve is a good practice to implement. As you use operation procedures, look for ways to improve them. This can provide efficiency. A great way to start this process is by setting up regular game-days to review and validate all the procedures to make sure they are effective and that the teams are constantly familiar with the practices.
  5. Anticipate Failure: If you’ve been in IT for any length of time, chances are you’ve probably heard someone mention “its better to be proactive than reactive”. They’re right and this point falls within that. When you are anticipating failure, its a good idea to find out the weaknesses in your architecture. Find out which sources they are and how to remove or mitigate them. This testing will improve your architecture and allow your team to develop better response procedures on an improved architecture.
  6. Learn from all operational failures: No one get’s it right on the first go, which is why you should constantly seek to improve from past failures. Share what you’ve learned with your team members and constantly seek to learn from others who are well seasoned in and out of your organization. This will help improve your ability to develop good architectures in the cloud.

For more information on the Operational Excellence Pillar, click here.


Securing your architecture is a primary bullet point in the AWS Well Architected Framework. This pillar includes seven principles for security in the cloud:

  1. Implement a strong identity foundation: This principle is fairly straight forward. Centralize privilege management so things don’t get too complicated and overwhelming as your organization, and ultimately, access to the various components of your architecture scale. Make sure you are implementing the principle of least privilege and enforcing separation of duties with appropriate authorization for each interaction with your AWS resources.
  2. Enable traceability: When you have multiple people working in the same environment, its helpful to be able to say who did what and when they did it. This is a common security practice that we’ve been implementing for years in on-premise architectures and environments so it only makes sense to be able to add this feature into the cloud.
  3. Apply security at all layers: If you had a treasure chest sitting in your living room worth $1 million dollars, you would simply secure it with a fence around your property and say “Ah, yes that should keep the bad guys from stealing my crown jewels!” Of course not! You will probably fortify the treasure in a vault, surrounded by guards with automatic rifles, guard dogs, security alarms, lasers, etc.. you get the picture. The same should go with your cloud architecture. Just putting up a firewall isn’t good enough! The safest approach to handling data in the cloud is to apply security at all levels from the outer most facing to the actually storage of your data. This will frustrate any attacker from continuing to persist after your data and likely give those who’s data you are managing the ability to fall asleep peacefully at night.
  4. Automate Security Best Practices: This is a very helpful for this building software in the cloud. You can automate security mechanisms while proving secure scaling as needed.
  5. Protect data in transit and at rest: Provides the ability to classify data and use security mechanisms, such as encryption, tokenization and access controls to safeguard the use of the the data.
  6. Keep people away from data: The further you keep humans from accessing and modifying data, the better. Within AWS you can design mechanisms to reduce the need for direct access or processing of data. Your organization will thank you!
  7. Prepare for security events: If you organization doesn’t currently have an incident management process, it might be a good idea to develop one. In your cloud architecture, you will want to run incident response simulations, and use automation tools to increase handling speeds.

For more information on the Security Pillar, click here.


The reliability pillar focuses on the ability of system recovery after an infrastructure disruption, the ability to scale resources to meet demand, and to mitigate disruptions. There are 5 principles to the Reliability Pillar:

  1. Test recover procedures directly relates to the ability to test failures and recovery procedures within the cloud. The reliability pillar will focus on the validation of your recovery
  2. Automatically recover from failure by monitoring a system for key performance indicators (KPIs), you can automate recovery procedures when failures is detected.
  3. Scale horizontally to increase aggregation system availability by replacing a large instance into numerous small instances to reduce the impact of a single point of failure. Distributing requests to many, smaller resources can help with this.
  4. Stop guessing capacity. This has been a long time practice in on-premise systems that have been severely improved by the use of cloud technology. In AWS, you can monitor the use of your system’s resources and provision increased resources when needed. To improve on that benefit, you can even automate the increase in resources.
  5. Manage change in automation for all systems and other infrastructure needs. Now, instead of managing systems individually, you can automate the management of all systems together and perform at a higher level of management all together.

For more information on the Reliability Pillar, click here.

Performance Efficiency

Performance efficiency directly relates to the ability to use computing resources to meet system requirements and maintain the technological demands of your organization. There are five design principles to Performance Efficiency:

  1. Democratize advanced technologies. What does that even mean? Well, I’m glad you asked. Tech can be hard to implement at times, so pushing the weight of implementation to the cloud can help alleviate your IT team’s workload while allowing your organization to focus on the use of these new technologies and not the building and maintaining of the technology.
  2. Go global in minutes: This goes without really saying but with the cloud you can scale your architecture across the world in minutes and for pennies compared to trying to do this in-house.
  3. Use serverless architectures to remove the need for running and maintaining servers.
  4. Experiment more often with virtual and automatable resources, its super easy to spin up and spin down testing environments that minimize the hard to your infrastructure and systems.
  5. Mechanical sympathy allows you to have some options when choosing the technology that works best for what you are trying to achieve.

For more information on the Performance Efficiency Pillar, click here.

Cost Optimization

Cloud technology has many benefits, one of them being cost. However, if not optimized correctly, you could find yourself paying equal amounts to what you would be if you were to be running on premise. The Cost Optimization plays a critical role in the AWS Well Architected Framework. This pillar has five design principles:

  1. Adopt a consumption model by paying only for what you use and a penny more. With this model, you can scale to your needs up or down and not focus so much of your energy on forecasting for future requirements.
  2. Measure overall efficiency by calculating the business output of the system you are implementing and the costs associated with delivering it. This will help your organization make more solid decisions before purchasing new tech services into your existing infrastructure.
  3. Stop spending money on data center operations. AWS handles that for you so you can focus on your customers and projects.
  4. Analyze and attribute expenditure to help determine return on investment for any given system or resource you provision within the cloud.
  5. Use managed services to reduce ownership so you can focus on performing the business tasks instead of handling the maintenance of those systems that are required for your business needs. Hand off that email server to be maintained by AWS.

For more information on the Cost Optimization Pillar, click here.

I hope this article helped! Please share your thoughts in the comments below. If you have any questions, feel free to reach out to me.

Post navigation