Introduction:
Cloud tools now have the ability to let DevOps deliver cloud infrastructure along-side applications that are deployed on it. Did I just say, build a PaaS solution? Commercial PaaS solutions like OpenShift and Pivotal Cloud Foundry can be expensive and require specialized skills. They do speed up development and keep your enterprise cloud adoption vendor agnostic. However, adopting them calls for a strategic shift in the way your organization does application development. All good with this approach, it just takes time – POC, POV, Road Show and then a Decision. While PaaS solutions are great, another alternative is to use individual AWS services alongside open source tools that can help provision, secure, run and connect cloud computing infrastructure.
Operating knowledge of these tools and orchestrating them in a cohesive workflow can help your DevOps team do continuous deployment on a cloud infrastructure whose results are similar to commercial PaaS solutions. This solution is economical and manageable without hiring specialized skill sets. Why no specialized skill set here – because your development team already has the skills to build “Castles in the Cloud”. While they are conceptualized as solutions, the end result is a full-blown product with its own governance and management lifecycle. It can easily be integrated with application delivery pipeline. Moreover, the solution provisions immutable EC2 instances that capture log information for monitoring and debugging. Underlying belief driving this approach – “Complete Automation, seamless integration using non-commercial tools and services”.
Solution:
At first, it appears that the solution lies in Elastic Beanstalk. Though Beanstalk produces immutable infrastructure, it has certain drawbacks when it comes to encrypting configuration and log data during infrastructure provisioning. This could pose a challenge to organizations that operate in a highly regulated industry. As such, the requirements to push service logs to an encrypted S3 bucket, to make the AMI generation process configuration driven and to be able to automate the monitoring and auditing of infrastructure requires a custom comprehensive configuration driven solution. Moreover, highly regulated industries like finance and healthcare require complete encryption of transitive and logged data.
Cloud infrastructure automation can be broken into five key processes:
- Pre Provision
- Bakery
- Provision
- Validation
- Post Provision
Consider the above process as individual workers trying to accomplish a fixed and independent task. AWS Step functions can easily orchestrate the workflow among these individual workers (activity workers) and can be configured to build a comprehensive, configuration driven and dynamic infrastructure provisioning process. With Step functions, the above five processes are now individual states that are executed in a chronological order. A process remains in that particular state till the activity worker completes its activity. A state machine is set up to pass control to different states which internally executes activity workers that are build using Lambda functions.
A quick summary of each process/state:
- Pre Provision – This is the first stage of the process that is triggered by the applications’ CI pipeline. Mostly enterprise CI pipelines are built using CI tools like Jenkins. The pipeline sends a notification to an SNS topic. A lambda function subscribed to the topic then triggers the step function execution. In this step, the activity gathers pertinent information from an application configuration file. It combines this information with process-specific configuration and environment-related information received from the pipeline trigger. It then encrypts this information and saves it to an encrypted EC2 parameter store. Application configuration file is generated by the application development teams using a rule-based UI that restricts access to AWS services as per application needs.
- Bakery – This process is the heart of automation solution. This step is the next transition state after Pre Provisioning. It uses tools like Packer, InSpec, Chef and AWS CW Agent. The state calls a Lambda activity worker that executes an SSM command. The command starts a packer build running on a separate EC2 instance. The packer pulls all the relevant information required for the build from the encrypted EC2 param store and starts the build. It uses Chef to layer application, middleware and other dependencies on the AMI. Post packer build, application-specific AMI is encrypted and shared with the application AWS account owner for provisioning.
- Provision – Once the AMI is ready and shared by the application account owner, next state in the automation process is Provision. This state calls a Lambda activity worker which executes another SSM command that starts executing Terraform modules, which provisions the following – ALB, LC with AMI Id that is baked in the previous state and ASG to supplement elasticity. At the end of this state, the entire application AWS physical architecture is up and running and one should be able to use the ALB DNS to connect to the application. SSH access is removed to keep instances immutable.
- Validation – Validation is the next stage in the process. After the infrastructure is provisioned, automated InSpec validation scripts validate the OS and services provisioned. This phase too is invoked by a Lambda activity worker. InSpec logs are moved to an encrypted S3 bucket from where they are sourced to the testing team to review and log defects as necessary. These defects are then triaged and assigned to respective teams.
- Post Provisioning – This is the last state in the process where the new provisioned infrastructure undergoes a smoke test before it is delivered to the application/testing teams. This state configures the EC2 based CW logs with an encrypted S3 bucket. From S3, the logs are exported into log management tools like Splunk. In Splunk operations team can build monitoring dashboards. Moreover, in this step, all AWS services provisioned along with application ID are stored in a Dynamodb table for logging and auditing purposes. Lastly, this stage also initiates blue-green deployments for a smoother transition to the new release.
The above infrastructure automation process nukes and paves the infrastructure using AWS services. A new release or an update to the base SOE image triggers the execution of the automation process. It can significantly improve the efficiency of deploying applications on AWS. It greatly reduces the EC2 provisioning time and can bring down your AWS operating costs over a period of time. Though custom, these automation solutions are complex and require deep knowledge of Cloud Native Services and tools that help build Infrastructure through code. Perficient’s Cloud Platform Services team is adept in such solutions and can help your organization look past the “pet EC2 instances” world. if you’re interested in learning more, reach out to one of our specialists at sales@perficient.com and download our Amazon Web Services guide for additional information.