Devops Automation and Devops Best Practices

Gennovacap explains how devops automation and devops best practices can help cut costs, deploy apps faster, and achieve 99.9% uptime.
Devops Automation and Devops Best Practices for Scaling Startups

Summary

In this guide we analyze a common problem with devops that startup companies face as they scale up their applications.  At Gennovacap, we formulate devops automations using our devops best practices that help our customers scale to the moon. The business results from our devops automation resolves scalability issues, reduces cloud costs, and improves software delivery 6X – 10X times faster. 

To encapsulate these learnings, Gennovacap decided to share our devops best practices with interested technical and business leaders. The following strategy proposes a well architected solution with a technical guide and benefits for businesses who choose this solution. 

Estimated reading time: 11 minutes

Tech Debt in Devops

Over the past 10 years, Gennovacap’s team has been a part of shipping software for technology companies. We’ve seen first hand how underrated devops is during the early stages of product development.  A good majority of startup companies spend 80% of their development cycles shipping features and spend the other 20% patching bugs. 

At the early stage, startups choose deployment tools like Jenkins or Heroku. These deployment tools are good to start but don’t scale efficiently with operations and certainly aren’t built for automation.  The tech debt and costs incurred from these tools grows as the applications grow. If a company’s customer base doubles in one year, then devops automation becomes a critical strategy to achieve the high growth and scale. This is where we can help you with our devops best practices and devops automation strategies.

Fast and cheap Devops – the Growth Killer

There’s an old saying in software development that goes something like, “Fast, good, or cheap – pick two.” Anyone who has ever built software has felt the pressure of weighing the opposing forces of features, speed and cost against each other.

Startup companies usually choose fast and cheap for their early stage devops strategy and this is why they choose Heroku or Jenkins. It’s smarter at that early stage to get the product launched and into customers hands to iterate quickly. However, when success catches up to you, you need a feature rich devops process to improve development and product quality. At Gennovacap, we focus on quality with each devops iteration so that our customers can reach scaling goals, deploy features faster, and achieve high growth. 

iron triangle - fast good cheap

Our Devops Automation and Managed Cloud Services Practices 

Last year, Gennovacap put together a case study covering Cloud Cost Optimization using AWS Devops: How an AI company saved 90% on cloud costs. CTOs at software companies always want to know how we did it. How did we save a company $18,000 / month using our devops best practices? 

The answer is not simple. In fact, it’s a lengthy process involving operations, code repositories, deployments, network, storage, pipelines, and databases in the cloud. At Gennovacap we break these steps into these devops best practice areas:

  1. Devops Consulting
    • Upgrade the application build process, to make it ready for running on Kubernetes (AWS EKS) as a Docker container, by following the 12-factor app methodology
    • Adopt Infrastructure as Code practice for managing and evolving cloud resources, leveraging industry standard, cloud-agnostic tools and processes
      1. Infrastructure as code
      2. Platform as code
      3. Configuration as code
      4. Policy as code
    • Create CI/CD Processes for consistently and reliably building and deploying managed artifacts (see our latest article on 11 Open Source Kubernetes CI CD Tools to Improve Your Devops )
  2. Managed Cloud Services
    • 24x7x365 support, continuous monitoring, and incident response for AWS Resources, GCP Resources, or Azure Resources
      1. Logging
      2. Monitoring
      3. Alerting
      4. Tracing
    • Security and Disaster Recovery Plans
  3. Ongoing devops consulting 
    • Security patches, software upgrades, infrastructure maintenance

Devops Automation Benefits

By implementing devops automation using these devops best practices, every company we consult for can achieve benefits like: 

  • Reduced Cloud Costs
  • Faster Releases
  • Compliance
  • Harden Security
  • 99.9% Uptime

In the remainder of this article, we examine a case which lists out some of the devops automation tools and reveals a basic cloud architecture for devops automation.

How Can Gennovacap’s Devops Best Practices Help My Company?

In the course of working with clients, we documented every issue and have created a series of devops best practices. From the devops best practices, we compiled a case from an existing client who faced scaling problems with their existing devops setup on AWS. 

In this case we break down the devops automation tools and managed support strategies we put in place for this startup company. By implementing these devops best practices, the client lowered costs $5400/month, deployed 15X faster, and reached a 99.9% uptime.

Devops Automation Case Study

Company Profile

Founded: 2012

Company Market: B2B SaaS Startup

Customers: Globally focused small and medium sized businesses with regional branches of multinational companies. 

Cloud costs:  $9,000 / month 

Application Stack
  • Ruby on Rails
  • PostgreSQL
  • Background Jobs
  • Redis
  • Multi-tenant SaaS
Cloud Services
  • AWS OpsWorks
  • AWS Certificate Manager
  • RDS
  • EC2 (Ubuntu Linux)
  • Classic Load Balancer
  • CloudWatch Logs and Alarms
Application and Cloud Infrastructure: Vanilla Ruby on Rails and AWS EC2

The startup company built their software on a very vanilla ruby software stack utilizing PostgreSQL on AWS RDS and EC2 instances. 

Devops Challenges: Slow releases, Unstable Deployments, and Service Disruptions

They deployed the software on an early version of AWS Ops Work which did not include any CI CD tools. As a result, the lack of modern CI/CD technology with Ops Work caused release cycles to be very slow and unstable.  Above all, their engineering team needed to focus on the core product instead of triaging scaling issues due. To sum up, here are the issues they faced:

  • Older cloud technology / devops tools caused devops process bottleneck
  • Rising cloud costs – $9,000 / month
  • 2 days / week used to deploy and stabilize code
  • Need additional support to handle IT related issues (servers, maintenance, security, upgrades)
Application Challenges: Memory Leaks and Auto Scaling Issues 

In addition to the IT and Devops problems, the multi-tenant SaaS application contained fundamental problems, like memory leaks. The memory leaks required the engineering team to add additional EC2 instances every week after each deployment. As a result, once they fixed the leaks, they would have to spin the servers back down to keep costs low.  

Further, they did not have autoscaling to handle large API requests, which often caused service disruptions for customers. In short, their devops needed a serious upgrade and they also needed a flexible auto scaling option for their application.

Solution: Resilient Cloud Architecture

The devops automation tools, cloud infrastructure, and application architecture proposal we chose for this startup company included: Terraform, Gitlab, AWS Codebuild, AWS Code Deploy, AWS EKS, AWS ECS, AWS EC2 Auto Scaling Spot Fleet, and numerous Kubernetes tools with monitoring and alerting.

Devops Automation and Devops Best Practices for Scaling Startups
Solution: Devops Automation Tools and Cloud Resources

The following is a complete list all the AWS cloud resources and devops automation tools needed to solve the scaling problems, address the application issues, and provide monitoring and support:

  • Containers and Microservices
    • EKS
    • ECS
    • Docker
    • Kubernetes
  • Identity and Access
    • IAM with RBAC
  • Continuous Delivery and Continuous Integration 
    • Codebuild
    • Gitlab
  • Infrastructure as code
    • Cloud Formation
    • Terraform
    • Systems Manager
    • Config
  • Monitoring and logging
    • Cloud Watch
    • Grafana
    • Prometheus
  • Version Control
    • Gitlab
  • Databases 
    • RDS
  • Network Services
    • Auto Scaling
    • Load Balancers 
    • VPC
  • DNS
    • Route 53
  • Storage 
    • S3
  • Certificates
    • ACM certificate
  • Servers
    • EC2
    • EC2 Spot Instances

devops automation tools
Technical Guide: Devops Automation and Devops Best Practices

To implement the full devops automation strategy, we took the following steps listed below. This strategy follows all of our devops best practices. For brevity’s sake, we will not dive into how to configure all these components and systems in this article. If you’re interested in receiving the technical devops automation guides, please sign up for our newsletter in the form below.

  • Upgrade the application build process, to make it ready for running on Kubernetes (AWS EKS) as a Docker container, by following the 12-factor app methodology:
    • Minimum impact on existing development workflows
    • Immutable images that can be stored in an artifact repository
    • Ready to run on Kubernetes
    • Optimal resource usage
    • Native IAM integration for fine-grained EKS service roles
    • Native integration with AWS CloudWatch Logs log shipper
  • Adopt Infrastructure as Code practice for managing and evolving cloud resources, leveraging industry standard, cloud-agnostic tools and processes:
    • Use Terraform as an Infrastructure as Code tool to manage changes in a controlled way, leveraging Git as the source of truth and collaboration tool
    • Changes applied via CI/CD system (see next topic), avoiding hard-to-track manual changes applied via AWS console or API
    • Self-document changes via Git history
  • CI/CD Processes for consistently and reliably building and deploying managed artifacts:
    • Enable and set up build and test pipelines in Jenkins or another build platform (e.g., AWS CodeBuild or GitLab) using modern standards and patterns
    • Custom workflows for enabling live test environments for each development branch, that can be automatically disposed of as soon as testing is done
    • Allow developers to also make and propose changes via PRs (pull requests), that can then be automatically applied to the system once approved
    • Autoscaling workers for the build system for reduced costs and faster builds
    • Implement monitors for the entire CI/CD process, that can be used to inform the status of events of interest for developers and OPS operators, and integrate them with team communication tools, like Slack
    • GitOps Continuous Delivery pipeline:
      • Releases for multiple environments (dev, prod, etc.)
      • Controlled release rollouts (for prod)
      • Database migration controller
      • Deployment Rollback mechanism
      • Kubernetes cluster add-ons (DNS manager, load balancer ingress controller, among others)
  • Managed AWS Services and Resources:
    • EKS Cluster setup for each environment
    • Auto Scaling setup for different node pools (application and delayed job workers)
    • Docker image Container Registry with automatic security assessments for vulnerabilities
    • Shared Application Load Balancers for decreasing costs on both prod and development environments
    • AWS SSO integration with EKS via IAM and RBAC
    • Redis ElastiCache
    • RDS
    • Fine grained IAM policies for DNS records, S3 buckets and Service Roles
    • EKS integration with CloudWatch Logs, for both app and infrastructure layers
    • CloudWatch alarms for monitoring key platform components
    • ACM certificate integration for public-facing and internal-facing components
    • Automated DNS management with Route53 and EKS
    • VPC endpoints for private connections between the VPC and supported AWS resources
  • Security and Disaster Recovery:
    • Segmented VPC with isolated subnets and NAT gateways spawned across multiple availability zones for increased reliability
    • CloudTrail integration with CloudWatch Logs
    • Integrated secrets management between AWS resources, CI/CD systems, applications and Terraform
    • Backup AWS account setup
    • Enforcement of Multi-Factor authentication and password rollouts
    • SSO authentication for multi-account access
    • AWS Organizations setup (Backup and Main AWS accounts), including consolidated billing
    • Fine-grained policies for S3 access, DNS management, load balancer registration and certificate renewals 
    • Leverage in-house tools or managed services like Skeddly for managing database backups and S3 data replication on another AWS account
    • Extensive use of in-transit and at-rest data encryption mechanisms for inter-resource communications
    • Continuous and automatic rollout of security updates for all EC2 instances
    • Optional use of AWS Trusted Advisor, for extra security recommendations and reports
    • ClientVPN setup to provide access to internal resources
    • WAF integration via Terraform
  • Ongoing operational support:
    • Add an on-call site reliability engineer for monitoring and support 
    • Prometheus and Grafana stacks for analyzing cluster and applications past data, providing details that can be further used to tweak and optimize different operational aspects
    • Documentation and playbooks for common and potential issues
    • Alert routing setup for on-call engineers
    • Optional: AWS DevOps guru for insights and early fault detection and notification
Results: Saved $5400 /month, Released 15X Faster, and Reached 99.9% Uptime

By employing a cloud cost optimization strategy and migrating to AWS EC2 Spot Instances, we minimized costs and obtained a 60% cost savings for the client. Additionally, they were able to continuously deploy the application as many times a day as possible. 

Furthermore, the alerting and monitoring solution notified the developers when the application failed from memory leaks. The engineering team solved application problems quicker and bugs were immediately triaged. With more stable releases, enhanced monitoring, and IT support for their infrastructure this enabled them to reach 99.9% uptime. 

Cloud Cost Optimization:

  • Before Devops: $9000 / month 
  • After Devops: $3600 / month 

Released Software 15X Faster:

  • Before Devops:  1 time / week 
  • After Devops: 3 times / day 

99.9% Uptime:

  • Before Devops: 97.5% Uptime
  • After Devops: 99.9% Uptime

This concludes our case study for Devops Automation and Devops Best Practices. We hope you enjoyed our article and feel free to contact us if you’re interested in having Gennovacap help you cut cloud costs and achieve scale for growth.

Related Posts

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Cloud Consulting Case Studies

Automated Deployments using Gitlab Runners

Enter your email to download the case

Cloud Consulting Case Studies

RACKSPACE TO AWS MIGRATION CASE STUDY

Enter your email to download the case