Client
Leading payroll and HR solutions providerGoal
Develop automation strategy and framework that accommodates growth and ensures efficiencyTools and Technologies
Ansible, AWS, Dynatrace, Gremlin, Groovy, Jenkins, Keptn, KICS, Python, TerraformBusiness Challenge
The SRE (Site Reliability Engineering) shared services team faced a diverse set of needs relating to automation of infrastructure and services provisioning, configuration, and deployment.
The team was encountering resource constraints, as limited in-house expertise in certain automation tools and technologies was causing delays in meeting critical automation requirements. They also needed to ensure system reliability and were challenged to scale automation solutions to accommodate increasing demands as operations grow.
Solution
- Development of a comprehensive automation strategy to align with objectives, encompassing Terraform, Ansible, Python, Groovy, and other relevant technologies in the AWS environment
- Leveraging our expertise to bridge the knowledge gap, provide training, and augment the client team in handling complex automation tasks
- Implementation of a chaos engineering framework using Gremlin, Dynatrace, Keptn, and EDA tools, to proactively identify weaknesses and enhance system resilience
- Creation of a scalable automation framework that accommodates growing needs and ensures long-term efficiency
Outcomes
- A unified automation strategy that streamlined processes, reduced manual effort, and enhanced overall efficiency by 30%
- The implementation of chaos engineering and self-healing practices, which increased reliability between 20% and 50%
- A reduction in manual interventions along with improved efficiency that will result in cost savings of 25% - 50%