Site Reliability Engineering

What you'll get to do...

Elevate operational standards for smoother day-to-day performance
Build and optimize a robust CI/CD pipeline for automated deployments across applications and platforms
Oversee and support both private and public cloud infrastructure, ensuring efficiency and scalability
Own and improve our release process with a focus on efficiency, scalability, and quality
Design and implement new infrastructure components, deploying them in distributed environments
Collaborate closely with engineering, product, and business teams to coordinate release schedules and drive timely delivery
Administer and support our cloud infrastructure based on OpenStack
Manage host systems through problem analysis, isolation, and debugging
Contribute throughout the product development lifecycle, from inception to post-launch support, for large-scale cloud deployments
Actively contribute to open-source projects such as OpenStack and libvirt
For some roles -participate in on-call rotations to maintain system reliability
Troubleshoot and resolve code-level issues quickly and effectively to support our infrastructure and delight our customers

Your experience should include...

Mid-Level: 3+ years of experience in Site Reliability Engineering
Senior-Level: 5+ years of experience in Site Reliability Engineering
Expertise with AWS (cloud-native and cloud-agnostic)
Experience with CI/CD development using Kubernetes, Docker, and related technologies
Proficiency in Linux and Python development
Proven contributions to open-source projects
Experience with cloud platforms like AWS or Azure
Hands-on experience with virtualization technologies such as KVM and OpenStack
Strong coding skills in languages such as Go, Python, Pulumi, Terraform, Ansible, and CDK

You might also have...

Any Graduate

Back To Jobs