Description

Job Description:

  • Working closely with a wide range of container automation tooling such as Kubernetes and AWS EKS
  • Design, implement, and maintain a secure scalable compute platform as it evolves with the industry
  • Champion SRE methodologies around monitoring, alerting, and establishing SLOs, SLAs
  • Identify and execute on opportunities to optimize existing systems, improve infrastructure and eliminate work through automation
  • Work alongside other teams in helping provide post mortem analysis of why services broke or became degraded.
  • Design and build automation suites to streamline operational support.
  • Good understanding of CNCF tools like ArgoCD, Crossplane and Kyverno
  • Established understanding of observability fundamentals (Logging, Metrics, Tracing)
  • Ability to learn quickly, master our existing systems and identify areas of improvement
  • Have a strong technical background and ability to think creatively to solve problems.
  • Acquainted with Kubernetes Operators, Controllers and CRDs functionalities
  • Participate in our on-call rotation for production services we build
  • Deep understanding and application of computer science fundamentals: data structures, algorithms, and design patterns.
  • You have exposure to and understanding of cloud (AWS, Google Cloud, Azure, etc.) architectures/services.
  • Excellent understanding of Multi cluster management, operating at Scale


 

Education

Any Graduate