Description

Job Description:

Dynamic Engineer who has an understanding of application performance management, experience building monitoring and alerting solutions.

  • Troubleshoot incidents, identify root cause , fix and document problems and deploy preventative solutions.

Required Experience

  • 5+ years of recent experience working on building automation and monitoring for observability (Prometheus/Grafana/ELK).
  • 5 + years of experience working on support projects and be on rotational on-call to address failures.
  • 5+ years of recent experience with Kubernetes, Docker, Helm and end to end support of applications in this environment.
  • 5+ years of recent experience working in AWS and/or GCP.
  • 3+ years of full stack python development.
  • Great communication skills to be able to effectively communicate with team members as well as management.

Skills Preferred:

  • MLOps experience
  • MLE experience

 


 

Education

Any Gradute