Description


Job Description:

  • Experience in Jupyter Notebooks as SRE
  • Experience on Kubernetes, Machine Learning workflows (preferably SageMaker), Python scripting, Rubix
  • SRE experience in Machine Learning Data Flows
  • Answering/Fixing support issues for Data Lab
  • Implement and maintain Infra as Code, and Build pipeline
  • Taking measure to minimize on-call incidents
  • Post incident reviews
  • Documenting the issue resolution and the undocumented knowledge
  • Work with dev teams to ensure that the new features meet the reliability and performance goals
  • Ability to work with geographically distributed teams


 

Education

Any Grdauate