Description

Technically sound in AWS, Kubernetes, and Python, basic SQL, ML Ops knowledge like MLFlow a plus.

Answeing/Fixing support issues for DatalaLab

Implement and maintain Infra as Code, and Build pipeline

Taking mesaures to minimize on-call incidents

Post incident reviews

Documenting the issue resolution and the undocumented knowledge

Work with dev teams to ensure that the new features meet the reliability and performance goals

Ability to work with geographically distributed teams in India and SCV

Successful candidate will several years of experience in supporting large enterprise system with at least 10 different upstream and downstream systems., Identifying issues from splunk logs

Excellent problem solving skills and decision making skills about when to engage other team members.

 

Key Skills: Devops Support SRE, AWS with Sagemaker ML

 


 

Education

Any Gradute