We are seeking a highly skilled and motivated Machine Learning Ops Engineer to join our team.
Your primary responsibility will be to develop and implement the infrastructure and processes required to deploy, manage, and monitor machine learning models in production.
You will work closely with our RD, Data Science, Engineering and Devops to shape our tech stack, building new MLOps pipelines as well as maintaining existing solutions
Responsibilities:
- Collaborate closely with cross-functional teams including data scientists, software engineers, architects, and DevOps experts to devise and implement highly scalable and dependable machine learning infrastructure.
- Build and maintain end-to-end machine learning pipelines, including data preprocessing, model training, validation, deployment, and monitoring.
- Develop tools and automation frameworks to streamline the deployment and management of machine learning models in production environments.
- Optimize and fine-tune machine learning workflows and processes to ensure high performance, scalability, and reliability.
- Review data science models, undertake code refactoring and optimization, oversee containerization, deployment, version control, and implement vigilant quality monitoring.
- Monitor and evaluate the performance and health of deployed machine learning models, and proactively troubleshoot any issues or bottlenecks.
- Collaborate with data engineers to ensure efficient data ingestion, storage, and retrieval for machine learning workflows.
- Implement robust security measures and privacy protocols to safeguard sensitive data and ensure strict adherence to regulatory prerequisites.
- Stay updated with the latest advancements and best practices in machine learning, cloud computing, and software engineering to drive continuous improvement.
Essential Competencies
- 5 to 8 years of work experience in end-to-end machine learning pipelines, including data preprocessing, model training, validation, deployment, and monitoring
- Strong programming skills in languages such as Python, Go or Spark, good understanding of Linux, knowledge of frameworks such as scikit-learn, Keras, PyTorch, Tensorflow, etc.
- In-depth understanding of machine learning concepts, algorithms, and frameworks like Kubeflow, MLFlow and Nvidia-toolkit.
- Proficiency in cloud platforms like AWS, Azure, or Google Cloud, and experience with containerization technologies (e.g., Docker, Kubernetes) and Cuda libraries.
- Solid knowledge of software engineering principles, including version control, testing, and agile development methodologies.
- Experience with data preprocessing feature engineering, and data visualization techniques.
- Familiarity with DevOps practices and tools, including CI/CD pipelines, infrastructure as code (IaC), and configuration management.
- Ability to understand tools used by data scientist and experience with software development and test automation
- Strong problem-solving skills and the ability to analyze complex systems and workflows.
- Excellent communication and collaboration skills to work effectively in cross-functional teams.
Mandatory Skills
- Python, Kubernetes, Docker
- Cloud experience AWS/ GCP
Preferred Skills
- Experience in AL/ML based projects Education
- BE/BTECH/MCA equivalent degree from a university of UGC accreted
Role: Data Science & Machine Learning - Other
Industry Type: Software Product
Department: Data Science & Analytics
Employment Type: Full Time, Permanent
Role Category: Data Science & Machine Learning
Education
UG: B.Tech/B.E. in Any Specialization
PG: MCA in Computers
ANY GRADUATE