Description


Key Responsibilities

Design, implement, and maintain robust machine learning pipelines that enable the development and deployment of models at scale.
Collaborate with data scientists and engineers to productionize ML models, ensuring scalability, efficiency, and robustness in the end-to-end ML lifecycle.
Develop automated data pipelines for data collection, preprocessing, and transformation to feed machine learning models.
Work with large-scale datasets and build efficient feature engineering pipelines for real-time or batch processing.
Implement and optimize machine learning algorithms, including supervised learning, unsupervised learning, deep learning, and reinforcement learning models.
Collaborate with software engineering teams to integrate ML models into production environments and ensure the reliability of models in live systems.
Monitor and maintain deployed models, including performance tracking, versioning, and retraining based on new data.
Leverage cloud platforms (e.g., AWS, GCP, Azure) and containerization technologies (e.g., Docker, Kubernetes) to scale model training, deployment, and inference.
Implement MLOps practices, including CI/CD for ML models, model versioning, monitoring, and automated testing.
Optimize model training and inference times using tools like distributed computing, parallel processing, and GPU acceleration.
Collaborate with cross-functional teams, including product managers, data engineers, and software developers, to ensure successful model deployment.
Stay up to date with the latest research and trends in machine learning engineering, scalable ML systems, and cloud-based ML solutions.

Required Qualifications

Bachelor’s or Master’s in Computer Science, Data Science, Software Engineering, or a related field.
5+ years of experience as a Machine Learning Engineer or Software Engineer with a focus on deploying and scaling machine learning models.
Strong proficiency in Python or Java, with hands-on experience in machine learning frameworks such as TensorFlow, PyTorch, Scikit-learn, or Keras.
Experience with building scalable machine learning pipelines using Airflow, Kubeflow, or similar tools.
Proficiency with cloud computing platforms such as AWS, Google Cloud Platform (GCP), or Azure for model training, deployment, and scaling.
Familiarity with containerization technologies like Docker and orchestration tools like Kubernetes for deploying and managing ML models in production.
Experience in MLOps practices, including CI/CD for machine learning models, model versioning, and monitoring.
Strong understanding of data engineering concepts, including ETL processes, data pipelines, and large-scale data processing frameworks like Spark or Hadoop.
Experience working with relational and non-relational databases (e.g., SQL, NoSQL) and proficiency in data querying.
Excellent software development skills, including object-oriented programming, algorithm design, and unit testing.
Strong problem-solving skills and ability to work in a fast-paced, dynamic environment.

Preferred Qualifications

Experience with deep learning architectures (CNNs, RNNs, Transformers) and applying them to real-world problems in areas such as computer vision or NLP.
Familiarity with distributed computing frameworks like Dask, Ray, or Horovod.
Experience working with DevOps practices, including infrastructure as code (IaC), monitoring, and logging.
Knowledge of big data tools such as Apache Spark, Kafka, and Cassandra.
Experience in building real-time machine learning systems and streaming pipelines.
Strong understanding of optimization techniques for efficient model inference, including model compression and quantization.
Contributions to open-source ML tools or published papers in the machine learning community.

Education

Bachelor’s or Master’s in Computer Science