Description

About the job
Responsibilities:

Design, implement, and optimize end-to-end data pipelines for AI and machine learning applications using cloud platforms such as Azure and AWS.
Collaborate with data scientists and other stakeholders to understand AI model requirements and deploy scalable solutions..
Develop and maintain data processing and feature engineering workflows for machine learning model training.
Implement data orchestration and workflow automation using tools like Apache Airflow, Azure Data Factory, or AWS Step Functions.
Work with big data technologies, including Apache Spark and Databricks, to process and analyze large datasets.
Implement data versioning and lineage tracking for model reproducibility and compliance.
Collaborate with cross-functional teams to design and implement AI-driven applications.
Ensure data quality, security, and compliance with data governance standards.
Optimize and tune data pipelines for performance, scalability, and cost-effectiveness.
Stay updated on the latest advancements in AI, machine learning, and data engineering.


Requirements:

Minimum 5 years of experience in data engineering with a focus on AI, machine learning and AI enrichment.
Bachelor’s degree in Computer Science or a related field.
Proficiency in cloud-based data engineering platforms, including Azure, AWS, and Databricks.
Strong programming skills in languages such as Python or Scala.
Experience with machine learning frameworks and libraries
Knowledge of data modelling, database design, and optimization.
Familiarity with data warehousing concepts and technologies.
Excellent problem-solving and analytical skills.
Ability to work collaboratively in a team environment.
Strong communication and documentation skills.


Preferred Skills:

Industry certifications related to data engineering, AI, or machine learning (e.g., Microsoft Certified: Azure AI Engineer Associate).
Experience with containerization and orchestration tools (e.g., Docker, Kubernetes).
Knowledge of MLOps practices for deploying and managing machine learning models.
Understanding of natural language processing (NLP) and computer vision.
Familiarity with distributed computing and parallel processing.

Education

Any Graduate