Description

Responsibilities:

Design, implement, and optimize end-to-end data pipelines for AI and machine learning applications using cloud platforms such as Azure and AWS

Collaborate with data scientists and other stakeholders to understand AI model requirements and deploy scalable solutions

Develop and maintain data processing and feature engineering workflows for machine learning model training

Implement data orchestration and workflow automation using tools like Apache Airflow, Azure Data Factory, or AWS Step Functions

Work with big data technologies, including Apache Spark and Databricks, to process and analyze large datasets

Implement data versioning and lineage tracking for model reproducibility and compliance

Collaborate with cross-functional teams to design and implement AI-driven applications

Ensure data quality, security, and compliance with data governance standards

Optimize and tune data pipelines for performance, scalability, and cost-effectiveness

Stay updated on the latest advancements in AI, machine learning, and data engineering

Requirements:

Minimum 5 years of experience in data engineering with a focus on AI, machine learning and AI enrichment

Bachelor’s degree in Computer Science or a related field

Proficiency in cloud-based data engineering platforms, including Azure, AWS, and Databricks

programming skills in languages such as Python or Scala

Knowledge of data modelling, database design, and optimization

Familiarity with data warehousing concepts and technologies

Excellent problem-solving and analytical skills

Ability to work collaboratively in a team environment

Strong communication and documentation skills

 

Education

Bachelor's degree