Responsibilities:
Design, implement, and optimize end-to-end data pipelines for AI and machine learning applications using cloud platforms such as Azure and AWS
Collaborate with data scientists and other stakeholders to understand AI model requirements and deploy scalable solutions
Develop and maintain data processing and feature engineering workflows for machine learning model training
Implement data orchestration and workflow automation using tools like Apache Airflow, Azure Data Factory, or AWS Step Functions
Work with big data technologies, including Apache Spark and Databricks, to process and analyze large datasets
Implement data versioning and lineage tracking for model reproducibility and compliance
Collaborate with cross-functional teams to design and implement AI-driven applications
Ensure data quality, security, and compliance with data governance standards
Optimize and tune data pipelines for performance, scalability, and cost-effectiveness
Stay updated on the latest advancements in AI, machine learning, and data engineering
Requirements:
Minimum 5 years of experience in data engineering with a focus on AI, machine learning and AI enrichment
Bachelor’s degree in Computer Science or a related field
Proficiency in cloud-based data engineering platforms, including Azure, AWS, and Databricks
programming skills in languages such as Python or Scala
Knowledge of data modelling, database design, and optimization
Familiarity with data warehousing concepts and technologies
Excellent problem-solving and analytical skills
Ability to work collaboratively in a team environment
Strong communication and documentation skills
Bachelor's degree