Responsibilities
Design, implement, and maintain scalable data architectures, including databases, AWS data lakes, and data warehouses in line with the existing Engie tools and frameworks.
Build and optimize ETL (Extract, Transform, Load) processes to move data from various sources to data repositories.
Combine diverse data sources into cohesive, unified views (gold datasources) for analysis and reporting purposes.
Ensure data consistency, accuracy, and quality throughout the data lifecycle.
Manage and optimize the data infrastructure, including storage, processing, and retrieval systems in terms of efficiency, storage and costs.
Collaborate with various teams to ensure the scalability, security, and performance of data systems.
Develop and maintain code for data processing tasks using languages such as Python, Spark, SQL and ensure they are automated and run on a regular basis.
Collaborate with business stakeholders to gather and understand data-related requirements.
Create and maintain comprehensive documentation for data processes, pipelines, and architectures.
Ensure that data engineering best practices are followed.
Share best practices with all data stakeholders, both IT and business.
Hard skills:
Must have:
Proven experience as a Senior Data Engineer or in a similar role
Proven experience on Data Platform Management
Effective Experience with AWS services such as S3, Athena, Glue, Redshift
Effective experience in Databricks ( preferably on AWS)
Strong programming skills in Python, Spark, SQL, or another data engineering language
Experience with data processing technologies
Knowledge of Agile processes (scrum, Kanban)
Knowledge of DataIku
Knowledge of Azure Devops
Any Graduate