Key Responsibilities
Design and build data pipelines using Spark-SQL and PySpark in Azure Databricks
Design and build ETL pipelines using ADF
Build and maintain a Lakehouse architecture in ADLS / Databricks.
Perform data preparation tasks including data cleaning, normalization, deduplication, type conversion etc.
Work with DevOps team to deploy solutions in production environments.
Control data processes and take corrective action when errors are identified. Corrective action may include executing a work around process and then identifying the cause and solution for data errors.
Participate as a full member of the global Analytics team, providing solutions for and insights into data related items.
Collaborate with your Data Science and Business Intelligence colleagues across the world to share key learnings, leverage ideas and solutions and to propagate best practices. You will lead projects that include other team members and participate in projects led by other team members.
Apply change management tools including training, communication and documentation to manage upgrades, changes and data migrations.
Required Qualifications
Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
Proven experience in data engineering, data warehousing, or data integration.
Proficiency in programming languages such as Python, Java, or Scala.
Strong understanding of SQL and database management systems.
Experience with big data technologies like Hadoop, Spark, or NoSQL databases.
Hands-on experience with cloud platforms such as AWS, Azure, or GCP.
Ability to design and optimize data models for diverse data types.
Expertise in building and maintaining ETL pipelines and data workflows.
Strong problem-solving skills and attention to detail.
Excellent communication and collaboration skills to work effectively with cross-functional teams.
Familiarity with data governance, security, and compliance best practices.
Bachelor's degree in Computer Science