Key Responsibilities:
- Build, test, and maintain scalable ETL pipelines using Databricks and Apache Spark.
- Contribute to the development of data models and optimize queries for performance and efficiency.
- Participate in the design and implementation of data governance practices, ensuring data quality and consistency across pipelines.
- Assist in the implementation and maintenance of the medallion architecture (bronze, silver, gold layers) to streamline data processing and reporting.
- Collaborate with cross-functional teams to integrate data management and reporting processes into broader business initiatives.
- Troubleshoot and optimize Databricks workflows to enhance performance and reduce costs.
- Maintain clear documentation of ETL pipelines, data models, and governance procedures to ensure transparency and scalability.
Skills Required:-
- Experience in Databricks for building ETL pipelines.
- Experience in building data models and optimizing queries.
- Experience with data management and reporting processes.
- Experience with medallion architecture and data governance.
- Proficiency in SQL and Python for data manipulation and analysis.
- Strong problem-solving skills with a keen eye for detail.
- Databricks certification preferred
Bachelor's degree in Computer Science