Job Overview:
An experienced Data Engineer with expertise in Azure and Databricks, coupled with a solid understanding of data warehousing concepts, SQL, and Python. As a Data Engineer, you will play a crucial role in designing and building, data-driven initiatives.
Responsibilities:
- Design, develop, and maintain robust data pipelines and infrastructure on Azure cloud platform, leveraging Databricks for large-scale data processing and analytics.
- Collaborate with Data Architects to design and evolve data warehouse architecture to meet evolving business requirements.
- Develop and optimize Extract, Transform, Load (ETL) processes to integrate data from various sources into the data warehouse. Ensure data quality, consistency, and reliability throughout the ETL process.
- Identify and address performance bottlenecks in data pipelines and queries. Optimize data processing and storage to achieve optimal performance and scalability.
- Proactively identify and address issues with data pipelines and infrastructure. Perform routine maintenance tasks to ensure the stability and reliability of data systems if required.
- Work closely with cross-functional teams including Data Scientists, Business Analysts, and Software Engineers to understand data requirements and deliver effective data solutions.
- Document data infrastructure, processes, and workflows. Establish and promote best practices for data engineering within the project.
Requirements:
- Bachelor's degree or higher in Computer Science, Engineering, or a related field.
- Proven experience as a Data Engineer or similar role, with a focus on Azure cloud platform and Databricks.
- Develop and maintain scalable data pipelines using Python and Apache Spark.
- Experience with Python programming for data processing and scripting.
- Strong knowledge of data warehousing concepts and best practices.
- Proficiency in SQL for data querying, stored procedures and optimization.
- Develop and maintain scalable data pipelines using Python and Apache Spark.
- Experience with Python programming for data processing and scripting.
- Hands-on experience with Azure services such as Azure Data Factory, Azure SQL Database, Azure Databricks, Azure Fabric etc.
- Familiarity with dimensional data modeling (Facts, Dimensions, slowly changing dimensions)
- Excellent problem-solving and analytical skills.
- Strong communication and collaboration skills, with the ability to work effectively in a cross-functional team environment.
- Experience with agile development methodologies is a plus.
Preferred Qualifications:
- Certification in Azure or Databricks or Snowflake.
- Experience with other cloud platforms such as AWS or Google Cloud Platform and Snowflake.
- Knowledge of big data technologies such as Hadoop, etc.