Description

Data Engineer (Azure Databricks):

 Responsibilities:

  • Design, develop, and maintain data pipelines using Azure Databricks for data ingestion, transformation, and loading into various data stores (e.g., Azure Data Lake Storage, Azure SQL Database, Azure Synapse Analytics).
  • Implement data processing solutions using Apache Spark on Azure Databricks, leveraging its capabilities for distributed computing, machine learning, and real-time data processing.
  • Develop and optimize data models and schemas for efficient data storage and retrieval.
  • Collaborate with data scientists, data analysts, and other stakeholders to understand their data needs and translate them into technical solutions.
  • Work with the team to define and implement data governance policies and best practices for data quality and security.
  • Monitor and troubleshoot data pipelines and applications on Azure Databricks, ensuring optimal performance and reliability.
  • Stay up-to-date with the latest advancements in Azure Databricks and related technologies.
  • Contribute to building a strong data culture within the organization.

Required Skills:

  • 10+ years of experience
  • Strong proficiency in Python or Scala for data processing and manipulation in Azure Databricks.
  • Experience working with Apache Spark, including its core concepts, data structures, and APIs.
  • Familiarity with data ingestion and transformation techniques (e.g., ETL, ELT).
  • Experience with data warehousing and data lake concepts.
  • Hands-on experience with Azure Databricks, including workspace management, cluster configuration, and job scheduling.
  • Knowledge of Azure storage services like Azure Data Lake Storage, Azure Blob Storage, and Azure SQL Database.
  • Understanding of data security principles and best practices.
  • Excellent problem-solving and analytical skills.
  • Strong communication and collaboration skills.

 

Education

Any Graduate