Job Description:
We are seeking a talented Data Engineer to join our team, specializing in building, maintaining, and optimizing data pipelines and cloud-based applications. The ideal candidate will have strong expertise in Python, Apache Spark, Azure Synapse Analytics, Azure Kubernetes Service (AKS), and CI/CD practices.
Key Responsibilities:
Develop, test, and deploy scalable data pipelines using Apache Spark and Python to process large-scale data sets.
Design and implement Azure Synapse Analytics solutions for data warehousing and real-time analytics.
Manage and orchestrate containerized applications using Azure Kubernetes Service (AKS).
Work closely with DevOps teams to automate deployment processes through robust CI/CD pipelines.
Ensure performance, reliability, and security of the data platform, optimizing solutions for high availability and low latency.
Collaborate with cross-functional teams to define technical requirements and improve data architecture.
Required Skills:
Strong experience in Python for data engineering and scripting tasks.
Proficiency in Apache Spark for distributed data processing.
Hands-on experience with Azure Synapse Analytics for creating and managing data integration, ETL processes, and real-time analytics.
Experience on Azure Kubernetes Service (AKS) for container orchestration and management.
Experience with CI/CD pipelines and version control systems (e.g., Azure DevOps, GitHub, Jenkins).
Understanding of cloud-based architectures and deployment models.
Preferred Qualifications:
Experience with other Azure services such as Azure Data Lake, Azure Functions, and Azure Blob Storage.
Familiarity with infrastructure as code (IaC) tools like Terraform or ARM templates.
Knowledge of SQL and database performance tuning.
Strong problem-solving skills and ability to work in a collaborative team environment.
Bachelor's degree