Description

Job Summary: We are looking for a talented and motivated a Data Engineer to join our team. The ideal candidate will have expertise in building, maintaining, and optimizing data pipelines and workflows. You will collaborate with cross-functional teams to drive the implementation of high-quality data solutions, working in an environment that leverages modern cloud architectures, big data technologies, and containerized environment.

 

Roles & Responsibilities:

 

• Design, develop, and maintain scalable ETL data pipelines using Dagster, NiFi, and Apache Spark to ensure reliable and timely data delivery.
• Implement data transformation workflows using DBT to ensure efficient data modelling and high-performance query execution.
• Leverage Python and libraries such as Pandas and NumPy for data manipulation, analysis, and building custom data transformations.
• Manage and monitor infrastructure with Prometheus and Grafana to ensure high availability and performance of data pipelines.
• Containerize applications and manage them using Docker and Kubernetes in a production environment.
• Architect, design, and manage cloud infrastructure on Google Cloud Platform (GCP), ensuring security, scalability, and cost optimization.
• Write complex SQL/T-SQL queries, stored procedures, and database transformation jobs to support business needs.
• Conduct complex ad-hoc data analysis for clients, ensuring data accuracy and actionable insights.
• Collaborate with data scientists, analysts, and business stakeholders to understand data needs and deliver solutions that drive decision-making

 

Proficiency in Python and libraries such as Pandas and NumPy for data analysis and manipulation

Experience with Dagster and DBT for orchestrating and managing data workflows

Familiarity with Apache NiFi and Apache Spark for ETL processes and big data transformations

Python Optimisation like threading & multi-processes will be preferable

Strong knowledge of Prometheus and Grafana for infrastructure monitoring and performance tuning

Hands-on experience with Docker and Kubernetes for container orchestration in a production environment

Cloud architecture experience, preferably with Google Cloud Platform (GCP), including GCP services like Big Query, Dataflow, and Cloud Storage

Expertise in database transformation, stored procedures, and creating/managing schedules in production environments

Experience writing and optimizing complex SQL queries for relational databases (e.g., SQL-Server, PostgreSQL, MySQL)

Ability to handle ad-hoc client analysis and provide meaningful insights from data

Bachelor’s or master’s degree in computer science, Data Engineering, or related fields

2-3 years of experience as a Data Engineer or in a similar role

 

Education

Bachelor's Degree