Job Description
As a Lead Data Engineer, you will be responsible for overseeing the design, development, implementation, and maintenance of the data infrastructure and pipelines within the organization. You will lead a team of data engineers and collaborate with cross-functional teams to ensure the efficient and effective management of data assets. This role requires a strong understanding of data engineering principles, excellent technical skills, and the ability to provide technical leadership.
Responsibilities:
- Lead and manage a team of data engineers, providing guidance, mentoring, and support to ensure successful project delivery.
- Collaborate with stakeholders, including data scientists, analysts, and product managers, to understand data requirements and translate them into technical solutions.
- Design and develop scalable, reliable, and efficient data pipelines and ETL processes to extract, transform, and load data from various sources into data warehouses or data lakes.
- Implement and maintain data governance and data quality frameworks to ensure data accuracy, integrity, and compliance with privacy regulations.
- Optimize data infrastructure and architecture to improve data processing performance and reduce latency.
- Identify and evaluate new technologies, tools, and frameworks to enhance data engineering capabilities and drive innovation.
- Collaborate with infrastructure and operations teams to ensure the availability, reliability, and security of data platforms.
- Define and enforce coding standards, best practices, and development methodologies to ensure high-quality code and maintainable solutions.
- Conduct performance reviews, provide feedback, and support the professional growth and development of team members.
- Stay up-to-date with industry trends, emerging technologies, and advancements in data engineering to drive continuous improvement and innovation within the team.
Qualifications:
- Bachelor's or Master's degree in Computer Science, Information Technology or any other field, preferably from a premium/Tier-1 college or institute/university.
- Proven experience in data engineering, with a focus on designing and implementing data pipelines, ETL processes, and data warehousing solutions.
- Good experience in Azure Data Factory, Data Lake, Azure Synapse,Databricks, building data models/data marts to be used in Power BI
- Skilled in end-to-end ETL/ELT process. with Bigdata eco systems
- Strong Hands-on experience in writing efficient code in Python/Pyspark to manipulate data and draw insights and have used libraries such as - Pandas, Numpy, Matplotlib, modin, Dask, scikit-learn etc.
- A successful history of manipulating, processing and extracting value from large disconnected datasets, including MariaDB, Mongo, Elasticsearch index.
- Familiar with data extraction using third party APIs
- Proficiency in working with relational databases (SQL) and NoSQL databases.
- Excellent problem-solving and analytical skills with the ability to translate complex business requirements into technical solutions.