Key Responsibilities:
Develop and optimize T-SQL queries, stored procedures, and functions to support data extraction, transformation, and loading (ETL) processes.
Design, implement, and maintain data pipelines using Databricks, ensuring efficient processing and analysis of large volumes of data.
Automate data-related tasks and processes using Python or Java programming languages, enhancing efficiency and scalability.
Build and maintain RESTful APIs for data access and integration, ensuring security, scalability, and performance.
Monitor data pipelines and integrations, proactively identifying and resolving issues to minimize downtime and ensure data integrity.
Implement data governance and quality control measures to ensure the accuracy, completeness, and consistency of data.
Document data models, processes, and procedures, ensuring knowledge transfer and compliance with best practices.
Stay up-to-date with emerging technologies and industry trends, evaluating new tools and methodologies to improve data engineering practices.
Qualifications:
Bachelor's degree in Computer Science, Information Technology, or related field; Master's degree preferred.
8+ years of experience in data engineering, with a strong focus on T-SQL (SQL Server) development.
Proficiency in Databricks for data processing and analytics, including Spark SQL, Data Frame API.
Strong understanding of API design principles and experience in developing RESTful APIs using frameworks like Flask or Spring Boot.
Proficiency in Python or Java programming languages for automation and scripting tasks.
Solid understanding of data warehousing concepts, dimensional modeling, and ETL best practices.
Experience with cloud platforms such as AWS, Azure, or GCP, including services like S3, Redshift, or Big Query.
Excellent problem-solving skills and ability to troubleshoot complex data-related issues.
Strong communication and collaboration skills, with the ability to work effectively in a team environment.
Any graduate