Responsibilities:
- Develop and maintain robust data pipelines to ingest, process, and transform large volumes of structured and unstructured data.
- Collaborate with cross-functional teams to understand data requirements and implement efficient data solutions.
- Design and implement data models for optimal storage and retrieval, ensuring data quality and integrity.
- Utilize Python for scripting and programming to build scalable and maintainable data applications.
- Implement and optimize ETL processes to facilitate the movement of data between systems.
- Work with cloud-based technologies, particularly AWS, to build scalable and cost-effective data solutions.
- Collaborate with data scientists, analysts, and other stakeholders to support their data infrastructure needs.
- Monitor and troubleshoot data pipeline issues, ensuring data availability and reliability.
- Implement best practices for data security, privacy, and compliance.
-
Skills and Qualifications:
- Bachelor's degree in computer science, Engineering, or a related field.
- Proven experience as a Data Engineer with a focus on building and maintaining data pipelines.
- Strong programming skills in Python with experience in data manipulation and analysis libraries.
- Hands-on experience with AWS services such as S3, Glue, Redshift, Lambda, and others.
- Proficiency in SQL and experience working with both relational and NoSQL databases.
- Familiarity with data modeling concepts and best practices.
- Experience with version control systems (e.g., Git) and continuous integration/deployment.
- Strong problem-solving and troubleshooting skills.
- Excellent communication and collaboration skills.