Key Responsibilities:
Data Pipeline Development: Design, develop, and optimize ETL pipelines to ensure reliable and efficient data flow across systems.
Data Warehousing: Architect, build, and maintain data warehouse solutions to support analytical and reporting needs.
Data Quality & Integrity: Implement data quality checks and cleaning processes to ensure accuracy, consistency, and reliability of data.
Collaboration: Work closely with data analysts, data scientists, and cross-functional teams to understand data needs and deliver solutions.
Optimization: Improve data infrastructure and optimize performance for better data processing and querying capabilities.
Documentation: Create and maintain documentation of data architecture, pipelines, and workflows for internal use and compliance.
Qualifications:
Bachelor’s degree in Computer Science, Engineering, or a related field.
10+ years of experience in data engineering, data warehousing, or related roles.
Proficiency in SQL and experience with database management systems (e.g., PostgreSQL, MySQL, Redshift).
Experience with ETL tools (e.g., Apache Airflow, Talend) and big data technologies (e.g., Hadoop, Spark).
Hands-on experience with cloud platforms (e.g., AWS, Azure, GCP) and cloud data storage solutions.
Programming skills in Python, Java, or Scala.
Strong understanding of data modeling, data architecture, and data governance principles.
Preferred Skills:
Familiarity with data visualization tools (e.g., Tableau, Power BI).
Knowledge of NoSQL databases (e.g., MongoDB, Cassandra).
Bachelor’s degree in Computer Science, Engineering, or a related field.