Description

What You’ll Do

Design and implement data monitoring pipelines to proactively identify and resolve data quality issues, potentially impacting downstream products

Collaborate with stakeholders to define requirements, develop metrics for data pipeline quality, negotiate data quality SLAs on behalf of downstream data product owners, and create monitoring solutions using Python, Spark, and Airflow

Serve as a technical lead for our data observability and federated query systems, playing a key role in achieving true Data Democratization at scale with elevated Data Quality standards

Innovate and develop new methodologies to enhance access to trustworthy data, accelerating the value provided by the product data team

What You’ll Bring

Bachelor’s Degree in Computer Science, Engineering, or a related STEM field, with a focus on data processing

A Master’s Degree is equivalent to 2 years of experience

A Ph.D. counts as 5 years of experience

At least 3 years of experience in Data Engineering or a similar role, with a proven track record of working with big data pipelines and analytics

Minimum of 2 years of hands-on experience with SQL in scalable data warehouses (e.g., Bigquery, Snowflake)

Experience in implementing best practices for data engineering and optimizing data pipeline performance/scalability

Proficiency in cloud technologies, preferably GCP and/or AWS

Experience with Apache Airflow

2+ years of experience with Apache Spark

Over 2 years of coding experience in Python, Java, or equivalent programming language, beyond an undergraduate degree

Deep understanding of Distributed Systems and Effective Data Management

 

Education

ANY GRADUATE