The candidate needs expertise in developing and managing data pipelines, particularly in a DataBricks DeltaLake format. The candidate must also have experience in
data modelling and end-user querying using Amazon Redshift or Snowflake, Amazon Athena, Presto, and orchestration experience using Airflow.
What you will do?
Map cross-functional business processes and identify data locations.
Define data models based on query/export interfaces and reporting requirements.
Develop data pipelines to consolidate, transform, and ensure data quality.
Collaborate with Business Analysts to address key business questions.
Review and evaluate the organization’s current and desired state of Data Governance.
Develop data products aligned with robust data governance principles.
Establish a medallion-style definition for data in the data lake, enabling easy identification of authoritative data sources for analytics.
Review, edit, and occasionally author new data pipelines.
Implement unit tests for newly developed pipelines.
Provide guidance on continuous integration and continuous deployment (CICD) processes related to data pipeline development.
Act as a consultant to the business for ad-hoc analytic queries executed via Looker.
Enable data-driven decision-making by providing expert guidance on data consumption and newly developed capabilities like data products.
Establish gold-standard data sources.
Implement data cleanliness pipelines and processes.
Formalize model review and ownership processes.
Publish access and interfaces to gold data through formal data products.
What are we looking for?
Bachelor's degree in Computer Science, Engineering, or related field.
4+ years experience as a Data Architect with a focus on data governance.
Strong proficiency in mapping business processes and defining data models.
Expertise in developing and managing data pipelines, particularly in a DataBricks' DeltaLake format.
Experience with Spark and Airflow for orchestration.
Familiarity with Looker for analytics.
Knowledge of best practices in unit testing and CICD for data pipelines.
Excellent communication and collaboration skills.
You will be preferred if:
AWS Data Analytics Speciality Certification
Experience with Agile development methodology
Bachelor's degree in Computer Science