Steer Clear -
No one < 3 years of experience on GCP
No one with only experience in AWS and Azure
Data Engineering Requirement
Programming
-> SQL
-> Python
-> Java (Optional)
GCP
-> BigQuery
-> Dataflow (Apache Beam)
-> Cloud Composer (Airflow)
-> GCS
-> GKE
-> Dataform (Optional to dbt)
Tools
-> dbt / Dataform(On GCP)
-> Test Automation on Data
Misc.(Good to Have):
-> Docker
-> Kubernetes
-> Microservices
Experience:
Data Platform Building (Mandatory)
-> Ingestion/Migration
-> Transformation/ETL
-> Analysis (Optional)
-> Visualization
(SAP BOBJ/Looker/PowerBI)
-> Governance
(Unitiy Catalog(tool in databricks)/Colibra(tool)/Data Catalog(Service in GCP)
-> Security
(GCP Services IAM, KMS, DLP. Techniques ACL's, Row/Column level in BigQuery)
Deployment
-> CI/CD
-> Github
-> Cloud Build (Service in GCP) + Terraform
Requirements:
- 10-15+ (for senior) of proven experience in modern cloud data engineering, broader data landscape experience and exposure and solid software engineering experience.
- Prior experience architecting and building successful enterprise scale data platforms in a green field environment is a must.
- Proficiency in building end-to-end data platforms and data services in GCP is a must.
- Proficiency in tools and technologies: BigQuery, Cloud Functions, Cloud Run, Dataform, Dataflow, Dataproc, SQL, Python, Airflow, PubSub. ----- SQL/Python Must
- Experience with Microservices architectures - Kubernetes, Docker and Cloud Run
- Experience building Symantec layers.
- Proficiency in architecting and designing and development experience with batch and real time streaming infrastructure and workloads.
- Solid experience with architecting and implementing metadata management including data catalogues, data lineage, data quality and data observability for big data workflows.
- Hands-on experience with GCP ecosystem and data lakehouse architectures.
- Strong understanding of data modeling, data architecture, and data governance principles.
- Excellent experience with DataOps principles and test automation.
- Excellent experience with observability tooling: Grafana, Datadog.