Key Responsibilities:
Data Architecture Design:
Design scalable, secure, and efficient data architectures on Google Cloud Platform (GCP) to support current and future business needs.
Create comprehensive data models, workflows, and frameworks to enable high-quality data storage, access, and processing on GCP.
Collaborate with technical teams to align the architecture with GCP best practices, ensuring the efficient use of GCP services such as BigQuery, Cloud Storage, Dataproc, and Pub/Sub.
Discovery & Assessment:
Conduct a thorough assessment of the existing on-premise or cloud data infrastructure to identify gaps, inefficiencies, and opportunities for migration to GCP.
Lead data discovery sessions with stakeholders to gather business requirements and map them to appropriate GCP solutions.
Identify data quality, data lineage, and metadata management needs during the discovery phase and incorporate them into the proposed architecture.
Data Migration Strategy:
Develop a data migration strategy, outlining the migration of databases, data warehouses, and data lakes from legacy systems to GCP.
Provide expertise in designing ETL/ELT processes using tools such as Cloud Dataflow, Dataproc, or other GCP data integration services.
Ensure data integrity, security, and minimal downtime during the migration process, leveraging GCP’s native tools for migration and replication.
Data Governance & Security:
Establish a data governance framework that ensures the security, privacy, and compliance of data across the GCP environment.
Implement GCP data security best practices, including encryption, Identity and Access Management (IAM), and data masking to protect sensitive information.
Ensure that the architecture complies with industry-specific regulations such as GDPR/CCPA, or CCPA where applicable.
Collaboration & Stakeholder Engagement:
Collaborate with cross-functional teams, including cloud architects, data engineers, and business stakeholders, to ensure alignment between business needs and data architecture.
Act as the primary point of contact for data-related discussions, providing guidance on data governance, performance optimization, and scalability on GCP.
Participate in workshops and meetings to present architecture designs, data migration plans, and GCP recommendations.
Optimization & Performance:
Optimize the architecture for high availability, scalability, and performance, leveraging GCP services such as BigQuery, Cloud Spanner, and Bigtable.
Design data models that are optimized for both transactional (OLTP) and analytical (OLAP) workloads.
Implement monitoring and alerting mechanisms to track data pipeline health and performance, using tools like GCP’s Stackdriver or Cloud Monitoring.
Documentation & Reporting:
Create detailed documentation of the data architecture, including data flow diagrams, ER diagrams, metadata definitions, and security models.
Prepare executive-level reports and presentations on the progress of data discovery, architecture design, and migration status.
Maintain detailed records of all data-related decisions made during the GCP discovery engagement for future reference.
Required Qualifications & Skills:
Experience:
7+ years of experience in data architecture or related roles, with a focus on cloud environments.
Proven experience in designing and implementing data architectures on Google Cloud Platform (GCP) or other cloud platforms (AWS, Azure).
Hands-on experience in data migration projects, data modeling, and database design in cloud environments.
Experience working with large-scale data warehouses, data lakes, and complex ETL/ELT processes.
Technical Expertise:
Deep understanding of GCP services related to data processing and storage, such as BigQuery, Cloud Spanner, Bigtable, Cloud SQL, Cloud Storage, Dataproc, and Dataflow.
Proficiency in SQL, NoSQL databases, and data modeling best practices.
Experience with data integration tools and frameworks (e.g., Apache Beam, Dataflow, Dataprep, or Informatica).
Data Governance & Security:
Strong knowledge of data governance principles, data lineage, metadata management, and data cataloging on cloud platforms.
Expertise in implementing data security and compliance measures, including encryption, IAM, and data privacy regulations.
Familiarity with relevant regulatory frameworks (GDPR/CCPA, or PCI-DSS) and their implications on cloud data architecture.
Soft Skills:
Strong problem-solving skills with the ability to translate business requirements into technical data solutions.
Excellent communication and presentation skills, with the ability to interact with both technical and non-technical stakeholders.
Ability to lead and mentor data engineering teams on best practices for GCP data architecture and governance.
Certifications (Preferred):
Google Professional Data Engineer certification (preferred).
Google Professional Cloud Architect or other GCP certifications are advantageous.
Additional certifications in data management, such as Certified Data Management Professional (CDMP) or equivalent, are a plus.
Education:
Bachelor’s degree in Computer Science, Data Science, Information Systems, or a related field.
Master’s degree (preferred) or equivalent experience in data architecture and cloud environments.
Bachelor’s degree in Computer Science, Data Science, Information Systems