Key Responsibilities:
Data Integration:
Implement and maintain data synchronization between on-premises Oracle databases and Snowflake using Kafka and CDC tools.
Support Data Modeling:
Assist in developing and optimizing the data model for Snowflake, ensuring it supports our analytics and reporting requirements.
Data Pipeline Development:
Design, build, and manage data pipelines for the ETL process, using Airflow for orchestration and Python for scripting, to transform raw data into a format suitable for our new Snowflake data model.
Reporting Support:
Collaborate with data architect to ensure the data within Snowflake is structured in a way that supports efficient and insightful reporting.
Technical Documentation:
Create and maintain comprehensive documentation of data pipelines, ETL processes, and data models to ensure best practices are followed and knowledge is shared within the team.
Tools and Skillsets:
Data engineering: proven track record of developing and maintaining data pipelines and data integration projects
Databases: Strong experience with Oracle, Snowflake, and Databricks.
Data Integration Tools: Proficiency in using Kafka and CDC tools for data ingestion and synchronization.
Programming: Advanced proficiency in Python and SQL for data processing tasks.
Data Modeling: Understanding of data modeling principles and experience with data warehousing solutions.
Cloud Platforms: Knowledge of cloud infrastructure and services, preferably Azure, as it relates to Snowflake and Databricks integration.
Collaboration Tools: Experience with version control systems (like Git) and collaboration platforms.
CI/CD Implementation: Utilize CI/CD tools to automate the deployment of data pipelines and infrastructure changes, ensuring high-quality data processing with minimal manual intervention.
Kindly DM me I will connect with you.
Bachelor's degree