Required Qualifications
· Familiarity with the Technology stack available in the industry for data management, data ingestion, capture, processing and curation
· ETL development experience with strong SQL background, analyze huge data sets, trends and issues, and create structured outputs.
· Experience in building high-performing data processing frameworks leveraging Google Cloud Platform and Teradata
· Experience in building data pipelines supporting both batch and real-time streams to enable data collection, storage, processing, transformation and aggregation.
· Experience in utilizing GCP Services like Big Query, Composer, Dataflow, Pub-Sub, Cloud Monitoring
· Experience in performing ETL and data engineering work by leveraging multiple google cloud components using Dataflow, Data Proc, BigQuery
· Experience in scheduling like Airflow, Cloud Composer etc.
· Experience in JIRA or any other Project Management Tools
· Experience in CI/CD automation pipeline facilitating automated deployment and testing
· Experience in bash shell scripts, UNIX utilities & UNIX Commands
Nice to have Qualifications
· Strong understanding towards Kubernetes, Docker containers and to deploy GCP services is a plus
· Knowledge of Scrum/Agile development methodologies is a plus
· Any experience with Spark, PySpark, or Kafka is a plus
· Data analysis / Data mapping skills is a plus
· Knowledge in data manipulation JSON and XML
Technical Skills
GCP Services: DataFlow, BigQuery, Cloud Storage, DataProc, Airflow, Composer, Pub/Sub and Memorystore/Redis
Programming languages: Java, Python
Database: Teradata, BigQuery/BigTable.
Any Graduate