Description

 Roles and Responsibility:


 

 

  • Hands-on experience with PySparkRedshift (SQL) and Airflow at minimum
  • Strong hands-on with required tech skills, flexibleright attitude to play the lead role
  • Should be able to design and document data model at various levels
  • Working closely with the stakeholders.
  • Building highly scalable, robust & fault-tolerant systems.
  • Knowledge of Hadoop ecosystem and different frameworks inside it – HDFS, YARN, MapReduce, Apache Pig, Hive, Flume, Sqoop, ZooKeeper, Oozie, Impala and Kafka
  • Must have experience on SQL-based technologies (e.g. MySQL/ Oracle DB) and NoSQL technologies (e.g. Cassandra and MongoDB)
  • Should have Python/Scala/Java Programming skills
  • Discovering data acquisitions opportunities
  • Finding ways & methods to find value out of existing data.
  • Improving data quality, reliability & efficiency of the individual components & the complete system.
  • Problem solving mindset working in agile environment.

Education

Any Graduate