– 4+ years of development experience with Hadoop eco-system ( Spark, Scala, Oozie, , Pig, Hive ,HDFS, MapReduce) and/or NoSQL technologies such as Cassandra, MongoDB with experience on real-time & stream processing systems. POC experience or Training won’t be considered
– Excellent knowledge of Core Java or UNIX shell script or PL/SQL stored procedures is required
– Should have knowledge in different Hadoop Distributions like CDH 4 / 5, Hortonworks, MapR, IBM Big Insights.
– Strong foundational knowledge and experience with a range of Big data components such as Hadoop/Yarn, HDFS,MapReduce,Oozie, Falcon, Pig, Hive, Zookeeper, Sqoop and Flume
– Develop MapReduce programs or Hadoop streaming.
– Develop Pig scripts/Hive QL for analyzing all semi-structured/unstructured/structured data flows.
– Should have knowledge of Table definitions, file formats, UDF , Data Layout ( Partitions & Buckets), Debugging & performance optimizations.
– Excellent oral and written communication skills.