Writing Scoop Jobs to Import/Export data from Hadoop
Program using Shell script/Python to create automation scripts
Knowledge of MapReduce, Spark. Hive, HBASE. Experience with Eclipse, XML, JSON
Program using SQL and SQL/RDBMS utilities
Responsible for setting up Hive structures, helping users troubleshoot issues with Hive/Impala/Spark/Sqoop, and migrating the structures between environments.
Monitor Hadoop cluster job performance, end-to-end performance tuning of Hadoop clusters and Hadoop Map/Reduce routines against very large data sets.
File system management and monitoring, Hadoop HDFS support and maintenance.
Manage and review Linux and Hadoop log files.
Team with the infrastructure, database, and business analytics teams to guarantee high data quality and availability and to troubleshoot Hadoop issues.