Required Skill Sets:
Hadoop - ~7 years of experience
Docker
Python
Shell Scripting
Jupyterhub
Samba
Linux
Node.js
Roles & Responsibilities:
• Install, configuration, management and monitoring of various Hadoop and database systems.
• Perform upgrades, scripting, task automation, backups/recovery
• Documentation of the installation and upgrade processes
• Creating and maintaining engineering documents and system designs
• Maintain appropriate written documentation for operational procedures and system design
• Experience with monitoring systems and diagnostic tools
• SQL engines, Hive, Pig and performance tuning at all levels
• Hands on experience with Dockers Infrastructure
• Expertise in implementing and coordinating large Hadoop clusters and related Infrastructure such as Hive, Spark, HDFS, HBase, Oozie, Flume, Zookeeper
• Experience in running Machine Learning pipelines such as Jupyterhub
• Ability to code well in language (Shell, Python)
• Deploy and scale Hadoop infrastructure to support data pipeline and related services
• Build infrastructure capabilities to improve resiliency and efficiency of the systems and services at scale
• Drive data infrastructure / pipeline, services and upgrade/migration projects from start to finish
• Support in Hadoop / HDFS infrastructure day today operations, administration and maintenance
• Data cluster monitoring and troubleshooting
• Capacity planning, management, and troubleshooting for HDFS, YARN/MapReduce and Spark work loads
Any Graduate