Job Overview:
The Senior Linux Platform Engineer will be responsible for follow-the-sun global operations and management of the Linux Platforms as well as project work.
Working closely with team leads, team managers, project managers, and engineering teams, you will be a key contributor in deploying, maintaining and supporting various flavor of OS, Platforms services and multi vendor hardware break fix.
Being a positive team player, enthusiastic, self-starter with a flexible attitude in applying different techniques to help drive successful outcomes. You thrive working across geographical and cultural boundaries and will travel as appropriate to other arm sites as/when needed; sometimes at short notice.
Responsibilities:
Focusing on the thorough implementation, improvement, and lifecycles of Linux based systems, services and tools provided by our suppliers and utilized within our team to ensure our SLAs are met.
- Build, upgrade, configure and deploy Linux based physical servers from scratch to support HPC infrastructure and services
- Maintain the availability of production environment by proactive monitoring for improved reliability and efficiency
- Provide primary operational support of RHEL OS, Dell, HPE, Fujitsu hardware to maintain large scale HPC Clusters globally
- Build, configure, maintain and improve various Infrastructure services such as LDAP, HAProxy, Bind DNS / DHCP, SMTP and other
- Maintain OME tools to support various hardware such as Dell, HP, Fujitsu for its lifecycle management, BIOS / firmware update and server management and improvements
- Coordinate & collaborate with the various Global & Infrastructure team to ensure a value driven road-map and strategy are established, prioritized and addressed
- A proactive approach to spotting problems, areas for improvement, performance bottlenecks, other issues or inefficiencies and work towards automating and fixing issues
- Create clear and concise documentation, share information and train team members across all towers
Required Skills and Experience:
- Advanced or expert level Linux administration skills, performance tuning, system deployment, PXE boot, hardware configuration and management.
- Linux based server deployment and services like Bind DNS, DHCP, LDAP, HAProxy.
- Infrastructure tools like Foreman, Puppet, Ansible, Git, vCenter, ESXi, HAProxy, Bluecat Bind DNS, NLYTE.
- Scripting language and automating repeated tasks using (CSH, BASH, PYTHON) or any other programming language.
- Solid understanding and experience in monitoring and alerting tools like PRTG, Nagios and other snmp based incident management.
- Vendor OEM operation tools like Dell OME, HP OneView and Fujitsu ISM!
“Nice To Have” Skills and Experience:
- Good understanding of database, MySQL database operational support, performance optimization updates and upgrades.
- Experience with Information Technology Service Management tools like ServiceNow and Jira!