Job Description:
- Configuration, deployment, and maintenance of the Linux Cluster Hardware and HPC Software applications suite, associated Storage, and network infrastructure.
- Administration of the teams Hosting and management systems that enables the HPC.
- Providing technical support and troubleshooting for end users’ issues related to HPC hardware and Solver software applications, evaluate, and perform Job performance and application testing.
- Work on HPC Operational and Strategic Projects efforts, participate in User Group Forums
- Ensuring compliance to enterprise IT security and technology controls.
- Evaluation and implementation of new tools and methods for improved operations and service delivery
Basic Qualifications:
- Bachelor’s degree in computer science, information systems, or engineering
- 2 years’ experience in administration of heterogeneous IT compute and storage infrastructure
- Extensive knowledge of Linux operating systems
- Strong Scripting capability in one or more languages – Python, powershell, shell/bash,etc, Azure/Gitlabd Dev-ops CICD pipelines
- Knowledge of TCP/IP fundamentals
Top Candidates will also have:
- Demonstrated experience with cloud-based computing resource deployment (Azure, AWS)
- Microsoft Azure fundamentals, administrator, or solutions architect certifications
- TCP/IP networking certification(s)
- Working knowledge of distributed/parallel file systems and storage appliances (Isilon, Netapp, Qumulo, etc)
- Experience with HPC deployment and middleware technologies (Bright Cluster manager, Altair PBS Pro, SLURM, Torque MOAB)