Job Summary:
We are seeking a skilled High-Performance Computing Systems Administrator to provide IT infrastructure support in our Developer Software Engineering lab in Oregon. The role involves building, configuring, and maintaining Linux-based high-performance computing clusters and fabrics, troubleshooting hardware and software, and developing automation scripts. The candidate will also ensure system usability, manage lab capacities, and uphold security compliance within a collaborative engineering environment.
Key Responsibilities:
- Installation, configuration, and maintenance of high-performance computing clusters.
- Troubleshooting production and pre-production servers, GPUs, switches, and software.
- Development and implementation of scripts for cluster provisioning and configuration.
- Engagement with Network & IT security teams for lab network and firewall designs.
- Documentation and communication of practices, findings, and troubleshooting guides.
Must-Have Skills:
- Expertise in Linux and Windows Server administration.
- Experience in virtualization technologies (MS Hyper V, VMware).
- Proficiency in scripting languages (Perl, Python, shell).
Industry Experience Required:
- Experience working in Lab and Datacenter environments is essential for this role.