Description

Responsibilities:
•            Lead the architecture, design, and implementation of highly scalable and resilient AI infrastructure solutions utilizing Seldon Core AI software on OpenShift Enterprise Kubernetes in GPU-based environments.
•            Provide technical leadership and mentorship to a team of engineers, guiding them in best practices for AI infrastructure design, deployment, and optimization.
•            Collaborate closely with cross-functional teams including data scientists, software engineers, and DevOps engineers to understand requirements and drive innovative solutions.
•            Develop and implement automation strategies for deployment, monitoring, and management of AI workloads, ensuring efficiency and reliability at scale.
•            Drive performance optimization efforts to maximize resource utilization and throughput of GPU-based infrastructure for AI model training and inference.
•            Establish and enforce security best practices to protect AI infrastructure and data assets against potential threats and vulnerabilities.
•            Stay abreast of emerging technologies and industry trends in AI infrastructure, evaluating their potential impact and driving adoption where appropriate.
•            Extensive experience designing and implementing AI infrastructure solutions, with a focus on Seldon Core AI software and OpenShift Enterprise Kubernetes.
•            Proven track record of technical leadership, including mentoring junior engineers and driving successful project outcomes.
•            Expertise in scripting and automation using tools such as Ansible, Terraform, or similar, with a strong emphasis on infrastructure as code (IaC) principles.
•            Deep understanding of GPU architecture and performance optimization techniques for AI workloads, with hands-on experience in tuning and scaling GPU-based infrastructure

Education

Any Graduate