Description

KEY RESPONSIBILITIES:
Collaborating with Product Teams to understand new product requirements including HPC (high performance computing) and AI/ML Products and ensuring the platform supports a wide range of AI applications.
Collaborating closely with broader GenAI teams and our internal (customer and engineering team) AI engineers to understand their needs and ensuring the platform supports a wide range of AI applications Monitoring platform performance, implementing enhancements, and ensuring platform scalability and stability.
Establishing a Security Protocol for GenAI applications developing and implementing robust security measures to protect the GenAI Platform. This includes vulnerability assessments, regular security audits, and ensuring compliance with industry-standard security protocols.
Finding optimal solutions to deploy GenAI products in embedded environments like automotive, industrial, networking & storage verticals using sophisticated design techniques, services, and tools.
Collaborating with multi-functional teams, including system engineering, software engineering, mechanical/thermal engineering, operations, embedded teams, external vendors, and other partners to successfully deliver a reliable and robust platform from concept to prototype to deployments.
Integrating and Optimizing OTA Deployment methods and managing SW stack deployments, including provisioning these services into the cloud.
Optimizing AI/ML pipelines and workload in a production environment on GenAI platform Diagnosing, root-causing and creating requirements for MLOps improvements depending on customer and product requirements.
Building a strong Platform Enablement team: Hiring experienced professionals with right expertise defining clear roles and responsibilities, establishing communication protocols, and providing ongoing training and support to ensure team members are equipped to handle complex challenges.
Developing a roadmap for GenAI Platform enablement: Conducting a thorough analysis of current and future needs, identifying gaps, and prioritizing initiatives to deliver value quickly Working closely with business stakeholders to understand their requirements and ensure that the platform meets their needs.
Developing a comprehensive training program: Developing a comprehensive training program for users of the GenAI Platform. This program should cover the basics of Cloud, Kubernetes, containers, MLOps, and SRE as well as best practices for using the platform.


PREFERRED EXPERIENCE:
Demonstrated ability to architect and design complex technical solutions tailored to client requirements.
Extensive background and technical expertise with embedded systems, orchestration & automation systems, data centers and cloud architecture including containerization, as well as excellent communication and planning skills
Understanding of embedded designs including automotive, industrial, compute, storage, and networking
Hands-on Linux and Scripting skills
Knowledge of OS Kernels and system engineering
Understanding of standard software engineering principles and enterprise system architecture with an automate and Scale approach.
Experienced with automation as well as experience with productivity tools and process automation is a big plus.
A track record of quickly understanding new technologies outside of your domain expertise and deploying systems in sophisticated configurations from hardware through multiple layers of software in a fast-paced environment
Strong problem-solving ability and experience in product engineering/failure analysis and debug/ HW or test design
Experience in large scale QA environments, for product bring-ups.

Education

Any graduate