Description

Education:
A Master’s or Ph.D. in Computer Science, Artificial Intelligence, Machine Learning, or a related field.

Experience:
12+ years of experience in AI/ML development, with at least 5+ years specifically working with Generative AI and Large Language Models (LLMs).

Proven track record in building and deploying LLM-based applications, including hands-on experience in integrating models in production environments.

Expertise in Monitoring and Observability:
In-depth experience with LLM monitoring tools and observability frameworks. Familiarity with systems like Prometheus, Grafana, and custom LLM tracking solutions.

Hands-on Knowledge of LLM Frameworks & Tools:
Expertise in industry-standard frameworks such as Hugging Face, GPT-based models, OpenAI, or custom LLM solutions. Strong experience with training, fine-tuning, and deploying LLMs.

Experience with Agentic Frameworks:
Solid understanding of agent-based architectures and orchestration frameworks, specifically those related to agentic control systems, autonomy, and decision-making.

Langraph Multi-Agent Framework:
Practical experience with the Langraph framework or similar multi-agent systems, facilitating efficient coordination between multiple agents to solve complex tasks in a distributed setting.

Proven Leadership:
A demonstrated ability to lead, mentor, and inspire AI/ML teams in a technical capacity. Experience guiding teams through complex challenges and leading technical projects from concept to delivery.

Desirable Qualifications:

Cloud Infrastructure Expertise:
Experience working with cloud platforms such as AWS, GCP, or Azure, particularly in deploying and managing AI/ML applications at scale.

Industry Contributions:
Contributions to open-source projects, research papers, or speaking engagements at AI/ML conferences.

Familiarity with Emerging AI Technologies:
Knowledge of cutting-edge AI techniques such as reinforcement learning, unsupervised learning, or hybrid models for multimodal tasks.

Skills:

Technical Skills:

  • Strong proficiency in Python, TensorFlow, PyTorch, and other ML frameworks.
  • Experience with cloud platforms (AWS, GCP, Azure) and containerization tools (Docker, Kubernetes).
  • Advanced understanding of model optimization, training techniques, and model interpretability.

Analytical Skills:

  • Excellent problem-solving and critical-thinking abilities.
  • Strong experience in debugging and optimizing complex AI systems in production.

Communication & Leadership:

  • Exceptional verbal and written communication skills, with the ability to communicate complex ideas clearly and effectively to both technical and non-technical stakeholders.
  • Proven leadership skills with the ability to inspire and guide teams toward shared goals.

Education

Any Graduate