Description

What you'll be doing:

o Collaborate with colleagues across multiple teams (Data Science and Data Engineering)

on unique machine learning system challenges at scale.

o Leverage distributed training systems to build scalable machine learning pipelines for

model training and deployments in IT/OT Products space.

o Design and implement solutions to optimize distributed training execution in terms of

model hyperparameter optimization, model training/inference latency and system-level

bottlenecks.

o Research and impalement state of the art LLM models for different business use cases

including finetuning and serving the LLMs.

o Ensure Client Model performance, uptime, and scale, maintaining high standards of code

quality and thoughtful design quality and monitoring.

o Optimize integration between popular machine learning libraries and cloud Client and data

processing frameworks.

o Build Deep Learning models and algorithms with optimal parallelism and performance

on CPUs/ GPUs.

 

Your background and who you are:

o MS or Ph.D. in Computer Science, Software Engineering, Electrical Engineering, or

related fields.

o 3+ years of industry experience with Python in a programming intensive role.

o 2+ years of experience with one or more of the following machine learning topics:

classification, clustering, optimization, recommendation system, graph mining, deep

learning.

o 3+ years of industry experience with distributed computing frameworks such as Spark,

Kubernetes ecosystem, etc.

o 3+ years of industry experience with popular Client frameworks such as Spark MLlib, Keras,

TensorFlow, PyTorch, HuggingFace Transformers and libraries (like scikit-learn, spacy,

gensim, CoreNLP etc.).

o 3+ years of industry experience with major cloud computing services.

o Background or experience in building and scaling Generative AI Applications, specifically

around frameworks like Langchain, PGVector, Pinecone, AzureML.

o Prior experience in building data products and established a track record of innovation

would be a big plus.

 

Preferred Qualifications:

o Proficient Python/PySpark coding experience

o Proficient in containerization services

o Proficient in Azure Client to deploy the models

o Experience with working in CICD framework

o Motivation to make downstream modelers' work smoother


 

Education

Any Graduate