Leverage distributed training systems to build scalable machine learning pipelines for model training and deployments in ITOT Products space
Design and implement solutions to optimize distributed training execution in terms of model hyperparameter optimization model traininginference latency and systemlevel bottlenecks
Research and impalement state of the art LLM models for different business use cases including finetuning and serving the LLMs
Ensure ML Model performance uptime and scale maintaining high standards of code quality and thoughtful design quality and monitoring
Optimize integration between popular machine learning libraries and cloud ML and data processing frameworks
Build Deep Learning models and algorithms with optimal parallelism and performance on CPUs GPUs
Your background and who you are
MS or PhD in Computer Science Software Engineering Electrical Engineering or related fields
3 years of industry experience with Python in a programming intensive role
2 years of experience with one or more of the following machine learning topics classification clustering optimization recommendation system graph mining deep learning
3 years of industry experience with distributed computing frameworks such as Spark Kubernetes ecosystem etc
3 years of industry experience with popular ml frameworks such as Spark MLlib Keras Tensorflow PyTorch HuggingFace Transformers and libraries like scikitlearn spacy gensim CoreNLP etc
3 years of industry experience with major cloud computing services
Background or experience in building and scaling Generative AI Applications specifically around frameworks like Langchain PGVector Pinecone AzureML
Prior experience in building data products and established a track record of innovation would be a big plus
An effective communicator you shall be an ambassador for Machine Learning engineering at external forums and have the ability to explain technical concepts to a nontechnical audience
Preferred Qualifications
Proficient PythonPySpark coding experience
Proficient in containerization services
Proficient in Azure ML to deploy the models
Experience with working in CICD framework
Motivation to make downstream modelers work smoother
Background or experience in building and scaling Generative AI Applications specifically around frameworks like Langchain PGVector Pinecone AzureML
Industry experience with popular ml frameworks such as Spark MLlib Keras Tensorflow PyTorch HuggingFace Transformers and libraries like scikitlearn spacy gensim CoreNLP etc
Experience in designing scalable services controller architecture using FastAPI