Description

REQUIREMENTS

  • Formal training in machine learning: dimensionality reduction, clustering, embeddings, and sequence classification algorithms
  • Experience with deep learning frameworks such as PyTorch, Tensorflow and Hugging Face Transformers.
  • Practical experience in Natural Language Processing methods and libraries such as spaCy, word2vec, TensorFlow, Keras, PyTorch, Flair, BERT
  • Practical experience with large language models, prompt engineering, fine-tuning and benchmarking, using frameworks such as LangChain and LlamaIndex
  • Strong Python background
  • Knowledge of AWS, GCP, Azure, or other cloud platform
  • Understanding of data modeling principles and complex data models.
  • Proficiency with relational and NoSQL databases as well as vector stores (e.g., Postgres, Elasticsearch/OpenSearch, ChromaDB)
  • Knowledge of Scala, Spark, Ray, or other distributed computing systems highly preferred
  • Knowledge of API development, containerization, and machine learning deployment highly preferred
  • Experience with Client Ops/AI Ops highly preferred

Education

MS in Data Science, Computer Science