Description

Responsibilities:

  • Design, develop, and implement data extraction systems using Large Language Models (LLMs) such as GPT-4.
  • Collaborate with data scientists, engineers, and product managers to understand data extraction requirements and objectives.
  • Fine-tune and customize LLMs for specific data extraction tasks, ensuring high accuracy and efficiency.
  • Create and maintain data pipelines for the extraction, processing, and storage of large datasets.
  • Conduct performance testing and optimization of LLMs to enhance data extraction capabilities.
  • Develop and document best practices for LLM-based data extraction processes.
  • Stay updated with the latest advancements in AI and LLM technologies to continually improve data extraction methodologies.
  • Troubleshoot and resolve issues related to data extraction processes and models.

Qualifications:

  • Bachelor's or Master's degree in Computer Science, Data Science, AI, or a related field.
  • Proven experience with LLMs and natural language processing (NLP) technologies.
  • Proficiency in programming languages such as Python, with experience in AI/ML libraries (e.g., TensorFlow, PyTorch).
  • Strong understanding of data structures, algorithms, and software engineering principles.
  • Experience with data extraction, ETL processes, and database management.
  • Familiarity with cloud computing platforms (e.g., AWS, Google Cloud, Azure) and containerization technologies (e.g., Docker, Kubernetes).
  • Excellent problem-solving skills and the ability to work in a fast-paced, collaborative environment.
  • Strong communication skills, both written and verbal, to effectively convey technical concepts to non-technical stakeholders

Education

Bachelor's or Master's degree