Job Description:
Must Haves:
Python
Pyspark
AWS
Databricks
RAG- Retrieval augmented generation, or RAG, is an architectural approach that can improve the efficacy of large language model (LLM) applications by leveraging custom data. This is done by retrieving data/documents relevant to a question or task and providing them as context for the LLM.
Day to Day
Build new data products.
Aggregate the Data
Pull data on a daily basis to look at trends.
This team is heavily invested in Lakehouse data bricks to run reports, the also use spark, Python, SQL, databricks to test and datalakes to optimize the metadata.
Nice to Have:
Someone who comes from an investment background but not necessarily, highly regulated would also be nice
Not looking for a Data Scientist but a true Engineer.
Any Graduate