Description

Responsibilities:

• Writing Scala code with tools like Apache Spark + Apache Arrow to build a hosted, multi-cluster data warehouse for Web3

• Developing database optimizers, query planners, query and data routing mechanisms, cluster-to-cluster communication, and workload management techniques

• Scaling up from proof of concept to “cluster scale” (and eventually hundreds of clusters with hundreds of terabytes each), in terms of both infrastructure/architecture and problem structure

• Codifying best practices for future reuse in the form of accessible, reusable patterns, templates, and code bases to facilitate meta data capturing and management

• Working with team of software engineers writing new code to build a bigger, better, faster, more optimized HTAP database (using Apache Spark, Apache Arrow and a wealth of other open source data tools)

• Understand data and analytics use cases across Web3 / blockchains

 

Skills & Qualifications

• 6+ years experience engineering software and data platforms / enterprise-scale data warehouses, preferably with knowledge of open source Apache stack (especially Apache Spark, Apache Arrow, Apache Kafka, Apache Ignite/Geode, and others)

• 3+ years experience with Scala and Apache Spark (or Kafka)

• Rock solid engineering fundamentals; query planning, optimizing and distributed data warehouse systems experience is preferred but not required

• Nice to have: Knowledge of blockchain indexing, web3 compute paradigms, Proofs and consensus mechanisms… is a strong plus but not required

• Experience with rapid development cycles in a web-based environment

• Strong scripting and test automation knowledge

• Nice to have: Passionate about Web3, blockchain, decentralization, and a base understanding of how data/analytics plays into this..

Education

Bachelor's degree