Responsibilities:
• Writing Scala code with tools like Apache Spark + Apache Arrow to build a hosted, multi-cluster data warehouse for Web3
• Developing database optimizers, query planners, query and data routing mechanisms, cluster-to-cluster communication, and workload management techniques
• Scaling up from proof of concept to “cluster scale” (and eventually hundreds of clusters with hundreds of terabytes each), in terms of both infrastructure/architecture and problem structure
• Codifying best practices for future reuse in the form of accessible, reusable patterns, templates, and code bases to facilitate meta data capturing and management
• Working with team of software engineers writing new code to build a bigger, better, faster, more optimized HTAP database (using Apache Spark, Apache Arrow and a wealth of other open source data tools)
• Understand data and analytics use cases across Web3 / blockchains
Skills & Qualifications
• 6+ years experience engineering software and data platforms / enterprise-scale data warehouses, preferably with knowledge of open source Apache stack (especially Apache Spark, Apache Arrow, Apache Kafka, Apache Ignite/Geode, and others)
• 3+ years experience with Scala and Apache Spark (or Kafka)
• Rock solid engineering fundamentals; query planning, optimizing and distributed data warehouse systems experience is preferred but not required
• Nice to have: Knowledge of blockchain indexing, web3 compute paradigms, Proofs and consensus mechanisms… is a strong plus but not required
• Experience with rapid development cycles in a web-based environment
• Strong scripting and test automation knowledge
• Nice to have: Passionate about Web3, blockchain, decentralization, and a base understanding of how data/analytics plays into this..
Bachelor's degree