Description

Responsibilities
• Develop and maintain highly performant, scalable systems capable of transforming, analyzing, and querying data from distributed sources to feed data visualization interfaces
• Create processes to schedule, execute, and monitor data transformation workflows
• Design, implement, and maintain APIs to quickly access data from a web-based application
• Collaboratively and pragmatically solve scientific software engineering challenges within interactive data analysis and visualization
• Work with computational scientists, biologists, and other software engineers to elucidate the emerging needs of our scientists, whether they are working at the keyboard or the bench
• Collaborate with distributed scientific and engineering teams to support your software development efforts
• Contribute to the broader scientific community through open-source software development

Required Qualifications
• BS or higher in Bioinformatics, Computer Science or related fields
• Expertise (5+ years of experience) in Python, designing and developing high-performance systems & package development
• Expertise in building, deploying, maintaining, and monitoring APIs
• Expertise in designing, running, and maintaining workflow processes, containers, schedulers, and systems in an on-premise server and in the cloud
• Experience with new and efficient file formats for large data
• Experience with scientific computing packages (SciPy, NumPy, pandas, etc.)
• Proficiency with cloud infrastructure, particularly AWS, to establish APIs and data services or databases
• Expertise in storing and extracting large amounts of data via cloud-based systems, including S3 buckets
• Demonstrated adherence to best practices in software engineering, particularly usability, version control, testing, and appropriate use of abstraction
• Passion for continuous learning and teaching others
• As the team is distributed between the US and Canada, the successful candidate should work in the Eastern or Pacific Time Zone.

Nice-to-haves
• Familiarity with formal build/release/deploy and continuous integration frameworks
• Kubernetes, AWS Lambda, and any other FaaS or containerized workloads experience
• Maintaining deployment infrastructure (reproducible, and IaaS), monitoring of events, and system maintenance
• Data wrangling, processing, and analysis in Python and/or R
• Biological domain knowledge, specifically in single cell genomics
• Familiarity with Multi Assay Experiment and other representations of biological information
• Experience building interactive visualization applications using modern frameworks and technologies (e.g., React, Vue, Svelte; D3.js, WebGL)
• Building interactive data apps in R and Python (Shiny, Streamlit, etc

Education

Any Graduate