I’ve created many open-source projects as well as made significant contributions to others:
Compiler for the Biological Expression Language
Framework for reproducible data integration in BEL
Interactive exploration of networks encoded in the BEL
Harmonization of biological ontologies and controlled vocabularies
Target prioritization framework using gene expression and network representation learning
The most expansive knowledge graph embedding framework to date
Platform and web application for integrating and curating pathway databases
Web application for exploration of pathway databases
Benchmarking pathway databases in functional enrichment analysis and prediction methods
Sequence-based representation learning
Drug repositioning based on bioactivity pattern matching and GWAS
Drug repositioning framework based on network representation learning
Patient stratification framework based on network representation learning
Generation of structural causal models (SCMs) from BEL graphs
Representation and manipulating probabilistic expressions
Multimodal Transformers for biomedical text and Knowledge Graph data
A deep learning library for drug-drug interaction, polypharmacy side effect, and synergy prediction.
A library implementing a comprehensive collection of metrics for the evaluation of recommender systems.
I’ve created many databases myself through curation, automated assembly, and also by coordinating others.
Ontology of phenomena related to neurodegeneration
Curated knowledge graphs describing neurodegeneration in BEL
Multimodal Mechanistic Signatures for Neurodegenerative Diseases
Predicted and curated mappings between named biological entities
What's the current version for each biological database?
An integrative meta-registry of biological databases, ontologies, and nomenclatures.
Comprehensive metadatabase of identifiers and names for biological entities
Comprehensive metadatabase of cross-references between databases of biological entities
Connecting roles in the ChEBI ontology to their targets
Scientific Programming Training
I care very deeply about reproducibility, especially in scientific software development. However, this is not one of the core values taught by most PIs, nor are the core skills part of either scientific or informatics curricula. I’m generating some resources to help fill that gap:
- Blog: Dealing with Big Pull Requests
- Blog: Flake8
- Blog: Packaging
- Blog: CLIs
- Blog: CLIs and Flask
- Video: Writing Reusable, Reproducible Python: Documentation, Packaging, Continuous Integration, and Beyond
- Video: Reusable, Reproducible, Useful Computational Science in Python (July 2021)
- GitHub: Using Flask, Celery, and Docker
- GitHub: Examples