Charles Tapley Hoyt My name is Charles Tapley Hoyt. I’m a bio/cheminformatician using biological knowledge graphs to generate biological hypotheses that assist in drug discovery and precision medicine.

Here’s some more details about me and my research. My résumé can be found here and everything else on ORCID at ORCID


  • Referring to SARS-CoV-2 Proteins in BEL

    Many of the proteins in the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) are cleavage products of the replicase polyprotein 1ab (uniprot:P0DTD1). Unfortunately, the bioinformatics community is not so comfortable with proteins like this and nomenclature remains tricky. Luckily, the Biological Expression Language (BEL) has exactly the right tool to encode information about these proteins using the fragment() function.

  • How to Code with Me - Making a CLI

    One of the cardinal sins in computational science is to hard code a file path in your analysis. This post is a guide to reorganizing your code to avoid this and then to generate a command line interface (CLI) using click.

  • The Curation of Neurodegeneration Supporting Ontology

    While I led the curation program in the Human Brain Pharmacome project during my Ph.D. from 2018-2019 at Fraunhofer, we built the Curation of Neurodegeneration Supporting Ontology (CONSO). This post outlines the project’s needs for quality control and re-curation that lead to its generation, the curation process, and how CONSO constitutes an example of how to follow the guidelines I proposed in a previous blog post on building ontologies.

  • How to Code with Me - Organizing a Package

    This blog post is the next installment in the series about all of the very particular ways I do software development in Python. This round is about where to put your code, your tests, your CLI, and the right metadata for each.

  • A Reading List of Academic Articles using the Biological Expression Language (BEL)

    This post is evolving from a reading list to a review of the academic papers published that are either about or use the Biological Expression Language (BEL). It’s divided into the categories of software/visualization tools, algorithms/analytical frameworks, data integration, natural language processing, curation workflows, and downstream applications.

  • The Trouble with Ontologies, or, How to Build an Ontology

    Everyone’s talking about biomedical ontologies! Let’s look at where most people go wrong and how to do it right.

  • A Listing of Publicly Available Content in the Biological Expression Language (BEL)

    While many researchers have a pathway or pathology of interest, their first time curating content in the Biological Expression Language (BEL) may seem intimidating. This post lists several disease maps and BEL content sources that are directly available for re-use.

  • An Incomplete History of Selventa and the Biological Expression Language (BEL)

    The company and community that surround the Biological Expression Language (BEL) are enigmatic, to say the least. This post represents the best I could do to tell the history of Selventa and BEL.

  • How to Code with Me - Flake8 Hell

    As scientists, we place huge importance on the communication of our results. We spend lots of time on editing, revising, and formatting so people can understand what we did. We also write a lot of code, so why aren’t we investing the same amount of love? Enter, flake8.

  • Inspector Javert's Xref Database

    On top the issue of resolving identifiers to their names, the bioinformatics community has a hard time figuring out when two identifiers from different databases are equivalent. You know who else has the same problem? Inspector Javert. Get ready for a Les Miserables-themed post on how to address this long-standing problem.

  • Ooh Na Na, What's My Name?

    We have a big problem in the bioinformatics community with namespaces, identifiers, and names. And nobody’s posed the question better than Rihanna herself.

  • Summarizing ChemRxiv

    A few months ago, the question was posed on science Twitter: “How many people have published on ChemRxiv?”

  • How to Fix Your Monolithic Pull Request

    We’ve all been there. You started a new branch from master. You had a very specific goal in mind, The Original Goal. You made a pull request (PR) to go with it, too, The Original Pull Request. But then, you had an idea! And also, someone on your team asked you to solve another problem! Now the original code you wrote to address The Original Goal relies on that code … and now you’ve got dozens of files changed, hundreds of lines of diff, and nobody (including you) can understand what you’ve done. Like I said, we’ve all been there. Here’s what you can do to fix it:

  • Host a Graduate Seminar Before Writing Your Thesis

    The other day I saw a tweet lamenting the drag that is literature review during preparation for writing your thesis.

  • Encoding Biology in Knowledge Graphs

    How many molecular biology papers have you read today? This week? This month? If you’re like me, its not so many, and we’re falling behind very quickly. Here’s a chart made by the new PubMed that summarizes how many papers were published mentioning RAS in the last 50 years.

  • Biosemantics vs. Biopragmatics

    In language, semantics describe the names and meanings of words. The bioinformatics community has aptly adopted biosemantics as a concept that encompasses the issues with the names and meanings of biological entities, usually in natural language processing and data integration. However, semantics does not capture the context of words, and biosemantics fails to describe the biological context and complex relationships between biological entities.

subscribe via RSS