International Society of Biocuration Presents: Curate This!
While researchers typically communicate their work through poster presentations, oral presentations, and written communication, programmers often give (live) demonstrations. I’m not aware of any technical nor practical barriers for why curators couldn’t do the same, and always wished that curators did this more often. This post is about how I planned to make this a reality by starting a podcast with the International Society for Biocuration (ISB) entitled ISB Presents: Curate This!.
The key first step was to decide on the goals of the podcast and its target audience. The primary goal of Curate This! is to explicitly show the process of curation and have an informal discussion about the challenges associated with it. It should have short episodes that require as little preparatory work for both the interviewer and interviewee as possible to make it possible to scale. I also decided with the ISB that it should be hosted as an ISB podcast, not as something just from me. This better fits the message for the curation community and is overall a better governance decision to support longevity (if we’re successful).
It’s not a goal of this podcast to give a background on curation - there are plenty of resources available from the ISB that cover this. It’s also not a goal of this podcast to focus on the curator themselves, such as how they became a curator - the ISB hosts two Careers in Biocuration sessions each year, one at the in-person conference and one virtually, that cover this. The target audience for this podcast is practitioners.
Script
Curate This! interviews are split into several segments. The first and last are recorded by the interviewer after the interview is done to give an introduction and parting remarks. The bulk of the episode is contained within three segments: introduction, demonstration, and reflections.
I’ve written out the questions and the concept for each segment below to serve as a resource for potential interviewees to read ahead of time and prepare themselves as well as a resource for interviewers to follow and stay on task.
Introduction
The goal of the first segment of the interview is to describe the history, goals, and uses of your curated resource in around five minutes. We’ll loosely use the following question lists:
- Basic
- What is your resource called (if not already mentioned in the introduction)?
- When was your resource established?
- What kind of information is in your resource?
- Do you develop/reuse any data standards?
- Impact
- Who uses (or could use) your resource and why? How do you assess this?
- Have you seen any cool citations of your resource?
- What is the broader impact in the basic and translational research space in biomedicine (or beyond?)
- Personnel
- What does your resource’s team look like?
- How many people/groups work on your resource?
- Is it developed and maintained by a group within your institution, as a community effort, or somewhere in between?
- If it’s a community effort, How do you do project management and communication? E.g., Slack, GitHub, Trello, etc.
- How do you onboard new curators? If there’s a difference between internal/external, what does this dichotomy look like?
- What does your resource’s team look like?
Demonstration
The goal of the second segment of the interview is to demonstrate the contribution of a curation to your resource, live, in between ten and thirty minutes. Here’s what makes a satisfying live demonstration:
- Show how you select what you’re going to curate.
- How do you find content? For example, if you curate text from literature or patents, do you have a search query that runs on a chronological basis?
- How do you prioritize content? For example, do you use ranking from a search system, or a more sophisticated document classifier?
- What do you look for in the text?
- Do you use external ontologies, terminologies, or semantic spaces to tag named entities?
- What kinds of assumptions do you make as a curator? For example, if you’re curating relationships between proteins, do you assume that authors refer to proteins using their corresponding gene names?
- How do you report the confidence of your curation (and its components)?
- What kind of metadata do you capture, e.g., the curator’s ORCiD, the time of curation, or anything else?
Ideally, you should prepare a curation ahead of time so you can quickly walk through the process during the live demo, rather than needing time to think and consider (though, this might be more realistic!).
Reflections
The goal of the third segment of the interview is to reflect on the demonstration and conclude the interview with parting thoughts, in around five minutes.
- Next Steps
- What happens next after curation?
- Does the data get reflected on the website immediately?
- Does a second curator check things?
- Are more substantial releases made periodically?
- What are some difficulties/challenges in curating your resource?
- What could authors/journals/publishers do to make it easier to curate?
- What data should they include (that they don’t)?
- How should data look?
- What kinds of standards would you like to see developed?
- Contrast curating a “good” paper versus a “bad” paper.
- What happens next after curation?
- Longevity and Sustainability
- How/where do you think that AI has a place in the curation and maintenance of your resource?
- What’s the funding situation like?
- How much do you estimate it costs to maintain this resource per year?
First episode
In our inaugural episode, I interviewed Dr. Susan (Sue) Bello, a curator for the Mouse Genome Informatics (MGI) knowledge base and Alliance of Genome Resources (AGR) who works at the Jackson Laboratory in Maine. Sue is also the ISB executive committee chair. She showed us how she curates alleles in MGI using the paper Mice deficient in TWIK-1 are more susceptible to kainic acid-induced seizures (Kim et al., 2025).
Let us Interview You
If you curate a resource and want to be featured on the podcast, please reach out. My contact information is on the bottom of my blog or ISB can be contacted here.