Search Mailing List Archives
[bioontology-support] capturing precise metadata descriptions (names) using BioPortal
jgraybeal at stanford.edu
Thu Oct 5 09:38:40 PDT 2017
I'm also replying off-line, but I'd like to share some general scenarios with the list. The question is, how can BioPortal be used to capture precise metadata descriptions for data sets and data values?
In a case where data is being submitted to a larger repository or data framework, either the whole data set, or each of the data records, may need to be described. This description will need to specify the names of things—the name of a medical condition, or of the sensor type that produced the data, or of the particular chemicals that could be in the environment, to pick just three examples. BioPortal is a repository of names, organized into vocabularies or "ontologies", that can be specified in precise ways.
There are 3 approaches to create data descriptions—that is, metadata—that include these precise names.
(!) Manually improving your existing metadata
Let's assume we have a big data set, all in a database table, with maybe 10 columns that need labels, and two columns that are filled out with to specific named values (like a disease, or a person's gender). For each column label, we want a well-defined name to use as a label, say for Sensor Type. To find a specific name to use from BioPortal, a user can visit https://bioportal.bioontology.org, enter 'sensor type' into the class search field at upper left, and obtain a list of possible selections. It won't be obvious at first which is the right one, so you'll probably need to click on several, look at the description of that item on the resulting page (and maybe whether there are appropriate specific examples under the item in the tree listing on the left), and decide if that's what a good term to describe the column of data from the table. If it is a good term, then the ID under the Details section—it might look something like http://purl.bioontology.org/ontology/SNOMEDCT/408746007—is used to unambiguously refer to the chosen descriptive term.
For data values within a column, let's say Gender, one can also use specific descriptive terms for the content. For example, a search on Gender finds the Radiology Lexicon's term http://www.radlex.org/RID/#RID5652, which has two terms under it, labeled 'female' (http://www.radlex.org/RID/#RID5654) and 'male' (http://www.radlex.org/RID/#RID5653). These can be used in place of your gender data values ('male'/'female', 'M'/'F', or similar) to be very concrete. Note that other concepts for 'gender' in other ontologies might include more comprehensive lists of terms that can be used as values.
It's free and easy to use BioPortal to get precise concepts in this way, and no login is required for the above scenario.
(2) Use BioPortal via an API to let users automatically choose controlled terms
For someone already building a system that needs to automatically offer value choices to users, an automated system to serve those terms may be required. For example, Injury Type can be filled out with a lot of different values (see the tree under http://bioportal.bioontology.org/ontologies/MEDDRA?p=classes&conceptid=10022117). In this case, the BIoPortal API supports queries to find all the values under a particular term. This requires an API key, which comes with the login account obtained with free registration. API documentation is available at http://data.bioontology.org/documentation, which allows software to automatically offer users specific terms from BioPortal, and use those terms to populate your metadata.
(3) Use CEDAR as an advanced interface for controlled terms in metadata
A more recently available option to easily specify and use controlled vocabularies and terms from BioPortal is to use CEDAR, http://metadatacenter.org/orientation (videos at https://metadatacenter.org/videos). CEDAR has both user interfaces and APIs that can let a person or team build metadata templates using BioPortal's controlled terms, and create their own controlled terms if necessary. Users can then fill out the templates with metadata, either in CEDAR itself, or with separately developed software that uses CEDAR's APIs to leverage CEDAR's services. More information is available at the CEDAR web site, https://metadatacenter.org, and people can play with the system by registering (no charge) at https://cedar.metadatacenter.net.
Different solutions may be appropriate
On Oct 4, 2017, at 2:50 PM, support at bioontology.org<mailto:support at bioontology.org> wrote:
One of our investigators is resubmitting a grant application and the reviewers strongly suggest the use of descriptive terms from existing biomedical ontologies (e.g. NCBO, BioPortal https://bioportal.bioontology.org).
We have a vast array of environmental exposure data and need to describe how we would subscribe to BioPortal for sharing the data elements/metadata. Could you briefly explain how I could best do that in a few sentences?
Technical Program Manager
Center for Expanded Data Annotation and Retrieval /+/ NCBO BioPortal
Stanford Center for Biomedical Informatics Research
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the bioontology-support