Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

[bioontology-support] are mapping counts accurate?

Michael Dorf mdorf at stanford.edu
Fri Oct 19 17:08:50 PDT 2018


Hi Mark,

Thanks for contacting us and bringing this issue to our attention.  At some point in the past, we’ve implemented a system that prevents expensive COUNT queries going live against our 4store backend. These queries used to really bog down our servers, often resulting in downtime. The COUNT queries used to be executed on paged REST services, like the one that retrieves all mappings for a given ontology.  So, in order to determine the correct number of pages for a given call, our system used to first execute a COUNT query, storing the result in the output. The new system would pre-cache these counts, so when a paged service call is made, the count would be retrieved from a static repository. Unfortunately, there appears to be a bug in this process that triggers the behavior you are seeing. I’ve created an issue in our Github repository to track our progress on fixing this problem:

https://github.com/ncbo/ontologies_linked_data/issues/88

For your specific example, it’s best to simply use an iterator to go through ALL pages of available mappings until you hit an empty collection instead of relying on the reported totalCount. For example:

http://data.bioontology.org/ontologies/EXACT/mappings?page=1&pagesize=500
vs
http://data.bioontology.org/ontologies/EXACT/mappings?page=50&pagesize=500

Thanks again for your report. Hope this works as a workaround for what you are trying to accomplish.

Michael


On Oct 19, 2018, at 6:36 AM, Miller, Mark <markampa at pennmedicine.upenn.edu<mailto:markampa at pennmedicine.upenn.edu>> wrote:

I have been downloading your term mappings via the REST API.  Thanks.  I put an example of how I do it (in R) at the end of this message.

I noticed that the number of mappings I get rarely agrees with the counts I calculate form a page like http://bioportal.bioontology.org/ontologies/EXACT?p=mappings

Pasting that page into Excel and calculating the sum seems to say that there are 1110 BioPortal mappings from any EXACT term

But my code below retrieves 1894 mappings.  If I take out the match method and then make the results  non-redundant, I get 1893.  If I take out the match ontology column and make it unique (on just the source term and the match term), I get 674 mappings.

Another example: my mentod gets 10 mappings from EXACT to ICO, but the web pages says 9.  However, when I click on the ICO link on the EXACT mappings page (http://bioportal.bioontology.org/mappings/show/EXACT?target=http://data.bioontology.org/ontologies/ICO&height=600&width=800), I get 10 mappings!  (see immediately below.)


Thanks,
Mark



An ontology for experimental actions

Informed Consent Ontology

Source

information content entity<http://bioportal.bioontology.org/visualize/virtual/ICO?conceptid=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FIAO_0000030>

IAO_0000030<http://bioportal.bioontology.org/visualize/virtual/EXACT?conceptid=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FIAO_0000030>

SAME_URI

one-dimensional temporal region<http://bioportal.bioontology.org/visualize/virtual/ICO?conceptid=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FBFO_0000038>

BFO_0000038<http://bioportal.bioontology.org/visualize/virtual/EXACT?conceptid=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FBFO_0000038>

SAME_URI

object<http://bioportal.bioontology.org/visualize/virtual/ICO?conceptid=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FBFO_0000030>

object<http://bioportal.bioontology.org/visualize/virtual/EXACT?conceptid=http%3A%2F%2Fwww.owl-ontologies.com%2FOntology1184060740.owl%23OWLClass_fede789e_08ff_4d4a_bf4a_cd6f393a670d>

LOOM

material entity<http://bioportal.bioontology.org/visualize/virtual/ICO?conceptid=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FBFO_0000040>

BFO_0000040<http://bioportal.bioontology.org/visualize/virtual/EXACT?conceptid=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FBFO_0000040>

SAME_URI

role<http://bioportal.bioontology.org/visualize/virtual/ICO?conceptid=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FBFO_0000023>

role<http://bioportal.bioontology.org/visualize/virtual/EXACT?conceptid=http%3A%2F%2Fwww.owl-ontologies.com%2FOntology1184060740.owl%23OWLClass_705c1d8b_d7b2_455e_9f80_d4bc62eb57b1>

LOOM

Version<http://bioportal.bioontology.org/visualize/virtual/ICO?conceptid=http%3A%2F%2Fusefulinc.com%2Fns%2Fdoap%23Version>

version<http://bioportal.bioontology.org/visualize/virtual/EXACT?conceptid=http%3A%2F%2Fwww.owl-ontologies.com%2FOntology1184060740.owl%23OWLClass_870d9f06_a65a_415d_b7cc_e1dbbbcc9b2c>

LOOM

centrifuge<http://bioportal.bioontology.org/visualize/virtual/ICO?conceptid=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FOBI_0400106>

centrifuge<http://bioportal.bioontology.org/visualize/virtual/EXACT?conceptid=http%3A%2F%2Fwww.owl-ontologies.com%2FOntology1184060740.owl%23Class_53>

LOOM

temporal region<http://bioportal.bioontology.org/visualize/virtual/ICO?conceptid=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FBFO_0000008>

BFO_0000008<http://bioportal.bioontology.org/visualize/virtual/EXACT?conceptid=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FBFO_0000008>

SAME_URI

site<http://bioportal.bioontology.org/visualize/virtual/ICO?conceptid=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FBFO_0000029>

BFO_0000029<http://bioportal.bioontology.org/visualize/virtual/EXACT?conceptid=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FBFO_0000029>

SAME_URI

textual entity<http://bioportal.bioontology.org/visualize/virtual/ICO?conceptid=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FIAO_0000300>

IAO_0000300<http://bioportal.bioontology.org/visualize/virtual/EXACT?conceptid=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FIAO_0000300>

SAME_URI





library(jsonlite)
library(httr)

# assign api key here
API_KEY  <-  ""

PAGE_SIZE <- 500
REST_URL <- "http://data.bioontology.org<http://data.bioontology.org/>"

sourceOntoAbbr  <-  "RXNORM"

# sourceOntoAbbr  <-  "EXACT"

next_page <-
  paste0(REST_URL,
         "/ontologies/",
         sourceOntoAbbr,
         "/mappings?pagesize=",
         PAGE_SIZE)

all.output.chunks <- list()

temp.rdf <- rrdf::new.rdf()

while (!is.null(next_page) > 0) {
  print(next_page)

  httpResponse <-
    GET(next_page,
        add_headers("Authorization" = paste0("apikey token=", API_KEY)),
        accept_json())

  page <-
    fromJSON(content(httpResponse, "text"), simplifyDataFrame = FALSE)

  next_page <- page$links$nextPage

  print(page$page)
  print(page$pageCount)

  one.output.chunk <-
    lapply(page$collection, function(current.collection) {


      matchMeth <- current.collection$source

      temp1 <- current.collection$classes[[1]]$links
      sourceOnt <- temp1$ontology
      inputTerm <- current.collection$classes[[1]]$`@id`

      temp2 <- current.collection$classes[[2]]$links
      matchOnt <- temp2$ontology
      matchTerm <- current.collection$classes[[2]]$`@id`

      return(
        list(
          "inputTerm" = inputTerm,
          "matchMeth" = matchMeth,
          "matchOnt" = matchOnt,
          "matchTerm" = matchTerm,
          "sourceOnt" = sourceOnt
        )
      )

    })
  one.output.chunk <- do.call(rbind.data.frame, one.output.chunk)

  all.output.chunks[[page$page]] <- one.output.chunk

}

all.output.chunks <- do.call(rbind.data.frame, all.output.chunks

write.table(all.output.chunks, sep = "\t", row.names = FALSE, file = paste0(sourceOntoAbbr, "_bioportal_mappings.tsv"))
_______________________________________________
bioontology-support mailing list
bioontology-support at lists.stanford.edu<mailto:bioontology-support at lists.stanford.edu>
https://mailman.stanford.edu/mailman/listinfo/bioontology-support

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.stanford.edu/pipermail/bioontology-support/attachments/20181020/5b473569/attachment-0001.html>


More information about the bioontology-support mailing list