Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

[bioontology-support] Still "writing" MTHSPL triples after 24 hrs, even with 128 GB RAM

Miller, Mark markampa at pennmedicine.upenn.edu
Thu Apr 4 08:03:55 PDT 2019


I submitted an issue to the ncbo/umls2rdf GitHub repo today.  I thought I'd share it here, too, in case there isn't much overlap in the audiences.

https://github.com/ncbo/umls2rdf/issues/29

I'm running the umls2rdf script on an Ubuntu 16 AWS EC2 server. I bump the RAM up to 128 GB when I'm doing this. I have extracted several other, larger sources with zero or minimal difficulty. I'm using UMLS 2018AA. I'm extracting on CUIs.
I haven't done any MySQL tuning, but the SQL portion of the extraction goes quickly... less than 5 minutes, I think. I have tried to do this with the MTHSPL content combined with other sources in a single mmsys extract/MySQL database, and I have also tried doing MTHSPL in a database all by itself, which has been helpful with some of the other sources.
The triples writing has been going for over 1 day, but I don't think the Turtle file's size has grown beyond roughly 400 MB in the last 10 hours. top shows the python process at 100% CPU but a pretty small RAM usage... ~ 10 GB, I think.
select count(distinct CUI) from MRCONSO; in a MTHSPL-only database says there are 58,041 CUIs used by MTHSPL. I have loaded the Turtle content that I have after one day into a triplestore, and that only shows 3 633 CUIs from MTHSPL.

PREFIX umls: <http://bioportal.bioontology.org/ontologies/umls/>
select (count(distinct ?o) as ?count)
where
{
    graph <https://www.nlm.nih.gov/research/umls/sourcereleasedocs/current/MTHSPL/> {
        ?s umls:cui ?o
    }
}

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.stanford.edu/pipermail/bioontology-support/attachments/20190404/0f303393/attachment-0001.html>


More information about the bioontology-support mailing list