Search Mailing List Archives

Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

[bioontology-support] [BioPortal] Feedback from Nicolas Joannin

Michael Dorf mdorf at
Wed Oct 17 11:44:33 PDT 2018

Hi Nicolas,

Thank you for bringing this to our attention. We don’t offer actual download files for UMLS ontologies (unless you supply us your UMLS license, offline), but I definitely see what you mean when looking at the user interface that displays these terms:<>

Roughly twice per year the UMLS [1] publishes a release of a group of ontologies, which we import into BioPortal.  ICD9CM and ICD10CM is one of the ontologies that’s pulled in when we do the UMLS imports.  At this point, I am not sure exactly where in the import process the wrong encoding gets introduced. I’ve logged this as an issue in our Github tracker:

Thanks again for your report!



On Oct 15, 2018, at 1:51 AM, support at<mailto:support at> wrote:

Name: Nicolas Joannin

Email: nicolas.joannin at<mailto:nicolas.joannin at>



Encoding problems with your ICD9CM and ICD10CM TTL files:


The files you provide for ICD9CM and ICD10CM have erroneous characters, e.g. "Ménière's disease" instead of "Ménière's disease".

This type of character replacement is typical when a UTF-8 encoded source file is read as a Windows-1252 encoded file (probably the default encoding of the machine used to process the data).

I would suggest that when next updating these files you consider updating your script to read the source data with the proper encoding.


bioontology-support mailing list
bioontology-support at<mailto:bioontology-support at>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the bioontology-support mailing list