Search Mailing List Archives
[protege-discussion] populate my individuals from database : protege-discussion Digest, Vol 76, Issue 26
Sluka, James Patrick
jsluka at indiana.edu
Tue Nov 13 12:10:04 PST 2012
> can any one help me to find efficient way to populate my ontology
> individuals from database table or speared-sheet
How often do you need to do this?
--------------------------------
IF this is a "one-of" process, that is you've got a big list of
individuals and you just need to import it one time, then you can
pretty much do it using Excel and a bit of cutting and pasting in a
text editor.
I've had to do this a few months ago; take a basic structure in Excel
that has classes, subclasses, definitions, external links, edit dates
etc. and convert it into an OWL-2 ontology.
To do it, I built a very simple OWL-2 in Protege 4 with the basic
types, qualities, links etc. and saved the ontology as RDF/XML.
I then examined that file and extracted the syntax for, for example,
creating a class (subClass of xxxx, definition etc.)
In Excel you can use a spreadsheet cell to create a class (or
individual) where the details for the class (individual) are values of
other columns that are inserted into the correct RDF/XML syntax.
To create many individuals it is likely that the syntax is exactly the
same for all of them, just the details (name etc.) change. So the same
Excel formula can be used for all the individuals. Write the formula
once then paste it down an entire column.
Here is the Excel formula I used to create classes, one class per row,
info for the class from several cells in the same Excel row;
="<owl:Class rdf:about=""&CBO_0_9;"&Classes!E6&""">
<rdfs:label>"&Classes!E6&"</rdfs:label>
<rdfs:subClassOf rdf:resource=""&CBO_0_9;"&Classes!F6&"""/>
<rdfs:comment>"&Classes!L6&"</rdfs:comment>"&C14&$C$1&B14&K14&L14&M14&"
</owl:Class>"
(In the above, Excel cell references were to another sheet in the
workbook named "Classes".)
You can put <cr>'s into an Excel cell with <alt><enter> and that's how
I got the line breaks in the formula above. The formula above gives a
cell with the text contents of (again, this is for one row of the Excel
table and is creating an owl class for that row);
"<owl:Class rdf:about=""&CBO_0_9;PhysicalEntityType"">
<rdfs:label>PhysicalEntityType</rdfs:label>
<rdfs:subClassOf rdf:resource=""&CBO_0_9;CBO_Object""/>
<rdfs:comment>A PhysicalEntityType is a CBO:Class (BFO:snap:quality)
class of physical objects. For example, a Cell, a basement membrane of
a diffusible Molecule.</rdfs:comment>
<dc:date rdf:datatype=""&xsd;date"">21 July 2011</dc:date>
<dc:creator>jps</dc:creator>
<rdfs:seeAlso>BFO:snap:quality</rdfs:seeAlso>
<rdfs:isDefinedBy>urn:miriam:dummy:1</rdfs:isDefinedBy>
<rdfs:isDefinedBy>urn:miriam:dummy:2</rdfs:isDefinedBy>
<rdfs:isDefinedBy>urn:miriam:dummy:3</rdfs:isDefinedBy>
</owl:Class>"
I then copied the column with the results from Excel into a basic text
editor and removed the extra quote marks. (Single " are removed and ""
is converted into ").
I then pasted that big chunk of XML into the original template OWL-2
file. If everything is done right that file can be opened in Protege2
and will be a valid OWL2 ontology. It took a few tries to get the
syntax and quote marks correct so save your Excel worksheet so you can
go back and fix any problems with the formulas.
This is a pretty crude hack but if you only need to do it once (or
perhaps only occasionally) it is probably easier than writing a Protege
plugin to import from an Excel file.
Jim Sluka
Indiana Univ.
More information about the protege-discussion
mailing list