Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

[protege-discussion] Source Control Mightmare

Mark Feblowitz MarkFeblowitz at comcast.net
Wed Jan 31 05:26:04 PST 2007


Ah - I look forward to reading that paper.

I've had the same challenge when using ClearCase with our many Owl ontologies.

I have often noticed issues - even in Protege Owl 
- with the way ontologies are serialized to files.

It seems to be a convenience and also an 
efficiency to be able to write any of a number of 
semantically equivalent but differently ordered 
Owl assertions to a file. Yet it does play havoc on xml  file comparators.

Specifically in Protege, I've noticed that a the 
order of at least some parts of the serialized 
owl files can be mostly or totally reversed each 
time the ontology is serialized to a file. When 
an owl file is later read back in, the order in 
which a "multi-parent" individual is associated 
with its various parents is also reversed, 
changing the behavior of the instance editor, 
which seems to favor the first parent in the set of  asserted parents.

It occurred to me that one might be able to write 
a separate tool that takes in an owl file and 
reorders it into some repeatable uniform order. 
It wouldn't be easy, though, since in many cases 
there is no convenient way to determine a 
repeatable ordering of elements (i.e., you 
couldn't know what the original/intended ordering 
of class membership was, without some additional 
annotation). You'd have to use some convention, 
e.g., alphabetically sorting the node in the 
tree. Not pretty, and possibly not completely 
repeatable, but even partial uniformity would help with the merge problem.

It would be better if there were a uniform 
serialization technique built into jena, 
switch-controllable in Protege. The default could 
be for efficient saves, but switchable to perform 
(somewhat) uniformly ordered saves. I assume that 
a special, owl-specific serializer would be 
necessary, if the model was known to be Owl RDF.

I'm not saying that anybody should forbid other 
techniques. What's needed is a practical means of 
improving the situation, for those of us whole 
have to maintain complex ontologies in a production setting.

Is this at all feasible?

Mark

At 05:24 AM 1/31/2007, Jan Henke wrote:
>Hi Thomas,
>
>there can be several different serializations of an ontology all of them
>having the same semantics. Therefore it would be not correct to forbid any
>of them. For this reason, versioning just on the basis of some serialization
>doesn't work for ontologies. However there are approaches for solving this.
>See for instance http://iswc2006.semanticweb.org/items/Noy2006fj.pdf
>
>Best regards
>Jan
>
>
>
>
> > -----Ursprüngliche Nachricht-----
> > Von: protege-discussion-bounces at mailman.stanford.edu
> > [mailto:protege-discussion-bounces at mailman.stanford.edu] Im
> > Auftrag von Thomas Russ
> > Gesendet: Dienstag, 30. Jänner 2007 18:59
> > An: User support for Core Protege and the Protege-Frames editor
> > Betreff: Re: [protege-discussion] Source Control Mightmare
> >
> >
> > On Jan 30, 2007, at 9:45 AM, Samson Tu wrote:
> >
> > >
> > > Have you tried to do the version comparison and merging using the
> > > Prompt plugin that comes with Protege? That's what the plugin is
> > > designed to do.
> >
> > That certainly helps with the merging step, but it doesn't
> > solve the problem of a real impedance mismatch between the
> > somewhat random order terms are saved by Protege and the
> > assumption of small, incremental and local changes that is
> > made by source control systems like CVS or SVN.
> >
> > Defining a canonical order in which to save information would
> > greatly aid in using such source control tools with Protege
> > ontologies.  This would be a big help for large projects, so
> > I need to express my support for John's feature request.
> >
> > It wouldn't really be all that hard to do, either.  All that
> > is required is to decide on the order to save (i.e., Classes
> > or Properties/Slots first) and then sort the objects by their
> > name before saving.  I have done this for an export plugin I
> > wrote and it isn't all that difficult.  An additional sort on
> > template slot information for classes will also cause the
> > substructure to be sorted.
> >
> > That would at least cause the terms to appear in the same
> > order when there are no changes and that would go a long way
> > to making the resulting files work well with source control tools.
> >
> > If there is concern about the cost of sorting the objects
> > each time one saves, then this could be addressed by
> > introducing a configuration property that determines if one
> > wants sorted output or not.  My feeling is that sorting
> > doesn't add much overhead on saving, but I haven't used this
> > on very large ontologies.
> >
> > But I think this would be a good feature to include in the
> > next version of Protege.
> >
> >
> > >
> > > John Patrick wrote:
> > >> Greetings,
> > >>
> > >> I've searched the message archives but have been unable to find
> > >> similar problems. I've been using Protege for the last 6
> > months and
> > >> have slow started to have more and more issues with how
> > Protege saves
> > >> owl files.
> > >>
> > >> The project I'm on is maintained in a perforce repository and
> > >> branched as required, once a branch is finished or stable it is
> > >> merged back into the main branch. The issue comes when you try to
> > >> merge owl files.
> > >> A merge is takes about 3 days, of which over 2.5 days is
> > just sorting
> > >> out manually merging the owl files. Identifying changes which have
> > >> occurred in both branches and then implementing those changes.
> > >>
> > >> Is there a way of getting Protege to group and sort
> > >> objects/properties/attributes when it saves and owl file. I don't
> > >> mind how its ordered or grouped I'd just like some
> > conformity to how
> > >> it does it.
> > >>
> > >> John Patrick
> > >> _______________________________________________
> > >> protege-discussion mailing list
> > >> protege-discussion at lists.stanford.edu
> > >> https://mailman.stanford.edu/mailman/listinfo/protege-discussion
> > >>
> > >> Instructions for unsubscribing: http://protege.stanford.edu/doc/
> > >> faq.html#01a.03
> > >>
> > >
> > >
> > > --
> > > Samson Tu                    email: swt at stanford.edu
> > > Senior Research Scientist    web: www.stanford.edu/~swt/
> > > Stanford Medical Informatics phone: 1-650-725-3391
> > > Stanford University          fax: 1-650-725-7944
> > >
> > > _______________________________________________
> > > protege-discussion mailing list
> > > protege-discussion at lists.stanford.edu
> > > https://mailman.stanford.edu/mailman/listinfo/protege-discussion
> > >
> > > Instructions for unsubscribing: http://protege.stanford.edu/doc/
> > > faq.html#01a.03
> >
> > _______________________________________________
> > protege-discussion mailing list
> > protege-discussion at lists.stanford.edu
> > https://mailman.stanford.edu/mailman/listinfo/protege-discussion
> >
> > Instructions for unsubscribing:
> > http://protege.stanford.edu/doc/faq.html#01a.03
> >
>
>_______________________________________________
>protege-discussion mailing list
>protege-discussion at lists.stanford.edu
>https://mailman.stanford.edu/mailman/listinfo/protege-discussion
>
>Instructions for unsubscribing: 
>http://protege.stanford.edu/doc/faq.html#01a.03




More information about the protege-discussion mailing list