Search Mailing List Archives
[protege-discussion] Source Control Mightmare
MarkFeblowitz at comcast.net
Wed Jan 31 05:26:04 PST 2007
Ah - I look forward to reading that paper.
I've had the same challenge when using ClearCase with our many Owl ontologies.
I have often noticed issues - even in Protege Owl
- with the way ontologies are serialized to files.
It seems to be a convenience and also an
efficiency to be able to write any of a number of
semantically equivalent but differently ordered
Owl assertions to a file. Yet it does play havoc on xml file comparators.
Specifically in Protege, I've noticed that a the
order of at least some parts of the serialized
owl files can be mostly or totally reversed each
time the ontology is serialized to a file. When
an owl file is later read back in, the order in
which a "multi-parent" individual is associated
with its various parents is also reversed,
changing the behavior of the instance editor,
which seems to favor the first parent in the set of asserted parents.
It occurred to me that one might be able to write
a separate tool that takes in an owl file and
reorders it into some repeatable uniform order.
It wouldn't be easy, though, since in many cases
there is no convenient way to determine a
repeatable ordering of elements (i.e., you
couldn't know what the original/intended ordering
of class membership was, without some additional
annotation). You'd have to use some convention,
e.g., alphabetically sorting the node in the
tree. Not pretty, and possibly not completely
repeatable, but even partial uniformity would help with the merge problem.
It would be better if there were a uniform
serialization technique built into jena,
switch-controllable in Protege. The default could
be for efficient saves, but switchable to perform
(somewhat) uniformly ordered saves. I assume that
a special, owl-specific serializer would be
necessary, if the model was known to be Owl RDF.
I'm not saying that anybody should forbid other
techniques. What's needed is a practical means of
improving the situation, for those of us whole
have to maintain complex ontologies in a production setting.
Is this at all feasible?
At 05:24 AM 1/31/2007, Jan Henke wrote:
>there can be several different serializations of an ontology all of them
>having the same semantics. Therefore it would be not correct to forbid any
>of them. For this reason, versioning just on the basis of some serialization
>doesn't work for ontologies. However there are approaches for solving this.
>See for instance http://iswc2006.semanticweb.org/items/Noy2006fj.pdf
> > -----Ursprüngliche Nachricht-----
> > Von: protege-discussion-bounces at mailman.stanford.edu
> > [mailto:protege-discussion-bounces at mailman.stanford.edu] Im
> > Auftrag von Thomas Russ
> > Gesendet: Dienstag, 30. Jänner 2007 18:59
> > An: User support for Core Protege and the Protege-Frames editor
> > Betreff: Re: [protege-discussion] Source Control Mightmare
> > On Jan 30, 2007, at 9:45 AM, Samson Tu wrote:
> > >
> > > Have you tried to do the version comparison and merging using the
> > > Prompt plugin that comes with Protege? That's what the plugin is
> > > designed to do.
> > That certainly helps with the merging step, but it doesn't
> > solve the problem of a real impedance mismatch between the
> > somewhat random order terms are saved by Protege and the
> > assumption of small, incremental and local changes that is
> > made by source control systems like CVS or SVN.
> > Defining a canonical order in which to save information would
> > greatly aid in using such source control tools with Protege
> > ontologies. This would be a big help for large projects, so
> > I need to express my support for John's feature request.
> > It wouldn't really be all that hard to do, either. All that
> > is required is to decide on the order to save (i.e., Classes
> > or Properties/Slots first) and then sort the objects by their
> > name before saving. I have done this for an export plugin I
> > wrote and it isn't all that difficult. An additional sort on
> > template slot information for classes will also cause the
> > substructure to be sorted.
> > That would at least cause the terms to appear in the same
> > order when there are no changes and that would go a long way
> > to making the resulting files work well with source control tools.
> > If there is concern about the cost of sorting the objects
> > each time one saves, then this could be addressed by
> > introducing a configuration property that determines if one
> > wants sorted output or not. My feeling is that sorting
> > doesn't add much overhead on saving, but I haven't used this
> > on very large ontologies.
> > But I think this would be a good feature to include in the
> > next version of Protege.
> > >
> > > John Patrick wrote:
> > >> Greetings,
> > >>
> > >> I've searched the message archives but have been unable to find
> > >> similar problems. I've been using Protege for the last 6
> > months and
> > >> have slow started to have more and more issues with how
> > Protege saves
> > >> owl files.
> > >>
> > >> The project I'm on is maintained in a perforce repository and
> > >> branched as required, once a branch is finished or stable it is
> > >> merged back into the main branch. The issue comes when you try to
> > >> merge owl files.
> > >> A merge is takes about 3 days, of which over 2.5 days is
> > just sorting
> > >> out manually merging the owl files. Identifying changes which have
> > >> occurred in both branches and then implementing those changes.
> > >>
> > >> Is there a way of getting Protege to group and sort
> > >> objects/properties/attributes when it saves and owl file. I don't
> > >> mind how its ordered or grouped I'd just like some
> > conformity to how
> > >> it does it.
> > >>
> > >> John Patrick
> > >> _______________________________________________
> > >> protege-discussion mailing list
> > >> protege-discussion at lists.stanford.edu
> > >> https://mailman.stanford.edu/mailman/listinfo/protege-discussion
> > >>
> > >> Instructions for unsubscribing: http://protege.stanford.edu/doc/
> > >> faq.html#01a.03
> > >>
> > >
> > >
> > > --
> > > Samson Tu email: swt at stanford.edu
> > > Senior Research Scientist web: www.stanford.edu/~swt/
> > > Stanford Medical Informatics phone: 1-650-725-3391
> > > Stanford University fax: 1-650-725-7944
> > >
> > > _______________________________________________
> > > protege-discussion mailing list
> > > protege-discussion at lists.stanford.edu
> > > https://mailman.stanford.edu/mailman/listinfo/protege-discussion
> > >
> > > Instructions for unsubscribing: http://protege.stanford.edu/doc/
> > > faq.html#01a.03
> > _______________________________________________
> > protege-discussion mailing list
> > protege-discussion at lists.stanford.edu
> > https://mailman.stanford.edu/mailman/listinfo/protege-discussion
> > Instructions for unsubscribing:
> > http://protege.stanford.edu/doc/faq.html#01a.03
>protege-discussion mailing list
>protege-discussion at lists.stanford.edu
>Instructions for unsubscribing:
More information about the protege-discussion