Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

[p4-feedback] Questions re the Future of 4.0

Timothy Redmond tredmond at stanford.edu
Thu Apr 23 11:51:12 PDT 2009


> I am currently testing against 3.4 since I hope to use Protégé as a  
> backend for several web-based applications.  Ideally, I would have a  
> frontend which would present results on the web, and an  
> administrative backend for data entry.  I have noticed several  
> limitations with 3.4 in these areas, and was wondering what was  
> slated for 4.0.

We are planning on having a Protege-4 version of web protege.  This is  
already funded but I am not sure of the start date for this project.

> 1) Will concurrent transactions be supported?  Do I understand  
> correctly that in 3.4 they are synchronized/blocking operations?

Protege 4 will support sandboxing of changes which I think will turn  
out to be a much better model than transactions.  This is a very  
natural approach with the owl api.  It will allow a user/process to  
make local copies of a change, employ a reasoner or other tools to  
evaluate these changes and then finally commit them to the server.   
Protege 4 will support a pluggable locking mechanism whereby different  
users/processes can lock different portions of the ontology for their  
work.  Two plugins that we will probably develop early are the NCI  
locking mechanism and Julian's locking mechanism.

Protege 3.4 uses transactions but the approach taken there has some  
disadvantages.  First, the transaction model in Protege 3.4 is derived  
from the underlying database storage mechanism.  We have discovered  
that there is a mismatch between the type of transaction locking  
provided by the database and the desired locking for ontologies.  In  
practice concurrent transactions often get in each others way.  In  
addition, making transactions work with ontologies proved extremely  
complex in Protege 3.4.   I hope we can avoid that type of complexity  
in Protege 4.

Protege 4 will also handle concurrency more efficiently.  In Protege  
3.4 the knowledge base is synchronized by a single course grained  
lock.  In Protege 4, the owl api will use fine grained locks to  
achieve thread-safety for multiple readers.  (Writers will still take  
a course grained lock).

> 2) Are there plans to better isolate the GUI, server and client  
> components of Protégé?

This is not planned in the immediate future for Protege 3.4.  The  
server definitely will run headless in the current implementation.  I  
believe that client code can also run headless but I will have to  
check this.  The client and the server are well isolated from one  
another.  But you are right there is some unfortunate linkage between  
the ontology model and the GUI.

> 3) In 3.4, the project and remote project manager are singletons.   
> This means a client can only connect to one server project at a  
> time.  Will this limitation be removed in 4.0?

First note that when you open a collaborative project, you are  
actually opening several server projects simultaneously - the main  
project, the annotations/changes project and the chat project.   On  
the other hand, a server can only serve up Protege once because the  
server object is a singleton.

There is a gui limitation that a Protege 3.4 client can only edit one  
main project at a time.   Protege 3.4 has always worked this way.  In  
Protege 4 this has never been the case.  From the start Protege 4 can  
used to edit as many ontologies simultaneously as you can load.

> One problem that I think may be related to this is that in 3.4  
> transactions in a particular VM are not fully isolated.  In  
> particular, if I create two sessions in the same application, begin  
> a transaction in model A, and create a new individual w/o  
> committing, model B sees the new individual (but the properties are  
> null until A commits).  If A and B are created in different  
> classloaders/VMs, B does not see the new individual until A commits.


Transactions are fully isolated in Protege 3.4 but it does depend on  
the backend..  If the server uses a file backend then the transaction  
isolation level is pretty low and there is no protection.  But with a  
database backend, the transaction isolation level can go up to what is  
supported by the database backend.  We run nightly junits that test  
that the isolation is enforced and these junits do run on a single  
jvm.  (Not for any special reason other than to simplify some already  
quite complex tests).

There is another limitation that you might have bumped into.  You  
can't use the same session to connect twice to the same project in  
Protege 3.4.  The server only sends  updates for a project once for  
the session and these updates then get divided between the two remote  
projects.  Also the notion of a transaction is defined relative to a  
session.  There is a RemoteServer.cloneSession method that can help  
with this problem.  I mention this because you say you are having  
different experiences in the same vs. different jvms.

> 4) To what degree will the 4.0 API be compatible with the 3.4 one?   
> In other words, if I code to 3.4 for now, how complex will the  
> migration path to 4.0 be?

There is not much common ground between the two implementations.   
Perhaps if you can isolate the code that uses the Protege 3.4 owl api/ 
owl api from the rest of your app this would help.  They are both  
Swing based so perhaps some of the gui code could survive.

> 5) Will the client-server architecture be the same as 3.4? (i.e. DB/ 
> File<-->Server<-(rmi)->Client<-->App?

The architecture will be very different.   But Protege 4 will probably  
support rmi and web-service based communication and will  also have a  
database backend.  Simply creating a client-server will not depend  
much on the database backend.  But for more features such as  
WebProtege the database back end will be very important.

The Protege 3.4 architecture was based on implementing the Protege 3.4  
owl api over the wire.  This approach makes many demands on the  
network and requires sophisticated caching protocols to ensure that  
the client will be responsive even when the network is slow.  The  
Protege 4 architecture [1,2] is based on the idea of change  
management.  Most of the network communication will involve updates of  
changes made to the ontology between the server and different  
clients.  This mechanism will decouple the client from the server.  In  
an extreme case, a client could even go offline for an extended period  
of time and have his changes committed when he gets back.  (Like I can  
do with e-mail.)

> 6) When will the above features be released?

It is hard to say.  Work is starting now.  But one of the things that  
will have to happen first is the update of Protege 4 to the full set  
of OWL 2.0 features.   I don't really want to commit but I am hoping  
that in 3-6 months we will have something.

> 7) Lastly, was there any load testing done on Protégé 4.0 (or 3.4  
> for that matter?)  I'm curious of what magnitude of data can be  
> reasonably supported in client-server mode.


This question probably needs some refining.  In standalone mode, there  
has been quite a bit of load testing.  Tania recently loaded SNOMED  
into a database and Matthew has done several experiments with the owl  
api.  The owl api is much more lightweight than Protege 3.4 and  
Matthew has achieved some impressive results.  I don't have  
architecture/timing pairs handy but I wouldn't be surprised if these  
can easily be found on the internet (at least for the owl api).

For the Protege 4 client server, this probably will give an accurate  
picture of the limits because the client will  be required to do the  
parsing and navigation of the ontology.  Even for really large  
ontologies, the changes being transferred between clients will not  
grow so the network will not be  overloaded.  In Protege 3.4 with its  
dependence on caches and interaction with the server for each call,  
the situation is more complex.  We have a user base that edits the NCI  
thesaurus (a moderately  large ontology)  using the client-server on a  
regular basis.

-Timothy


[1] http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-432/owled2008eu_submission_33.pdf
[2] https://bmir-gforge.stanford.edu/gf/project/owleditor/wiki/?pagename=ClientServerDesignIdea


On Apr 23, 2009, at 6:48 AM, Miceli, Gino (FOEL) wrote:

> Dear Protégé Team,
>
> First of all, I would like to compliment you on a fine job of  
> Protégé so far.  I've been evaluating it to model and maintain some  
> our knowledge bases, and the tools provided seem quite powerful and  
> well thought out.
>
> I am currently testing against 3.4 since I hope to use Protégé as a  
> backend for several web-based applications.  Ideally, I would have a  
> frontend which would present results on the web, and an  
> administrative backend for data entry.  I have noticed several  
> limitations with 3.4 in these areas, and was wondering what was  
> slated for 4.0.  In particular:
>
> 1) Will concurrent transactions be supported?  Do I understand  
> correctly that in 3.4 they are synchronized/blocking operations?
> 2) Are there plans to better isolate the GUI, server and client  
> components of Protégé?
> 3) In 3.4, the project and remote project manager are singletons.   
> This means a client can only connect to one server project at a  
> time.  Will this limitation be removed in 4.0?  One problem that I  
> think may be related to this is that in 3.4 transactions in a  
> particular VM are not fully isolated.  In particular, if I create  
> two sessions in the same application, begin a transaction in model  
> A, and create a new individual w/o committing, model B sees the new  
> individual (but the properties are null until A commits).  If A and  
> B are created in different classloaders/VMs, B does not see the new  
> individual until A commits.
> 4) To what degree will the 4.0 API be compatible with the 3.4 one?   
> In other words, if I code to 3.4 for now, how complex will the  
> migration path to 4.0 be?
> 5) Will the client-server architecture be the same as 3.4? (i.e. DB/ 
> File<-->Server<-(rmi)->Client<-->App?
> 6) When will the above features be released?
> 7) Lastly, was there any load testing done on Protégé 4.0 (or 3.4  
> for that matter?)  I'm curious of what magnitude of data can be  
> reasonably supported in client-server mode.
>
> Sorry for the flood of questions, but out group must decide in the  
> coming days if we can use Protégé to model our knowledge, and these  
> answers may quell concerns and allows us to go forward.
>
> Again, thanks for all your efforts and I look forward to learning  
> more soon!
>
> Best regards,
>
> --------------------------------------------
> Gino Miceli
> System Development Specialist
> Food and Agriculture Organization of the UN
> Forest Communication Service (FOEL)
>
> _______________________________________________
> p4-feedback mailing list
> p4-feedback at lists.stanford.edu
> https://mailman.stanford.edu/mailman/listinfo/p4-feedback




More information about the p4-feedback mailing list