Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

[locrefdev] Re: Proposal for new GO Flat File format (fwd) [NCBI tracking system #15018423]

Chris Mungall cjm at fruitfly.org
Fri May 30 16:50:05 PDT 2003


On Thu, 29 May 2003, Michael Ovetsky wrote:

> Hi All,
>
> After reviewing new GO Flat File format proposal, I have an impression that
> the best format to satisfy the stated design goals (Human readability, Ease
> of parsing, Extensibility, Minimal redundancy) would be XML.
>
> 1) Human readability
> XML is humanly readable and self-explanatory

Personally I find XML a headache to read, and harder to edit directly (OK,
so 99% of the time you want to use a custom viewer or editor, but often it
is useful to work directly with the format).

> 2) Ease of parsing
> There are plenty XML parsers for all possible platforms, so there will be no
> need to write a new low-level parser.

Absolutely - that's why we're providing obo-text <-> obo-xml converters

A text format also has the advantage of easy parsing with perl, unix
command line tools, etc

We are currently using cvs rather than a database as a back end. This
actually works incredibly well. The obo text format will work extremely
well with cvs, whereas xml may prove more problemmatic.

Just because obo-text will be the native format used by DAG Edit, and
because that is the format we are storing in CVS, it does not mean that
everyone has to go out and write parsers for this format. We will provide
obo-xml, as well as the old format and others such as RDF and OWL.

> 3) Extensibility
> New tags can always be added to the document definition

The obo text format is extensible - new tags can easily be added. granted,
there may be specific syntax in the tag value, but I think it's the best
of both worlds.

> 4) Minimal redundancy
> Since XML describes tree-type data structures, it is easy to avoid
> relational database-type redundancy.

Can you expand on that point? In my experience trees can lead to
redundancy whereas relational databases have a formal theory for avoiding
redundancy

> Michael Ovetsky

Thanks for the comments. I agree with the overall sentiment that XML is
good for programmers, but I hope I have justified our approach

cheers,
Chris



> On Fri, 25 Apr 2003, John Richter wrote:
>
> >
> > Hello, everyone. As promised at the last GO meeting, I've come up with a
> > proposal for the new GO Flat File format (based largely on the format of
> the
> > GO.defs file), tentatively titled GO-EFF (Gene Ontology Extensible Flat
> File).
> > The proposal is attached to this email as a text file.
> >
> > Please consider this proposal carefully, and email the list with any
> concerns
> > you have. I have tried to come up with a design that is familiar, yet
> usable,
> > extensible, and compact.
> >
> > A warning: This proposal suggests discarding the current method of using
> > indentation to indicate parentage, and adopting a more record-based
> approach.
> > If this is intolerable to our community, I have developed an alternate
> > proposal that preserves the current document structure. However, I feel
> that
> > as the GO grows more expressive, the indentation-based method will grow
> > harder and harder to parse, and harder and harder for humans to read. I
> hope
> > this proposal will address that problem, and remain palatable to users who
> > have grown accustomed to the format that has served us well for so long.
> >
> > 	-John
>
> ----------------------------------------------------------------------------
> -
> Sue Rhee                         	rhee at acoma.stanford.edu
> The Arabidopsis Information Resource	URL: www.arabidopsis.org
> Carnegie Institution of Washington	FAX: +1-650-325-6857
> Department of Plant Biology		Tel: +1-650-325-1521 ext. 251
> 260 Panama St.
> Stanford, CA 94305
> U.S.A.
> ----------------------------------------------------------------------------
> -
>
>
> --
> This message is from the GOFriends moderated mailing list.  A list of public
> announcements and discussion of the Gene Ontology (GO) project.
> Problems with the list?           E-mail: owner-gofriends at geneontology.org
> Subscribing   send   "subscribe"   to   gofriends-request at geneontology.org
> Unsubscribing send   "unsubscribe"  to  gofriends-request at geneontology.org
> Web:          http://www.geneontology.org/
>
>
> ---- END OF MESSAGE BODY.  PLEASE DO NOT CHANGE THE DATA BELOW ----
> SK#:121:1586:55:80:6054973
>
> URL: http://yar:6224/rt/bin/cgi/webrt.cgi?serial_num=15018423
> &textfilter=&display=History
>
> Please leave the subject line unchanged, and do not change the message
> following "END OF MESSAGE BODY".
>
> Tue, May 27 2003 10:05:22: Request 15018423 was acted upon(15046456)
>  Transaction: Request created by maglott
> ------- END OF TRNSACTION DATA.  YOU MAY ADD COMMENTS BELOW -------
>
> --
> This message is from the GOFriends moderated mailing list.  A list of public
> announcements and discussion of the Gene Ontology (GO) project.
> Problems with the list?           E-mail: owner-gofriends at geneontology.org
> Subscribing   send   "subscribe"   to   gofriends-request at geneontology.org
> Unsubscribing send   "unsubscribe"  to  gofriends-request at geneontology.org
> Web:          http://www.geneontology.org/
>


--
This message is from the GOFriends moderated mailing list.  A list of public
announcements and discussion of the Gene Ontology (GO) project.
Problems with the list?           E-mail: owner-gofriends at geneontology.org
Subscribing   send   "subscribe"   to   gofriends-request at geneontology.org
Unsubscribing send   "unsubscribe"  to  gofriends-request at geneontology.org
Web:          http://www.geneontology.org/



More information about the go-friends mailing list