Search Mailing List Archives
[Gofriends] Announcement: Changes to the GO Consortium gene association file format
edimmer at ebi.ac.uk
edimmer at ebi.ac.uk
Tue Mar 2 08:05:03 PST 2010
On the 1st of June 2010, the format of GO annotation files available from
the GO Consortium web and ftp site will change, with files being supplied
in a 17-column GAF format (GAF2.0) instead of the current 15 columns.
The format of the new annotation file is described in the GO web page below:
In essence, this format change means that two columns will be added to the
end of the current tab-delimited file format:
Column 16 (Annotation Extension). This column will provide additional
cross references to other ontologies that can be used to qualify or
enhance a GO annotation. Cross-references will be prefaced by an
appropriate GO relationship and references to multiple ontologies will be
provided. Targets of certain processes or functions can also be included
in this field to indicate the gene, gene product, or chemical involved.
The finalstructure of data in this field is currently being discussed. As
the Consortium agrees on the presentation of certain types of data in
column 16, details will be announced to the GO Friends email list. It
will be optional for GO Consortium annotation groups to supply data in
Column 17 (Gene Product Form ID). This column will specifically identify
which gene product is being annotated. For example, it may include protein
sequence identifiers that specify distinct proteins variants produced by
differential splicing, alternative translational starts,
post-translational cleavage or post-translational modification.
Identifiers for functional RNAs can also be included in this column. The
addition of this new column means that the DB_Object_ID (column 2) will
always reference a gene or protein identifier that has a 1-1
correspondence to a gene. It will be optional for GO Consortium
annotation groups to supply data in this column.
It will not be mandatory for users to use the information supplied in
columns 16 and 17; this data will enhance the annotation descriptiveness,
but will not radically modify the interpretation of the annotation
information supplied in columns 1 to 15.
Before the 1st June switchover date, the GO Consortium's gene-associations
ftp directory (ftp://ftp.geneontology.org/pub/go/gene-associations/) will
only supply users with GAF1.0 formatted files. On the 1st of June, the
gene-associations directory will switch to supplying GAF2.0 files. Please
ensure that your gene association parsers are updated to accept this
format change by this date.
GO Consortium managers
More information about the go-friends