Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

[go-friends] UniProt-GOA release [2016-05-11]

Tony Sawford tonys at ebi.ac.uk
Wed May 11 03:09:21 PDT 2016


UniProt-GOA release: 11 May 2016
================================

UniProt-GOA (UniProt GO Annotation) is a project run by the UniProt
group that provides assignments of gene products to the Gene Ontology
(GO) resource.

The data can be obtained via:

           EBI FTP: ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/

           GO FTP: ftp://ftp.geneontology.org/pub/go/gene-associations/

           GO SVN: http://www.geneontology.org/GO.svn.help.shtml

For further information read: http://www.ebi.ac.uk/GOA or contact
goa at ebi.ac.uk.

Latest News
===========

Changes to GOA annotation files
===============================

Following discussions with the GO Consortium, we shall be making some 
changes to the set of annotation files that we publish. These changes 
will be implemented at the next GOA release, which is scheduled for the 
week of 6th June.

i) Changes to species-specific annotation sets

Note that in the following text, <species> is one of: human, chicken, 
cow, pig, dog, mouse, rat, arabidopsis, zebrafish, worm, yeast, fly, or 
dicty

For each species, we currently publish two sets of annotations, in both 
GAF 2.1 and GPAD 1.1 format:

- gene_association.goa_<species> and gp_association.goa_<species> - 
these files contain annotations to proteins that are part of the UniProt 
complete proteome for the species *plus* isoforms; they also contain 
annotations to complexes and RNAs

- gene_association.goa_ref_<species> and 
gp_association.goa_ref_<species> - these files contain annotations to 
proteins that are part of the UniProt reference proteome for the species 
(the “Gene Centric Reference Proteome”, or GCRP)

With effect from the next GOA release, we shall cease production of the 
above files, and replace them with the following four annotation sets 
per species, which we will provide in both GAF and GPAD format, the 
format being indicated by the file suffix:

- goa_<species>.[gaf|gpa] - annotations to canonical accessions from the 
UniProt reference proteome
- goa_<species>_isoform.[gaf|gpa] - annotations to isoforms from the 
UniProt reference proteome
- goa_<species>_complex.[gaf|gpa] - annotations to complexes
- goa_<species>_rna.[gaf|gpa] - annotations to RNAs

Note that annotations to proteins that are not part of the UniProt 
reference proteome for a species will still be available in the full 
goa_uniprot annotation set (goa_uniprot_all.[gaf|gpa]).

ii) Changes to species-specific metadata files

For each species, we also currently publish two metadata files, in GPI 
1.1 format, which contain information (name, symbol, synonyms, etc) 
about both annotated and unannotated gene products:

- gp_information.goa_<species> - contains metadata for all proteins that 
are part of the UniProt complete proteome for the species, plus metadata 
for complexes and RNAs from that species

- gp_information.goa_ref_<species> - contains metadata only for those 
proteins that are part of the UniProt reference proteome for the species

With effect from the next GOA release, we shall cease production of the 
above two files for each species, and replace them with the following:

- goa_<species>.gpi - metadata for canonical accessions from the UniProt 
reference proteome (GCRP) for the species
- goa_<species>_isoform.gpi - metadata for isoforms from the UniProt GCRP
- goa_<species>_complex.gpi - metadata for complexes
- goa_<species>_rna.gpi - metadata for RNAs

Note that the files will contain entries for all gene products for that 
particular category, whether they are annotated or not.

iii) Changes to other annotation sets

For consistency with the new naming convention used for the 
species-specific files, we shall be changing the names of the following 
files; their content, however, will remain unchanged:

- gene_association.goa_uniprot     will be renamed to goa_uniprot_all.gaf
- gp_association.goa_uniprot       will be renamed to goa_uniprot_all.gpa
- gene_association.goa_ref_uniprot will be renamed to goa_uniprot_gcrp.gaf
- gp_association.goa_ref_uniprot   will be renamed to goa_uniprot_gcrp.gpa
- gp_information.goa_uniprot       will be renamed to goa_uniprot_all.gpi
- gp_information.goa_ref_uniprot   will be renamed to goa_uniprot_gcrp.gpi


Examples of the new species-specific file sets, for human and mouse, are 
available at ftp://ftp.ebi.ac.uk/pub/contrib/goa/new-files/

If you have any comments or questions about these changes, please email 
us at goa at ebi.ac.uk.

Changes to annotations created by logical inference
===================================================

With effect from this release, we have made a change to the format of 
annotations that are created by logical inference based on 
inter-ontology links between molecular function and biological process 
terms, or between biological process and cellular component terms. These 
annotations can be identified easily, as they have "GOC" in the 
assigned_by column of the GAF and GPAD files.

In previous releases, these logically inferred annotations retained the 
reference, with/from, and evidence code of the annotation to the 
asserted GO term from which they were derived. Now, however, inferred 
annotations will have GO_REF:0000108 
(http://www.geneontology.org/cgi-bin/references.cgi#GO_REF:0000108) as 
their reference, and one of the two evidence codes ECO:0000364 (evidence 
based on logical inference from manual annotation used in automatic 
assertion) or ECO:0000366 (evidence based on logical inference from 
automatic annotation used in automatic assertion), depending on whether 
the annotation to the asserted term is manual or electronic, 
respectively; the with/from column is also populated with the identifier 
of the GO term in the original asserted annotation.

Again, if you have any comments or questions about these changes, please 
email us at goa at ebi.ac.uk.


Regards,
UniProt-GOA Production


More information about the go-friends mailing list