Search Mailing List Archives

Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

use of Uniprot accession Vs GenBank Accession in With column

Emily Dimmer edimmer at
Wed Jan 31 08:34:33 PST 2007


Yes this is true, there is only one UniProtKB record when two proteins 
are from the same species and 100% identical.
I thought this discussion was started what type of accessions should be 
used in the 'with' column for IPI-evidenced annotations ... if the 
proteins are identical and the experiment e.g. a protein binding assay,  
is done with the protein,  how much is this a problem? Surely its more 
correct and more meaningful to the user to use a protein identifier.

For an example of two genes encoding the same protein sequence see 
Q9SW34 (S61G1_ARATH)  
You can see the two gene name lines here:

GN   Name=SEC61G1; OrderedLocusNames=At4g24920; ORFNames=F13M23.60;
GN   and
GN   Name=SEC61G2; OrderedLocusNames=At5g50460; ORFNames=MBA10.1;


Doug Howe wrote:

> I seem to recall that identical proteins generated from distinct genes 
> are represented by a single UniProt record.  If that is still true, 
> isn't that a case where an EMBL accession would be better in the with 
> field?
> -Doug
> On Wed, 31 Jan 2007, Valerie Wood wrote:
>> Emily Dimmer wrote:
>>> Hi,
>>> Just a quick note, GenBank Accessions are exactly the same as EMBL 
>>> accessions. All EMBL accessions are cross-referenced in UniProt. 
>>> Therefore if you *did* want to find a UniProtKB accession, you 
>>> should be able just to enter the  GenBank accession into the UniProt 
>>> website (or search via SRS etc...) and it will bring up the 
>>> quivalent UniProt entry (I do realize that for some groups there is 
>>> an issue of a UniProtKB accession not yet existing for an equivalent 
>>> GenBank accession).
>> In the cases where there is no Uniprot ID, it may be a problem to 
>> refer to the Genbank/EMBL accession number as this will often be a 
>> cosmid or contig and contain multiple CDS- in these cases you can't 
>> refer to the gene/protein uniquely  with an EMBL ID.
>> Presumably though, for the cases where there is no Swiss-Prot /Trembl 
>> ID then the likelihood that you would be using this as a dbxref in 
>> the with column for an ISS is very small (I have never come across 
>> one). Can't we all agree to track down the Uniprot ID (which is 
>> relatively straightforward), or in cases why there isn't one, contact 
>> Uniprot to work out why?
>> Val
>>> Cheers,
>>> Emily
>>> Midori Harris wrote:
>>>> Actually, GB or GenBank would also be acceptable, because they're 
>>>> listed as synonyms in GO.xrf_abbs (tthe filtering script allows 
>>>> anything in the 'aabbreviation' or 'synonym' fields).
>>>> m
>>>> On Tue, 30 Jan 2007, Karen Christie wrote:
>>>>> Note that the abbreviation selected by GO for the IDs for GenBank, 
>>>>> DDBJ, and EMBL is EMBL, so that's the namespace that needs to be 
>>>>> used in the gene_association files for GO.
>>>>> -Karen
>>>>> On Tue, 30 Jan 2007, Pankaj Jaiswal wrote:
>>>>>> Got it. We will use the GB one.
>>>>>> BTW GenBank ID is different than the GenBank Accession. GenBank 
>>>>>> ID is the ID exclusive for the GenBank database entry. One GB 
>>>>>> accession can have mappings to several GenBank IDs.
>>>>>> Pankaj
>>>>>> Karen Christie wrote:
>>>>>>> Hi Pankaj,
>>>>>>> GenBank IDs are already allowed in the with column. The main 
>>>>>>> requirement is that the abbreviation (or namespace) for the 
>>>>>>> source of the ID be included in the GO.xrf_abbs file. There is 
>>>>>>> already an entry for IDs coming from GenBank/DDBJ/EMBL, so these 
>>>>>>> IDs are already permissable.
>>>>>>> -Karen
>>>>>>> abbreviation: EMBL
>>>>>>> database: International Nucleotide Sequence Database 
>>>>>>> Collaboration, comprising EMBL-EBI International Nucleotide 
>>>>>>> Sequence Data Library (EMBL-Bank), DNA DataBank of Japan (DDBJ), 
>>>>>>> and NCBI GenBank
>>>>>>> object: Sequence accession number
>>>>>>> example_id: EMBL:AA816246
>>>>>>> example_id: DDBJ:AA816246
>>>>>>> example_id: GB:AA816246
>>>>>>> synonym: DDBJ
>>>>>>> synonym: GB
>>>>>>> synonym: GenBank
>>>>>>> generic_url:
>>>>>>> generic_url:
>>>>>>> generic_url:
>>>>>>> On Tue, 30 Jan 2007, Pankaj Jaiswal wrote:
>>>>>>>> Hi Everyone,
>>>>>>>> I know it is an accepted SOP to include either the Uniprot 
>>>>>>>> accession number or the individual database's own gene/protein 
>>>>>>>> ID in the WITH column of the association tables.
>>>>>>>> However while doing it it seems that it is too much of the work 
>>>>>>>> to find out what is the Uniprot entry, because often the DDBJ 
>>>>>>>> and GenBank do not Xref each other using the Uniprot accession. 
>>>>>>>> However the best alternative is to use the GenBank's Accession 
>>>>>>>> number. Which I see that almost all the databases including 
>>>>>>>> Uniprot, DDBJ, EMBL, PIR etc. use it to cross refer. It is also 
>>>>>>>> the most suitable ID used to find the particular 
>>>>>>>> nucleotide/protein accession that we are looking for using the 
>>>>>>>> same query, no matter which db is queried.
>>>>>>>> I hope you would consider my request by adopting the GenBank's 
>>>>>>>> accession number, unless there is a better option.
>>>>>>>> Thanks
>>>>>>>> Pankaj
>>>>>>>> -- 
>>>>>>>> Pankaj Jaiswal
>>>>>>>> G-15, Bradfield Hall
>>>>>>>> Dept. of Plant Breeding and Genetics
>>>>>>>> Cornell University
>>>>>>>> Ithaca, NY-14853, USA
>>>>>>>> Ph. +1-607-255-3103 / 4199
>>>>>>>> fax: +1-607-255-6683
>>>>>> -- 
>>>>>> Pankaj Jaiswal
>>>>>> G-15, Bradfield Hall
>>>>>> Dept. of Plant Breeding and Genetics
>>>>>> Cornell University
>>>>>> Ithaca, NY-14853, USA
>>>>>> Ph. +1-607-255-3103 / 4199
>>>>>> fax: +1-607-255-6683
>> -- 
>> --------------------------------------------------------------------------- 
>> Valerie Wood             Tel: 01223 496909
>> S. pombe Genome Project         Fax: 01223 494919 Wellcome Trust 
>> Sanger Institute     email: val at
>> Wellcome Trust Genome Campus 
>> Hinxton, Cambridge, CB10 1HH     

    Emily Dimmer
    GOA and IntAct Database Curator
    Wellcome Trust Genome Campus
    Cambridge CB10 1SD, U.K.
    Tel:     +44 1223 494654
    Fax:    +44 1223 494468
    email:  edimmer at

More information about the go-discuss mailing list