Search Mailing List Archives

Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

use of Uniprot accession Vs GenBank Accession in With column

Gavin Sherlock sherlock at genome.Stanford.EDU
Wed Jan 31 09:03:40 PST 2007

Interesting dilemma.

Clearly the result is based on a protein, but what if there are two  
genes, A and B, whose DNA sequences may differ, but whose protein  
products have identical sequences.  What if one of them is expressed  
under some circumstances, allowing its product to interact with  
protein X, and the other is never expressed under circumstances that  
would allow its product to interact with protein X.  In this case,  
when annotating protein X, knowing the gene whose product it  
interacts with would be important.  Of course, I have no examples of  
this, and no reason to expect that they might exist, but it is a  
formal possibility, and there are certainly examples in the  
literature where synonymous changes can affect function.


On Jan 31, 2007, at 8:34 AM, Emily Dimmer wrote:

> Hi,
> Yes this is true, there is only one UniProtKB record when two  
> proteins are from the same species and 100% identical.
> I thought this discussion was started what type of accessions  
> should be used in the 'with' column for IPI-evidenced  
> annotations ... if the proteins are identical and the experiment  
> e.g. a protein binding assay,  is done with the protein,  how much  
> is this a problem? Surely its more correct and more meaningful to  
> the user to use a protein identifier.
> For an example of two genes encoding the same protein sequence see  
> Q9SW34 (S61G1_ARATH)  ( 
> You can see the two gene name lines here:
> GN   Name=SEC61G1; OrderedLocusNames=At4g24920; ORFNames=F13M23.60;
> GN   and
> GN   Name=SEC61G2; OrderedLocusNames=At5g50460; ORFNames=MBA10.1;
> Emily
> Doug Howe wrote:
>> I seem to recall that identical proteins generated from distinct  
>> genes are represented by a single UniProt record.  If that is  
>> still true, isn't that a case where an EMBL accession would be  
>> better in the with field?
>> -Doug
>> On Wed, 31 Jan 2007, Valerie Wood wrote:
>>> Emily Dimmer wrote:
>>>> Hi,
>>>> Just a quick note, GenBank Accessions are exactly the same as  
>>>> EMBL accessions. All EMBL accessions are cross-referenced in  
>>>> UniProt. Therefore if you *did* want to find a UniProtKB  
>>>> accession, you should be able just to enter the  GenBank  
>>>> accession into the UniProt website (or search via SRS etc...)  
>>>> and it will bring up the quivalent UniProt entry (I do realize  
>>>> that for some groups there is an issue of a UniProtKB accession  
>>>> not yet existing for an equivalent GenBank accession).
>>> In the cases where there is no Uniprot ID, it may be a problem to  
>>> refer to the Genbank/EMBL accession number as this will often be  
>>> a cosmid or contig and contain multiple CDS- in these cases you  
>>> can't refer to the gene/protein uniquely  with an EMBL ID.
>>> Presumably though, for the cases where there is no Swiss-Prot / 
>>> Trembl ID then the likelihood that you would be using this as a  
>>> dbxref in the with column for an ISS is very small (I have never  
>>> come across one). Can't we all agree to track down the Uniprot ID  
>>> (which is relatively straightforward), or in cases why there  
>>> isn't one, contact Uniprot to work out why?
>>> Val
>>>> Cheers,
>>>> Emily
>>>> Midori Harris wrote:
>>>>> Actually, GB or GenBank would also be acceptable, because  
>>>>> they're listed as synonyms in GO.xrf_abbs (tthe filtering  
>>>>> script allows anything in the 'aabbreviation' or 'synonym'  
>>>>> fields).
>>>>> m
>>>>> On Tue, 30 Jan 2007, Karen Christie wrote:
>>>>>> Note that the abbreviation selected by GO for the IDs for  
>>>>>> GenBank, DDBJ, and EMBL is EMBL, so that's the namespace that  
>>>>>> needs to be used in the gene_association files for GO.
>>>>>> -Karen
>>>>>> On Tue, 30 Jan 2007, Pankaj Jaiswal wrote:
>>>>>>> Got it. We will use the GB one.
>>>>>>> BTW GenBank ID is different than the GenBank Accession.  
>>>>>>> GenBank ID is the ID exclusive for the GenBank database  
>>>>>>> entry. One GB accession can have mappings to several GenBank  
>>>>>>> IDs.
>>>>>>> Pankaj
>>>>>>> Karen Christie wrote:
>>>>>>>> Hi Pankaj,
>>>>>>>> GenBank IDs are already allowed in the with column. The main  
>>>>>>>> requirement is that the abbreviation (or namespace) for the  
>>>>>>>> source of the ID be included in the GO.xrf_abbs file. There  
>>>>>>>> is already an entry for IDs coming from GenBank/DDBJ/EMBL,  
>>>>>>>> so these IDs are already permissable.
>>>>>>>> -Karen
>>>>>>>> abbreviation: EMBL
>>>>>>>> database: International Nucleotide Sequence Database  
>>>>>>>> Collaboration, comprising EMBL-EBI International Nucleotide  
>>>>>>>> Sequence Data Library (EMBL-Bank), DNA DataBank of Japan  
>>>>>>>> (DDBJ), and NCBI GenBank
>>>>>>>> object: Sequence accession number
>>>>>>>> example_id: EMBL:AA816246
>>>>>>>> example_id: DDBJ:AA816246
>>>>>>>> example_id: GB:AA816246
>>>>>>>> synonym: DDBJ
>>>>>>>> synonym: GB
>>>>>>>> synonym: GenBank
>>>>>>>> generic_url:
>>>>>>>> generic_url:
>>>>>>>> generic_url:
>>>>>>>> On Tue, 30 Jan 2007, Pankaj Jaiswal wrote:
>>>>>>>>> Hi Everyone,
>>>>>>>>> I know it is an accepted SOP to include either the Uniprot  
>>>>>>>>> accession number or the individual database's own gene/ 
>>>>>>>>> protein ID in the WITH column of the association tables.
>>>>>>>>> However while doing it it seems that it is too much of the  
>>>>>>>>> work to find out what is the Uniprot entry, because often  
>>>>>>>>> the DDBJ and GenBank do not Xref each other using the  
>>>>>>>>> Uniprot accession. However the best alternative is to use  
>>>>>>>>> the GenBank's Accession number. Which I see that almost all  
>>>>>>>>> the databases including Uniprot, DDBJ, EMBL, PIR etc. use  
>>>>>>>>> it to cross refer. It is also the most suitable ID used to  
>>>>>>>>> find the particular nucleotide/protein accession that we  
>>>>>>>>> are looking for using the same query, no matter which db is  
>>>>>>>>> queried.
>>>>>>>>> I hope you would consider my request by adopting the  
>>>>>>>>> GenBank's accession number, unless there is a better option.
>>>>>>>>> Thanks
>>>>>>>>> Pankaj
>>>>>>>>> -- 
>>>>>>>>> Pankaj Jaiswal
>>>>>>>>> G-15, Bradfield Hall
>>>>>>>>> Dept. of Plant Breeding and Genetics
>>>>>>>>> Cornell University
>>>>>>>>> Ithaca, NY-14853, USA
>>>>>>>>> Ph. +1-607-255-3103 / 4199
>>>>>>>>> fax: +1-607-255-6683
>>>>>>> -- 
>>>>>>> Pankaj Jaiswal
>>>>>>> G-15, Bradfield Hall
>>>>>>> Dept. of Plant Breeding and Genetics
>>>>>>> Cornell University
>>>>>>> Ithaca, NY-14853, USA
>>>>>>> Ph. +1-607-255-3103 / 4199
>>>>>>> fax: +1-607-255-6683
>>> -- 
>>> -------------------------------------------------------------------- 
>>> -------
>>> Valerie Wood             Tel: 01223 496909
>>> S. pombe Genome Project         Fax: 01223 494919 Wellcome Trust  
>>> Sanger Institute     email: val at
>>> Wellcome Trust Genome Campus 
>>> pombe Hinxton, Cambridge, CB10 1HH 
>>> Projects/S_pombe
> -- 
> ************************************
>    Emily Dimmer
>    GOA and IntAct Database Curator
>    Wellcome Trust Genome Campus
>    Hinxton
>    Cambridge CB10 1SD, U.K.
>    Tel:     +44 1223 494654
>    Fax:    +44 1223 494468
>    email:  edimmer at
> ************************************

More information about the go-discuss mailing list