Search Mailing List Archives

Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

use of Uniprot accession Vs GenBank Accession in With column

Michael Ashburner (Genetics) ma11 at
Wed Jan 31 06:47:57 PST 2007


PID IS NOT the same as the GB/EMBL/DDBJ accession number.
You are correct in that the PID is this:
> /protein_id="AAT37941.1"
but the corresponding GB/EMBL/DDBJ accession number is, in the case you cite

Any single ACCESSION can have 0, 1 or >1 PID. The last is especially
true for genomic sequences.

As has been pointed out to you if you look up an ACCESSION number
at the EBI you will see the UniProt xlink. So if you go to and paste in AY607689 into the top page query box
you will get this record which includes these data:

T                   /codon_start=1
FT                   /product="low temperature-induced low molecular weight
FT                   integral membrane protein LTI6a"
FT                   /note="OsLti6a"
FT                   /db_xref="GOA:Q8H5T6"
FT                   /db_xref="InterPro:IPR000612"
FT                   /db_xref="UniProtKB/Swiss-Prot:Q8H5T6"
FT                   /protein_id="AAT37941.1"
FT                   GIIYAVWVITK"


> Envelope-to: ma11 at
> Delivery-date: Wed, 31 Jan 2007 14:21:23 +0000
> X-Cam-SpamDetails: scanned, SpamAssassin-3.1.7 (score=0)
> X-Cam-AntiVirus: No virus found
> X-Cam-ScannerInfo:
> Date: Wed, 31 Jan 2007 09:20:38 -0500
> From: Pankaj Jaiswal <pj37 at>
> User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.4) 
Gecko/20030624 Netscape/7.1 (ax)
> X-Accept-Language: en-us, en
> MIME-Version: 1.0
> To: "Michael Ashburner (Genetics)" <ma11 at>
> CC: go at, val at, edimmer at, 
midori at, kchris at, annotation at
> Subject: Re: use of Uniprot accession Vs GenBank Accession in With column
> Content-Transfer-Encoding: 7bit
> Hi,
> PID is the same as GB/EMBL/DDBJ accession number
> e.g.
> /protein_id="AAT37941.1"
> referred in nucleotide entry
> is the same as accession number in
> VERSION     AAT37941.1  GI:47717899
> The version is fine, that refers to any new updates in the entry and 
> they are all tracked. However, in most cases it is not that significant.
> The problem I raised is also because it is a rare occurrence a citation 
> refers to Uniprot accessions. Almost always they refer to GB/EMBL/DDBJ 
> accessions. In that case a curator has to go and find out the possible 
> Uniprot accession as Emily has suggested. This I think is extra 
> curational load. There are other problems as well cited in this mail 
> tread as well. So my suggestion is to adopt a universal system to always 
> refer by an EMBL/GB/DDBJ accession number in the association files and 
> some magic script should be able to link back to all the respective dbs 
> and not just one source.
> On the other hand we should encourage the GB to provide Xrefs to the 
> Uniprot accessions also. I have seen them in unigenes/genes/genomes but 
> not always in protein and nucleotide dbs.
> -Pankaj
> Michael Ashburner (Genetics) wrote:
> > All
> > 
> > Am I being thick or not ? It seems as if the obvious object to refer
> > to, if Uniprot ID is not available, is the PID contained within GenBank
> > EMBL records. This is shared between GB, EMBL and DDBJ. It is versioned
> > and gets over the problem that Val points to:
> > 'it may be a problem to refer 
> > to the Genbank/EMBL accession number as this will often be a cosmid or 
> > contig and contain multiple CDS- in these cases you can't refer to the 
> > gene/protein uniquely  with an EMBL ID.'
> > 
> > Michael
> > 
> > 

More information about the go-discuss mailing list