Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

use of Uniprot accession Vs GenBank Accession in With column

Michael Ashburner (Genetics) ma11 at gen.cam.ac.uk
Wed Jan 31 06:47:57 PST 2007


Pankaj

PID IS NOT the same as the GB/EMBL/DDBJ accession number.
You are correct in that the PID is this:
> /protein_id="AAT37941.1"
but the corresponding GB/EMBL/DDBJ accession number is, in the case you cite
ACCESSION   AY607689


Any single ACCESSION can have 0, 1 or >1 PID. The last is especially
true for genomic sequences.

As has been pointed out to you if you look up an ACCESSION number
at the EBI you will see the UniProt xlink. So if you go to
www.ebi.ac.uk and paste in AY607689 into the top page query box
you will get this record which includes these data:

T                   /codon_start=1
FT                   /product="low temperature-induced low molecular weight
FT                   integral membrane protein LTI6a"
FT                   /note="OsLti6a"
FT                   /db_xref="GOA:Q8H5T6"
FT                   /db_xref="InterPro:IPR000612"
FT                   /db_xref="UniProtKB/Swiss-Prot:Q8H5T6"
FT                   /protein_id="AAT37941.1"
FT                   /translation="MADSTATCIDIILAIILPPLGVFFKFGCGIEFWICLLLTFFGYLP
FT                   GIIYAVWVITK"


Michael

> Envelope-to: ma11 at gen.cam.ac.uk
> Delivery-date: Wed, 31 Jan 2007 14:21:23 +0000
> X-Cam-SpamDetails: scanned, SpamAssassin-3.1.7 (score=0)
> X-Cam-AntiVirus: No virus found
> X-Cam-ScannerInfo: http://www.cam.ac.uk/cs/email/scanner/
> Date: Wed, 31 Jan 2007 09:20:38 -0500
> From: Pankaj Jaiswal <pj37 at cornell.edu>
> User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.4) 
Gecko/20030624 Netscape/7.1 (ax)
> X-Accept-Language: en-us, en
> MIME-Version: 1.0
> To: "Michael Ashburner (Genetics)" <ma11 at gen.cam.ac.uk>
> CC: go at genome.stanford.edu, val at sanger.ac.uk, edimmer at ebi.ac.uk, 
midori at ebi.ac.uk, kchris at genome.stanford.edu, annotation at genome.stanford.edu
> Subject: Re: use of Uniprot accession Vs GenBank Accession in With column
> Content-Transfer-Encoding: 7bit
> 
> Hi,
> 
> PID is the same as GB/EMBL/DDBJ accession number
> e.g.
> /protein_id="AAT37941.1"
> referred in nucleotide entry
> http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nucleotide&val=47717898
> is the same as accession number in
> http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?val=AAT37941
> ACCESSION   AAT37941
> VERSION     AAT37941.1  GI:47717899
> 
> The version is fine, that refers to any new updates in the entry and 
> they are all tracked. However, in most cases it is not that significant.
> 
> The problem I raised is also because it is a rare occurrence a citation 
> refers to Uniprot accessions. Almost always they refer to GB/EMBL/DDBJ 
> accessions. In that case a curator has to go and find out the possible 
> Uniprot accession as Emily has suggested. This I think is extra 
> curational load. There are other problems as well cited in this mail 
> tread as well. So my suggestion is to adopt a universal system to always 
> refer by an EMBL/GB/DDBJ accession number in the association files and 
> some magic script should be able to link back to all the respective dbs 
> and not just one source.
> 
> On the other hand we should encourage the GB to provide Xrefs to the 
> Uniprot accessions also. I have seen them in unigenes/genes/genomes but 
> not always in protein and nucleotide dbs.
> 
> -Pankaj
> 
> Michael Ashburner (Genetics) wrote:
> > All
> > 
> > Am I being thick or not ? It seems as if the obvious object to refer
> > to, if Uniprot ID is not available, is the PID contained within GenBank
> > EMBL records. This is shared between GB, EMBL and DDBJ. It is versioned
> > and gets over the problem that Val points to:
> > 'it may be a problem to refer 
> > to the Genbank/EMBL accession number as this will often be a cosmid or 
> > contig and contain multiple CDS- in these cases you can't refer to the 
> > gene/protein uniquely  with an EMBL ID.'
> > 
> > Michael
> > 
> > 
> 




More information about the go-discuss mailing list