Search Mailing List Archives

Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

Phone call minutes

Evelyn Camon camon at
Mon Apr 2 04:04:25 PDT 2007

 >UniProt does not have anything.

can I suggest changing that to:

UniProtKB may not have all proteins as it relies on CDS data coming from 
EMBL/DDBJ/Genbankas well as data directly submitted via the SPIN 
submission tool:

 >For example, the farm animal grouped in UniPark do not
 > have UniProt ids, but IPI accession Ids.

 > New in UniProt: SPCL to GO (subcellular localization to GO)
Change to:




change to:

For example, many of the proteins from farm animal species e.g. pig, cow 
do not yet have UniProt ids, but they may have Ensembl, UniParc or IPI 
accession Ids.

J Clark wrote:
> Hi Harold,
> Thanks for the great minutes. I have added in the part where I took 
> minutes and pasted below. Would you like to put the whole lot up on the 
> wiki?
> Thanks,
> Jen
> Minutes for GO Outreach call of 3/30/07
> Rama Balakrishnan
> Evelyn Camon
> Jennifer Clark
> Harold Drabkin*
> Michelle Gwinn
> Pascale Gaudet
> Fiona McCarthy
> We attempted to come to a finished product about the IEA flowchart(s). A 
> basic idea is that how one gets IEA annotation depends upon many 
> factors, including whether one can make use of UniProt, where the 
> sequences given to EMBL/Genbank; etc.
> How to obtain records for one taxon ID from UniProt?
> Sending everything “through” UniProt has limitations. UniProt does not 
> have anything. For example, the farm animal grouped in UniPark do not 
> have UniProt ids, but IPI accession Ids.
> UniProt does not have a great deal of prokaryotic products. TIGRE may be 
> a better source for comparisons.
> Things to emphasize
> InterPro domain to GO mappings are meant to be broad. ISS via blasts 
> help you get more specific.
> HAMAP to GO : a more manual GO mapping
> New in UniProt: SPCL to GO (subcellular localization to GO)
> Harold attempted to clarify MGI IEA flow chart
> ===================
> During the night we download all the uniprot records for the mouse taxon 
> id. Each uniprot record has a section that lists embl and genbank 
> records (nucleic acid version.)
> If any ids match any gene in the MGI database then they keep that 
> record. This record is attached to the marker. That means that we load 
> the swissprot ids, and the two accessions are linked in a relational 
> database.
> Each record also contains keyword. We don't load the keywords but we map 
> them to GO in house. The keyword2go mapping could be used but MGI makes 
> there own as they have particular needs.
> We load the EC numbers and the domains into the database where they are 
> known to apply to a given gene product.
> (Unless it is a trembl record in which we'd only do an EC number.) Not 
> all domains are taken as they'd get odd results from a patially curated 
> record.
> e.g. s6kinase domain there were thousands of them.
> Every night we load GO also.
> Some very broad mapping terms  may be filtered e.g. enzyme.
> =============
> Questions about “marker” ; could change to gene; again specific to MGI 
> (and maybe others) because we have many seq_ids (nucleic acid and 
> proteins) that are collected under the thing we call Marker. 
> (Originally, MGI was a chromosome mapping db, so marker was appropriate: 
> something one followed in crosses).  Evelyn indicated that there will 
> be/are  now manual rules applied to translating the keywords to GO 
> terms, and that perhaps the rules should be shared. However, MGI 
> strictly takes the keywords in record  and takes the GO term that the 
> keyword maps to in the translation table with NO inspection (done 
> nightly by HAL2000).
> Think of  ISS or IEA as a “suggestion”.
> Use of ISS noting the context within an organisms as to whether to take 
> the GO terms of the organisms that the ISS points to.
> Michelle: TIGRE does not make much use of the UniProt resource, but uses 
> Procite, Pfam, and TIGRE2GO mappins.
> If one wanted to do the IEA on ones own, say for sequence that is 
> nowhere else other than at the users site, then it would appear that the 
> best approach would be a domain scan using any of several tools, and 
> then using the IP2GO mappings.
> ==============================================
> Harold Drabkin wrote:
>> Here is what I have so far, based on my scribbles. If you can remember 
>> anything please add.
>> Harold

Evelyn Camon
GOA Coordinator
Senior Scientific Curator
European Bioinformatics Institute
E-mail: camon at

More information about the go-discuss mailing list