Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

[go-helpdesk] Enrichment GO Help query

Paola Roncaglia paola at ebi.ac.uk
Wed Jun 27 06:34:48 PDT 2012


Dear Alfons,

Since your fungal species is an anamorph for an Aspergillus species, 
yes, you might be able to use the Aspergillus information in the 
database. However, since the Aspergillus genome is still being annotated 
as far as I know, you might prefer not to restrict your search to that 
single fungal species. Also I'm not sure I completely understand your 
approach. If you use your list of induced/repressed genes as an input in 
the AmiGO BLAST tool, rather than the full list of Eurotium genes on 
your microarray, you will still obtain a (long) list of gene products 
that have different P values, and you would not want to use all of them 
in subsequent analyses. For instance, if I input the protein fasta 
sequence of glycerol 3-phosphate dehydrogenase from Eurotium herbariorum 
(http://www.ncbi.nlm.nih.gov/protein/109156739?report=fasta) in the 
AmiGO BLAST tool, the resulting list of gene products ranges from a very 
significant P value (1.6e-122) down to 7.7e-14. You would want to filter 
your results so as to keep only the top match gene product to use in 
subsequent analyses, if you were to follow this approach - and blast 
your induced/repressed genes one by one and not all together. (The BLAST 
tool on AmiGO is also maintained by the GO Software group in case they 
have any comment.)

Sorry I can't be of more help and thanks for your patience,
Paola

On 6/27/12 1:53 PM, Alfons Weig wrote:
>
> Hi,
>
> it is Eurotium herbariorum.
>
> Best regards
>
> Alfons
>
> *Dr. Alfons Weig*
>
> DNA-Analytik & Ökoinformatik - Univ. Bayreuth
>
> Universitätsstrasse 30
>
> 95447 Bayreuth - Germany
>
> Tel. +49 (0)921-552457
>
> www.daneco.uni-bayreuth.de
>
> *Von:*Paola Roncaglia [mailto:paola at ebi.ac.uk]
> *Gesendet:* Mittwoch, 27. Juni 2012 14:47
> *An:* Alfons Weig
> *Cc:* go-helpdesk at mailman.stanford.edu; go-software at mailman.stanford.edu
> *Betreff:* Re: AW: AW: AW: Enrichment GO Help query
>
> Dear Alfons,
>
> Thanks for your reply. Could you please indicate what the fungal 
> species is? That might help to answer your questions.
>
> Thanks and regards,
> Paola
>
> On 6/27/12 1:27 PM, Alfons Weig wrote:
>
> Dear Paola,
>
> the data are probably not yet included in public databases, although 
> the genome data and the annotations are already available from the DOE 
> JGI. Let's wait and hear what the west coast will suggest!
>
> I have also tried to use the Blast tool at AMIGO. If I understood it 
> correctly, it could be used to blast protein sequences against 
> annotated proteins and the associated GO terms of the blast hits could 
> be used to subsequently look for GO term enrichments.
>
> I think the Aspergillus background could be used as a background, but 
> I was not able to combine more than one blast results in subsequent 
> analyses. Would this approach be an alternative?
>
> Best regards
>
> Alfons
>
> *Dr. Alfons Weig*
>
> DNA-Analytik & Ökoinformatik - Univ. Bayreuth
>
> Universitätsstrasse 30
>
> 95447 Bayreuth - Germany
>
> Tel. +49 (0)921-552457
>
> www.daneco.uni-bayreuth.de <http://www.daneco.uni-bayreuth.de>
>
> *Von:*Paola Roncaglia [mailto:paola at ebi.ac.uk]
> *Gesendet:* Mittwoch, 27. Juni 2012 14:15
> *An:* Alfons Weig
> *Cc:* go-helpdesk at mailman.stanford.edu 
> <mailto:go-helpdesk at mailman.stanford.edu>; 
> go-software at mailman.stanford.edu <mailto:go-software at mailman.stanford.edu>
> *Betreff:* Re: AW: AW: Enrichment GO Help query
>
> Dear Alfons,
>
> The error message shown below suggests that the AmiGO enrichment tool 
> may not be able to process unrecognized IDs even if you provide a 
> background set containing GO annotations for all those IDs. Before 
> suggesting other solutions, I'd wait to hear back from the GO Software 
> group if they have any comments, as AmiGO is maintained by them. 
> They're located on the West Coast of the US, so it may be a few hours 
> before they're able to get back to you.
>
> As for the GAF file format, you may find details here:
> http://www.geneontology.org/GO.format.gaf-2_0.shtml
> Since it seems that the data you have have not been submitted and 
> included in any public database, if this is the case you wouldn't be 
> able to create a properly formatted GAF file, as column 1 would have 
> to indicate the database. I was hoping that a simple tab-delimited 
> text file, containing the information you do have, would be sufficient 
> for the tool, but this may not be the case if AmiGO can't relate your 
> IDs to any database.
> In your example below, if you did have a database that you could 
> indicate, that would have to be column 1 (not "GO" ) if you wanted to 
> make a GAF file. And yes, the gene product IDs should go in column 2.
>
> I'm sorry I don't have any more suggestions at this time but please 
> bear with us while we wait to hear from the GO Software group.
> Thank you and best regards,
>
> Paola Roncaglia, for GO help.
>
> On 6/27/12 11:48 AM, Alfons Weig wrote:
>
> Hello Paola,
>
> I have tried to follow your instructions but was not able to overcome 
> the error shown below. I am not sure, in which colum of a GAF file the 
> id should sho up, but I assume it is column 2.
>
> I prepared two files
>
> Ex. 1, without col 1 filled in:
>
>             100258            jgi|Eurhe1|100258|CE89580_18773 
> 0          GO:0006468   0          ND                  
> biological_process      protein amino acid 
> phosphorylation              protein taxon:41413    20120627        
> GO            0
>
>             100258            jgi|Eurhe1|100258|CE89580_18773 
> 0          GO:0018110   0          ND                  
> molecular_function    histone arginine kinase 
> activity                     protein taxon:41413    20120627        
> GO            0
>
>             100258            jgi|Eurhe1|100258|CE89580_18773 
> 0          GO:0004871   0          ND                  
> molecular_function    signal transducer activity                   
> protein taxon:41413    20120627        GO      0
>
>             100258            jgi|Eurhe1|100258|CE89580_18773 
> 0          GO:0018106   0          ND                  
> biological_process      peptidyl-histidine 
> phosphorylation                 protein taxon:41413    20120627        
> GO            0
>
>             100258            jgi|Eurhe1|100258|CE89580_18773 
> 0          GO:0004673   0          ND                  
> molecular_function    protein histidine kinase 
> activity                      protein taxon:41413    20120627        
> GO            0
>
>             100258            jgi|Eurhe1|100258|CE89580_18773 
> 0          GO:0016772   0          ND                  
> molecular_function    transferase activity, transferring 
> phosphorus-containing groups                 protein            
> taxon:41413    20120627        GO      0
>
> Ex. 2, with col 1 filled in (I am not sure whether 'GO' is correct)
>
> GO      100258            jgi|Eurhe1|100258|CE89580_18773 0          
> GO:0006468   0          ND                  P            protein amino 
> acid phosphorylation              protein taxon:41413    
> 20120627        GO      0
>
> GO      100258            jgi|Eurhe1|100258|CE89580_18773 0          
> GO:0018110   0          ND                  F            histone 
> arginine kinase activity                     protein taxon:41413    
> 20120627        GO      0
>
> GO      100258            jgi|Eurhe1|100258|CE89580_18773 0          
> GO:0004871   0          ND                  F          signal 
> transducer activity                 protein taxon:41413    
> 20120627        GO      0
>
> GO      100258            jgi|Eurhe1|100258|CE89580_18773 0          
> GO:0018106   0          ND                  P            
> peptidyl-histidine phosphorylation                 protein 
> taxon:41413    20120627        GO      0
>
> GO      100258            jgi|Eurhe1|100258|CE89580_18773 0          
> GO:0004673   0          ND                  F            protein 
> histidine kinase activity                      protein taxon:41413    
> 20120627        GO      0
>
> GO      100258            jgi|Eurhe1|100258|CE89580_18773 0          
> GO:0016772   0          ND                  F            transferase 
> activity, transferring phosphorus-containing groups                 
> protein taxon:41413    20120627            GO      0
>
> Does this information help?
>
> Best regards
>
> Alfons
>
> *Dr. Alfons Weig*
>
> DNA-Analytik & Ökoinformatik - Univ. Bayreuth
>
> Universitätsstrasse 30
>
> 95447 Bayreuth - Germany
>
> Tel. +49 (0)921-552457
>
> www.daneco.uni-bayreuth.de <http://www.daneco.uni-bayreuth.de>
>
> *Von:*Paola Roncaglia [mailto:paola at ebi.ac.uk]
> *Gesendet:* Mittwoch, 27. Juni 2012 11:48
> *An:* Alfons Weig
> *Cc:* go-helpdesk at mailman.stanford.edu 
> <mailto:go-helpdesk at mailman.stanford.edu>; 
> go-software at mailman.stanford.edu <mailto:go-software at mailman.stanford.edu>
> *Betreff:* Re: AW: Enrichment GO Help query
>
> Dear Alfons,
>
> Thank you for your email. If I understand correctly, the identifiers 
> in the column labeled "#proteinID" have been arbitrarily assigned by 
> your sequencing lab? Are these identifiers already included in any 
> public database? Because if so, I might point you to a way to convert 
> your gene product IDs into something that the enrichment tool will 
> recognize.
>
> You might still be able to use the AmiGO enrichment tool as follows - 
> I'm not entirely sure this will work as I've never used it with a 
> newly sequenced species, but it's worth a try, and I'm cc-ing the GO 
> Software group on this email in case they have any comments:
>
> In the field labeled "Input your gene products", enter the list of 
> identifiers for your induced/repressed genes. These identifiers should 
> be in the same format as the ones in the column labeled "#proteinID" 
> in your example (they should be a subset of those proteinIDs). You may 
> simply write the IDs in the box (separated by a whitespace), or upload 
> a text file with the IDs separated by a newline or a whitespace.
>
> Then, in the field labeled "Input your background set", upload the 
> full dataset you received from the genome sequencing project. You may 
> not need a full GAF file; the information you have in your example 
> might suffice. If this does not work, we might have to explore other 
> solutions.
>
> In the field labeled "Select the database filter", click on "No 
> selection".
>
> In the field labeled "IEA annotations", "use IEAs in calculation" - 
> select "yes". Because your fungal species is newly sequenced, most of 
> the annotations are likely to be IEAs (inferred from electronic 
> annotation); please review the manual page for the AmiGO enrichment 
> tool, if you haven't already done so:
> http://wiki.geneontology.org/index.php/AmiGO_Manual:_Term_Enrichment
>
> I presume your microarray is not a commercial one. If it were 
> commercial, there may be other publicly available enrichment tools you 
> could use - feel free to let us know in that case.
>
> As for your AmiGO BLAST questions:
> A manual page for the tool is available here: 
> http://wiki.geneontology.org/index.php/AmiGO_Manual:_BLAST
> In particular, see the section "Entering sequences". You may use more 
> than one gene, but the format to use depends, again, on whether your 
> gene products are included in any public database or not. If not, you 
> may use FASTA sequences. Please refer to the manual for full details. 
> You probably would not want to use BLAST results to do an enrichment 
> analysis, as those results will be from species that may be quite 
> different from your fungal one.
>
> I hope this answers your queries. If the AmiGO term enrichment doesn't 
> work, please let us know and we'll look into other solutions.
>
> With best regards,
>
> Paola Roncaglia, for GO help.
>
> On 6/26/12 3:08 PM, Alfons Weig wrote:
>
> Dear Paola,
>
> thank you for your fast reply.
>
> Unfortunately, I do not have a full gene association file. I have 
> already looked at you GAF1.0 and GAF2.0 specifications, but there are 
> too much columns in there, which I cannot fill with my data. The only 
> dataset I received from the genome sequencing project have the 
> following content (4 lines as an example). It is probably to less to 
> create a valid GAF background file.
>
> #proteinId        gotermId         goName          gotermType    goAcc
>
> 12647  815      rhodopsin-like receptor activity         
> molecular_function    GO:0001584
>
> 12647  5270    G-protein coupled receptor protein signaling 
> pathway         biological_process      GO:0007186
>
> 12647  9321    integral to membrane cellular_component    GO:0016021
>
> 13104  162      nucleotide binding      molecular_function    GO:0000166
>
> etc.
>
> I have also tried to figure out what exactly a "list containing gene 
> products" as the background set would mean (see screenshot below taken 
> from the GO Term Enrichment page) , but I could not find any example 
> for that type of file..
>
> So, I was looking for a tool which would allow me to enter GO terms 
> directly (taken from induced/repressed genes) and to overlay it to a 
> GO graph.
>
> I have also seen the BLAST tool at AMIGO, but I could not figure out 
> how I can combine Blast results from more than one gene to initiate a 
> subsequent GO Term enrichment analysis; I can only use the results of 
> one single gene, right?
>
> Thanks for your help!
>
> Best regards
>
> Alfons
>
> *Dr. Alfons Weig*
>
> DNA-Analytik & Ökoinformatik - Univ. Bayreuth
>
> Universitätsstrasse 30
>
> 95447 Bayreuth - Germany
>
> Tel. +49 (0)921-552457
>
> www.daneco.uni-bayreuth.de <http://www.daneco.uni-bayreuth.de>
>
> *Von:*Paola Roncaglia [mailto:paola at ebi.ac.uk]
> *Gesendet:* Dienstag, 26. Juni 2012 15:46
> *An:* a.weig at uni-bayreuth.de <mailto:a.weig at uni-bayreuth.de>
> *Cc:* go-helpdesk at mailman.stanford.edu 
> <mailto:go-helpdesk at mailman.stanford.edu>
> *Betreff:* Re: Enrichment GO Help query
>
> Dear Alfons,
>
> Thank you for writing to GO.
>
> I have a few questions so I may provide you with better indications. 
> Do you have a gene association file for the fungal species you're 
> working on? That is, a file that includes both the full list of genes 
> in your fungus' genome (and/or the genes on the microarray) and the GO 
> terms associated to each gene. Is the microarray you're using a 
> commercial one?
>
> For your reference, a list of GO tools for enrichment analysis is 
> available here:
> http://www.geneontology.org/GO.tools_by_type.term_enrichment.shtml
>
> With best regards,
>
> Paola Roncaglia, for GO help.
>
>
>
>
>
>
> Subject:
>
> GO Help query (from website)
>
> From:
>
> a.weig at uni-bayreuth.de <mailto:a.weig at uni-bayreuth.de>
>
> Date:
>
> Tue, 26 Jun 2012 00:14:36 -0700
>
> To:
>
> go-helpdesk at mailman.stanford.edu <mailto:go-helpdesk at mailman.stanford.edu>
>
> Email:a.weig at uni-bayreuth.de  <mailto:a.weig at uni-bayreuth.de>
> Name: Ontology (from Alfons Weig)
> Text: Hello,
> I have identified induced/repressed genes by microarray analysis from a fully sequenced fungus. Most of these genes were already annotated with GO terms during the sequencing and annotation project. However, this organims is not included in the reference organisms in AMIGO's GO Term enrichment.
> I am looking for a tool where I can enter GO terms directly to see whether some term are enriched. My hope was to use GO Slimmer using the fungal or Aspergillus slims, but - again - genes are required for input.
>   
> Is there any tool where I can analyse a subset of GO terms for enrichment/slim analyses.
>   
> Best regards
> Alfons Weig
>
>
>
>
>
>
> -- 
> Dr Paola Roncaglia
> GO Editorial Office
> EMBL-EBI
> Wellcome Trust Genome Campus
> Hinxton
> Cambridgeshire, UK
> CB10 1SD
> p: +44 1223 492600
> f: +44 1223 494468
>
>
>
>
>
> -- 
> Dr Paola Roncaglia
> GO Editorial Office
> EMBL-EBI
> Wellcome Trust Genome Campus
> Hinxton
> Cambridgeshire, UK
> CB10 1SD
> p: +44 1223 492600
> f: +44 1223 494468
>
>
>
>
> -- 
> Dr Paola Roncaglia
> GO Editorial Office
> EMBL-EBI
> Wellcome Trust Genome Campus
> Hinxton
> Cambridgeshire, UK
> CB10 1SD
> p: +44 1223 492600
> f: +44 1223 494468
>
>
>
> -- 
> Dr Paola Roncaglia
> GO Editorial Office
> EMBL-EBI
> Wellcome Trust Genome Campus
> Hinxton
> Cambridgeshire, UK
> CB10 1SD
> p: +44 1223 492600
> f: +44 1223 494468

-- 
Dr Paola Roncaglia
GO Editorial Office
EMBL-EBI
Wellcome Trust Genome Campus
Hinxton
Cambridgeshire, UK
CB10 1SD
p: +44 1223 492600
f: +44 1223 494468

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.stanford.edu/pipermail/go-helpdesk/attachments/20120627/a1c81597/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 19843 bytes
Desc: not available
URL: <http://mailman.stanford.edu/pipermail/go-helpdesk/attachments/20120627/a1c81597/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 11321 bytes
Desc: not available
URL: <http://mailman.stanford.edu/pipermail/go-helpdesk/attachments/20120627/a1c81597/attachment-0001.png>


More information about the go-helpdesk mailing list