Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

how to define GO groups with a certain size boundary

Albert Vilella avilella at gmail.com
Tue Nov 8 06:55:37 PST 2005


Hi all go-dev-elopers and gofriends,

> >> I am trying to split a large subset of the GO-annotated Flybase
genes
> >> into GO groups to use them as categories for my analyses.
> >> 
> >> But I would like to try and merge the groups that have too few
genes
> >> into their upper GO level and split those that accumulate too
> >> much genes into their lower GO level.
> >> 
> >> For what I have seen in most of the published articles, this is
just
> >> the other way around of what most people (for example, in the
> >> microarrays field) is doing: performing an analysis to obtain a
subset
> >> of genes, then look for enrichments in the GO DB distribution of
> >> ids.
> >> 
> >> I would like to ask if anybody has stumbled over this situation
before
> >> or if anyone has any suggestion about how to do this.
> >> 
> >> It is worth to mention that this merge/split trick was used in the
> >> paper of the chimp genome ("Initial sequence of the chimpanzee
genome
> >> and comparison with the human genome" (Nature)) although they did
it
> >> "a posteriori" only in the categories that showed a significant
> >> diference to a given analysis.

El dt 08 de 11 del 2005 a les 13:45 +0000, en/na Jane Lomax va escriure:

> Hi Albert - as far as know there isn't an easy way to do this, but
I'll 
> explain the long way to do it. It may be worth emailing GO friends 
> (gofriends at geneontology.org) though, as there are often tools out
there 
> that we don't yet know about.
> 
> So a way you could do it would be to create your own set of high-level
GO 
> categories (called a GO slim, see 
> http://www.geneontology.org/GO.slims.shtml), and then use this to sort
out 
> your annotation set into those categories. You can create a GO slim
using 
> DAG-Edit (http://www.geneontology.org/GO.tools.shtml#in_house) - there
are 
> some instructions on doing this here: 
> 
> http://www.geneontology.org/GO.teaching.resources.shtml#tut
> 
> (I will get round to making proper documentation for doing this soon,
I 
> promise!). Then you can use the Perl script map2slim 
> (http://www.geneontology.org/GO.slims.shtml#script) to 'bucket' your 
> annotations into your categories - it may be easier for you to use a 
> web-based implementation of this script e.g. Generic GO Term Mapper
> (http://www.geneontology.org/GO.tools.other.shtml#ggtm).
> 
> The difficult thing will be that you will have to keep adjusting your 
> GO slim set and re-running the mapping script to see which categories
have 
> too many annotations until you get the correct balance. It would be
nice 
> to have a tool to do this automatically.

Yes, from my humble biological-degree-not-graph-theory-guru
background, I understand that the tricky bit of this stuff is to
balance the categories: split the large GOids into their childs and
merge the small GOids into their parent.

For the tiny amount of investigation I made, I believe that if GO were
a cyclic graph, this would be a challenging algorithm. 

But with a DAG with the characteristics of GO, I bet there must be a
graph theory guru that can enlighten me on this issue with an
algorithm that does this adjustments.

Manually balancing the DAG is worth the try.

Anyone any hint?

Bests,

    Albert.


--
This message is from the GOFriends moderated mailing list.  A list of public
announcements and discussion of the Gene Ontology (GO) project.
Problems with the list?           E-mail: owner-gofriends at geneontology.org
Subscribing   send   "subscribe"   to   gofriends-request at geneontology.org
Unsubscribing send   "unsubscribe"  to  gofriends-request at geneontology.org
Web:          http://www.geneontology.org/



More information about the go-friends mailing list