Search Mailing List Archives

Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

[Gofriends] matrices from terms

Gabriel Berriz gberriz at
Tue Nov 18 06:23:00 PST 2008

On 2008.11.14 Fri, at 14:06, Chris Mungall wrote:
> The generally recommended way to do this is via the graph_path table
> in the GO database. You can query either a local installation, or a
> public mirror, via
> The documentation for this table is here:
> However, at this time this table contains the transitive closure
> computed by considering all edges, ignoring edge labels (relations).
> Thus it does not satisfy your requirement that only the is_a relation
> is considered

Chris, couldn't one get the information Nicholas wants directly from  
the table term2term in the GO database?  Unlike the graph_path table,  
this table does provide edge labels.  One could use a recursive  
descendants function together with a children function that returns  
only is_a children (from term2term).  In Perl-ish it would look like  

sub descendants {
   my $node = shift;

   my @descendants = ( $node );

   for my $child ( children( $node ) ) {
     push @descendants, descendants( $child );

   return @descendants;

In English, the descendants of a node is the union of the node's  
children's descendants, plus the node itself.  (Here we follow the  
convention, also followed by the graph_path table, of including the  
node among its descendants.)

This is just a sketch.  As written it returns a list of descendants  
that will generally contain duplicates.  It would be better if these  
duplicates were removed.  Also, its performance can be optimized  
significantly by avoiding unnecessary recursion through some form of  

The children function used by the descendants function would be  
implemented using something like the following SQL query:

SELECT term2_id FROM term2term tt, term t WHERE  
tt.relationship_type_id = AND t.acc = 'is_a' AND tt.term1_id = ?;


Gabriel F. Berriz, PhD
Bioinformatics Developer
Roth Lab
Biological Chemistry and Molecular Pharmacology -- Harvard Medical  
Seeley G. Mudd Building 322B
Boston, MA 02115-5701
Telephone: 617.432.3555
Fax: 617.432.3557

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the go-friends mailing list