Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

GO and PROSITE

Rolf Apweiler apweiler at ebi.ac.uk
Sat Apr 6 04:53:10 PST 2002


Hi Xinghua,

> We have tried different methods to associate the GO term with PROSITE
> patterns.  The result is interesting and promising. However, for most of
> time,  the GO term tends to be over generalized than the annotation of
> PROSITE pattern.  We need some "gold standard" to evaluate our result.
> Alternatively, we can make our system available on the web and have
> friends at geneontology evaluate the results.  Eventually, if the system works
> well, it can be part of GO.  I hope this will be of interest?

I believe you are wasting your time. This mapping is already done and is
constantly updated. I just have sent you the mappings. As Wolfgang already
said:

> > PROSITE is part of InterPro, see www.ebi.ac.uk/interpro for details.
> > Nicky Mulder started to map InterPro entries to GO terms. As all PROSITE
> > patterns have exactly one InterPro accession, her mapping can be
> > translated
> > easily from InterPro -> GO to PROSITE -> GO. If you have difficulties
> > doing
> > that, mail again, we surely can help with that.
> >
> > Similarily, searching for PROSITE patterns is part of searching for
> > InterPro
> > entries. You can use this online or download and install at your site.
> > "InterProScan"
> > has an option to do the mapping to GO terms. Input is a aminoacid
> > sequence,
> > output are GO terms.

If you want to know more about InterPro2Go (and thus also PROSITE2GO), here it
comes:

InterPro [Apweiler et al., 2001] is an integrated documentation resource for
protein families, domains and sites, developed initially as a means of
rationalising the complementary efforts of the PROSITE (Falquet et al., 2002),
PRINTS (Attwood et al., 2002), Pfam (Bateman et al., 2002) and ProDom (Corpet
et al., 2000) databases. The project has now been extended to include SMART
(Letunic et al., 2002) and TIGRFAMs (Haft et al., 2001).

InterPro entries provide annotation describing a set of related proteins, some
of which may have identical molecular functions, be involved in the same
processes, and perform their function in the same cellular locations. Mapping
of InterPro entries to GO terms thus provide an automatic means of assigning GO
terms to the corresponding proteins. The assignment of GO terms to InterPro
entries was done by manual inspection of the abstract of the entries and
annotation of proteins in the match lists, and mapping of the appropriate GO
terms of any level which apply to the whole protein, not necessarily only the
domain described. The associated GO terms should also apply to all proteins
with true hits to all signatures in the InterPro entry. For each associated
term the name of the term and GO accession number is given, and these are
visible in InterPro entries, with links to the EBI QuickGO browser. In this
way, all proteins belonging to InterPro entries mapped to GO terms can be
automatically mapped to these GO terms. An additional advantage is that
multifunctional proteins can be mapped to multiple GO terms though associations
with more than one different InterPro entry matched.

Some entries could be mapped to very deep level (specific) GO terms, while
entries describing wider families or common domains could only be mapped to
higher level terms or could not be mapped at all. In many cases where there is
a parent/child relationship in InterPro, a protein can be mapped to a high
level term through the parent entry as well as to a specific term through a
more specific child entry.

The integrity of the InterPro to GO mappings is maintained by running regular
sanity checks on the data. The checks include searching for mappings from
secondary or deleted InterPro accession numbers, and mappings to obsolete or
non-existent GO terms. The reports are manually checked and corrected.

The data is available directly from the InterPro database at
http://www.ebi.ac.uk/interpro/ or through an InterPro-to-GO flat-file available
from the EBI ftp site [ftp://ftp.ebi.ac.uk/pub/databases/interpro]. This file
lists the InterPro entries and corresponding GO terms. A protein-to-GO file is
also produced which maps proteins to InterPro entries to GO. The results are
used for GO association files and for the Proteome Analysis pages
[http://www.ebi.ac.uk/proteome/]. The data is also available via the EBI SRS
server at http://srs6.ebi.ac.uk/. It is possible to search a sequence against
InterPro using InterProScan at http://www.ebi.ac.uk/interpro/scan.html, and
then link the results to the appropriate GO term through the InterPro-to-GO
associations.


Cheers

Rolf


--
This message is from the GOFriends moderated mailing list.  A list of public
announcements and discussion of the Gene Ontology (GO) project.
Problems with the list?           E-mail: owner-gofriends at geneontology.org
Subscribing   send   "subscribe"   to   gofriends-request at geneontology.org
Unsubscribing send   "unsubscribe"  to  gofriends-request at geneontology.org
Web:          http://www.geneontology.org/



More information about the go-friends mailing list