Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

[SO-devel] Re: annotating to pseudogenes

chris mungall cjm at fruitfly.org
Mon Mar 20 15:37:18 PST 2006


Here's how I would represent this, if we decided to go this way:

is-a hierarchy:

pseudogene: <current def>
  non-expressed-pseudogene: a pseudogene that lacks expression
  expressed-pseudogene: a pseudogene that expresses ncRNA that lacks  
function

the relation between ncRNA and expressed-pseudogene is trickier. We  
should not use is-a. Currently there is a member_of relation between a  
transcript and a gene in SO, and there should be a parallel structure  
for pseudos.

We can't say ncRNA member_of expressed-pseudogene, since this implies  
that all ncRNAs are members of some expressed-pseudogene using the  
standard ALL-SOME relation definitions.

We could introduce a subtype "nonfunctional ncRNA" and say  
nonfunctional-ncRNA member_of expressed-pseudogene.

This is also problematic, as there are presumably functional genes that  
have some nonfunctional transcripts.

In fact defining a expressed-pseudogene as being a pseudogene which  
expressed ncRNA is problematic, as all ncRNA are members_of some gene -  
and  a pseudogene is not a gene.

The only solution which guarantees consistency in the ontology is to  
define transcript as having a function, and having nonfunctional-ncRNA  
as a sibling of ncRNA

Of course it may be simpler to declare expressed-pseudogene oxymoronic  
- anything that is expressed is a gene

On Mar 20, 2006, at 9:33 AM, Karen Christie wrote:

> Hi
>
> I have a question along the lines of Val's comment. I was wondering  
> what
> the scientific community would expect to see with respect to how a
> "pseudogene" that expresses a transcript would be annotated in SO. Suzi
> has said that if a pseudogene is discovered to express a transcript,  
> than
> the 'pseudogene' annotation should be removed and replaced with  
> something
> else, i.e. 'ncRNA'. However, is that consistent with what the people
> studying these do, i.e. if they discover that a 'pseudogene' is  
> expressed,
> do they stop calling it a 'pseudogene'? I'm not sure that they do.
>
> In the exposure to this issue that I've had, it seems that people DO
> continue to call it a pseudogene, but add the adjective 'expressed' so
> that they now refer to it as an 'expressed pseudogene'. If this is  
> common
> practice, to refer to 'pseudogenes', i.e. genes that don't express the
> product they might have been expected to based on sequence similarity  
> to a
> functional gene, that actually do express a product, often an ncRNA, as
> 'expressed pseudogenes', then perhaps SO should reflect that usage,  
> rather
> than impose a perhaps artificially strict definition that pseudogenes
> never express any product. Perhaps we could have a SO term for  
> 'expressed
> pseudogene' to capture this particular class of features, perhaps with
> dual parentage both under 'pseudogene' and under 'ncRNA'.
>
> Like I said at the beginning, this is a question about how would
> researchers expect an 'expressed pseudogene' to be annotated, and I'm  
> far
> from an expert on pseudogenes. I'm also more familiar with GO practice
> that SO, but it seems that if researchers still refer to a feature as a
> pseudogene, even after it has been discovered to produce a transcript  
> that
> may function as an ncRNA, then SO should attempt to reflect the usage  
> of
> the research community.
>
> -Karen
>
>
>
> On Mon, 20 Mar 2006 val at sanger.ac.uk wrote:
>
>>
>>
>> So, if a gene with a degraded protein coding CDS was found to have  
>> functionality
>> as a ncRNA, can you annotate:
>>
>> i) a ncRNA feature (with appropriate GO terms)
>> ii) the degraded CDS region as a pseudogene, or pseudogenic exon or  
>> whatever
>>
>> It sounds as if not, but for complete annotation of features you  
>> would probably
>> still want to capture the degraded protein coding region. Certainly  
>> this would
>> be useful information to anybody who didn't know the full history of  
>> the
>> feature. Is there any reason that both cannot be captured as  
>> different partially
>> overlapping feature types?
>>
>> Val
>>
>>
>>
>>
>> Quoting Suzanna Lewis <suzi at fruitfly.org>:
>>
>>>
>>> On Mar 19, 2006, at 9:52 AM, Karen Eilbeck wrote:
>>>
>>>> Before we all agree on the proposed definition for pseudogenes, we
>>>> need to address some issues.
>>>> Firstly if we use this definition, then a region that is a  
>>>> pseudogene
>>>> that turns out to also be a functional non-coding RNA, will also be  
>>>> a
>>>> ncRNA.
>>>> I'm not sure if this will affect supporting software.
>>>
>>> Non, non, mon cheri.
>>>
>>> If it turns out to be a functional non-coding RNA then it is *not* a
>>> pseudogene (even with this modified definition). The annotation  
>>> calling
>>> it a "pseudogene" would necessarily have to be updated (removed) as
>>> soon as proof of functionality is found.
>>>
>>> -S
>>>
>>>>
>>>> Secondly the definition is supposed the explicitly describe the
>>>> feature, and the phrase "that is thought to be" adds vagueness  
>>>> rather
>>>> than clarity.
>>>>
>>>> I'm sure there will ample discussion of this term in St Croix.
>>>>
>>>> --Karen
>>>>
>>>>
>>>>
>>>> On Mar 19, 2006, at 5:27 AM, Richard Durbin wrote:
>>>>
>>>>> Thank you for this thoughtful analysis and proposal. I support the
>>>>> revised definition as well.
>>>>> Richard
>>>>>
>>>>> Michael Ashburner (Genetics) wrote:
>>>>>
>>>>>> I could live with that changed definition.
>>>>>> Michael
>>>>>>
>>>>>>
>>>>>>> Envelope-to: ma11 at gen.cam.ac.uk
>>>>>>> Delivery-date: Wed, 15 Mar 2006 18:44:22 +0000
>>>>>>> X-Cam-SpamDetails: scanned, SpamAssassin (score=0)
>>>>>>> X-Cam-AntiVirus: No virus found
>>>>>>> X-Cam-ScannerInfo: http://www.cam.ac.uk/cs/email/scanner/
>>>>>>> From: Karen Christie <kchris at genome.stanford.edu>
>>>>>>> To: Hubert Renauld <hjr at sanger.ac.uk>
>>>>>>> Cc: ruth at galton.ucl.ac.uk, Karen Eilbeck <eilbeck at fruitfly.org>,
>>>>>> ...
>>>>>>
>>>>>>> List-Archive:
>>>>>>> <http://sourceforge.net/mailarchive/forum.php?forum=song-devel>
>>>>>>> Date: Wed, 15 Mar 2006 09:46:15 -0800 (PST)
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> My comment here is more about the SO definition of pseudogene,  
>>>>>>> than
>>>>>>> about
>>>>>>> whether or not to use GO terms to annotate features given that
>>>>>>> label.
>>>>>>>
>>>>>>> I was thinking about the SO definition of pseudogene this morning
>>>>>>> and was
>>>>>>> wondering if part of the problem is that the definition may have
>>>>>>> been
>>>>>>> written from a protein-centric view of genes and what the  
>>>>>>> possible
>>>>>>> functions of genes are. If I understand correctly, and Rama and
>>>>>>> some of
>>>>>>> the other people who are more up on pseudogenes may correct me,  
>>>>>>> most
>>>>>>> pseudogenes have been designated as such by virtue of being 1)
>>>>>>> similar to
>>>>>>> a known protein-coding gene and 2) being thought to NOT express  
>>>>>>> that
>>>>>>> particular protein that might have been expected based on its
>>>>>>> similarity
>>>>>>> to a known protein coding gene. In other words, most people when
>>>>>>> designating something as a pseudogene were only thinking about  
>>>>>>> it's
>>>>>>> protein coding ability. It seems that all of the cases that  
>>>>>>> people
>>>>>>> have
>>>>>>> mentioned in this thread where a pseudogene is expressed result  
>>>>>>> in
>>>>>>> the
>>>>>>> production of an RNA from a pseudogene that resesmbles a protein,
>>>>>>> thus the
>>>>>>> pseudogene is not producing the gene product that it might have  
>>>>>>> been
>>>>>>> expected to have based on its sequence similarity to a protein
>>>>>>> coding
>>>>>>> gene. It seems to me that even when it has been discovered that a
>>>>>>> "pseudogene" produces an RNA transcript that may have activity in
>>>>>>> regulating the gene it is related to, that the community still
>>>>>>> calls these
>>>>>>> "pseudogenes" because they do not produce the protein product
>>>>>>> expected
>>>>>>> based on sequence similarity to the known functional gene.
>>>>>>>
>>>>>>> It seems possible that there may be pseudogenes of ncRNA genes as
>>>>>>> well of
>>>>>>> protein coding genes, but perhaps we can revise the definition of
>>>>>>> pseudogene to be a little more accurate. While GO and SO do need  
>>>>>>> to
>>>>>>> be
>>>>>>> precise and rigorous, often more so than the literature, we also  
>>>>>>> do
>>>>>>> need
>>>>>>> to reflect the community usage of terms.
>>>>>>>
>>>>>>> Here's my thoughts on a possible revision of the SO def of
>>>>>>> pseudogene; the
>>>>>>> current SO def is below.
>>>>>>>
>>>>>>> Possible revision:
>>>>>>>
>>>>>>> def: "A sequence that closely resembles a known functional gene,  
>>>>>>> at
>>>>>>> another locus within a genome, that is thought to be
>>>>>>> non-functional, with
>>>>>>> respect to producing the expected gene product based on sequence
>>>>>>> similarity with the known functional gene, as a consequence of
>>>>>>> (usually
>>>>>>> several) mutations that prevent either its transcription or
>>>>>>> translation
>>>>>>> (or both). In general, pseudogenes result from either reverse
>>>>>>> transcription of a transcript of their \"normal\" paralog
>>>>>>> (SO:0000043) (in
>>>>>>> which case the pseudogene typically lacks introns and includes a
>>>>>>> poly(A)
>>>>>>> tail) or from recombination (SO:0000044) (in which case the
>>>>>>> pseudogene is
>>>>>>> typically a tandem duplication of its \"normal\" paralog)."
>>>>>>> [http://www.ucl.ac.uk/ ~ ucbhjow/b241/glossary.html] subset: SOFA
>>>>>>>
>>>>>>>
>>>>>>> Current SO def:
>>>>>>>
>>>>>>> def: "A sequence that closely resembles a known functional gene,  
>>>>>>> at
>>>>>>> another locus within a genome, that is non-functional as a
>>>>>>> consequence of
>>>>>>> (usually several) mutations that prevent either its  
>>>>>>> transcription or
>>>>>>> translation (or both). In general, pseudogenes result from either
>>>>>>> reverse
>>>>>>> transcription of a transcript of their \"normal\" paralog
>>>>>>> (SO:0000043) (in
>>>>>>> which case the pseudogene typically lacks introns and includes a
>>>>>>> poly(A) tail) or from recombination (SO:0000044) (in which case  
>>>>>>> the
>>>>>>> pseudogene is typically a tandem duplication of its
>>>>>>> \"normal\" paralog)." [http://www.ucl.ac.uk/ ~
>>>>>>> ucbhjow/b241/glossary.html]
>>>>>>> subset: SOFA
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -------------------------------------------------------
>>>>>>> This SF.Net email is sponsored by xPML, a groundbreaking  
>>>>>>> scripting
>>>>>>> language
>>>>>>> that extends applications into web and mobile media. Attend the
>>>>>>> live webcast
>>>>>>> and join the prime developer group breaking into this new coding
>>>>>>> territory!
>>>>>>> http://sel.as-us.falkag.net/sel?
>>>>>>> cmd=lnk&kid=110944&bid=241720&dat=121642
>>>>>>> _______________________________________________
>>>>>>> SOng-devel mailing list
>>>>>>> SOng-devel at lists.sourceforge.net
>>>>>>> https://lists.sourceforge.net/lists/listinfo/song-devel
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> -------------------------------------------------------
>>>>>> This SF.Net email is sponsored by xPML, a groundbreaking scripting
>>>>>> language
>>>>>> that extends applications into web and mobile media. Attend the  
>>>>>> live
>>>>>> webcast
>>>>>> and join the prime developer group breaking into this new coding
>>>>>> territory!
>>>>>> http://sel.as-us.falkag.net/sel?
>>>>>> cmd=lnk&kid=110944&bid=241720&dat=121642
>>>>>> _______________________________________________
>>>>>> SOng-devel mailing list
>>>>>> SOng-devel at lists.sourceforge.net
>>>>>> https://lists.sourceforge.net/lists/listinfo/song-devel
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>> -------------------------------------------------------
>>> This SF.Net email is sponsored by xPML, a groundbreaking scripting  
>>> language
>>> that extends applications into web and mobile media. Attend the live  
>>> webcast
>>> and join the prime developer group breaking into this new coding  
>>> territory!
>>> http://sel.as-us.falkag.net/sel? 
>>> cmd=lnk&kid=110944&bid=241720&dat=121642
>>> _______________________________________________
>>> SOng-devel mailing list
>>> SOng-devel at lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/song-devel
>>>
>>
>>
>>
>>
>>
>> -------------------------------------------------------
>> This SF.Net email is sponsored by xPML, a groundbreaking scripting  
>> language
>> that extends applications into web and mobile media. Attend the live  
>> webcast
>> and join the prime developer group breaking into this new coding  
>> territory!
>> http://sel.as-us.falkag.net/sel? 
>> cmd=lnk&kid=110944&bid=241720&dat=121642
>> _______________________________________________
>> SOng-devel mailing list
>> SOng-devel at lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/song-devel
>>
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting  
> language
> that extends applications into web and mobile media. Attend the live  
> webcast
> and join the prime developer group breaking into this new coding  
> territory!
> http://sel.as-us.falkag.net/sel? 
> cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> SOng-devel mailing list
> SOng-devel at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/song-devel



More information about the go-discuss mailing list