Search Mailing List Archives
[SO-devel] Re: annotating to pseudogenes
chris mungall
cjm at fruitfly.org
Mon Mar 20 15:37:18 PST 2006
Here's how I would represent this, if we decided to go this way:
is-a hierarchy:
pseudogene: <current def>
non-expressed-pseudogene: a pseudogene that lacks expression
expressed-pseudogene: a pseudogene that expresses ncRNA that lacks
function
the relation between ncRNA and expressed-pseudogene is trickier. We
should not use is-a. Currently there is a member_of relation between a
transcript and a gene in SO, and there should be a parallel structure
for pseudos.
We can't say ncRNA member_of expressed-pseudogene, since this implies
that all ncRNAs are members of some expressed-pseudogene using the
standard ALL-SOME relation definitions.
We could introduce a subtype "nonfunctional ncRNA" and say
nonfunctional-ncRNA member_of expressed-pseudogene.
This is also problematic, as there are presumably functional genes that
have some nonfunctional transcripts.
In fact defining a expressed-pseudogene as being a pseudogene which
expressed ncRNA is problematic, as all ncRNA are members_of some gene -
and a pseudogene is not a gene.
The only solution which guarantees consistency in the ontology is to
define transcript as having a function, and having nonfunctional-ncRNA
as a sibling of ncRNA
Of course it may be simpler to declare expressed-pseudogene oxymoronic
- anything that is expressed is a gene
On Mar 20, 2006, at 9:33 AM, Karen Christie wrote:
> Hi
>
> I have a question along the lines of Val's comment. I was wondering
> what
> the scientific community would expect to see with respect to how a
> "pseudogene" that expresses a transcript would be annotated in SO. Suzi
> has said that if a pseudogene is discovered to express a transcript,
> than
> the 'pseudogene' annotation should be removed and replaced with
> something
> else, i.e. 'ncRNA'. However, is that consistent with what the people
> studying these do, i.e. if they discover that a 'pseudogene' is
> expressed,
> do they stop calling it a 'pseudogene'? I'm not sure that they do.
>
> In the exposure to this issue that I've had, it seems that people DO
> continue to call it a pseudogene, but add the adjective 'expressed' so
> that they now refer to it as an 'expressed pseudogene'. If this is
> common
> practice, to refer to 'pseudogenes', i.e. genes that don't express the
> product they might have been expected to based on sequence similarity
> to a
> functional gene, that actually do express a product, often an ncRNA, as
> 'expressed pseudogenes', then perhaps SO should reflect that usage,
> rather
> than impose a perhaps artificially strict definition that pseudogenes
> never express any product. Perhaps we could have a SO term for
> 'expressed
> pseudogene' to capture this particular class of features, perhaps with
> dual parentage both under 'pseudogene' and under 'ncRNA'.
>
> Like I said at the beginning, this is a question about how would
> researchers expect an 'expressed pseudogene' to be annotated, and I'm
> far
> from an expert on pseudogenes. I'm also more familiar with GO practice
> that SO, but it seems that if researchers still refer to a feature as a
> pseudogene, even after it has been discovered to produce a transcript
> that
> may function as an ncRNA, then SO should attempt to reflect the usage
> of
> the research community.
>
> -Karen
>
>
>
> On Mon, 20 Mar 2006 val at sanger.ac.uk wrote:
>
>>
>>
>> So, if a gene with a degraded protein coding CDS was found to have
>> functionality
>> as a ncRNA, can you annotate:
>>
>> i) a ncRNA feature (with appropriate GO terms)
>> ii) the degraded CDS region as a pseudogene, or pseudogenic exon or
>> whatever
>>
>> It sounds as if not, but for complete annotation of features you
>> would probably
>> still want to capture the degraded protein coding region. Certainly
>> this would
>> be useful information to anybody who didn't know the full history of
>> the
>> feature. Is there any reason that both cannot be captured as
>> different partially
>> overlapping feature types?
>>
>> Val
>>
>>
>>
>>
>> Quoting Suzanna Lewis <suzi at fruitfly.org>:
>>
>>>
>>> On Mar 19, 2006, at 9:52 AM, Karen Eilbeck wrote:
>>>
>>>> Before we all agree on the proposed definition for pseudogenes, we
>>>> need to address some issues.
>>>> Firstly if we use this definition, then a region that is a
>>>> pseudogene
>>>> that turns out to also be a functional non-coding RNA, will also be
>>>> a
>>>> ncRNA.
>>>> I'm not sure if this will affect supporting software.
>>>
>>> Non, non, mon cheri.
>>>
>>> If it turns out to be a functional non-coding RNA then it is *not* a
>>> pseudogene (even with this modified definition). The annotation
>>> calling
>>> it a "pseudogene" would necessarily have to be updated (removed) as
>>> soon as proof of functionality is found.
>>>
>>> -S
>>>
>>>>
>>>> Secondly the definition is supposed the explicitly describe the
>>>> feature, and the phrase "that is thought to be" adds vagueness
>>>> rather
>>>> than clarity.
>>>>
>>>> I'm sure there will ample discussion of this term in St Croix.
>>>>
>>>> --Karen
>>>>
>>>>
>>>>
>>>> On Mar 19, 2006, at 5:27 AM, Richard Durbin wrote:
>>>>
>>>>> Thank you for this thoughtful analysis and proposal. I support the
>>>>> revised definition as well.
>>>>> Richard
>>>>>
>>>>> Michael Ashburner (Genetics) wrote:
>>>>>
>>>>>> I could live with that changed definition.
>>>>>> Michael
>>>>>>
>>>>>>
>>>>>>> Envelope-to: ma11 at gen.cam.ac.uk
>>>>>>> Delivery-date: Wed, 15 Mar 2006 18:44:22 +0000
>>>>>>> X-Cam-SpamDetails: scanned, SpamAssassin (score=0)
>>>>>>> X-Cam-AntiVirus: No virus found
>>>>>>> X-Cam-ScannerInfo: http://www.cam.ac.uk/cs/email/scanner/
>>>>>>> From: Karen Christie <kchris at genome.stanford.edu>
>>>>>>> To: Hubert Renauld <hjr at sanger.ac.uk>
>>>>>>> Cc: ruth at galton.ucl.ac.uk, Karen Eilbeck <eilbeck at fruitfly.org>,
>>>>>> ...
>>>>>>
>>>>>>> List-Archive:
>>>>>>> <http://sourceforge.net/mailarchive/forum.php?forum=song-devel>
>>>>>>> Date: Wed, 15 Mar 2006 09:46:15 -0800 (PST)
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> My comment here is more about the SO definition of pseudogene,
>>>>>>> than
>>>>>>> about
>>>>>>> whether or not to use GO terms to annotate features given that
>>>>>>> label.
>>>>>>>
>>>>>>> I was thinking about the SO definition of pseudogene this morning
>>>>>>> and was
>>>>>>> wondering if part of the problem is that the definition may have
>>>>>>> been
>>>>>>> written from a protein-centric view of genes and what the
>>>>>>> possible
>>>>>>> functions of genes are. If I understand correctly, and Rama and
>>>>>>> some of
>>>>>>> the other people who are more up on pseudogenes may correct me,
>>>>>>> most
>>>>>>> pseudogenes have been designated as such by virtue of being 1)
>>>>>>> similar to
>>>>>>> a known protein-coding gene and 2) being thought to NOT express
>>>>>>> that
>>>>>>> particular protein that might have been expected based on its
>>>>>>> similarity
>>>>>>> to a known protein coding gene. In other words, most people when
>>>>>>> designating something as a pseudogene were only thinking about
>>>>>>> it's
>>>>>>> protein coding ability. It seems that all of the cases that
>>>>>>> people
>>>>>>> have
>>>>>>> mentioned in this thread where a pseudogene is expressed result
>>>>>>> in
>>>>>>> the
>>>>>>> production of an RNA from a pseudogene that resesmbles a protein,
>>>>>>> thus the
>>>>>>> pseudogene is not producing the gene product that it might have
>>>>>>> been
>>>>>>> expected to have based on its sequence similarity to a protein
>>>>>>> coding
>>>>>>> gene. It seems to me that even when it has been discovered that a
>>>>>>> "pseudogene" produces an RNA transcript that may have activity in
>>>>>>> regulating the gene it is related to, that the community still
>>>>>>> calls these
>>>>>>> "pseudogenes" because they do not produce the protein product
>>>>>>> expected
>>>>>>> based on sequence similarity to the known functional gene.
>>>>>>>
>>>>>>> It seems possible that there may be pseudogenes of ncRNA genes as
>>>>>>> well of
>>>>>>> protein coding genes, but perhaps we can revise the definition of
>>>>>>> pseudogene to be a little more accurate. While GO and SO do need
>>>>>>> to
>>>>>>> be
>>>>>>> precise and rigorous, often more so than the literature, we also
>>>>>>> do
>>>>>>> need
>>>>>>> to reflect the community usage of terms.
>>>>>>>
>>>>>>> Here's my thoughts on a possible revision of the SO def of
>>>>>>> pseudogene; the
>>>>>>> current SO def is below.
>>>>>>>
>>>>>>> Possible revision:
>>>>>>>
>>>>>>> def: "A sequence that closely resembles a known functional gene,
>>>>>>> at
>>>>>>> another locus within a genome, that is thought to be
>>>>>>> non-functional, with
>>>>>>> respect to producing the expected gene product based on sequence
>>>>>>> similarity with the known functional gene, as a consequence of
>>>>>>> (usually
>>>>>>> several) mutations that prevent either its transcription or
>>>>>>> translation
>>>>>>> (or both). In general, pseudogenes result from either reverse
>>>>>>> transcription of a transcript of their \"normal\" paralog
>>>>>>> (SO:0000043) (in
>>>>>>> which case the pseudogene typically lacks introns and includes a
>>>>>>> poly(A)
>>>>>>> tail) or from recombination (SO:0000044) (in which case the
>>>>>>> pseudogene is
>>>>>>> typically a tandem duplication of its \"normal\" paralog)."
>>>>>>> [http://www.ucl.ac.uk/ ~ ucbhjow/b241/glossary.html] subset: SOFA
>>>>>>>
>>>>>>>
>>>>>>> Current SO def:
>>>>>>>
>>>>>>> def: "A sequence that closely resembles a known functional gene,
>>>>>>> at
>>>>>>> another locus within a genome, that is non-functional as a
>>>>>>> consequence of
>>>>>>> (usually several) mutations that prevent either its
>>>>>>> transcription or
>>>>>>> translation (or both). In general, pseudogenes result from either
>>>>>>> reverse
>>>>>>> transcription of a transcript of their \"normal\" paralog
>>>>>>> (SO:0000043) (in
>>>>>>> which case the pseudogene typically lacks introns and includes a
>>>>>>> poly(A) tail) or from recombination (SO:0000044) (in which case
>>>>>>> the
>>>>>>> pseudogene is typically a tandem duplication of its
>>>>>>> \"normal\" paralog)." [http://www.ucl.ac.uk/ ~
>>>>>>> ucbhjow/b241/glossary.html]
>>>>>>> subset: SOFA
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -------------------------------------------------------
>>>>>>> This SF.Net email is sponsored by xPML, a groundbreaking
>>>>>>> scripting
>>>>>>> language
>>>>>>> that extends applications into web and mobile media. Attend the
>>>>>>> live webcast
>>>>>>> and join the prime developer group breaking into this new coding
>>>>>>> territory!
>>>>>>> http://sel.as-us.falkag.net/sel?
>>>>>>> cmd=lnk&kid=110944&bid=241720&dat=121642
>>>>>>> _______________________________________________
>>>>>>> SOng-devel mailing list
>>>>>>> SOng-devel at lists.sourceforge.net
>>>>>>> https://lists.sourceforge.net/lists/listinfo/song-devel
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> -------------------------------------------------------
>>>>>> This SF.Net email is sponsored by xPML, a groundbreaking scripting
>>>>>> language
>>>>>> that extends applications into web and mobile media. Attend the
>>>>>> live
>>>>>> webcast
>>>>>> and join the prime developer group breaking into this new coding
>>>>>> territory!
>>>>>> http://sel.as-us.falkag.net/sel?
>>>>>> cmd=lnk&kid=110944&bid=241720&dat=121642
>>>>>> _______________________________________________
>>>>>> SOng-devel mailing list
>>>>>> SOng-devel at lists.sourceforge.net
>>>>>> https://lists.sourceforge.net/lists/listinfo/song-devel
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>> -------------------------------------------------------
>>> This SF.Net email is sponsored by xPML, a groundbreaking scripting
>>> language
>>> that extends applications into web and mobile media. Attend the live
>>> webcast
>>> and join the prime developer group breaking into this new coding
>>> territory!
>>> http://sel.as-us.falkag.net/sel?
>>> cmd=lnk&kid=110944&bid=241720&dat=121642
>>> _______________________________________________
>>> SOng-devel mailing list
>>> SOng-devel at lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/song-devel
>>>
>>
>>
>>
>>
>>
>> -------------------------------------------------------
>> This SF.Net email is sponsored by xPML, a groundbreaking scripting
>> language
>> that extends applications into web and mobile media. Attend the live
>> webcast
>> and join the prime developer group breaking into this new coding
>> territory!
>> http://sel.as-us.falkag.net/sel?
>> cmd=lnk&kid=110944&bid=241720&dat=121642
>> _______________________________________________
>> SOng-devel mailing list
>> SOng-devel at lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/song-devel
>>
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting
> language
> that extends applications into web and mobile media. Attend the live
> webcast
> and join the prime developer group breaking into this new coding
> territory!
> http://sel.as-us.falkag.net/sel?
> cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> SOng-devel mailing list
> SOng-devel at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/song-devel
More information about the go-discuss
mailing list