Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

[SO-devel] Re: annotating to pseudogenes

Karen Christie kchris at genome.Stanford.EDU
Tue Mar 21 09:42:47 PST 2006


Hi Chris,

I agree representing this would be tricky, but I think not in quite the
way you suggest below. The thing with the 'expressed pseudogenes' is that
the transcript they make is functional. It's not functional with respect
to encoding the protein product that it might have been predicted to
encode based on sequence similarity with a gene that produces that
protein, but the transcript is often characterized as having a function,
typically something like regulating expression of the gene that is
functional for producing the protein product.

The biology might be clearer if I make up an example.

Gene X1 produces a protein product called X1p.

Gene X2 has sequence similarity to gene X1, but contains mutations that
would prevent the production of a protein. Thus gene X2 is classified as a
pseudogene.

People later discover that gene X2 produces an RNA transcript, let's call
it X2-rna, and that X2-rna is involved in regulating the expression of the
transcript from gene X1. 


On the basis of that example, here are some comments about how we might
represent this, if we choose to do so.  Since the ncRNA called X2-rna in
this example has a regulatory function, we don't need to try to represent
"nonfunctional ncRNA". We would just need to be careful that our
definition of pseudogene is precise about the specific non-functionality
that was used as the basis of designating it a pseudogene, i.e. that it
does not encode the protein product that it would have been predicted to
on the basis of its sequence similarity to a known functional gene that
does produce a protein product, and that the def does not make statements
indicating that pseudogenes are absolutely non-functional. The
modification I suggested to the definition of the term 'pseudogene' tried
to address that, so that might be enough on that aspect.

-Karen


On Mon, 20 Mar 2006, chris mungall wrote:

> Here's how I would represent this, if we decided to go this way:
> 
> is-a hierarchy:
> 
> pseudogene: <current def>
>   non-expressed-pseudogene: a pseudogene that lacks expression
>   expressed-pseudogene: a pseudogene that expresses ncRNA that lacks  
> function
> 
> the relation between ncRNA and expressed-pseudogene is trickier. We  
> should not use is-a. Currently there is a member_of relation between a  
> transcript and a gene in SO, and there should be a parallel structure  
> for pseudos.
> 
> We can't say ncRNA member_of expressed-pseudogene, since this implies  
> that all ncRNAs are members of some expressed-pseudogene using the  
> standard ALL-SOME relation definitions.
> 
> We could introduce a subtype "nonfunctional ncRNA" and say  
> nonfunctional-ncRNA member_of expressed-pseudogene.
> 
> This is also problematic, as there are presumably functional genes that  
> have some nonfunctional transcripts.
> 
> In fact defining a expressed-pseudogene as being a pseudogene which  
> expressed ncRNA is problematic, as all ncRNA are members_of some gene -  
> and  a pseudogene is not a gene.
> 
> The only solution which guarantees consistency in the ontology is to  
> define transcript as having a function, and having nonfunctional-ncRNA  
> as a sibling of ncRNA
> 
> Of course it may be simpler to declare expressed-pseudogene oxymoronic  
> - anything that is expressed is a gene
> 
> On Mar 20, 2006, at 9:33 AM, Karen Christie wrote:
> 
> > Hi
> >
> > I have a question along the lines of Val's comment. I was wondering  
> > what
> > the scientific community would expect to see with respect to how a
> > "pseudogene" that expresses a transcript would be annotated in SO. Suzi
> > has said that if a pseudogene is discovered to express a transcript,  
> > than
> > the 'pseudogene' annotation should be removed and replaced with  
> > something
> > else, i.e. 'ncRNA'. However, is that consistent with what the people
> > studying these do, i.e. if they discover that a 'pseudogene' is  
> > expressed,
> > do they stop calling it a 'pseudogene'? I'm not sure that they do.
> >
> > In the exposure to this issue that I've had, it seems that people DO
> > continue to call it a pseudogene, but add the adjective 'expressed' so
> > that they now refer to it as an 'expressed pseudogene'. If this is  
> > common
> > practice, to refer to 'pseudogenes', i.e. genes that don't express the
> > product they might have been expected to based on sequence similarity  
> > to a
> > functional gene, that actually do express a product, often an ncRNA, as
> > 'expressed pseudogenes', then perhaps SO should reflect that usage,  
> > rather
> > than impose a perhaps artificially strict definition that pseudogenes
> > never express any product. Perhaps we could have a SO term for  
> > 'expressed
> > pseudogene' to capture this particular class of features, perhaps with
> > dual parentage both under 'pseudogene' and under 'ncRNA'.
> >
> > Like I said at the beginning, this is a question about how would
> > researchers expect an 'expressed pseudogene' to be annotated, and I'm  
> > far
> > from an expert on pseudogenes. I'm also more familiar with GO practice
> > that SO, but it seems that if researchers still refer to a feature as a
> > pseudogene, even after it has been discovered to produce a transcript  
> > that
> > may function as an ncRNA, then SO should attempt to reflect the usage  
> > of
> > the research community.
> >
> > -Karen
> >
> >
> >
> > On Mon, 20 Mar 2006 val at sanger.ac.uk wrote:
> >
> >>
> >>
> >> So, if a gene with a degraded protein coding CDS was found to have  
> >> functionality
> >> as a ncRNA, can you annotate:
> >>
> >> i) a ncRNA feature (with appropriate GO terms)
> >> ii) the degraded CDS region as a pseudogene, or pseudogenic exon or  
> >> whatever
> >>
> >> It sounds as if not, but for complete annotation of features you  
> >> would probably
> >> still want to capture the degraded protein coding region. Certainly  
> >> this would
> >> be useful information to anybody who didn't know the full history of  
> >> the
> >> feature. Is there any reason that both cannot be captured as  
> >> different partially
> >> overlapping feature types?
> >>
> >> Val
> >>
> >>
> >>
> >>
> >> Quoting Suzanna Lewis <suzi at fruitfly.org>:
> >>
> >>>
> >>> On Mar 19, 2006, at 9:52 AM, Karen Eilbeck wrote:
> >>>
> >>>> Before we all agree on the proposed definition for pseudogenes, we
> >>>> need to address some issues.
> >>>> Firstly if we use this definition, then a region that is a  
> >>>> pseudogene
> >>>> that turns out to also be a functional non-coding RNA, will also be  
> >>>> a
> >>>> ncRNA.
> >>>> I'm not sure if this will affect supporting software.
> >>>
> >>> Non, non, mon cheri.
> >>>
> >>> If it turns out to be a functional non-coding RNA then it is *not* a
> >>> pseudogene (even with this modified definition). The annotation  
> >>> calling
> >>> it a "pseudogene" would necessarily have to be updated (removed) as
> >>> soon as proof of functionality is found.
> >>>
> >>> -S
> >>>
> >>>>
> >>>> Secondly the definition is supposed the explicitly describe the
> >>>> feature, and the phrase "that is thought to be" adds vagueness  
> >>>> rather
> >>>> than clarity.
> >>>>
> >>>> I'm sure there will ample discussion of this term in St Croix.
> >>>>
> >>>> --Karen
> >>>>
> >>>>
> >>>>
> >>>> On Mar 19, 2006, at 5:27 AM, Richard Durbin wrote:
> >>>>
> >>>>> Thank you for this thoughtful analysis and proposal. I support the
> >>>>> revised definition as well.
> >>>>> Richard
> >>>>>
> >>>>> Michael Ashburner (Genetics) wrote:
> >>>>>
> >>>>>> I could live with that changed definition.
> >>>>>> Michael
> >>>>>>
> >>>>>>
> >>>>>>> Envelope-to: ma11 at gen.cam.ac.uk
> >>>>>>> Delivery-date: Wed, 15 Mar 2006 18:44:22 +0000
> >>>>>>> X-Cam-SpamDetails: scanned, SpamAssassin (score=0)
> >>>>>>> X-Cam-AntiVirus: No virus found
> >>>>>>> X-Cam-ScannerInfo: http://www.cam.ac.uk/cs/email/scanner/
> >>>>>>> From: Karen Christie <kchris at genome.stanford.edu>
> >>>>>>> To: Hubert Renauld <hjr at sanger.ac.uk>
> >>>>>>> Cc: ruth at galton.ucl.ac.uk, Karen Eilbeck <eilbeck at fruitfly.org>,
> >>>>>> ...
> >>>>>>
> >>>>>>> List-Archive:
> >>>>>>> <http://sourceforge.net/mailarchive/forum.php?forum=song-devel>
> >>>>>>> Date: Wed, 15 Mar 2006 09:46:15 -0800 (PST)
> >>>>>>>
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> My comment here is more about the SO definition of pseudogene,  
> >>>>>>> than
> >>>>>>> about
> >>>>>>> whether or not to use GO terms to annotate features given that
> >>>>>>> label.
> >>>>>>>
> >>>>>>> I was thinking about the SO definition of pseudogene this morning
> >>>>>>> and was
> >>>>>>> wondering if part of the problem is that the definition may have
> >>>>>>> been
> >>>>>>> written from a protein-centric view of genes and what the  
> >>>>>>> possible
> >>>>>>> functions of genes are. If I understand correctly, and Rama and
> >>>>>>> some of
> >>>>>>> the other people who are more up on pseudogenes may correct me,  
> >>>>>>> most
> >>>>>>> pseudogenes have been designated as such by virtue of being 1)
> >>>>>>> similar to
> >>>>>>> a known protein-coding gene and 2) being thought to NOT express  
> >>>>>>> that
> >>>>>>> particular protein that might have been expected based on its
> >>>>>>> similarity
> >>>>>>> to a known protein coding gene. In other words, most people when
> >>>>>>> designating something as a pseudogene were only thinking about  
> >>>>>>> it's
> >>>>>>> protein coding ability. It seems that all of the cases that  
> >>>>>>> people
> >>>>>>> have
> >>>>>>> mentioned in this thread where a pseudogene is expressed result  
> >>>>>>> in
> >>>>>>> the
> >>>>>>> production of an RNA from a pseudogene that resesmbles a protein,
> >>>>>>> thus the
> >>>>>>> pseudogene is not producing the gene product that it might have  
> >>>>>>> been
> >>>>>>> expected to have based on its sequence similarity to a protein
> >>>>>>> coding
> >>>>>>> gene. It seems to me that even when it has been discovered that a
> >>>>>>> "pseudogene" produces an RNA transcript that may have activity in
> >>>>>>> regulating the gene it is related to, that the community still
> >>>>>>> calls these
> >>>>>>> "pseudogenes" because they do not produce the protein product
> >>>>>>> expected
> >>>>>>> based on sequence similarity to the known functional gene.
> >>>>>>>
> >>>>>>> It seems possible that there may be pseudogenes of ncRNA genes as
> >>>>>>> well of
> >>>>>>> protein coding genes, but perhaps we can revise the definition of
> >>>>>>> pseudogene to be a little more accurate. While GO and SO do need  
> >>>>>>> to
> >>>>>>> be
> >>>>>>> precise and rigorous, often more so than the literature, we also  
> >>>>>>> do
> >>>>>>> need
> >>>>>>> to reflect the community usage of terms.
> >>>>>>>
> >>>>>>> Here's my thoughts on a possible revision of the SO def of
> >>>>>>> pseudogene; the
> >>>>>>> current SO def is below.
> >>>>>>>
> >>>>>>> Possible revision:
> >>>>>>>
> >>>>>>> def: "A sequence that closely resembles a known functional gene,  
> >>>>>>> at
> >>>>>>> another locus within a genome, that is thought to be
> >>>>>>> non-functional, with
> >>>>>>> respect to producing the expected gene product based on sequence
> >>>>>>> similarity with the known functional gene, as a consequence of
> >>>>>>> (usually
> >>>>>>> several) mutations that prevent either its transcription or
> >>>>>>> translation
> >>>>>>> (or both). In general, pseudogenes result from either reverse
> >>>>>>> transcription of a transcript of their \"normal\" paralog
> >>>>>>> (SO:0000043) (in
> >>>>>>> which case the pseudogene typically lacks introns and includes a
> >>>>>>> poly(A)
> >>>>>>> tail) or from recombination (SO:0000044) (in which case the
> >>>>>>> pseudogene is
> >>>>>>> typically a tandem duplication of its \"normal\" paralog)."
> >>>>>>> [http://www.ucl.ac.uk/ ~ ucbhjow/b241/glossary.html] subset: SOFA
> >>>>>>>
> >>>>>>>
> >>>>>>> Current SO def:
> >>>>>>>
> >>>>>>> def: "A sequence that closely resembles a known functional gene,  
> >>>>>>> at
> >>>>>>> another locus within a genome, that is non-functional as a
> >>>>>>> consequence of
> >>>>>>> (usually several) mutations that prevent either its  
> >>>>>>> transcription or
> >>>>>>> translation (or both). In general, pseudogenes result from either
> >>>>>>> reverse
> >>>>>>> transcription of a transcript of their \"normal\" paralog
> >>>>>>> (SO:0000043) (in
> >>>>>>> which case the pseudogene typically lacks introns and includes a
> >>>>>>> poly(A) tail) or from recombination (SO:0000044) (in which case  
> >>>>>>> the
> >>>>>>> pseudogene is typically a tandem duplication of its
> >>>>>>> \"normal\" paralog)." [http://www.ucl.ac.uk/ ~
> >>>>>>> ucbhjow/b241/glossary.html]
> >>>>>>> subset: SOFA
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> -------------------------------------------------------
> >>>>>>> This SF.Net email is sponsored by xPML, a groundbreaking  
> >>>>>>> scripting
> >>>>>>> language
> >>>>>>> that extends applications into web and mobile media. Attend the
> >>>>>>> live webcast
> >>>>>>> and join the prime developer group breaking into this new coding
> >>>>>>> territory!
> >>>>>>> http://sel.as-us.falkag.net/sel?
> >>>>>>> cmd=lnk&kid=110944&bid=241720&dat=121642
> >>>>>>> _______________________________________________
> >>>>>>> SOng-devel mailing list
> >>>>>>> SOng-devel at lists.sourceforge.net
> >>>>>>> https://lists.sourceforge.net/lists/listinfo/song-devel
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> -------------------------------------------------------
> >>>>>> This SF.Net email is sponsored by xPML, a groundbreaking scripting
> >>>>>> language
> >>>>>> that extends applications into web and mobile media. Attend the  
> >>>>>> live
> >>>>>> webcast
> >>>>>> and join the prime developer group breaking into this new coding
> >>>>>> territory!
> >>>>>> http://sel.as-us.falkag.net/sel?
> >>>>>> cmd=lnk&kid=110944&bid=241720&dat=121642
> >>>>>> _______________________________________________
> >>>>>> SOng-devel mailing list
> >>>>>> SOng-devel at lists.sourceforge.net
> >>>>>> https://lists.sourceforge.net/lists/listinfo/song-devel
> >>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>>
> >>> -------------------------------------------------------
> >>> This SF.Net email is sponsored by xPML, a groundbreaking scripting  
> >>> language
> >>> that extends applications into web and mobile media. Attend the live  
> >>> webcast
> >>> and join the prime developer group breaking into this new coding  
> >>> territory!
> >>> http://sel.as-us.falkag.net/sel? 
> >>> cmd=lnk&kid=110944&bid=241720&dat=121642
> >>> _______________________________________________
> >>> SOng-devel mailing list
> >>> SOng-devel at lists.sourceforge.net
> >>> https://lists.sourceforge.net/lists/listinfo/song-devel
> >>>
> >>
> >>
> >>
> >>
> >>
> >> -------------------------------------------------------
> >> This SF.Net email is sponsored by xPML, a groundbreaking scripting  
> >> language
> >> that extends applications into web and mobile media. Attend the live  
> >> webcast
> >> and join the prime developer group breaking into this new coding  
> >> territory!
> >> http://sel.as-us.falkag.net/sel? 
> >> cmd=lnk&kid=110944&bid=241720&dat=121642
> >> _______________________________________________
> >> SOng-devel mailing list
> >> SOng-devel at lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/song-devel
> >>
> >
> >
> >
> > -------------------------------------------------------
> > This SF.Net email is sponsored by xPML, a groundbreaking scripting  
> > language
> > that extends applications into web and mobile media. Attend the live  
> > webcast
> > and join the prime developer group breaking into this new coding  
> > territory!
> > http://sel.as-us.falkag.net/sel? 
> > cmd=lnk&kid=110944&bid=241720&dat=121642
> > _______________________________________________
> > SOng-devel mailing list
> > SOng-devel at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/song-devel
> 




More information about the go-discuss mailing list