Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

Curator prediction of NOT kinase activity

Alexander D. Diehl adiehl at informatics.jax.org
Wed Mar 8 06:08:56 PST 2006


Yes, that's true, which is why Val wrote earlier of improving Interpro 
mappings.

I would rather see a manual deletion or suppression of the IEA statement 
than put in a contradictory statement which can only serve to confuse 
users, particularly if it is not tied to a definitive reference which 
users can use to judge the validity of the annotation.

-- Alex


Sandra Orchard wrote:
> The main problem with that from a GO point of view is that you already 
> have the kinase annotation by IEA. Without the NOT annotation, that 
> will remain and be incorrect.
>
> Sandra
>
> Alexander D. Diehl wrote:
>
>> Hi,
>>
>> Sorry to be so late on this discussion, but I would like suggest that 
>> perhaps the annotation is inappropriate on the grounds that NOT 
>> annotations in general are very tricky assertions, and are best based 
>> on actual experimental evidence, which itself is very dependent on 
>> the experimental conditions used.  There's a universe of proteins 
>> that I could provide NOT kinase activity for, but of course most 
>> people would not confuse them with kinases.  The question here is 
>> more whether the original algorithms for predicting kinase activity 
>> need to be refined so that proteins lacking the correct residues in 
>> the active site are not flagged as kinases in the first place.  Such 
>> refinement, of course, ought best to be based on experimental 
>> evidence (wet science) that feeds the computational algorithm 
>> development, and that's where expert input is needed.
>>
>> I thus vote against any annotation at all in this case.
>>
>> -- Alex
>>
>>
>> Evelyn Camon wrote:
>>
>>>> what were the objections to a GO_REF, this would be unambiguous 
>>>
>>>
>>> Hi,
>>>
>>> The issue of using the GO_REF vs extension of the evidence codes is on
>>> the GO Consortium meeting agenda.
>>>
>>> Arguments against GO_REF include:
>>>
>>> I don't think the users will read about GO_REF (is that our problem?)
>>> I don't think the users will even see GO_REF in most tools, 
>>> microarray, we are also limited in UniProt ffl to 4 fields of GO 
>>> information
>>> Users can filter on GO_REF plus evidence code but are we simply 
>>> avoiding extending the number of useful GO evidence codes, for 
>>> manual codes I can see that extending the codes might slow down 
>>> curation but for more granular IEA codes, they are created 
>>> electronically so no extra effort from curator required.
>>> Biology is complex. Although I don't like to see information lost I 
>>> think we further complicate what was a simple annotation process and 
>>> output.
>>>
>>> Arguments in favour GO_REF include:
>>>
>>> many different techniques not practical (or is it) to create ne 
>>> evidence codes, helps to disambiguate annotations
>>>
>>> I am really not wishing to start a new thread here...can we leave 
>>> GO_REF and codes to Consortium meeting, I am still collecting 
>>> ideas..you could reply to me directly if you wish me to collect 
>>> further 'for' and 'aganist' examples for discussion.
>>>
>>> cheers
>>> Evelyn
>>>
>>>
>>>>
>>>>
>>>> So all annotations which use
>>>>
>>>> NOT with the ISS evidence code and GO_REF:xxx and a dbxref to an
>>>> alignment
>>>> mean that:
>>>>
>>>> The curator or an expert have looked at  the alignment of this 
>>>> sequence
>>>> to the associated database entry, protein family or hmm, and on the
>>>> basis of the absence of critical residues have inferred that this 
>>>> family
>>>> member is unlikely possess the associated activity.
>>>>
>>>> Or words to that effect.
>>>>
>>>>
>>>>
>>>> David Hill wrote:
>>>>
>>>>> I can see your point, but that would mean that every database should
>>>>> have an internal ISS reference. We have one, but it is only for
>>>>> orthology and subsequent inheritence of GO terms. Our's doesn't 
>>>>> address
>>>>> this issue of a NOT. I think in either case, a User might be 
>>>>> confused.
>>>>> If I were a naive User and I saw the annotation as you describe, I 
>>>>> might
>>>>> think the NOT was a mistake becasue the proteins were so similar. 
>>>>> If the
>>>>> protein in the with field is taken from the paper that discusses the
>>>>> critical residues, then maybe it would be less confusing. I'm not 
>>>>> sure.
>>>>> I think we could generate confusion both ways.
>>>>>
>>>>> David
>>>>>
>>>>> Pascale Gaudet wrote:
>>>>>
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Is this allowed? I thought that the reference had to directly relate
>>>>>> to the annotation. In this case I would have used our standard
>>>>>> 'dictybase curators ISS' reference, because that's how the 
>>>>>> annotation
>>>>>> was made.
>>>>>>
>>>>>> If I was to see the annotation as you describe it, I might be 
>>>>>> tempted
>>>>>> to go and look at the reference, and I would be very confused 
>>>>>> because
>>>>>> it doesn't talk about the protein at all.
>>>>>>
>>>>>> Pascale
>>>>>>
>>>>>>
>>>>>> At 07:49 AM 3/8/2006 -0500, David Hill wrote:
>>>>>>
>>>>>>
>>>>>>> This is a bit out of the ordinary, but what about an ISS evidence
>>>>>>> code with an active kinase and then a reference to a paper that
>>>>>>> identifies the critical residues for kinase activity?
>>>>>>>
>>>>>>> David
>>>>>>>
>>>>>>> Midori Harris wrote:
>>>>>>>
>>>>>>>
>>>>>>>> Seems to me it would be a valuable part of the story, but not
>>>>>>>> necessarily the whole thing. It would tell you what the important
>>>>>>>> residues are, but would miss out the part about observing that 
>>>>>>>> those
>>>>>>>> residues are altered/absent in this particular protein. Also, 
>>>>>>>> citing
>>>>>>>> only the important-residue reference could give the impression 
>>>>>>>> that
>>>>>>>> that paper (or whatever it is) actually states that protein XYZ
>>>>>>>> doesn't have the activity -- which I assume is not the case.
>>>>>>>>
>>>>>>>> m
>>>>>>>>
>>>>>>>> On Wed, 8 Mar 2006, jyoti khadake wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> In this particular instance would the reference which identifies
>>>>>>>>> residues important for the kinase activity in members of the 
>>>>>>>>> family
>>>>>>>>> be the appropriate reference?
>>>>>>>>>
>>>>>>>>> JK
>>>>>>>>>
>>>>>>>>> Midori Harris wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> The reference has to identify the source of the information. In
>>>>>>>>>> this case,
>>>>>>>>>> it comes from what the curator knows, and from the work she did
>>>>>>>>>> examining
>>>>>>>>>> the protein sequence. So I don't think the protein ID would 
>>>>>>>>>> suffice,
>>>>>>>>>> because it would capture nothing of the curator's 
>>>>>>>>>> involvement. The
>>>>>>>>>> advantage of a GO_REF is that we could include everything the
>>>>>>>>>> curator did,
>>>>>>>>>> and make it unambiguous ... but it's not for me to decide 
>>>>>>>>>> whether
>>>>>>>>>> that
>>>>>>>>>> advantage outweighs the problems (btw, what are the arguments
>>>>>>>>>> against a
>>>>>>>>>> GO_REF?)
>>>>>>>>>>
>>>>>>>>>> m
>>>>>>>>>>
>>>>>>>>>> On Wed, 8 Mar 2006, Emily Dimmer wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> So if using the ISS code with these kinds of annotations, what
>>>>>>>>>>> reference information should be provided? Should the reference
>>>>>>>>>>> field refer back to the protein's identifier? Or to a specific
>>>>>>>>>>> GO_REF (which isn't ideal)
>>>>>>>>>>> e.g.
>>>>>>>>>>> UniProt     P12345      GO:0004672      UniProt:P12345     
>>>>>>>>>>> ISS   F
>>>>>>>>>>> protein    taxon:9606 20060308       UniProt
>>>>>>>>>>>
>>>>>>>>>>> Midori Harris wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> The documentation for ISS says that it can be used for 
>>>>>>>>>>>> predicted
>>>>>>>>>>>> or observed sequence features, and that in such cases the 
>>>>>>>>>>>> 'with'
>>>>>>>>>>>> field can be left blank. If we choose to regard altered 
>>>>>>>>>>>> 'active'
>>>>>>>>>>>> site residues as features -- which seems reasonable -- ISS 
>>>>>>>>>>>> will
>>>>>>>>>>>> work.
>>>>>>>>>>>>
>>>>>>>>>>>> Also, using IC would not solve the reference problem, so you
>>>>>>>>>>>> would still have to either (a) make a GO_REF entry or (b) 
>>>>>>>>>>>> think
>>>>>>>>>>>> of something else to use as the reference.
>>>>>>>>>>>>
>>>>>>>>>>>> m
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, 8 Mar 2006, Evelyn Camon wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> ok..so sequence similar to what?? the sequence/domain for the
>>>>>>>>>>>>> active kinase??? or could we have Inferred by Curator from
>>>>>>>>>>>>> Sequence (ICS??)..hmmm
>>>>>>>>>>>>>
>>>>>>>>>>>>> Ev
>>>>>>>>>>>>>
>>>>>>>>>>>>> Valerie Wood wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I think I prefer ISS, because this is essentially a 
>>>>>>>>>>>>>> judgement
>>>>>>>>>>>>>> which has
>>>>>>>>>>>>>> been made by assessing the sequence.....
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Evelyn Camon wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'm not keen on the GO_REF idea I'm afraid...could we 
>>>>>>>>>>>>>>> propose
>>>>>>>>>>>>>>> that IC
>>>>>>>>>>>>>>> could be used without GO ID on these odd occasions...not 
>>>>>>>>>>>>>>> sure
>>>>>>>>>>>>>>> what
>>>>>>>>>>>>>>> publication you would use though...
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Ev
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Sandra Orchard wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Most kinase recognition patterns are HMMs which can only
>>>>>>>>>>>>>>>> predict a
>>>>>>>>>>>>>>>> domain but will not tell you if it is active or not. The
>>>>>>>>>>>>>>>> kinases in
>>>>>>>>>>>>>>>> these examples were hit by the HMMs. The only method which
>>>>>>>>>>>>>>>> will give any
>>>>>>>>>>>>>>>> indication of activity are ProSite patterns which
>>>>>>>>>>>>>>>> specifically say a
>>>>>>>>>>>>>>>> particular residue needs to be in a particulr position. 
>>>>>>>>>>>>>>>> The
>>>>>>>>>>>>>>>> HMMs are
>>>>>>>>>>>>>>>> correct in that these are part of the kinase family, 
>>>>>>>>>>>>>>>> but are
>>>>>>>>>>>>>>>> inactive
>>>>>>>>>>>>>>>> members of it, they are not false positives in that sense.
>>>>>>>>>>>>>>>> This is true
>>>>>>>>>>>>>>>> for many different classes of enzyme.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> And I do not remove enzyme InterPro2GO annotation just
>>>>>>>>>>>>>>>> because a family
>>>>>>>>>>>>>>>> contains a few inactive members - all the big enzyme
>>>>>>>>>>>>>>>> families do and
>>>>>>>>>>>>>>>> they can only really be recognised by manual annotation.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Sandra
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Valerie Wood wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi Emily,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> A few comments which may be relevant:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Out of interest, which protein kinase family is this 
>>>>>>>>>>>>>>>>> (i.e.
>>>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>> Interpro domain). Is it a family where some (but not all)
>>>>>>>>>>>>>>>>> members are
>>>>>>>>>>>>>>>>> protein kinases, in
>>>>>>>>>>>>>>>>> which case the mapping should be removed?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Alternatively, if this appears to be a spurious hit,
>>>>>>>>>>>>>>>>> instead of adding a
>>>>>>>>>>>>>>>>> NOT annotation, you can get spurious matches 
>>>>>>>>>>>>>>>>> suppressed by
>>>>>>>>>>>>>>>>> Interpro as
>>>>>>>>>>>>>>>>> false positives (I often do this for S. pombe).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Or, could it be a sequencing or gene predicition error?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Val
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Midori Harris wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I think there's no doubt whatsoever that this 
>>>>>>>>>>>>>>>>>> information
>>>>>>>>>>>>>>>>>> should be
>>>>>>>>>>>>>>>>>> captured. The question is what to put for reference and
>>>>>>>>>>>>>>>>>> evidence. The
>>>>>>>>>>>>>>>>>> best
>>>>>>>>>>>>>>>>>> evidence code is probably TAS, although one could 
>>>>>>>>>>>>>>>>>> possibly
>>>>>>>>>>>>>>>>>> also make a
>>>>>>>>>>>>>>>>>> case for ISS (note that IC is restricted to inferences
>>>>>>>>>>>>>>>>>> from other GO
>>>>>>>>>>>>>>>>>> annotations, so isn't suitable).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> For a reference, one possibility is to add an item to 
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> GO_REF
>>>>>>>>>>>>>>>>>> collection; then there would be an ID to plug into 
>>>>>>>>>>>>>>>>>> the file.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> m
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Wed, 8 Mar 2006, Emily Dimmer wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> One of our annotators, who is an expert on protein
>>>>>>>>>>>>>>>>>>> kinases, has looked
>>>>>>>>>>>>>>>>>>> at the sequence of a putative protein kinase and from
>>>>>>>>>>>>>>>>>>> noticing a couple
>>>>>>>>>>>>>>>>>>> of amino acids changes at its active site, has 
>>>>>>>>>>>>>>>>>>> predicted
>>>>>>>>>>>>>>>>>>> that it does
>>>>>>>>>>>>>>>>>>> not possess any kinase activity - she did not use any
>>>>>>>>>>>>>>>>>>> software and
>>>>>>>>>>>>>>>>>>> there
>>>>>>>>>>>>>>>>>>> is no published work on this protein.
>>>>>>>>>>>>>>>>>>> Do you think this type of annotation should be
>>>>>>>>>>>>>>>>>>> represented in GO (we
>>>>>>>>>>>>>>>>>>> feel this annotation is of high quality and adds 
>>>>>>>>>>>>>>>>>>> valuable
>>>>>>>>>>>>>>>>>>> information to
>>>>>>>>>>>>>>>>>>> a protein which has not yet been characterized), and if
>>>>>>>>>>>>>>>>>>> so how should
>>>>>>>>>>>>>>>>>>> this annotation be shown?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>> Emily
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -- 
>>>>>>>>>>>>>>> Evelyn Camon
>>>>>>>>>>>>>>> GOA Coordinator
>>>>>>>>>>>>>>> Senior Scientific Curator
>>>>>>>>>>>>>>> European Bioinformatics Institute
>>>>>>>>>>>>>>> Tel:01223-494465
>>>>>>>>>>>>>>> Fax:01223-494468
>>>>>>>>>>>>>>> E-mail: camon at ebi.ac.uk
>>>>>>>>>>>>>>> URL: http://www.ebi.ac.uk/goa
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> -- Evelyn Camon
>>>>>>>>>>>>> GOA Coordinator
>>>>>>>>>>>>> Senior Scientific Curator
>>>>>>>>>>>>> European Bioinformatics Institute
>>>>>>>>>>>>> Tel:01223-494465
>>>>>>>>>>>>> Fax:01223-494468
>>>>>>>>>>>>> E-mail: camon at ebi.ac.uk
>>>>>>>>>>>>> URL: http://www.ebi.ac.uk/goa
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> -- 
>>>>>>> David P. Hill, Ph.D.
>>>>>>> Senior Scientific Curator
>>>>>>> Mouse Genome Informatics
>>>>>>> Gene Ontology Consortium
>>>>>>> The Jackson Laboratory
>>>>>>> 600 Main Street
>>>>>>> Bar Harbor, ME 04609-1500
>>>>>>> tel:207-288-6430
>>>>>>> htpp://www.informatics.jax.org
>>>>>>> http://www.geneontology.org
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> -- 
>>>>> David P. Hill, Ph.D.
>>>>> Senior Scientific Curator
>>>>> Mouse Genome Informatics
>>>>> Gene Ontology Consortium
>>>>> The Jackson Laboratory
>>>>> 600 Main Street
>>>>> Bar Harbor, ME 04609-1500
>>>>> tel:207-288-6430
>>>>> htpp://www.informatics.jax.org
>>>>> http://www.geneontology.org
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>


-- 
Alexander Diehl, Ph.D.
Scientific Curator
Mouse Genome Informatics
The Jackson Laboratory
600 Main Street
Bar Harbor, ME  04609

email:  adiehl at informatics.jax.org
work:  +1 (207) 288-6427
fax:  +1 (207) 288-6131




More information about the go-discuss mailing list