Search Mailing List Archives
Curator prediction of NOT kinase activity
Evelyn Camon
camon at ebi.ac.uk
Wed Mar 8 06:18:41 PST 2006
there are plans to rank interpro2go mappings based on a marginal vs true
match...whether yet another GO_REF gets created so that you can
distinguish these is yet to be discussed...
Evelyn
Alexander D. Diehl wrote:
> Yes, that's true, which is why Val wrote earlier of improving Interpro
> mappings.
>
> I would rather see a manual deletion or suppression of the IEA statement
> than put in a contradictory statement which can only serve to confuse
> users, particularly if it is not tied to a definitive reference which
> users can use to judge the validity of the annotation.
>
> -- Alex
>
>
> Sandra Orchard wrote:
>
>> The main problem with that from a GO point of view is that you already
>> have the kinase annotation by IEA. Without the NOT annotation, that
>> will remain and be incorrect.
>>
>> Sandra
>>
>> Alexander D. Diehl wrote:
>>
>>> Hi,
>>>
>>> Sorry to be so late on this discussion, but I would like suggest that
>>> perhaps the annotation is inappropriate on the grounds that NOT
>>> annotations in general are very tricky assertions, and are best based
>>> on actual experimental evidence, which itself is very dependent on
>>> the experimental conditions used. There's a universe of proteins
>>> that I could provide NOT kinase activity for, but of course most
>>> people would not confuse them with kinases. The question here is
>>> more whether the original algorithms for predicting kinase activity
>>> need to be refined so that proteins lacking the correct residues in
>>> the active site are not flagged as kinases in the first place. Such
>>> refinement, of course, ought best to be based on experimental
>>> evidence (wet science) that feeds the computational algorithm
>>> development, and that's where expert input is needed.
>>>
>>> I thus vote against any annotation at all in this case.
>>>
>>> -- Alex
>>>
>>>
>>> Evelyn Camon wrote:
>>>
>>>>> what were the objections to a GO_REF, this would be unambiguous
>>>>
>>>>
>>>>
>>>> Hi,
>>>>
>>>> The issue of using the GO_REF vs extension of the evidence codes is on
>>>> the GO Consortium meeting agenda.
>>>>
>>>> Arguments against GO_REF include:
>>>>
>>>> I don't think the users will read about GO_REF (is that our problem?)
>>>> I don't think the users will even see GO_REF in most tools,
>>>> microarray, we are also limited in UniProt ffl to 4 fields of GO
>>>> information
>>>> Users can filter on GO_REF plus evidence code but are we simply
>>>> avoiding extending the number of useful GO evidence codes, for
>>>> manual codes I can see that extending the codes might slow down
>>>> curation but for more granular IEA codes, they are created
>>>> electronically so no extra effort from curator required.
>>>> Biology is complex. Although I don't like to see information lost I
>>>> think we further complicate what was a simple annotation process and
>>>> output.
>>>>
>>>> Arguments in favour GO_REF include:
>>>>
>>>> many different techniques not practical (or is it) to create ne
>>>> evidence codes, helps to disambiguate annotations
>>>>
>>>> I am really not wishing to start a new thread here...can we leave
>>>> GO_REF and codes to Consortium meeting, I am still collecting
>>>> ideas..you could reply to me directly if you wish me to collect
>>>> further 'for' and 'aganist' examples for discussion.
>>>>
>>>> cheers
>>>> Evelyn
>>>>
>>>>
>>>>>
>>>>>
>>>>> So all annotations which use
>>>>>
>>>>> NOT with the ISS evidence code and GO_REF:xxx and a dbxref to an
>>>>> alignment
>>>>> mean that:
>>>>>
>>>>> The curator or an expert have looked at the alignment of this
>>>>> sequence
>>>>> to the associated database entry, protein family or hmm, and on the
>>>>> basis of the absence of critical residues have inferred that this
>>>>> family
>>>>> member is unlikely possess the associated activity.
>>>>>
>>>>> Or words to that effect.
>>>>>
>>>>>
>>>>>
>>>>> David Hill wrote:
>>>>>
>>>>>> I can see your point, but that would mean that every database should
>>>>>> have an internal ISS reference. We have one, but it is only for
>>>>>> orthology and subsequent inheritence of GO terms. Our's doesn't
>>>>>> address
>>>>>> this issue of a NOT. I think in either case, a User might be
>>>>>> confused.
>>>>>> If I were a naive User and I saw the annotation as you describe, I
>>>>>> might
>>>>>> think the NOT was a mistake becasue the proteins were so similar.
>>>>>> If the
>>>>>> protein in the with field is taken from the paper that discusses the
>>>>>> critical residues, then maybe it would be less confusing. I'm not
>>>>>> sure.
>>>>>> I think we could generate confusion both ways.
>>>>>>
>>>>>> David
>>>>>>
>>>>>> Pascale Gaudet wrote:
>>>>>>
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Is this allowed? I thought that the reference had to directly relate
>>>>>>> to the annotation. In this case I would have used our standard
>>>>>>> 'dictybase curators ISS' reference, because that's how the
>>>>>>> annotation
>>>>>>> was made.
>>>>>>>
>>>>>>> If I was to see the annotation as you describe it, I might be
>>>>>>> tempted
>>>>>>> to go and look at the reference, and I would be very confused
>>>>>>> because
>>>>>>> it doesn't talk about the protein at all.
>>>>>>>
>>>>>>> Pascale
>>>>>>>
>>>>>>>
>>>>>>> At 07:49 AM 3/8/2006 -0500, David Hill wrote:
>>>>>>>
>>>>>>>
>>>>>>>> This is a bit out of the ordinary, but what about an ISS evidence
>>>>>>>> code with an active kinase and then a reference to a paper that
>>>>>>>> identifies the critical residues for kinase activity?
>>>>>>>>
>>>>>>>> David
>>>>>>>>
>>>>>>>> Midori Harris wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>> Seems to me it would be a valuable part of the story, but not
>>>>>>>>> necessarily the whole thing. It would tell you what the important
>>>>>>>>> residues are, but would miss out the part about observing that
>>>>>>>>> those
>>>>>>>>> residues are altered/absent in this particular protein. Also,
>>>>>>>>> citing
>>>>>>>>> only the important-residue reference could give the impression
>>>>>>>>> that
>>>>>>>>> that paper (or whatever it is) actually states that protein XYZ
>>>>>>>>> doesn't have the activity -- which I assume is not the case.
>>>>>>>>>
>>>>>>>>> m
>>>>>>>>>
>>>>>>>>> On Wed, 8 Mar 2006, jyoti khadake wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> In this particular instance would the reference which identifies
>>>>>>>>>> residues important for the kinase activity in members of the
>>>>>>>>>> family
>>>>>>>>>> be the appropriate reference?
>>>>>>>>>>
>>>>>>>>>> JK
>>>>>>>>>>
>>>>>>>>>> Midori Harris wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> The reference has to identify the source of the information. In
>>>>>>>>>>> this case,
>>>>>>>>>>> it comes from what the curator knows, and from the work she did
>>>>>>>>>>> examining
>>>>>>>>>>> the protein sequence. So I don't think the protein ID would
>>>>>>>>>>> suffice,
>>>>>>>>>>> because it would capture nothing of the curator's
>>>>>>>>>>> involvement. The
>>>>>>>>>>> advantage of a GO_REF is that we could include everything the
>>>>>>>>>>> curator did,
>>>>>>>>>>> and make it unambiguous ... but it's not for me to decide
>>>>>>>>>>> whether
>>>>>>>>>>> that
>>>>>>>>>>> advantage outweighs the problems (btw, what are the arguments
>>>>>>>>>>> against a
>>>>>>>>>>> GO_REF?)
>>>>>>>>>>>
>>>>>>>>>>> m
>>>>>>>>>>>
>>>>>>>>>>> On Wed, 8 Mar 2006, Emily Dimmer wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> So if using the ISS code with these kinds of annotations, what
>>>>>>>>>>>> reference information should be provided? Should the reference
>>>>>>>>>>>> field refer back to the protein's identifier? Or to a specific
>>>>>>>>>>>> GO_REF (which isn't ideal)
>>>>>>>>>>>> e.g.
>>>>>>>>>>>> UniProt P12345 GO:0004672 UniProt:P12345
>>>>>>>>>>>> ISS F
>>>>>>>>>>>> protein taxon:9606 20060308 UniProt
>>>>>>>>>>>>
>>>>>>>>>>>> Midori Harris wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> The documentation for ISS says that it can be used for
>>>>>>>>>>>>> predicted
>>>>>>>>>>>>> or observed sequence features, and that in such cases the
>>>>>>>>>>>>> 'with'
>>>>>>>>>>>>> field can be left blank. If we choose to regard altered
>>>>>>>>>>>>> 'active'
>>>>>>>>>>>>> site residues as features -- which seems reasonable -- ISS
>>>>>>>>>>>>> will
>>>>>>>>>>>>> work.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Also, using IC would not solve the reference problem, so you
>>>>>>>>>>>>> would still have to either (a) make a GO_REF entry or (b)
>>>>>>>>>>>>> think
>>>>>>>>>>>>> of something else to use as the reference.
>>>>>>>>>>>>>
>>>>>>>>>>>>> m
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, 8 Mar 2006, Evelyn Camon wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> ok..so sequence similar to what?? the sequence/domain for the
>>>>>>>>>>>>>> active kinase??? or could we have Inferred by Curator from
>>>>>>>>>>>>>> Sequence (ICS??)..hmmm
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Ev
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Valerie Wood wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I think I prefer ISS, because this is essentially a
>>>>>>>>>>>>>>> judgement
>>>>>>>>>>>>>>> which has
>>>>>>>>>>>>>>> been made by assessing the sequence.....
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Evelyn Camon wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'm not keen on the GO_REF idea I'm afraid...could we
>>>>>>>>>>>>>>>> propose
>>>>>>>>>>>>>>>> that IC
>>>>>>>>>>>>>>>> could be used without GO ID on these odd occasions...not
>>>>>>>>>>>>>>>> sure
>>>>>>>>>>>>>>>> what
>>>>>>>>>>>>>>>> publication you would use though...
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Ev
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Sandra Orchard wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Most kinase recognition patterns are HMMs which can only
>>>>>>>>>>>>>>>>> predict a
>>>>>>>>>>>>>>>>> domain but will not tell you if it is active or not. The
>>>>>>>>>>>>>>>>> kinases in
>>>>>>>>>>>>>>>>> these examples were hit by the HMMs. The only method which
>>>>>>>>>>>>>>>>> will give any
>>>>>>>>>>>>>>>>> indication of activity are ProSite patterns which
>>>>>>>>>>>>>>>>> specifically say a
>>>>>>>>>>>>>>>>> particular residue needs to be in a particulr position.
>>>>>>>>>>>>>>>>> The
>>>>>>>>>>>>>>>>> HMMs are
>>>>>>>>>>>>>>>>> correct in that these are part of the kinase family,
>>>>>>>>>>>>>>>>> but are
>>>>>>>>>>>>>>>>> inactive
>>>>>>>>>>>>>>>>> members of it, they are not false positives in that sense.
>>>>>>>>>>>>>>>>> This is true
>>>>>>>>>>>>>>>>> for many different classes of enzyme.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> And I do not remove enzyme InterPro2GO annotation just
>>>>>>>>>>>>>>>>> because a family
>>>>>>>>>>>>>>>>> contains a few inactive members - all the big enzyme
>>>>>>>>>>>>>>>>> families do and
>>>>>>>>>>>>>>>>> they can only really be recognised by manual annotation.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Sandra
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Valerie Wood wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi Emily,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> A few comments which may be relevant:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Out of interest, which protein kinase family is this
>>>>>>>>>>>>>>>>>> (i.e.
>>>>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>> Interpro domain). Is it a family where some (but not all)
>>>>>>>>>>>>>>>>>> members are
>>>>>>>>>>>>>>>>>> protein kinases, in
>>>>>>>>>>>>>>>>>> which case the mapping should be removed?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Alternatively, if this appears to be a spurious hit,
>>>>>>>>>>>>>>>>>> instead of adding a
>>>>>>>>>>>>>>>>>> NOT annotation, you can get spurious matches
>>>>>>>>>>>>>>>>>> suppressed by
>>>>>>>>>>>>>>>>>> Interpro as
>>>>>>>>>>>>>>>>>> false positives (I often do this for S. pombe).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Or, could it be a sequencing or gene predicition error?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Val
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Midori Harris wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I think there's no doubt whatsoever that this
>>>>>>>>>>>>>>>>>>> information
>>>>>>>>>>>>>>>>>>> should be
>>>>>>>>>>>>>>>>>>> captured. The question is what to put for reference and
>>>>>>>>>>>>>>>>>>> evidence. The
>>>>>>>>>>>>>>>>>>> best
>>>>>>>>>>>>>>>>>>> evidence code is probably TAS, although one could
>>>>>>>>>>>>>>>>>>> possibly
>>>>>>>>>>>>>>>>>>> also make a
>>>>>>>>>>>>>>>>>>> case for ISS (note that IC is restricted to inferences
>>>>>>>>>>>>>>>>>>> from other GO
>>>>>>>>>>>>>>>>>>> annotations, so isn't suitable).
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> For a reference, one possibility is to add an item to
>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>> GO_REF
>>>>>>>>>>>>>>>>>>> collection; then there would be an ID to plug into
>>>>>>>>>>>>>>>>>>> the file.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> m
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Wed, 8 Mar 2006, Emily Dimmer wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> One of our annotators, who is an expert on protein
>>>>>>>>>>>>>>>>>>>> kinases, has looked
>>>>>>>>>>>>>>>>>>>> at the sequence of a putative protein kinase and from
>>>>>>>>>>>>>>>>>>>> noticing a couple
>>>>>>>>>>>>>>>>>>>> of amino acids changes at its active site, has
>>>>>>>>>>>>>>>>>>>> predicted
>>>>>>>>>>>>>>>>>>>> that it does
>>>>>>>>>>>>>>>>>>>> not possess any kinase activity - she did not use any
>>>>>>>>>>>>>>>>>>>> software and
>>>>>>>>>>>>>>>>>>>> there
>>>>>>>>>>>>>>>>>>>> is no published work on this protein.
>>>>>>>>>>>>>>>>>>>> Do you think this type of annotation should be
>>>>>>>>>>>>>>>>>>>> represented in GO (we
>>>>>>>>>>>>>>>>>>>> feel this annotation is of high quality and adds
>>>>>>>>>>>>>>>>>>>> valuable
>>>>>>>>>>>>>>>>>>>> information to
>>>>>>>>>>>>>>>>>>>> a protein which has not yet been characterized), and if
>>>>>>>>>>>>>>>>>>>> so how should
>>>>>>>>>>>>>>>>>>>> this annotation be shown?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>> Emily
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> Evelyn Camon
>>>>>>>>>>>>>>>> GOA Coordinator
>>>>>>>>>>>>>>>> Senior Scientific Curator
>>>>>>>>>>>>>>>> European Bioinformatics Institute
>>>>>>>>>>>>>>>> Tel:01223-494465
>>>>>>>>>>>>>>>> Fax:01223-494468
>>>>>>>>>>>>>>>> E-mail: camon at ebi.ac.uk
>>>>>>>>>>>>>>>> URL: http://www.ebi.ac.uk/goa
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -- Evelyn Camon
>>>>>>>>>>>>>> GOA Coordinator
>>>>>>>>>>>>>> Senior Scientific Curator
>>>>>>>>>>>>>> European Bioinformatics Institute
>>>>>>>>>>>>>> Tel:01223-494465
>>>>>>>>>>>>>> Fax:01223-494468
>>>>>>>>>>>>>> E-mail: camon at ebi.ac.uk
>>>>>>>>>>>>>> URL: http://www.ebi.ac.uk/goa
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> David P. Hill, Ph.D.
>>>>>>>> Senior Scientific Curator
>>>>>>>> Mouse Genome Informatics
>>>>>>>> Gene Ontology Consortium
>>>>>>>> The Jackson Laboratory
>>>>>>>> 600 Main Street
>>>>>>>> Bar Harbor, ME 04609-1500
>>>>>>>> tel:207-288-6430
>>>>>>>> htpp://www.informatics.jax.org
>>>>>>>> http://www.geneontology.org
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> --
>>>>>> David P. Hill, Ph.D.
>>>>>> Senior Scientific Curator
>>>>>> Mouse Genome Informatics
>>>>>> Gene Ontology Consortium
>>>>>> The Jackson Laboratory
>>>>>> 600 Main Street
>>>>>> Bar Harbor, ME 04609-1500
>>>>>> tel:207-288-6430
>>>>>> htpp://www.informatics.jax.org
>>>>>> http://www.geneontology.org
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>
>
--
Evelyn Camon
GOA Coordinator
Senior Scientific Curator
European Bioinformatics Institute
Tel:01223-494465
Fax:01223-494468
E-mail: camon at ebi.ac.uk
URL: http://www.ebi.ac.uk/goa
More information about the go-discuss
mailing list