SlideShare uma empresa Scribd logo
1 de 36
Baixar para ler offline
Formalising	
  Uncertainty:	
  	
  
   An	
  Ontology	
  of	
  Reasoning,	
  
Certainty	
  and	
  A9ribu<on	
  (ORCA)	
  


Anita	
  de	
  Waard	
                                      Jodi	
  Schneider	
  
Disrup<ve	
  Technologies	
  Director	
                  PhD	
  Researcher	
  
Elsevier	
  Labs,	
  Jericho,	
  VT,	
  USA	
     DERI,	
  Galway,	
  Ireland	
  
	
                                                                           	
  
Outline	
  
•  Background:	
  	
  
    –  Metadiscourse,	
  epistemic	
  modality,	
  and	
  knowledge	
  
       a9ribu<on,	
  oh	
  my!	
  
    –  Some	
  related	
  work:	
  genre	
  studies,	
  linguis<cs,	
  NLP	
  
•  Our	
  model:	
  
    –  What	
  it	
  models	
  
    –  The	
  ontology	
  
    –  How	
  can	
  we	
  find	
  this	
  in	
  text?	
  
•  Possible	
  applica<ons:	
  	
  
    –  Possible	
  uses	
  
    –  Next	
  steps	
  
Background	
  
Scien<sts	
  make	
  uncertain	
  claims	
  
Uncertainty	
  
These	
  miRNAs	
  neutralize	
  p53-­‐mediated	
  CDK	
  
inhibi;on,	
  possibly	
  through	
  direct	
  inhibi;on	
  
of	
  the	
  expression	
  of	
  the	
  tumor-­‐suppressor	
  
LATS2.	
  	
  
But	
  uncertainty	
  gets	
  lost	
  while	
  ci<ng	
  
Uncertainty	
  
These	
  miRNAs	
  neutralize	
  p53-­‐mediated	
  CDK	
  
inhibi;on,	
  possibly	
  through	
  direct	
  inhibi;on	
  
of	
  the	
  expression	
  of	
  the	
  tumor-­‐suppressor	
  
LATS2.	
  	
  




                                                                 Certainty	
  
                                                                 Two	
  oncogenic	
  miRNAs,	
  miR-­‐372	
  and	
  
                                                                 miR-­‐373,	
  directly	
  inhibit	
  the	
  expression	
  of	
  
                                                                 Lats2,	
  thereby	
  allowing	
  tumorigenic	
  growth	
  
                                                                 in	
  the	
  presence	
  of	
  p53	
  (Voorhoeve	
  et	
  al.,	
  
                                                                 2006)	
  
Uncertainty	
  in	
  ac<on:	
  
       “[Y]ou	
  can	
  transform	
  ..	
  fic<on	
  into	
  fact	
  just	
  by	
  adding	
  or	
  
       subtrac<ng	
  references”,	
  Bruno	
  Latour	
  [1]
•  Voorhoeve	
  et	
  al.,	
  2006:	
   These	
  miRNAs	
  neutralize	
  p53-­‐	
  mediated	
  CDK	
  
   inhibi<on,	
  possibly	
  through	
  direct	
  inhibi<on	
  of	
  the	
  expression	
  of	
  the	
  tumor	
  
   suppressor	
  LATS2. 	
  
•  Kloosterman	
  and	
  Plasterk,	
  2006:	
   In	
  a	
  gene<c	
  screen,	
  miR-­‐372	
  and	
  miR-­‐373	
  
   were	
  found	
  to	
  allow	
  prolifera<on	
  of	
  primary	
  human	
  cells	
  that	
  express	
  
   oncogenic	
  RAS	
  and	
  ac<ve	
  p53,	
  possibly	
  by	
  inhibi<ng	
  the	
  tumor	
  suppressor	
  
   LATS2	
  (Voorhoeve	
  et	
  al.,	
  2006). 	
  
•  Yabuta	
  et	
  al.,	
  2007:	
  	
   [On	
  the	
  other	
  hand,]	
  two	
  miRNAs,	
  miRNA-­‐372	
  and-­‐373,	
  
   func<on	
  as	
  poten6al	
  novel	
  oncogenes	
  in	
  tes<cular	
  germ	
  cell	
  tumors	
  by	
  
   inhibi<on	
  of	
  LATS2	
  expression,	
  which	
  suggests	
  that	
  Lats2	
  is	
  an	
  important	
  
   tumor	
  suppressor	
  (Voorhoeve	
  et	
  al.,	
  2006). 	
  	
  
•  Okada	
  et	
  al.,	
  2011:	
   Two	
  oncogenic	
  miRNAs,	
  miR-­‐372	
  and	
  miR-­‐373,	
  directly	
  
   inhibit	
  the	
  expression	
  of	
  Lats2,	
  thereby	
  allowing	
  tumorigenic	
  growth	
  in	
  the	
  
   presence	
  of	
  p53	
  (Voorhoeve	
  et	
  al.,	
  2006). 	
  
Uncertainty	
  =	
  Hedging:	
  
•  Why	
  do	
  authors	
  hedge?	
  
    –  Make	
  a	
  claim	
  ‘pending	
  […]	
  acceptance	
  in	
  the	
  community’	
  [2]	
  
    –  ‘Create	
  A	
  Research	
  Space’	
  –	
  hedging	
  allows	
  authors	
  to	
  insert	
  themselves	
  into	
  
       the	
  discourse	
  in	
  a	
  community	
  [3]	
  
    –  ‘the	
  strongest	
  claim	
  a	
  careful	
  researcher	
  can	
  make’	
  [4]	
  
•  Hedging	
  cues,	
  specula<ve	
  language,	
  modality/nega<on:	
  
    –  Light	
  et	
  al	
  [5]:	
  finding	
  specula<ve	
  language	
  
    –  Wilbur	
  et	
  al	
  [6]:	
  focus,	
  polarity,	
  certainty,	
  evidence,	
  and	
  direc<onality	
  
    –  Thompson	
  et	
  al	
  [7]:	
  level	
  of	
  specula<on,	
  type/source	
  of	
  the	
  evidence	
  and	
  
       level	
  of	
  certainty	
  	
  	
  
•  Sen<ment	
  detec<on	
  (e.g.	
  Kim	
  and	
  Hovy	
  [8]	
  a.m.o.):	
  	
  
    –  Holder	
  of	
  the	
  opinion,	
  strength,	
  polarity	
  as	
  ‘mathema<cal	
  func<on’	
  ac<ng	
  on	
  
       main	
  proposi<onal	
  content	
  	
  
    –  Wide	
  applica<ons	
  in	
  product	
  reviews;	
  but	
  not	
  (yet)	
  in	
  science!	
  
Our	
  Model	
  
Our	
  model	
  for	
  epistemic	
  evalua<ons:	
  
For	
  a	
  Proposi<on	
  P,	
  an	
  epistemically	
  marked	
  clause	
  E	
  
is	
  an	
  evalua<on	
  of	
  P,	
  	
  where	
  	
  EV,	
  B,	
  S(P),	
  with:	
  
    –  V	
  =	
  Value:	
  
            3	
  =	
  Assumed	
  true,	
  2	
  =	
  Probable,	
  1	
  =	
  Possible,	
  0	
  =	
  Unknown,	
  	
  
            (-­‐	
  1=	
  possibly	
  untrue,	
  -­‐	
  2	
  =	
  probably	
  untrue,	
  -­‐3	
  =	
  assumed	
  untrue)	
  
    –  B	
  =	
  Basis:	
  
            Reasoning	
  
            Data	
  	
  
    –  S	
  =	
  Source:	
  
            A	
  =	
  speaker	
  is	
  author	
  A,	
  explicit	
  
            IA	
  =	
  speaker	
  author,	
  A,	
  implicit	
  
            N	
  =	
  other	
  author	
  N,	
  explicit	
  
            NN	
  =	
  other	
  author	
  NN,	
  implicit	
  
            	
                                                                      Model	
  suggested	
  by	
  Eduard	
  Hovy,	
  	
  
                                                       Informa;on	
  Sciences	
  Ins;tute	
  University	
  South	
  Califormia	
  
Adding	
  Epistemic	
  Evalua<on	
  
Together,	
  Lats2	
  and	
  ASPP1	
  shunt	
  p53	
  to	
  proapopto<c	
                          Value	
  =	
  3	
  
promoters	
  and	
  promote	
  the	
  death	
  of	
  polyploid	
  cells	
  [1].	
  (…)	
           Source	
  =	
  N	
  
	
                                                                                                 Basis	
  =	
  0	
  	
  
Further	
  biochemical	
  characteriza<on	
  of	
  hMOBs	
  showed	
  that	
  	
                   Value	
  =	
  3	
  
only	
  hMOB1A	
  and	
  hMOB1B	
  interact	
  with	
  both	
  LATS1	
  and	
                      Source	
  =	
  N	
  
LATS2	
  in	
  vitro	
  and	
  in	
  vivo	
  [39].	
  (…)	
                                        Basis	
  =	
  Data	
  	
  
	
                                                                                                 	
  
Our	
  findings	
  reveal	
  that	
  miR-­‐373	
  would	
  be	
  a	
  poten<al	
                    Value	
  =	
  1	
  
oncogene	
  and	
  it	
  par<cipates	
  in	
  the	
  carcinogenesis	
  of	
  human	
               Source	
  =	
  Author	
  
esophageal	
  cancer	
  by	
  suppressing	
  LATS2	
  expression.	
  	
  	
                        Basis	
  =	
  Data	
  	
  
	
                                                                                                 	
  
Furthermore,	
  we	
  demonstrated	
  that	
  the	
  direct	
  inhibi<on	
  of	
                   Value	
  =	
  2	
  (3?)	
  
LATS2	
  protein	
  was	
  mediated	
  by	
  miR-­‐373	
  and	
  manipulated	
  the	
              Source	
  =	
  Author	
  
expression	
  of	
  miR-­‐373	
  to	
  affect	
  esophageal	
  cancer	
  cells	
  growth.	
  	
     Basis	
  =	
  Data	
  	
  
	
                                                                                                 	
  
Finding	
  hedges	
  in	
  text	
  [9]:	
  
•  Modal	
  auxiliary	
  verbs	
  (e.g.	
  can,	
  could,	
  might)	
  	
  
•  Qualifying	
  adverbs	
  and	
  adjec<ves	
  (e.g.	
  interes;ngly,	
  
   possibly,	
  likely,	
  poten;al,	
  somewhat,	
  slightly,	
  
   powerful,	
  unknown,	
  undefined)	
  
•  References,	
  either	
  external	
  (e.g.	
  ‘[Voorhoeve	
  et	
  al.,	
  
   2006]’)	
  or	
  internal	
  (e.g.	
  ‘See	
  fig.	
  2a’).	
  	
  
•  Repor<ng/epistemic	
  verbs	
  (e.g.	
  suggest,	
  imply,	
  
   indicate,	
  show)	
  	
  
   –  either	
  within	
  the	
  clause:	
  ‘These	
  results	
  suggest	
  that...’	
  	
  
   –  or	
  in	
  a	
  subordinate	
  clause	
  governed	
  by	
  repor<ng-­‐verb	
  
      matrix	
  clause	
  ‘{These	
  results	
  suggest	
  that}	
  indeed,	
  this	
  
      represents	
  the	
  true	
  endogenous	
  ac;vity.’	
  
Manual	
  iden<fica<on:	
  
Value	
                        Modal	
          Repor6ng	
         Ruled	
  by	
   Adverbs/ Referenc None	
                              Total	
  	
  
                               Aux	
  	
        Verb	
             RV	
            Adjec6ves	
   es	
  

Total	
  value	
  =	
  3	
        1	
  (0.5%)	
      81	
  (40%)	
   24	
  (12%)	
      7	
  (4%)	
   41	
  (20%)	
   47	
  (24%)	
  201(100%)	
  

Total	
  Value	
  =	
  2	
                           29	
  (51%)	
   23	
  (40%)	
      1	
  (2%)	
       4(7%)	
                 57(100%)	
  

Total	
  Value	
  =	
  1	
          9(27%)	
         11(33%)	
   11(33%)	
               1(3%)	
          1(3%)	
                 33(100%)	
  

Total	
  Value	
  =	
  0	
                            9	
  (64%)	
     3	
  (21%)	
      1(7%)	
          1(7%)	
                 14(100%)	
  

Total	
  No	
  Modality	
                            16(37%)	
            3(7%)	
                0	
      3(7%)	
   22(50%)	
   44(100%)	
  

Overall	
  Total	
                 10	
  (2%)	
     146(23%)	
   64(10%)	
              10(2%)	
         50(8%)	
   69(11%)	
  640(100%)	
  
Most	
  prevalent	
  clause	
  type:	
  	
  
                “These	
  results	
  suggest	
  that...”	
  
Adverb/Connec<ve	
           thus,	
  therefore,	
  together,	
  recently,	
  in	
  summary	
  	
  

Determiner/Pronoun	
  	
     it,	
  this,	
  these,	
  we/our	
  

Adjec<ve	
                   previous,	
  future,	
  beeer	
  

Noun	
  phrase	
             data,	
  report,	
  study,	
  result(s);	
  method	
  or	
  reference	
  


Modal	
                      form	
  of	
  	
  ‘to	
  be’,	
  may,	
  remain	
  

Adjec<ve	
                   ogen,	
  recently,	
  generally	
  

Verb	
                       show,	
  obtain,	
  consider,	
  view,	
  reveal,	
  suggest,	
  
                             hypothesize,	
  indicate,	
  believe	
  

Preposi<on	
  	
             that,	
  to	
  
Repor<ng	
  verbs	
  vs.	
  epistemic	
  value:	
  
Value	
  =	
  0	
        establish,	
  (remain	
  to	
  be)	
  elucidated,	
  	
  
(unknown)	
              be	
  (clear/useful),	
  (remain	
  to	
  be)	
  examined/determined,	
  
                         describe,	
  make	
  difficult	
  to	
  infer,	
  report	
  
Value	
  =	
  1	
        be	
  important,	
  consider,	
  expect,	
  hypothesize	
  (5x),	
  give	
  
(hypothe<cal)	
          insight,	
  raise	
  possibility	
  that,	
  suspect,	
  think	
  

Value	
  =	
  2	
        appear,	
  believe,	
  implicate	
  (2x),	
  imply,	
  indicate	
  (12x),	
  play	
  a	
  
(probable)	
             role,	
  represent,	
  suggest	
  (18x),	
  validate	
  (2x),	
  	
  

Value	
  =	
  3	
        be	
  able/apparent/important	
  /posi<ve/visible,	
  compare	
  
(presumed	
  true)	
     (2x),	
  confirm	
  (2x),	
  define,	
  	
  demonstrate	
  (15x),	
  detect	
  (5x),	
  
                         discover,	
  display	
  (3x),	
  eliminate,	
  find	
  (3x),	
  iden<fy	
  (4x),	
  
                         know,	
  need,	
  note	
  (2x),	
  observe	
  (2x),	
  obtain	
  (success/
                         results-­‐	
  3x),	
  prove	
  to	
  be,	
  refer,	
  report(2x),	
  	
  reveal	
  (3x),	
  
                         see(2x),	
  show(24x),	
  	
  study,	
  view	
  
Finding	
  Claimed	
  Knowledge	
  Updates	
  [10]:	
  
Defini<on:	
  	
  
1)	
  A	
  CKU	
  expresses	
  a	
  proposi<on	
  about	
  biological	
  en<<es	
  	
  
2)	
  A	
  CKU	
  is	
  a	
  new	
  proposi<on	
  
3)	
  The	
  authors	
  present	
  the	
  CKU	
  as	
  factual:	
  
=>	
  Strength	
  =	
  Certainty	
  
4)	
  A	
  CKU	
  is	
  derived	
  from	
  experimental	
  work	
  described	
  in	
  the	
  ar<cle:	
  
=>	
  Basis	
  =	
  Data	
  
5)	
  The	
  ownership	
  is	
  a9ributed	
  to	
  the	
  author(s)	
  of	
  the	
  ar<cle.	
  	
  
=>	
  Source	
  =	
  Author,	
  Explicit	
  
3),	
  4)	
  and	
  5)	
  are	
  either	
  explicitly	
  expressed	
  or	
  structurally	
  conveyed:	
  
Here	
  we	
  used	
  mass	
  spectrometry	
  to	
  iden:fy	
  HuD	
  as	
  a	
  novel	
  SMN-­‐
   interac;ng	
  partner	
  
Our	
  analysis	
  of	
  known	
  HuD-­‐associated	
  mRNAs	
  iden:fied	
  cpg15	
  
   mRNA	
  as	
  a	
  highly	
  abundant	
  mRNA	
  in	
  HuD	
  Ips	
  
Automa<c	
  hedge	
  detec<on	
  with	
  
 The	
  Xerox	
  Incremental	
  Parser:	
  
                             Concept-­‐matching:
                                               	
  
                  Match	
  concept	
  pa9erns	
  with	
  rules
                                                             	
  
  Assign	
  features	
  to	
  keywords,	
  dependencies	
  and	
  sentences
                                                                          	
  
                                          	
  
                                          	
  
            General	
  linguis<c	
  analysis	
  of	
  running	
  texts:
                                                                      	
  
         Extract	
  syntac<c	
  dependencies	
  between	
  words
                                                               	
  
                                    Chunking	
  
                    Part-­‐of-­‐speech	
  disambigua<on	
  
                 Segment	
  the	
  sentences	
  into	
  words
                                                            	
  
                   Segment	
  the	
  text	
  into	
  sentences
                                                             	
  
Result:	
  CKUs	
  appear	
  throughout	
  the	
  paper
                                                         	
  
                        bio-event 	
  
                       entity 1       event name         entity 2             location


                        HuD           interaction         SMN           motor neurons
    Title          Abstract         Intro.          Results         Figures       Discussion       Citation
Interaction of   Here we used   Here we        Together with   SMN               Our MS and     Furthermore,
survival of      mass           identify HuD   our co-IP       interacts         co-IP data     these findings
motor            spectrometry   as a novel     data, these     with HuD.         demonstrate    are consistent
neuron           to identify    interacting    results                           a strong       with recent
(SMN) and        HuD as a       partner of     indicate that                     interaction    studies
HuD proteins     novel          SMN,           SMN                               between        demonstrating
[with m RNA      neuronal                      associates                        SMN and        that the
cpg15rescues     SMN-                          with HuD in                       HuD in         interaction of
motor neuron     interacting                   motor                             spinal motor   HuD with the
axonal           partner.                      neurons.                          neuron         spinal
deficits]                                                                        axons.         muscular
                                                                                                atrophy
                                                                                                (SMA)
                                                                                                protein SMN
                                                                                                …
The	
  Xerox	
  Incremental	
  Parser:	
  
                             Concept-­‐matching:
                                               	
  
                  Match	
  concept	
  pa9erns	
  with	
  rules
                                                             	
  
  Assign	
  features	
  to	
  keywords,	
  dependencies	
  and	
  sentences
                                                                          	
  
                                          	
  
                                          	
  
            General	
  linguis<c	
  analysis	
  of	
  running	
  texts:
                                                                      	
  
         Extract	
  syntac<c	
  dependencies	
  between	
  words
                                                               	
  
                                    Chunking	
  
                    Part-­‐of-­‐speech	
  disambigua<on	
  
                 Segment	
  the	
  sentences	
  into	
  words
                                                            	
  
                   Segment	
  the	
  text	
  into	
  sentences
                                                             	
  
The	
  formal	
  model	
  

          ©	
  Jodi	
  Schneider,	
  	
  
with	
  thanks	
  to	
  Siggi	
  Handschuh	
  
orca	
  [11]	
  
	
  vocab.deri.ie/orca	
  	
  
Example	
  Usage	
  
	
  
	
  
	
  
<claim>	
  orca:hasBasis	
  orca:Data	
  .	
  
Basis	
  
Source	
  
ConfidenceLevel	
  
How	
  to	
  represent	
  the	
  hierarchy?	
  
    lack	
  of	
  knowledge	
  <	
  hypothe;cal	
  knowledge	
  	
  
    <	
  dubita;ve	
  knowledge	
  <	
  doxas;c	
  knowledge	
  
    	
  
•  skos:broaderThan	
  –	
  not	
  appropriate	
  
•  skos	
  Collec<ons	
  add	
  an	
  unwanted	
  layer	
  of	
  
   complexity.	
  
•  Our	
  approach:	
  transi<ve	
  proper<es	
  
   “lessCertain”	
  and	
  “moreCertain”	
  
Transi<ve	
  proper<es	
  used	
  for	
  
      ConfidenceLevel	
  
ConfidenceLevel	
  &	
  its	
  Rela<onships	
  
Possible	
  Applica<ons	
  
Add	
  knowledge	
  value/basis/source	
  	
  
                                  to	
  a	
  bio-­‐event	
  
                                                     	
  
Biological	
  statement	
  	
  with	
  epistemic	
  markup	
   Epistemic	
  evalua6on	
  

Our	
  findings	
  reveal	
  that	
  miR-­‐373	
  would	
  be	
  a	
                 Value	
  =	
  Probable	
  
poten<al	
  oncogene	
  and	
  it	
  par<cipates	
  in	
  the	
                     Source	
  =	
  Author	
  
carcinogenesis	
  of	
  human	
  esophageal	
  cancer	
  by	
                       Basis	
  =	
  Data	
  	
  
suppressing	
  LATS2	
  expression.	
  	
  	
                                       	
  

Further	
  biochemical	
  characteriza<on	
  of	
  hMOBs	
                             Value	
  =	
  Presumed	
  
showed	
  that	
  only	
  hMOB1A	
  and	
  hMOB1B	
  interact	
                        true	
  
with	
  both	
  LATS1	
  and	
  LATS2	
  in	
  vitro	
  and	
  in	
  vivo	
  [39].	
   Source	
  =	
  Reference	
  
                                                                                       Basis	
  =	
  Data	
  	
  
Moreover,	
  the	
  mechanisms	
  by	
  which	
  tumor	
                            Value	
  =	
  Possible	
  
suppressor	
  genes	
  are	
  inhibited	
  may	
  vary	
  between	
                 Source	
  =	
  Unknown	
  
tumors.	
                                                                           Basis	
  =	
  Unknown	
  
E.g.	
  to	
  augment	
  Medscan	
  [13]	
  
Biological	
  statement	
  with	
  Medscan/                    MedScan	
  Analysis:	
                       Epistemic	
  
epistemic	
  markup	
                                                                                       evalua6on	
  
Furthermore,	
  we	
  present	
  evidence	
  that	
            IL-­‐6	
  è	
  NUCB2	
  (nesfa;n-­‐1)	
     Value	
  =	
  Probable	
  
the	
  secre;on	
  of	
  nesfa:n-­‐1	
  into	
  the	
          Rela<on:	
  MolTransport	
                   Source	
  =	
  Author	
  
culture	
  media	
  was	
  drama<cally	
  increased	
          Effect:	
  Posi<ve	
                          Basis	
  =	
  Data	
  	
  
during	
  the	
  differen<a<on	
  of	
  3T3-­‐L1	
              CellType:	
  Adipocytes	
  
                                                                                                            	
  
preadipocytes	
  into	
  adipocytes	
  (P	
  <	
  0.001)	
     Cell	
  Line:	
  3T3-­‐L1	
  
and	
  a{er	
  treatments	
  with	
  TNF-­‐alpha,	
            	
  
IL-­‐6,	
  insulin,	
  and	
  dexamethasone	
  (P	
  <	
  
0.01).	
  
Or	
  Biological	
  Exchange	
  Language	
  [14]:	
  	
  
Biological	
  statement	
  with	
                BEL	
  representa6on:	
                                      Epistemic	
  
BEL/	
  epistemic	
  markup	
                                                                                 evalua6on	
  
These	
  miRNAs	
  neutralize	
  p53-­‐          Increased	
  abundance	
  of	
  miR-­‐372	
                  Value	
  =	
  Possible	
  
                                                 decreases:	
  Increased	
  ac;vity	
  of	
  TP53	
  
mediated	
  CDK	
  inhibi;on,	
                                                                               Source	
  =	
  
                                                 decreases	
  ac;vity	
  of	
  CDK	
  protein	
  family	
  
possibly	
  through	
  direct	
                  r(MIR:miR-­‐372)	
  -­‐|                                     Unknown	
  
inhibi;on	
  of	
  the	
  expression	
  of	
     (tscript(p(HUGO:Trp53))	
  -­‐|	
                            Basis	
  =	
  
the	
  tumor-­‐suppressor	
  LATS2.	
  	
        kin(p(PFH:”CDK	
  	
  Family”)))	
                           Unknown	
  
                                                 	
  
                                                                                                              	
  
                                                 Increased	
  abundance	
  of	
  miR-­‐372	
  
                                                 decreases	
  abundance	
  of	
  LATS2	
  
                                                 r(MIR:miR-­‐372)	
  -­‐|	
  r(HUGO:LATS2)	
  
Using	
  ORCA	
  for	
  Nanopublica<ons	
  [15]:	
  
•  Use	
  to	
  indicate	
  Strength,	
  Basis,	
  Source	
  of	
  
   Asser<ons:	
  	
  




             Knowledge	
  Strength,	
  
                                          Methods	
     Authors,	
  DOIs	
  
                Basis,	
  Source	
  
Next	
  steps:	
  	
  
•  Con<nuing	
  experiments	
  with	
  automated	
  
   detec<on	
  
•  Can	
  be	
  used	
  in	
  Claim-­‐Evidence	
  network	
  
   projects,	
  e.g.	
  Data2Seman<cs	
  or	
  DIKB	
  
•  Could	
  replace	
  more	
  complicated	
  models	
  of	
  
   argumenta<on	
  
•  Ontology	
  is	
  available	
  for	
  all	
  to	
  use!	
  	
  
Thank	
  you!	
  
•  Funding:	
  	
                                 •  Discussion	
  partners:	
  	
  
    –  Elsevier	
  Labs	
                             –  Phil	
  Bourne,	
  UCSD	
  
    –  NWO	
  Casimir	
  programme	
                  –  Ed	
  Hovy,	
  	
  
•  Collaborators:	
  	
                               –  Gully	
  Burns,	
  ISI	
  
    –  Henk	
  Pander	
  Maat,	
  UU	
                –  Joanne	
  Luciano,	
  RPI	
  
    –  Agnes	
  Sandor,	
  XRCE	
                     –  Tim	
  Clark	
  et	
  al.,	
  Harvard	
  
    –  Siegfried	
  Handshuh,	
  DERI	
  
    –  Rinke	
  Hoekstra	
  &	
  co,	
  VU	
  
    –  Richard	
  Boyce	
  &	
  co,	
  UPi9	
  
    –  Maria	
  Liakata,	
  EBI	
  
    –  Sophia	
  Ananiadou	
  &	
  co,	
  
         NaCTeM	
  
    	
  
Ques<ons?	
  	
  
                     	
  
          Anita	
  de	
  Waard	
  
    a.dewaard@elsevier.com	
  
 h9p://elsatglabs.com/labs/anita/	
  	
  
                     	
  
           Jodi	
  Schneider	
  
     jodi.schneider@deri.org	
  	
  
h9p://jodischneider.com/jodi.html	
  	
  
                     	
  
References	
  
[1]	
  Latour,	
  B.	
  and	
  Woolgar,	
  S.,	
  Laboratory	
  Life:	
  the	
  Social	
  Construc<on	
  of	
  Scien<fic	
  Facts,	
  1979,	
  Sage	
  	
  
[2]	
  Myers,	
  G.	
  (1992).	
  ‘In	
  this	
  paper	
  we	
  report’:	
  Speech	
  acts	
  and	
  scien<fic	
  facts,	
  Jnl	
  of	
  Pragmatlcs	
  17	
  (1992)	
  
295-­‐313	
  
[3]	
  Swales,	
  J.	
  (1990).	
  Genre	
  Analysis,	
  English	
  in	
  Acad.	
  and	
  Res.Se}ngs,	
  Cambridge	
  University	
  Press,	
  1990.	
  	
  
[4]	
  Salager-­‐Meyer,	
  F.	
  (1994),	
  Hedges	
  and	
  Textual	
  Communica<ve	
  Func<on	
  in	
  Medical	
  English	
  Wri9en	
  
Discourse,	
  English	
  for	
  Specific	
  Purposes,	
  Vol.	
  13,	
  No.	
  2,	
  pp.	
  149-­‐170,	
  1994.	
  	
  
[5]	
  Light	
  M,	
  Qiu	
  XY,	
  Srinivasan	
  P.	
  (2004).	
  The	
  language	
  of	
  bioscience:	
  facts,	
  specula<ons,	
  and	
  statements	
  in	
  
between.	
  BioLINK	
  2004:	
  Linking	
  Biological	
  Literature,	
  Ontologies	
  and	
  Databases	
  2004:17-­‐24.	
  
[6]	
  Wilbur	
  WJ,	
  Rzhetsky	
  A,	
  Shatkay	
  H	
  (2006).	
  New	
  direc<ons	
  in	
  biomedical	
  text	
  annota<ons:	
  defini<ons,	
  
guidelines	
  and	
  corpus	
  construc<on.	
  BMC	
  Bioinforma<cs	
  2006,	
  7:356.	
  
[7]	
  Thompson	
  P.,	
  Venturi	
  G.	
  et	
  al.	
  (2008).	
  Categorising	
  modality	
  in	
  biomedical	
  texts.	
  Proc.	
  LREC	
  2008	
  Wkshp	
  
Building	
  and	
  Evalua<ng	
  Resources	
  for	
  Biomedical	
  Text	
  Mining	
  2008.	
  
[8]	
  Kim,	
  S-­‐M.	
  Hovy,	
  E.H.	
  (2004).	
  Determining	
  the	
  Sen<ment	
  of	
  Opinions,COLING	
  conference,	
  Geneva,	
  2004.	
  	
  
[9]	
  	
  de	
  Waard,	
  A.	
  and	
  Pander	
  Maat,	
  H.	
  (2012).	
  Epistemic	
  Modality	
  and	
  Knowledge	
  A9ribu<on	
  in	
  Scien<fic	
  
Discourse:	
  A	
  Taxonomy	
  of	
  Types	
  and	
  Overview	
  of	
  Features.	
  Workshop	
  on	
  Detec<ng	
  Structure	
  in	
  Scholarly	
  
Discourse,	
  ACL	
  2012.	
  	
  
[10]	
  Sándor,	
  À.	
  and	
  de	
  Waard,	
  A.,	
  (2012).	
  Iden<fying	
  Claimed	
  Knowledge	
  Updates	
  in	
  Biomedical	
  Research	
  
Ar<cles,	
  Workshop	
  on	
  Detec<ng	
  Structure	
  in	
  Scholarly	
  Discourse,	
  ACL	
  2012.	
  	
  
[11]	
  de	
  Waard,	
  A.	
  and	
  Schneider,	
  J.	
  (2012)	
  Formalising	
  Uncertainty:	
  An	
  Ontology	
  of	
  Reasoning,	
  Certainty	
  and	
  
A9ribu<on	
  (ORCA),	
  SATBI+SWIM,	
  ISWC	
  2012.	
  	
  
[12]	
  Medscan	
  
[13]	
  Biological	
  Expression	
  Language	
  –	
  h9p://www.openbel.org	
  	
  
[14]	
  Groth	
  et	
  al	
  (2010)	
  'The	
  anatomy	
  of	
  a	
  nanopublica<on'	
  Informa<on	
  Services	
  &	
  Use	
  30:51-­‐6	
  

Mais conteúdo relacionado

Semelhante a Talk at ISWC 2012 Workshop on Semantic Technologies Applied to Biomedical Informatics and Individualized Medicine (SATBI+SWIM 2012)

The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...Anita de Waard
 
How to persuade with data
How to persuade with dataHow to persuade with data
How to persuade with dataAnita de Waard
 
Correlation globes of the exposome 2016
Correlation globes of the exposome 2016Correlation globes of the exposome 2016
Correlation globes of the exposome 2016Chirag Patel
 
How Can Ngs Forward Research Essay
How Can Ngs Forward Research EssayHow Can Ngs Forward Research Essay
How Can Ngs Forward Research EssayStefanie Yang
 
Transcriptomics and lexico-syntactic analysis
Transcriptomics and lexico-syntactic analysisTranscriptomics and lexico-syntactic analysis
Transcriptomics and lexico-syntactic analysisLars Juhl Jensen
 
Why life is so complicated
Why life is so complicatedWhy life is so complicated
Why life is so complicatedAnita de Waard
 
Taking A Look At Influenza A Virus
Taking A Look At Influenza A VirusTaking A Look At Influenza A Virus
Taking A Look At Influenza A VirusNicole Gomez
 
Essential Biology 04.1 Chromosomes, Genes, Alleles, Mutations
Essential Biology 04.1   Chromosomes, Genes, Alleles, MutationsEssential Biology 04.1   Chromosomes, Genes, Alleles, Mutations
Essential Biology 04.1 Chromosomes, Genes, Alleles, MutationsStephen Taylor
 
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...Neuroscience Information Framework
 
International Journal of Biometrics and Bioinformatics(IJBB) Volume (4) Issue...
International Journal of Biometrics and Bioinformatics(IJBB) Volume (4) Issue...International Journal of Biometrics and Bioinformatics(IJBB) Volume (4) Issue...
International Journal of Biometrics and Bioinformatics(IJBB) Volume (4) Issue...CSCJournals
 
Hsiao-DevNeurobiol2014
Hsiao-DevNeurobiol2014Hsiao-DevNeurobiol2014
Hsiao-DevNeurobiol2014Katie K. Hsiao
 
cloning. Second, it is sensitive. Activities canbe detected
cloning. Second, it is sensitive. Activities canbe detected cloning. Second, it is sensitive. Activities canbe detected
cloning. Second, it is sensitive. Activities canbe detected WilheminaRossi174
 
Accelerating Scientific Research Through Machine Learning and Graph
Accelerating Scientific Research Through Machine Learning and GraphAccelerating Scientific Research Through Machine Learning and Graph
Accelerating Scientific Research Through Machine Learning and GraphNeo4j
 

Semelhante a Talk at ISWC 2012 Workshop on Semantic Technologies Applied to Biomedical Informatics and Individualized Medicine (SATBI+SWIM 2012) (20)

The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
 
How to persuade with data
How to persuade with dataHow to persuade with data
How to persuade with data
 
Dynamics of developmental fate decisions - Luís A. Nunes Amaral
Dynamics of developmental fate decisions - Luís A. Nunes AmaralDynamics of developmental fate decisions - Luís A. Nunes Amaral
Dynamics of developmental fate decisions - Luís A. Nunes Amaral
 
Correlation globes of the exposome 2016
Correlation globes of the exposome 2016Correlation globes of the exposome 2016
Correlation globes of the exposome 2016
 
How Can Ngs Forward Research Essay
How Can Ngs Forward Research EssayHow Can Ngs Forward Research Essay
How Can Ngs Forward Research Essay
 
6 55 E
6 55 E6 55 E
6 55 E
 
Transcriptomics and lexico-syntactic analysis
Transcriptomics and lexico-syntactic analysisTranscriptomics and lexico-syntactic analysis
Transcriptomics and lexico-syntactic analysis
 
Why life is so complicated
Why life is so complicatedWhy life is so complicated
Why life is so complicated
 
Ls+Resume+09 10 2
Ls+Resume+09 10 2Ls+Resume+09 10 2
Ls+Resume+09 10 2
 
Taking A Look At Influenza A Virus
Taking A Look At Influenza A VirusTaking A Look At Influenza A Virus
Taking A Look At Influenza A Virus
 
Essential Biology 04.1 Chromosomes, Genes, Alleles, Mutations
Essential Biology 04.1   Chromosomes, Genes, Alleles, MutationsEssential Biology 04.1   Chromosomes, Genes, Alleles, Mutations
Essential Biology 04.1 Chromosomes, Genes, Alleles, Mutations
 
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
 
International Journal of Biometrics and Bioinformatics(IJBB) Volume (4) Issue...
International Journal of Biometrics and Bioinformatics(IJBB) Volume (4) Issue...International Journal of Biometrics and Bioinformatics(IJBB) Volume (4) Issue...
International Journal of Biometrics and Bioinformatics(IJBB) Volume (4) Issue...
 
BioPosterPP
BioPosterPPBioPosterPP
BioPosterPP
 
Hsiao-DevNeurobiol2014
Hsiao-DevNeurobiol2014Hsiao-DevNeurobiol2014
Hsiao-DevNeurobiol2014
 
cloning. Second, it is sensitive. Activities canbe detected
cloning. Second, it is sensitive. Activities canbe detected cloning. Second, it is sensitive. Activities canbe detected
cloning. Second, it is sensitive. Activities canbe detected
 
Accelerating Scientific Research Through Machine Learning and Graph
Accelerating Scientific Research Through Machine Learning and GraphAccelerating Scientific Research Through Machine Learning and Graph
Accelerating Scientific Research Through Machine Learning and Graph
 
Nature Of Gene.pdf
Nature Of Gene.pdfNature Of Gene.pdf
Nature Of Gene.pdf
 
Nature Of Gene.pdf
Nature Of Gene.pdfNature Of Gene.pdf
Nature Of Gene.pdf
 
Myers CV_2015
Myers CV_2015Myers CV_2015
Myers CV_2015
 

Mais de Anita de Waard

Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and ReuseMendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and ReuseAnita de Waard
 
Why would a publisher care about open data?
Why would a publisher care about open data?Why would a publisher care about open data?
Why would a publisher care about open data?Anita de Waard
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Anita de Waard
 
NFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR DataNFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR DataAnita de Waard
 
CNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data CommonsCNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data CommonsAnita de Waard
 
Enabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring GuidelinesEnabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring GuidelinesAnita de Waard
 
Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.Anita de Waard
 
Data, Data Everywhere: What's A Publisher to Do?
Data, Data Everywhere: What's  A Publisher to Do?Data, Data Everywhere: What's  A Publisher to Do?
Data, Data Everywhere: What's A Publisher to Do?Anita de Waard
 
Talk on Research Data Management
Talk on Research Data ManagementTalk on Research Data Management
Talk on Research Data ManagementAnita de Waard
 
Networked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseNetworked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseAnita de Waard
 
Big Data and the Future of Publishing
Big Data and the Future of PublishingBig Data and the Future of Publishing
Big Data and the Future of PublishingAnita de Waard
 
Real-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data EcosystemsReal-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data EcosystemsAnita de Waard
 
Data Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost RecoveryData Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost RecoveryAnita de Waard
 
The Economics of Data Sharing
The Economics of Data SharingThe Economics of Data Sharing
The Economics of Data SharingAnita de Waard
 
Public Identifiers in Scholarly Publishing
Public Identifiers in Scholarly PublishingPublic Identifiers in Scholarly Publishing
Public Identifiers in Scholarly PublishingAnita de Waard
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumAnita de Waard
 
Elsevier‘s RDM Program: Ten Habits of Highly Effective Data
Elsevier‘s RDM Program: Ten Habits of Highly Effective DataElsevier‘s RDM Program: Ten Habits of Highly Effective Data
Elsevier‘s RDM Program: Ten Habits of Highly Effective DataAnita de Waard
 
Charleston Conference 2016
Charleston Conference 2016Charleston Conference 2016
Charleston Conference 2016Anita de Waard
 
RDA-WDS Publishing Data Interest Group
RDA-WDS Publishing Data Interest GroupRDA-WDS Publishing Data Interest Group
RDA-WDS Publishing Data Interest GroupAnita de Waard
 

Mais de Anita de Waard (20)

Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and ReuseMendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
 
Why would a publisher care about open data?
Why would a publisher care about open data?Why would a publisher care about open data?
Why would a publisher care about open data?
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
 
NFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR DataNFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR Data
 
CNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data CommonsCNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data Commons
 
Enabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring GuidelinesEnabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring Guidelines
 
Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.
 
Data, Data Everywhere: What's A Publisher to Do?
Data, Data Everywhere: What's  A Publisher to Do?Data, Data Everywhere: What's  A Publisher to Do?
Data, Data Everywhere: What's A Publisher to Do?
 
Talk on Research Data Management
Talk on Research Data ManagementTalk on Research Data Management
Talk on Research Data Management
 
History of the future
History of the futureHistory of the future
History of the future
 
Networked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseNetworked Science, And Integrating with Dataverse
Networked Science, And Integrating with Dataverse
 
Big Data and the Future of Publishing
Big Data and the Future of PublishingBig Data and the Future of Publishing
Big Data and the Future of Publishing
 
Real-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data EcosystemsReal-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data Ecosystems
 
Data Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost RecoveryData Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost Recovery
 
The Economics of Data Sharing
The Economics of Data SharingThe Economics of Data Sharing
The Economics of Data Sharing
 
Public Identifiers in Scholarly Publishing
Public Identifiers in Scholarly PublishingPublic Identifiers in Scholarly Publishing
Public Identifiers in Scholarly Publishing
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
 
Elsevier‘s RDM Program: Ten Habits of Highly Effective Data
Elsevier‘s RDM Program: Ten Habits of Highly Effective DataElsevier‘s RDM Program: Ten Habits of Highly Effective Data
Elsevier‘s RDM Program: Ten Habits of Highly Effective Data
 
Charleston Conference 2016
Charleston Conference 2016Charleston Conference 2016
Charleston Conference 2016
 
RDA-WDS Publishing Data Interest Group
RDA-WDS Publishing Data Interest GroupRDA-WDS Publishing Data Interest Group
RDA-WDS Publishing Data Interest Group
 

Talk at ISWC 2012 Workshop on Semantic Technologies Applied to Biomedical Informatics and Individualized Medicine (SATBI+SWIM 2012)

  • 1. Formalising  Uncertainty:     An  Ontology  of  Reasoning,   Certainty  and  A9ribu<on  (ORCA)   Anita  de  Waard   Jodi  Schneider   Disrup<ve  Technologies  Director   PhD  Researcher   Elsevier  Labs,  Jericho,  VT,  USA   DERI,  Galway,  Ireland      
  • 2. Outline   •  Background:     –  Metadiscourse,  epistemic  modality,  and  knowledge   a9ribu<on,  oh  my!   –  Some  related  work:  genre  studies,  linguis<cs,  NLP   •  Our  model:   –  What  it  models   –  The  ontology   –  How  can  we  find  this  in  text?   •  Possible  applica<ons:     –  Possible  uses   –  Next  steps  
  • 4. Scien<sts  make  uncertain  claims   Uncertainty   These  miRNAs  neutralize  p53-­‐mediated  CDK   inhibi;on,  possibly  through  direct  inhibi;on   of  the  expression  of  the  tumor-­‐suppressor   LATS2.    
  • 5. But  uncertainty  gets  lost  while  ci<ng   Uncertainty   These  miRNAs  neutralize  p53-­‐mediated  CDK   inhibi;on,  possibly  through  direct  inhibi;on   of  the  expression  of  the  tumor-­‐suppressor   LATS2.     Certainty   Two  oncogenic  miRNAs,  miR-­‐372  and   miR-­‐373,  directly  inhibit  the  expression  of   Lats2,  thereby  allowing  tumorigenic  growth   in  the  presence  of  p53  (Voorhoeve  et  al.,   2006)  
  • 6. Uncertainty  in  ac<on:   “[Y]ou  can  transform  ..  fic<on  into  fact  just  by  adding  or   subtrac<ng  references”,  Bruno  Latour  [1] •  Voorhoeve  et  al.,  2006:   These  miRNAs  neutralize  p53-­‐  mediated  CDK   inhibi<on,  possibly  through  direct  inhibi<on  of  the  expression  of  the  tumor   suppressor  LATS2.   •  Kloosterman  and  Plasterk,  2006:   In  a  gene<c  screen,  miR-­‐372  and  miR-­‐373   were  found  to  allow  prolifera<on  of  primary  human  cells  that  express   oncogenic  RAS  and  ac<ve  p53,  possibly  by  inhibi<ng  the  tumor  suppressor   LATS2  (Voorhoeve  et  al.,  2006).   •  Yabuta  et  al.,  2007:     [On  the  other  hand,]  two  miRNAs,  miRNA-­‐372  and-­‐373,   func<on  as  poten6al  novel  oncogenes  in  tes<cular  germ  cell  tumors  by   inhibi<on  of  LATS2  expression,  which  suggests  that  Lats2  is  an  important   tumor  suppressor  (Voorhoeve  et  al.,  2006).     •  Okada  et  al.,  2011:   Two  oncogenic  miRNAs,  miR-­‐372  and  miR-­‐373,  directly   inhibit  the  expression  of  Lats2,  thereby  allowing  tumorigenic  growth  in  the   presence  of  p53  (Voorhoeve  et  al.,  2006).  
  • 7. Uncertainty  =  Hedging:   •  Why  do  authors  hedge?   –  Make  a  claim  ‘pending  […]  acceptance  in  the  community’  [2]   –  ‘Create  A  Research  Space’  –  hedging  allows  authors  to  insert  themselves  into   the  discourse  in  a  community  [3]   –  ‘the  strongest  claim  a  careful  researcher  can  make’  [4]   •  Hedging  cues,  specula<ve  language,  modality/nega<on:   –  Light  et  al  [5]:  finding  specula<ve  language   –  Wilbur  et  al  [6]:  focus,  polarity,  certainty,  evidence,  and  direc<onality   –  Thompson  et  al  [7]:  level  of  specula<on,  type/source  of  the  evidence  and   level  of  certainty       •  Sen<ment  detec<on  (e.g.  Kim  and  Hovy  [8]  a.m.o.):     –  Holder  of  the  opinion,  strength,  polarity  as  ‘mathema<cal  func<on’  ac<ng  on   main  proposi<onal  content     –  Wide  applica<ons  in  product  reviews;  but  not  (yet)  in  science!  
  • 9. Our  model  for  epistemic  evalua<ons:   For  a  Proposi<on  P,  an  epistemically  marked  clause  E   is  an  evalua<on  of  P,    where    EV,  B,  S(P),  with:   –  V  =  Value:   3  =  Assumed  true,  2  =  Probable,  1  =  Possible,  0  =  Unknown,     (-­‐  1=  possibly  untrue,  -­‐  2  =  probably  untrue,  -­‐3  =  assumed  untrue)   –  B  =  Basis:   Reasoning   Data     –  S  =  Source:   A  =  speaker  is  author  A,  explicit   IA  =  speaker  author,  A,  implicit   N  =  other  author  N,  explicit   NN  =  other  author  NN,  implicit     Model  suggested  by  Eduard  Hovy,     Informa;on  Sciences  Ins;tute  University  South  Califormia  
  • 10. Adding  Epistemic  Evalua<on   Together,  Lats2  and  ASPP1  shunt  p53  to  proapopto<c   Value  =  3   promoters  and  promote  the  death  of  polyploid  cells  [1].  (…)   Source  =  N     Basis  =  0     Further  biochemical  characteriza<on  of  hMOBs  showed  that     Value  =  3   only  hMOB1A  and  hMOB1B  interact  with  both  LATS1  and   Source  =  N   LATS2  in  vitro  and  in  vivo  [39].  (…)   Basis  =  Data         Our  findings  reveal  that  miR-­‐373  would  be  a  poten<al   Value  =  1   oncogene  and  it  par<cipates  in  the  carcinogenesis  of  human   Source  =  Author   esophageal  cancer  by  suppressing  LATS2  expression.       Basis  =  Data         Furthermore,  we  demonstrated  that  the  direct  inhibi<on  of   Value  =  2  (3?)   LATS2  protein  was  mediated  by  miR-­‐373  and  manipulated  the   Source  =  Author   expression  of  miR-­‐373  to  affect  esophageal  cancer  cells  growth.     Basis  =  Data        
  • 11. Finding  hedges  in  text  [9]:   •  Modal  auxiliary  verbs  (e.g.  can,  could,  might)     •  Qualifying  adverbs  and  adjec<ves  (e.g.  interes;ngly,   possibly,  likely,  poten;al,  somewhat,  slightly,   powerful,  unknown,  undefined)   •  References,  either  external  (e.g.  ‘[Voorhoeve  et  al.,   2006]’)  or  internal  (e.g.  ‘See  fig.  2a’).     •  Repor<ng/epistemic  verbs  (e.g.  suggest,  imply,   indicate,  show)     –  either  within  the  clause:  ‘These  results  suggest  that...’     –  or  in  a  subordinate  clause  governed  by  repor<ng-­‐verb   matrix  clause  ‘{These  results  suggest  that}  indeed,  this   represents  the  true  endogenous  ac;vity.’  
  • 12. Manual  iden<fica<on:   Value   Modal   Repor6ng   Ruled  by   Adverbs/ Referenc None   Total     Aux     Verb   RV   Adjec6ves   es   Total  value  =  3   1  (0.5%)   81  (40%)   24  (12%)   7  (4%)   41  (20%)   47  (24%)  201(100%)   Total  Value  =  2   29  (51%)   23  (40%)   1  (2%)   4(7%)   57(100%)   Total  Value  =  1   9(27%)   11(33%)   11(33%)   1(3%)   1(3%)   33(100%)   Total  Value  =  0   9  (64%)   3  (21%)   1(7%)   1(7%)   14(100%)   Total  No  Modality   16(37%)   3(7%)   0   3(7%)   22(50%)   44(100%)   Overall  Total   10  (2%)   146(23%)   64(10%)   10(2%)   50(8%)   69(11%)  640(100%)  
  • 13. Most  prevalent  clause  type:     “These  results  suggest  that...”   Adverb/Connec<ve   thus,  therefore,  together,  recently,  in  summary     Determiner/Pronoun     it,  this,  these,  we/our   Adjec<ve   previous,  future,  beeer   Noun  phrase   data,  report,  study,  result(s);  method  or  reference   Modal   form  of    ‘to  be’,  may,  remain   Adjec<ve   ogen,  recently,  generally   Verb   show,  obtain,  consider,  view,  reveal,  suggest,   hypothesize,  indicate,  believe   Preposi<on     that,  to  
  • 14. Repor<ng  verbs  vs.  epistemic  value:   Value  =  0   establish,  (remain  to  be)  elucidated,     (unknown)   be  (clear/useful),  (remain  to  be)  examined/determined,   describe,  make  difficult  to  infer,  report   Value  =  1   be  important,  consider,  expect,  hypothesize  (5x),  give   (hypothe<cal)   insight,  raise  possibility  that,  suspect,  think   Value  =  2   appear,  believe,  implicate  (2x),  imply,  indicate  (12x),  play  a   (probable)   role,  represent,  suggest  (18x),  validate  (2x),     Value  =  3   be  able/apparent/important  /posi<ve/visible,  compare   (presumed  true)   (2x),  confirm  (2x),  define,    demonstrate  (15x),  detect  (5x),   discover,  display  (3x),  eliminate,  find  (3x),  iden<fy  (4x),   know,  need,  note  (2x),  observe  (2x),  obtain  (success/ results-­‐  3x),  prove  to  be,  refer,  report(2x),    reveal  (3x),   see(2x),  show(24x),    study,  view  
  • 15. Finding  Claimed  Knowledge  Updates  [10]:   Defini<on:     1)  A  CKU  expresses  a  proposi<on  about  biological  en<<es     2)  A  CKU  is  a  new  proposi<on   3)  The  authors  present  the  CKU  as  factual:   =>  Strength  =  Certainty   4)  A  CKU  is  derived  from  experimental  work  described  in  the  ar<cle:   =>  Basis  =  Data   5)  The  ownership  is  a9ributed  to  the  author(s)  of  the  ar<cle.     =>  Source  =  Author,  Explicit   3),  4)  and  5)  are  either  explicitly  expressed  or  structurally  conveyed:   Here  we  used  mass  spectrometry  to  iden:fy  HuD  as  a  novel  SMN-­‐ interac;ng  partner   Our  analysis  of  known  HuD-­‐associated  mRNAs  iden:fied  cpg15   mRNA  as  a  highly  abundant  mRNA  in  HuD  Ips  
  • 16. Automa<c  hedge  detec<on  with   The  Xerox  Incremental  Parser:   Concept-­‐matching:   Match  concept  pa9erns  with  rules   Assign  features  to  keywords,  dependencies  and  sentences       General  linguis<c  analysis  of  running  texts:   Extract  syntac<c  dependencies  between  words   Chunking   Part-­‐of-­‐speech  disambigua<on   Segment  the  sentences  into  words   Segment  the  text  into  sentences  
  • 17. Result:  CKUs  appear  throughout  the  paper   bio-event   entity 1 event name entity 2 location HuD interaction SMN motor neurons Title Abstract Intro. Results Figures Discussion Citation Interaction of Here we used Here we Together with SMN Our MS and Furthermore, survival of mass identify HuD our co-IP interacts co-IP data these findings motor spectrometry as a novel data, these with HuD. demonstrate are consistent neuron to identify interacting results a strong with recent (SMN) and HuD as a partner of indicate that interaction studies HuD proteins novel SMN, SMN between demonstrating [with m RNA neuronal associates SMN and that the cpg15rescues SMN- with HuD in HuD in interaction of motor neuron interacting motor spinal motor HuD with the axonal partner. neurons. neuron spinal deficits] axons. muscular atrophy (SMA) protein SMN …
  • 18. The  Xerox  Incremental  Parser:   Concept-­‐matching:   Match  concept  pa9erns  with  rules   Assign  features  to  keywords,  dependencies  and  sentences       General  linguis<c  analysis  of  running  texts:   Extract  syntac<c  dependencies  between  words   Chunking   Part-­‐of-­‐speech  disambigua<on   Segment  the  sentences  into  words   Segment  the  text  into  sentences  
  • 19. The  formal  model   ©  Jodi  Schneider,     with  thanks  to  Siggi  Handschuh  
  • 20. orca  [11]    vocab.deri.ie/orca    
  • 21. Example  Usage         <claim>  orca:hasBasis  orca:Data  .  
  • 25. How  to  represent  the  hierarchy?   lack  of  knowledge  <  hypothe;cal  knowledge     <  dubita;ve  knowledge  <  doxas;c  knowledge     •  skos:broaderThan  –  not  appropriate   •  skos  Collec<ons  add  an  unwanted  layer  of   complexity.   •  Our  approach:  transi<ve  proper<es   “lessCertain”  and  “moreCertain”  
  • 26. Transi<ve  proper<es  used  for   ConfidenceLevel  
  • 27. ConfidenceLevel  &  its  Rela<onships  
  • 29. Add  knowledge  value/basis/source     to  a  bio-­‐event     Biological  statement    with  epistemic  markup   Epistemic  evalua6on   Our  findings  reveal  that  miR-­‐373  would  be  a   Value  =  Probable   poten<al  oncogene  and  it  par<cipates  in  the   Source  =  Author   carcinogenesis  of  human  esophageal  cancer  by   Basis  =  Data     suppressing  LATS2  expression.         Further  biochemical  characteriza<on  of  hMOBs   Value  =  Presumed   showed  that  only  hMOB1A  and  hMOB1B  interact   true   with  both  LATS1  and  LATS2  in  vitro  and  in  vivo  [39].   Source  =  Reference   Basis  =  Data     Moreover,  the  mechanisms  by  which  tumor   Value  =  Possible   suppressor  genes  are  inhibited  may  vary  between   Source  =  Unknown   tumors.   Basis  =  Unknown  
  • 30. E.g.  to  augment  Medscan  [13]   Biological  statement  with  Medscan/ MedScan  Analysis:   Epistemic   epistemic  markup   evalua6on   Furthermore,  we  present  evidence  that   IL-­‐6  è  NUCB2  (nesfa;n-­‐1)   Value  =  Probable   the  secre;on  of  nesfa:n-­‐1  into  the   Rela<on:  MolTransport   Source  =  Author   culture  media  was  drama<cally  increased   Effect:  Posi<ve   Basis  =  Data     during  the  differen<a<on  of  3T3-­‐L1   CellType:  Adipocytes     preadipocytes  into  adipocytes  (P  <  0.001)   Cell  Line:  3T3-­‐L1   and  a{er  treatments  with  TNF-­‐alpha,     IL-­‐6,  insulin,  and  dexamethasone  (P  <   0.01).  
  • 31. Or  Biological  Exchange  Language  [14]:     Biological  statement  with   BEL  representa6on:   Epistemic   BEL/  epistemic  markup   evalua6on   These  miRNAs  neutralize  p53-­‐ Increased  abundance  of  miR-­‐372   Value  =  Possible   decreases:  Increased  ac;vity  of  TP53   mediated  CDK  inhibi;on,   Source  =   decreases  ac;vity  of  CDK  protein  family   possibly  through  direct   r(MIR:miR-­‐372)  -­‐| Unknown   inhibi;on  of  the  expression  of   (tscript(p(HUGO:Trp53))  -­‐|   Basis  =   the  tumor-­‐suppressor  LATS2.     kin(p(PFH:”CDK    Family”)))   Unknown       Increased  abundance  of  miR-­‐372   decreases  abundance  of  LATS2   r(MIR:miR-­‐372)  -­‐|  r(HUGO:LATS2)  
  • 32. Using  ORCA  for  Nanopublica<ons  [15]:   •  Use  to  indicate  Strength,  Basis,  Source  of   Asser<ons:     Knowledge  Strength,   Methods   Authors,  DOIs   Basis,  Source  
  • 33. Next  steps:     •  Con<nuing  experiments  with  automated   detec<on   •  Can  be  used  in  Claim-­‐Evidence  network   projects,  e.g.  Data2Seman<cs  or  DIKB   •  Could  replace  more  complicated  models  of   argumenta<on   •  Ontology  is  available  for  all  to  use!    
  • 34. Thank  you!   •  Funding:     •  Discussion  partners:     –  Elsevier  Labs   –  Phil  Bourne,  UCSD   –  NWO  Casimir  programme   –  Ed  Hovy,     •  Collaborators:     –  Gully  Burns,  ISI   –  Henk  Pander  Maat,  UU   –  Joanne  Luciano,  RPI   –  Agnes  Sandor,  XRCE   –  Tim  Clark  et  al.,  Harvard   –  Siegfried  Handshuh,  DERI   –  Rinke  Hoekstra  &  co,  VU   –  Richard  Boyce  &  co,  UPi9   –  Maria  Liakata,  EBI   –  Sophia  Ananiadou  &  co,   NaCTeM    
  • 35. Ques<ons?       Anita  de  Waard   a.dewaard@elsevier.com   h9p://elsatglabs.com/labs/anita/       Jodi  Schneider   jodi.schneider@deri.org     h9p://jodischneider.com/jodi.html      
  • 36. References   [1]  Latour,  B.  and  Woolgar,  S.,  Laboratory  Life:  the  Social  Construc<on  of  Scien<fic  Facts,  1979,  Sage     [2]  Myers,  G.  (1992).  ‘In  this  paper  we  report’:  Speech  acts  and  scien<fic  facts,  Jnl  of  Pragmatlcs  17  (1992)   295-­‐313   [3]  Swales,  J.  (1990).  Genre  Analysis,  English  in  Acad.  and  Res.Se}ngs,  Cambridge  University  Press,  1990.     [4]  Salager-­‐Meyer,  F.  (1994),  Hedges  and  Textual  Communica<ve  Func<on  in  Medical  English  Wri9en   Discourse,  English  for  Specific  Purposes,  Vol.  13,  No.  2,  pp.  149-­‐170,  1994.     [5]  Light  M,  Qiu  XY,  Srinivasan  P.  (2004).  The  language  of  bioscience:  facts,  specula<ons,  and  statements  in   between.  BioLINK  2004:  Linking  Biological  Literature,  Ontologies  and  Databases  2004:17-­‐24.   [6]  Wilbur  WJ,  Rzhetsky  A,  Shatkay  H  (2006).  New  direc<ons  in  biomedical  text  annota<ons:  defini<ons,   guidelines  and  corpus  construc<on.  BMC  Bioinforma<cs  2006,  7:356.   [7]  Thompson  P.,  Venturi  G.  et  al.  (2008).  Categorising  modality  in  biomedical  texts.  Proc.  LREC  2008  Wkshp   Building  and  Evalua<ng  Resources  for  Biomedical  Text  Mining  2008.   [8]  Kim,  S-­‐M.  Hovy,  E.H.  (2004).  Determining  the  Sen<ment  of  Opinions,COLING  conference,  Geneva,  2004.     [9]    de  Waard,  A.  and  Pander  Maat,  H.  (2012).  Epistemic  Modality  and  Knowledge  A9ribu<on  in  Scien<fic   Discourse:  A  Taxonomy  of  Types  and  Overview  of  Features.  Workshop  on  Detec<ng  Structure  in  Scholarly   Discourse,  ACL  2012.     [10]  Sándor,  À.  and  de  Waard,  A.,  (2012).  Iden<fying  Claimed  Knowledge  Updates  in  Biomedical  Research   Ar<cles,  Workshop  on  Detec<ng  Structure  in  Scholarly  Discourse,  ACL  2012.     [11]  de  Waard,  A.  and  Schneider,  J.  (2012)  Formalising  Uncertainty:  An  Ontology  of  Reasoning,  Certainty  and   A9ribu<on  (ORCA),  SATBI+SWIM,  ISWC  2012.     [12]  Medscan   [13]  Biological  Expression  Language  –  h9p://www.openbel.org     [14]  Groth  et  al  (2010)  'The  anatomy  of  a  nanopublica<on'  Informa<on  Services  &  Use  30:51-­‐6