Presentation from a lecture at the IT-university, University of Copenhagen, sept 2013. Covers trends, where is data coming from, what is cognitive computing, what is Watson, how does it work and how to apply to real-world issues.
HTML Injection Attacks: Impact and Mitigation Strategies
How a Jeopardy-winning machine makes the World a Smarter Place
1. 1
Total Pages 16 Subject precert Memo Ref# Jwa010m66715 Urgent request procedure is 12/27 Please call xxxxx at nnnn-
nnn-nnn ..y G LI F 10 0513Z Er 12/20/2012 12:54 5733646493 PAGE 01/15 Th, r DateDEC 0 52012 Time: / I e0 IL,' DATE OF
BIRTH: 5- tall ^ SR PRINT PARIEDO" NAME: Patient Name: xx Referring Physician: xxx . Ht .41 '' Temp qi. 1 FutEcehArvi
<, St.-mi.d. PCP:3 r . eheit,Liv„ V. Wt 350 Pulse gp Location: Li RR - Well developed, well nourished , Quality: 2 N -
Voice quality normal limits 1k Medication her; Allergies: -I- Lek eri--e- (- Severity: Duration: if/10 ks.,0
eAtAp„z,../vais /.....,,4 r,..,,..,„ . fr.-, w.,,,-,-_,- C' Thr Sse-v-e.tat-9 €74,4-•-' Medications: a 63',,,,- ION if
Timing: CT Sfirytd."-Is't 11(a-etta. -r-cm-t4 NNS td.4.,n--i A110-440.P NI, I aa:-.• /11,4402",nX DO Context: t• C ...
st.m.., rheleni e...n.-- dm? SaGICAL 0 HISTORY: 6030 Modifying ice- soiN Factors:Warr, Lir, 6,,,,liccX . 1- nil ctan-
-)c, (8g--raA, Revi Cawcter A , owed _Try..,,, Y N laett fit Atan.4r, Impressions/Plan Signs & SYmPtoths:SA-,--0--
eAdra-. of Care efts; ea_ "ArStirr. -Ionuse .1-1 A, e cylLetr- 5 utee.ict441rdi PAST ILLNESSES: Reviewnsi Y • e•-• • ,
_,,, Cc/ _CMArna rrie, kikt$T c44,2-4.,/aner Avvya • greis- efrsolineptiO Reviewed FAMILY HISTORY: Y N SOCIAL
HISTORY: Marital Status: S9 W D Living Status; Nursing Home / Other Father: ) I Cath Minor Living Status: Mother /
Father / Other Mother: sibiingso,,. C lo r ca 6 .w.t0 b 1- L-7,- V VV-A Chi ots„ 1 Habits: If Child: Caffeine
Cigarettes/Tobacco Alcottoi Breast fed 6) Bottle Y YO &Haw N fed how How Much Much Much s3rPc othe years_ .a41 „. .
al- 4.„ ..-.....s PGF/GM: ca--)-14 rr-d-ei 611) '1 119 ' '' ReligiGps Preference; MGFIGM: e , Genera/ Health: weigh v
/perdfaitpti Eyes: yr itching, dry. bluffed, CV: MI, palpiteoKysnosis, erturrifies, cp.di/ sweats / chills,
ala241sturbance I rednessA rupii-d..t.,, rheumatic feeer, angina, cyanosis w/ feeding , cnyironyetnal/food
Allergic/immunologic 9.1egies, immuniratians: tendency, adenopathy pew eausatatons, 1,04est,
HereatopeicticfLymphatictdeenkeiThleoillig and reactions, END: colds, infectionttsorslbseC ts litffingluss ,ttto 7
rtdX -"ear - gt, ea di to dental cliff:imam vote Respirstogaatt- chronic Muscuioskclatakpenc3welling, muscular
inteknentatry: r f, itri4g, lesions, "Kees pneurnonige rcl wcanass, sloop d ce 01: diarrhea, dyspleagia tiftetla,
constipation, abdominal-an, hearibriitn, hepatitis incligegron, hcmatemeais therapy, Endocrine; theenaSe intolerance
to air heat changes. or cold, hormone with Neurological: syncope, memory, mm5floss, speech, are, special paralysis,
senses trepalt, incootdinatio<difficulties GU: .."5 Other: to Psychtstrie: sperly, do on, nerv,sicss xxxxxDate of
Service: December 5, 2012 Level of Exam PerfoCm and Document Problem Focused One to five element% identified by a
bullet Expanded Problem Focused At look six elements identified by a bullet Detailed At ieht twelve elements
identified by a but let Comprehensive Oerform all elements identified by a bullet; document every element in every
heavy box and at least on dement in every regular box HEAD AND FACE NECK Normocephalic • Atraumatic .[•••""" I _—
SuppleY Trache N N Full range of M *Lesions: Palpation Y it or percussion Masses: sinus tendern Masses: Laryngeal Y s
ton is without crepitutrON *Facial asymmetry or weakness noted: -Parotid tender & Y s Obvious dibular nodularity
glands: metric masses __left asymmetric right *Enlarged Tender: Y thyr sea: .,- 1 /igeht left both lobes EARS, NOSE,
MOUTH at THROAT EARS: NOSE: *External External Tympanic Effusion: Infection: Foreign Dearing: Examined be. , auditory
Y a m under ted microscope: canals es in: Ye intact: 01 Tuning Lesions left left left left ar Y rit right right tight
N right sw/Fingc Masses _oebilaterally bilaterally _Lbilaterally ear Scrtum Turbirtates Mucosa Dnamage: Foreign Nose:
Masses: Scars intact bod is appear ' Y and N and ey describe midline: hypertro moist: in Lesions ibe left flare Y
deviated N ET Masses right nate We) .... copy: movement good / / none left right blisters? TMJ Tubes Cen Removed
MOUTH/THROAT: Tongue is mobile: se in Y Lips are symmetrical: N Lesions az. Nasopharynx Eustachian Drainage:
Posterior Defenod Buccal Other: Masses/lesion describe Posterior Tonsils mucosa are: chrome phaea/ tubes with N tont
is are: pink Unable Manes: adenoids mirror floor and I? to except drains moist: are: of Lock-- detenni mouth,
for :hildren); e ?Tani/soft Y Lesions: Y pais / / N N Y-62 *Larynx Masses: Nodules, Interarytenoid Pyriform Vocal
Epiglottis Vallecula Deferred: Teeth Healthy cords in (with YIN gums: erythema, good Sinuses: clear/trio with Y
approx. mirror / cobblestoning: N repair sharp describe Masses masses: Unable edema except well borders: with to N Y
for N Y determine / IN good N dentures children): other mobility: CPIN Y Y Y Y / / I / N N N N Nasopharyngoscopy: Rea
Cardiovascular CTA Resp Effort Bilatera Lungs Chest Abeam wa symmetric Sounds AVID 'nevem -Exam Temp, Peripheral
pulse, edema, Vascular tenderness by ObservatioaN *Mood %Cranial -Alert Ocular arid and mobtility nerves oriented
affect: 2-12 goo0 x intact 3. and N no Eyes -, se Infantrfoddler a ory alignment deficits noterlaPN no alert
Neurological/Psychiatric Y / N Other: *Lymph *PAR, Finger-to-nose: Gait Heel- Dlisdiadoehokinesia: no nodes toe
murmurs, maneuver: neck: N clicks, bettor, Level of Service PR
How a Jeopardy-winning machine
makes the World a Smarter Place
Kim Escherich, Executive Innovation Architect
9. 9
“Can we design a computing system that rivals a human‟s ability to answer
questions posed in natural language, interpreting meaning and context and
retrieving, analyzing and understanding vast amounts of information in real-time?”
26. Kim Escherich
escherich@dk.ibm.com
+45 2880 4733
internetofthings.dk
escherich.biz
@kescherich
/escherich
/in/escherich
kescherich@gmail.com
We have only just begun to
build a new era of computing
powered by cognitive systems
Transforming how organizations
think, act, and operate
Learning through interactions
Delivering evidence based
responses driving better outcomes
Notas do Editor
What is bringing about the need for a new era of computing. In large part it is because of the explosion of data. And not just the typical structured data we find in computer databases, but through voice, social media, and sensors throughout the world. Up to 80 percent of this data is projected to be unstructured data by 2015.As you can see, data is just beginning its rapid growth. We’re sill on the blade part of the hockey stick.
Main point: Data is growing at an astounding rate. It is growing so fast that we often lack the ability to use it to its full potential. The highly unstructured nature of this data makes the challenge that much more difficult. This is a real problem for business. It makes informed decisions more difficult to make. Business leaders need a way to find hidden patterns and isolate the valuable nuggets that they need to make business decisions.Further speaking points: Yet, the rewards for finding a way to harness the data into useful information are great; 54% of companies in this year’s study with MIT/Sloan are using analytics for competitive advantage… and that number has surged 57% in just the past 12 months. “Dying of thirst in an ocean of data”… It’s an apt analogy. Data is everywhere. 90% of it didn't exist just two years ago. The vast majority of it is totally useless for any given goal and therefore amounts to noise and a hindrance to finding the key useful information needed in a specific time and place. Additional information: See information and stats
As this chart shows, the issue with data isn’t just the volume of data. IT systems need to be able to deal with the speed of the data, the various forms of data and the growing awareness that much of our data is uncertain.On Jeopardy, Watson competed well across all of these attributes.Arvind Krishna will talk more about these aspects of big data tomorrow.
In order to know we are making progress on scientific problems like open-domain QA well-defined challenges help demonstrate we can solve concrete & difficult tasks. As you might know Jeopardy! Is a long-standing, well-regarded and highly challenging Television quiz show in the US that demands human contestants to quickly understand and answer richly expressed natural language questions over a staggering array of topics.The Jeopardy! Challenge uniquely provides a palpable, compelling and notable way to drive the technology of Question Answering along key dimensionsIf you are familiar with the quiz show it asks an I incredibly broad range of questions over a huge variety of topics.In a single round there is a grid of 6 Categories and for each category 5 rows with increasing $ values. Once a cell is chosen by 1 of three players, A question, or what is often called a Clue is revealed.Here you see some example questions.<read some of the questions> Jeopardy uses complex and often subtle language to describe what is being asked. To win you have to be extraordinarily precise. You must deliver the exact answer – no more and no less – it is not good enough for it be somewhere in the top 2, 10 or 20 documents – you must know it exactly and get it in first place – otherwise no credit – in fact you loose points. You must demonstrate Accurate Confidences -- That is -- you must know what you know – if you “buzz –in” and then get it wrong you lose the $$ value of the question. And you have to do this all very quickly – deeply analyze huge volumes of content, consider many possible answers, compute your confidence and buzz in – all in just seconds.As we shall see compete with human champions at this game represents a Grand Challenge in Automatic Open-Domain Question Answering.<STOP><NEXT SLIDE>
Main Point: At the core of what makes Watson different are three powerful technologies - natural language, hypothesis generation, and evidence based learning. But Watson is more than the sum of its individual parts. Watson is about bringing these capabilities together in a way that’s never been done before resulting in a fundamental change in the way businesses look at quickly solving problemsSolutions that learn with each iterationCapable of navigating human communicationDynamically evaluating hypothesis to questions askedResponses optimized based on relevant dataIngesting and analyzing Big DataDiscovering new patterns and insights in secondsFurther speaking points:. Looking at these one by one, understanding natural language and the way we speak breaks down the communication barrier that has stood in the way between people and their machines for so long. Hypothesis generation bypasses the historic deterministic way that computers function and recognizes that there are various probabilities of various outcomes rather than a single definitive ‘right’ response. And adaptation and learning helps Watson continuously improve in the same way that humans learn….it keeps track of which of its selections were selected by users and which responses got positive feedback thus improving future response generationAdditional information: The result is a machine that functions along side of us as an assistant rather than something we wrestle with to get an adequate outcome
Here we see the same question, the same parse, but on the other side we see that there exists a passage containing the RIGHT answer BUT with only one key word in common. <read the green passage> The system must consider in parallel and in detail a huge amount of content just to get a SHOT at this evidence and then must find and weigh the right inferences that will allow it to match and score with an accurate confidence, for example in this case <click> Date Math, Statistical Paraphrasing and Geospatial reasoning. And its still not 100% certain What if, for example, the passage said “considered landing in” rather than “landed in” or what if there was just a preponderance of weaker evidence for another answer. Question Answering Technology tries to understand what the user is really asking for and to deliver precise and correct responses. But Natural language is hard. Meaning can be expressed in so many different ways and to achieve high levels of precision and confidence you must consider much more information and analyze it much more deeply. We is needed is a radically different approach that explores many different plaussive interpretations in parallel and collects and evaluates all sorts of evidence in support or in refutation of those possibilities.
Main point: Healthcare is a great example of how these challenges come to life. Physicians can not keep up with the explosive growth of medical information which is doubling every five years. Reading journals is the primary way new medical information is delivered yet the vast majority of physicians don’t spend anywhere near enough time to keep up with it.. Meanwhile, diagnosis, treatments, and preventable deaths leave huge room for improvementFurther speaking points: Imagine you’re in a hospital waiting room with 9 others waiting to seen. Chances are, two of you are going to be misdiagnosed. Preventable medical errors kill 44K-98K Americans every year. That’s enough to fill a big college football stadium. Imagine what the total would be world wide. As Steven Shapiro, Chief medical and scientific officer at UPMC says “Medicine has become too complex (and only) about 20% of the knowledge clinicians use today is evidence based”. We just spoke about the gap between today’s IT needs and traditional IT. Surely, a new approach to IT can help address some of the healthcare difficulties described here. Additional information: statistics on right.
Main point: Watson is evolving to meet the unique challenges of Healthcare. But it already has the core capabilities to handle a variety of healthcare-specific needs. Further speaking points:. It understands natural language. While this is valuable in any industry, healthcare is especially qualitative and verbal in its operations and stands to benefit strongly from Watson’s conversational approachAnalyzes large volumes of unstructured data which is especially prevalent in medicine such as Physician Notes, Medical Journals, Clinical Trials, Pathology Results, Blogs, WikipediaGenerates and evaluates hypothesis and presents responses with confidence. Just like healthcare professionals already do. Unlike some other fields, medicine has always looked at diagnosis and treatment options as probabilistic rather than ever identifying Supports iterative dialogue to refine results so as more information is made available, different hypotheses gain and shrink in probabilistic likelihood. Learns from results over time. During Jeopardy! trials, Watson was trained iteratively so it could learn from its experience. Same thing in healthcare.
Main point: A great way to understand how Watson works is to through a simulation. In this case, we can see how an incoming patient is diagnosed with increasing precision as more information is made available. Further speaking points:. A woman describes her symptoms to her healthcare professional. It can handle alternate meanings, misspellings (i.e. –her voice is ‘horse’ rather than ‘hoarse”). Based on the symptoms alone, Watson considers five possible diagnosis and scores probabilities for each based on the evidence… in the case of present symptoms, influenza seems most likely.It then considers explicitly absent symptoms (no abdominal pain, no cough, no shortness of breath, etc.) and now considers diabetes the most probable diagnosis.It correlates the various symptoms and evaluates co-relationships between them. So based on the symptoms alone, a UTI is most likely with Diabetes close behind. A family history shows strong Diabetes likelihood… enough to outweigh the symptoms-driven UTI likelihood but not by much. Her patient history brings UTI back up above diabetes. Looking at medications she is using brings another hypothesis into the mix but does not alter the balance of the two most likely diagnoses. So tests are done for both. Tests confirm the presence of a UTI. Additional information: Note that the iterative process matches the process the physician would use in an unaided diagnosis. And the process extends beyond healthcare. You can think about this iterative process in other situations beyond healthcare.