SlideShare uma empresa Scribd logo
1 de 58
Will artificial intelligence change how readers
use the research literature?
jake.lever@glasgow.ac.uk
Jake Lever, University of Glasgow
This talk will cover:
• Why text mining is going to get
bigger and bigger in science?
• An application in precision
medicine
• What are these new large
language models and how do they
work?
• How will they fit into text mining?
• What’s next?
Overview of the talk
Text mining will become essential in science
• Too many papers to read!
• (for some areas of
science)
• Artificial intelligence
methods may help:
• Extract knowledge from
papers
• Summarise them
• More?
image generated with Midjourney
• We can now gather huge amounts of data on each person
• But what does it mean?
Biomedicine is becoming a “big data” science
https://www.genome.gov/about-genomics/fact-sheets/Sequencing-Human-Genome-cost
https://nanoporetech.com/about-us/news/oxford-nanopore-announces-ps100-million-140m-fundraising-global-investors
Interpreting biological experiments is getting more challenging
Gene Event
AP2B1 Overexpressed
CDB10 Deletion
PDDC1 Point Mutation
TPSD1 Underexpressed
WFDC5
Promoter
Methylated
BCL6B
Alternative
Splicing
Search tools & knowledge bases are valuable
Search Tools Knowledge Bases
Search tools & knowledge bases are valuable
Search Tools
✅ Flexible to different queries
✅ Easier to maintain
✅ Deals well with new
literature
❌ May requires users to read
(many) papers
❌ Cannot easily be used by
automated analyses
Search tools & knowledge bases are valuable
Search Tools
✅ Flexible to different queries
✅ Easier to maintain
✅ Deals well with new
literature
❌ May requires users to read
(many) papers
❌ Cannot easily be used by
automated analyses
Knowledge Bases
✅ Little paper reading
✅ Structured & searchable
✅ Usable by automated
analyses
❌ Only good if a KB exists for
your problem
❌ Huge cost burden to create
and maintain
Could text mining be a solution?
Can we use natural language
processing (NLP) to create
knowledge bases for our
specific information need?
http://www.anthropologyofknowledge.nu/2015/11/10/the-challenge-of-digital-reading
Knowledge provenance is essential in biomedicine
Where did you get that information
to make that clinical/experimental
decision?
• Clinicians and scientists need
to see the underlying data
• There are often legal
ramifications for clinical
decision making
An application in precision medicine
What are large language models?
Where do large language models fit?
What’s coming next?
An application in precision medicine
What are large language models?
Where do large language models fit?
What’s coming next?
Primary application: Precision medicine
“The right drug to the right
patient at the right time”
• Relies heavily on the latest
research
• Groups around the world are
manually reviewing literature
constantly
Interpretation is the
bottleneck of precision
medicine
Good, Benjamin M., et al. "Organizing knowledge to enable personalization of medicine in cancer." Genome biology 15.8 (2014): 438.
• Creating a knowledge base
requires expert knowledge
• Biocurators need text
mining tools to help them
triage the literature
Biocuration
Pharmacogenomics Knowledge Base (PharmGKB)
Codeine
Morphine
Motivation
Can text-mining find relevant
knowledge in the literature to assist
curation?
“Our results show that the V433M mutation of the
CYP4F2 gene affects metabolism of warfarin.”
Step 1: Get available biomedical publications
Takeuchi, Fumihiko, et al. "A genome-wide association study confirms VKORC1, CYP2C9, and CYP4F2 as principal genetic determinants of warfarin dose." PLoS genetics 5.3 (2009): e1000433.
PubMed
PMC Open
Access subset
PMC Author
Manuscript
Example Paper
… After these quality control steps, a total of 1053 warfarin
patients and 325,997 GWAS SNPs were retained for analysis.
The GWAS SNPs included two SNPs not on the
HumanCNV370 array but which are highly predictive of
warfarin dose [rs9923231 (VKORC1) and rs1799853
(CYP2C9*2)] which we genotyped by TaqMan assay (Applied
Biosystems).
Defining CNV Regions
Although we retained 325,997 GWAS SNPs for association
testing of ...
Step 2: Find chemicals, mutations and genes
Takeuchi, Fumihiko, et al. "A genome-wide association study confirms VKORC1, CYP2C9, and CYP4F2 as principal genetic determinants of warfarin dose." PLoS genetics 5.3 (2009): e1000433.
Annotations for:
• Genes
• Mutations
• Chemicals
• Diseases
• Species
• Celllines
Example Paper
… After these quality control steps, a total of 1053 warfarin
patients and 325,997 GWAS SNPs were retained for analysis.
The GWAS SNPs included two SNPs not on the
HumanCNV370 array but which are highly predictive of
warfarin dose [rs9923231 (VKORC1) and rs1799853
(CYP2C9*2)] which we genotyped by TaqMan assay (Applied
Biosystems).
Defining CNV Regions
Although we retained 325,997 GWAS SNPs for association
testing of ...
Step 3: Find sentences that mention chemical, mutation
and specific keywords
Takeuchi, Fumihiko, et al. "A genome-wide association study confirms VKORC1, CYP2C9, and CYP4F2 as principal genetic determinants of warfarin dose." PLoS genetics 5.3 (2009): e1000433.
Example Paper
… After these quality control steps, a total of 1053 warfarin
patients and 325,997 GWAS SNPs were retained for analysis.
The GWAS SNPs included two SNPs not on the
HumanCNV370 array but which are highly predictive of
warfarin dose [rs9923231 (VKORC1) and rs1799853
(CYP2C9*2)] which we genotyped by TaqMan assay (Applied
Biosystems).
Defining CNV Regions
Although we retained 325,997 GWAS SNPs for association
testing of ...
Step 4: Identify pharmacogenomic relations between
each chemical/mutation pair
Takeuchi, Fumihiko, et al. "A genome-wide association study confirms VKORC1, CYP2C9, and CYP4F2 as principal genetic determinants of warfarin dose." PLoS genetics 5.3 (2009): e1000433.
Example Paper
… After these quality control steps, a total of 1053 warfarin
patients and 325,997 GWAS SNPs were retained for analysis.
The GWAS SNPs included two SNPs not on the
HumanCNV370 array but which are highly predictive of
warfarin dose [rs9923231 (VKORC1) and rs1799853
(CYP2C9*2)] which we genotyped by TaqMan assay (Applied
Biosystems).
Defining CNV Regions
Although we retained 325,997 GWAS SNPs for association
testing of ...
1000 sentences
manually annotated
80/20% split for
evaluation
Avg Precision: 78%
Avg Recall: 25%
Step 5: Convert to structured data
The GWAS SNPs included two SNPs not on the HumanCNV370 array but
which are highly predictive of warfarin dose [rs9923231 (VKORC1) and
rs1799853 (CYP2C9*2)] which we genotyped by TaqMan assay (Applied
Biosystems).
Chemical Gene Mutation PubMed ID
warfarin VKORC1 rs9923231 19300499
warfarin CYP2C9 rs1799853 19300499
warfarin CYP2C9 *2 19300499
Results
# of papers 7,170
# of sentences 15,228
# of associations 19,930
# of unique
gene/mutation pairs
6,099
% of associations
found in full-text
58.9
PGxMine Resource
• Relation extraction used
to mine
pharmacogenomics
sentences from PubMed
& PubMed Central
• Used by PharmGKB
curators to prioritize new
papers for publication
https://pgxmine.pharmgkb.org/
Lever, Jake, et al. "PGxMine: text mining for curation of PharmGKB." Pacific Symposium on Biocomputing 2020.
Successful evaluation by PharmGKB curators
Top 100 chemical/mutation associations not in PharmGKB
57 lead directly
to at least one
curatable paper
24 would likely lead
indirectly to curatable
papers through
citations
19 did not lead to
curatable papers
37 to one
paper
16 to two
papers
3 to three
papers
1 to five
papers
83 curatable papers found directly with
PGxMine in top 100 hits!
Lever, Jake, et al. "PGxMine: text mining for curation of PharmGKB." Pacific
Symposium on Biocomputing 2020.
Now integrated
into PharmGKB
as automated
annotations
Lever, Jake, et al. "PGxMine: text mining for curation of PharmGKB." Pacific Symposium on Biocomputing 2020.
An application in precision medicine
What are large language models?
Where do large language models fit?
What’s coming next?
Language models can calculate the probability of the next word
Can you guess the next word?
Peter Piper picked a peck of pickled peppers,
A peck of pickled peppers Peter Piper picked;
If Peter Piper picked a peck of pickled …?
From previous text:
“peppers” appears 2/2
times after “pickled”
Can you guess the next word?
Peter Piper picked a peck of pickled peppers,
A peck of pickled peppers Peter Piper picked;
If Peter Piper picked a peck of pickled …?
• Look at lots of text (scraped from the internet)
• Some serious problems with bias and hate speech
• See what words follow on another
Where do these probabilities come from?
Example corpora from recent Gopher language paper
Rae, Jack W., et al. "Scaling language models: Methods, analysis & insights from training gopher." arXiv preprint arXiv:2112.11446 (2021).
You need more than the previous word to guess the next
it showed 9 o’clock on my ?
Some words are more important for context
Language models
have often been
named for muppets
characters:
• ELMO
• BERT
BERT used a new
idea: Transformers
Language models with deep learning
https://muppet.fandom.com/wiki/Sesame_Street
Playing language games
It showed 9 o’clock on my __________
It showed 9 _________ on my watch
It showed 9 o’clock on my stapler -> stapler is wrong
Predict the next word:
Predict a masked word:
Spot the corrupted word:
• Building and using a language model requires
GPUs become essential and expensive!
https://www.scan.co.uk/products/pny-nvidia-a100-80gb-hbm2-graphics-card-6912-cores-195-tflops-sp-97-tflops-dp
• Only large companies can build
these very large language models
And scale up!
https://www.theverge.com/2023/3/13/23637675/microsoft-chatgpt-bing-millions-dollars-supercomputer-openai
Banks, Carl. The Sport of Tycoons. 1974
Once upon a time …
Using a language model to generate new text
Word Probability
there 65 / 100
a 22 / 100
it 11 / 100
the 3 / 100
Once upon a time there …
Using a language model to generate new text
Word Probability
was 61 / 100
were 23 / 100
had 3 / 100
could 1 / 100
Once upon a time there was a young man who lived
down by a river. He did not want to go to school so he…
Using a language model to generate new text
● Keep generating
word-by-word
● Picking the most likely word
doesn’t create the most
interesting text
○ So pick randomly but
weight by the probabilities
Word Probability
ran 22 / 100
hid 21 / 100
told 15 / 100
stole 6 / 100
Do language models know anything?
The largest city in Scotland is …
Do language models know anything?
Word Probability
Glasgow 56 / 100
Edinburgh 35 / 100
London 7 / 100
Dundee 2 / 100
The largest city in Scotland is …
Do language models know anything?
Word Probability
Glasgow 56 / 100
Edinburgh 35 / 100
London 7 / 100
Dundee 2 / 100
The largest city in Scotland is …
How?
The huge amount
of text used to train
the system must
have contained
text about Scottish
cities
But, language models will hallucinate
Source: https://www.unite.ai/preventing-hallucination-in-gpt-3-and-other-complex-language-models/
• Stochastic parrots paper
• Fantastic summary of the
limitations of large language
models
• Language models encode the
biases in the text used to train
them
• They don’t “know” anything - they
just regurgitate grammatical
patterns that they’ve seen
Language models are stochastic parrots
Bender, Emily M., et al. "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?🦜." Proceedings of the 2021 ACM conference on fairness, accountability, and transparency. 2021.
An application in precision medicine
What are large language models?
Where do large language models fit?
What’s coming next?
• Prone to confident
hallucination
• Probably mostly factual -
but really difficult to spot
mistakes
• Can’t cite their sources
• But they are very good at
working with language
Using language models directly to ask knowledge is tricky
Feed text to a large language model and ask for interpretation
Language is
complex and large
language models
can summarize
Hall, Jeff M., et al. "Linkage of early-onset familial breast cancer to chromosome 17q21." Science 250.4988 (1990): 1684-1689.
They can also
extract structured
knowledge
Hall, Jeff M., et al. "Linkage of early-onset familial breast cancer to chromosome 17q21." Science 250.4988 (1990): 1684-1689.
• Ongoing work to understand if they
are much better than existing
methods
• Writing tasks as “prompts” is a bit of
an art
How good are they at useful language tasks?
• This article does not
exist
• Sometimes you get
lucky and it cites real
things. Other things,
it may make up
publications (with
authors, journals,
DOIs, etc).
For better or worse, readers will ask chat systems for knowledge
An application in precision medicine
What are large language models?
Where do large language models fit?
What’s coming next?
• What happens when there is a new prime minister?
• Language models are trapped by the text they are trained
on
• ChatGPT was trained until 2021
Language models get stuck in the past
The current prime minister of the UK is …
• Allow a language
model to use a
search engine to
help it pick the next
word
• Would enable citing
of sources!
• But are the sources
good?
Retrieval-enhanced
language models
Guu, Kelvin, et al. "Retrieval augmented language model pre-training." International conference on machine learning. PMLR, 2020.
The Bing chatbot (using GPT-4) can cite sources
But even a language model with sources can still hallucinate
• Scientists cannot read all the
necessary papers so text
mining can help
• Large language models
seem strangely good at
many general tasks
• There is a lot of hype
• We need to be sceptical
about their abilities and
accuracy
Conclusions
Acknowledgements
Jones Lab @ UBC
● Steven Jones
● Martin Jones
● Eric Zhao
● Jasleen Grewal
● Luka Culibrk
● Melika Bonakdar
Griffith Lab @ WashU
● Obi Griffith
● Malachi Griffith
● Kilannin Krysiak
● Arpad Danos
● Jason Saliba
Helix Lab @ Stanford
● Russ Altman
● Teri Klein
● Michelle Whirl-Carrillo
● PharmGKB team
jake.lever@glasgow.ac.uk

Mais conteúdo relacionado

Mais procurados

Colombia's Consumer Pulse Update - August 2020
Colombia's Consumer Pulse Update - August 2020Colombia's Consumer Pulse Update - August 2020
Colombia's Consumer Pulse Update - August 2020Bain & Company Brasil
 
Shattering the Glass Screen: Gender inequality in media and entertainment
 Shattering the Glass Screen: Gender inequality in media and entertainment Shattering the Glass Screen: Gender inequality in media and entertainment
Shattering the Glass Screen: Gender inequality in media and entertainmentMcKinsey & Company
 
Revolutionizing your Business with AI (AUC VLabs).pdf
Revolutionizing your Business with AI (AUC VLabs).pdfRevolutionizing your Business with AI (AUC VLabs).pdf
Revolutionizing your Business with AI (AUC VLabs).pdfOmar Maher
 
EY Germany FinTech Landscape
EY Germany FinTech LandscapeEY Germany FinTech Landscape
EY Germany FinTech LandscapeEY
 
Bcg true luxury global cons insight 2017 - presentata
Bcg true luxury global cons insight 2017 - presentataBcg true luxury global cons insight 2017 - presentata
Bcg true luxury global cons insight 2017 - presentataGabriela Otto
 
2020 Women in the Workplace Participant Briefing
2020 Women in the Workplace Participant Briefing2020 Women in the Workplace Participant Briefing
2020 Women in the Workplace Participant BriefingMcKinsey & Company
 
Protocol for systematic literature review
Protocol for systematic literature reviewProtocol for systematic literature review
Protocol for systematic literature reviewKhalid Mahmood
 
Turnitin presentation
Turnitin presentationTurnitin presentation
Turnitin presentationCtl Ul
 
Matthueu Lamiaux-Enfermedades transmitidas por vectores
Matthueu Lamiaux-Enfermedades transmitidas por vectoresMatthueu Lamiaux-Enfermedades transmitidas por vectores
Matthueu Lamiaux-Enfermedades transmitidas por vectoresFundación Ramón Areces
 
Accenture Technology Vision - How the trends apply to higher education
Accenture Technology Vision - How the trends apply to higher education Accenture Technology Vision - How the trends apply to higher education
Accenture Technology Vision - How the trends apply to higher education accenture
 
The Great Mobility Tech Race: Winning the battle for future profits
The Great Mobility Tech Race: Winning the battle for future profitsThe Great Mobility Tech Race: Winning the battle for future profits
The Great Mobility Tech Race: Winning the battle for future profitsBoston Consulting Group
 
Assessment and feedback re-designs for the generative AI era
Assessment and feedback re-designs for the generative AI eraAssessment and feedback re-designs for the generative AI era
Assessment and feedback re-designs for the generative AI eraDavid Carless
 
Digital strategies to find the right journal for publishing your research
Digital strategies to find the right journal for publishing your researchDigital strategies to find the right journal for publishing your research
Digital strategies to find the right journal for publishing your researchSC CTSI at USC and CHLA
 
Smart Cities – how to master the world's biggest growth challenge
Smart Cities – how to master the world's biggest growth challengeSmart Cities – how to master the world's biggest growth challenge
Smart Cities – how to master the world's biggest growth challengeBoston Consulting Group
 
Putting digital technology and data to work for Tech CMO's
Putting digital technology and data to work for Tech CMO'sPutting digital technology and data to work for Tech CMO's
Putting digital technology and data to work for Tech CMO'sPwC
 
intro chatGPT workshop.pdf
intro chatGPT workshop.pdfintro chatGPT workshop.pdf
intro chatGPT workshop.pdfpeterpur
 
A non-technical introduction to ChatGPT - SEDA.pptx
A non-technical introduction to ChatGPT - SEDA.pptxA non-technical introduction to ChatGPT - SEDA.pptx
A non-technical introduction to ChatGPT - SEDA.pptxSue Beckingham
 

Mais procurados (20)

Colombia's Consumer Pulse Update - August 2020
Colombia's Consumer Pulse Update - August 2020Colombia's Consumer Pulse Update - August 2020
Colombia's Consumer Pulse Update - August 2020
 
Shattering the Glass Screen: Gender inequality in media and entertainment
 Shattering the Glass Screen: Gender inequality in media and entertainment Shattering the Glass Screen: Gender inequality in media and entertainment
Shattering the Glass Screen: Gender inequality in media and entertainment
 
Revolutionizing your Business with AI (AUC VLabs).pdf
Revolutionizing your Business with AI (AUC VLabs).pdfRevolutionizing your Business with AI (AUC VLabs).pdf
Revolutionizing your Business with AI (AUC VLabs).pdf
 
EY Germany FinTech Landscape
EY Germany FinTech LandscapeEY Germany FinTech Landscape
EY Germany FinTech Landscape
 
Bcg true luxury global cons insight 2017 - presentata
Bcg true luxury global cons insight 2017 - presentataBcg true luxury global cons insight 2017 - presentata
Bcg true luxury global cons insight 2017 - presentata
 
2020 Women in the Workplace Participant Briefing
2020 Women in the Workplace Participant Briefing2020 Women in the Workplace Participant Briefing
2020 Women in the Workplace Participant Briefing
 
Protocol for systematic literature review
Protocol for systematic literature reviewProtocol for systematic literature review
Protocol for systematic literature review
 
How to check indexing of publications
How to check indexing of publicationsHow to check indexing of publications
How to check indexing of publications
 
Turnitin presentation
Turnitin presentationTurnitin presentation
Turnitin presentation
 
Matthueu Lamiaux-Enfermedades transmitidas por vectores
Matthueu Lamiaux-Enfermedades transmitidas por vectoresMatthueu Lamiaux-Enfermedades transmitidas por vectores
Matthueu Lamiaux-Enfermedades transmitidas por vectores
 
Accenture Technology Vision - How the trends apply to higher education
Accenture Technology Vision - How the trends apply to higher education Accenture Technology Vision - How the trends apply to higher education
Accenture Technology Vision - How the trends apply to higher education
 
The Great Mobility Tech Race: Winning the battle for future profits
The Great Mobility Tech Race: Winning the battle for future profitsThe Great Mobility Tech Race: Winning the battle for future profits
The Great Mobility Tech Race: Winning the battle for future profits
 
Assessment and feedback re-designs for the generative AI era
Assessment and feedback re-designs for the generative AI eraAssessment and feedback re-designs for the generative AI era
Assessment and feedback re-designs for the generative AI era
 
BCG Telco Sustainability Index
BCG Telco Sustainability IndexBCG Telco Sustainability Index
BCG Telco Sustainability Index
 
Digital strategies to find the right journal for publishing your research
Digital strategies to find the right journal for publishing your researchDigital strategies to find the right journal for publishing your research
Digital strategies to find the right journal for publishing your research
 
Smart Cities – how to master the world's biggest growth challenge
Smart Cities – how to master the world's biggest growth challengeSmart Cities – how to master the world's biggest growth challenge
Smart Cities – how to master the world's biggest growth challenge
 
Putting digital technology and data to work for Tech CMO's
Putting digital technology and data to work for Tech CMO'sPutting digital technology and data to work for Tech CMO's
Putting digital technology and data to work for Tech CMO's
 
intro chatGPT workshop.pdf
intro chatGPT workshop.pdfintro chatGPT workshop.pdf
intro chatGPT workshop.pdf
 
Scopus
ScopusScopus
Scopus
 
A non-technical introduction to ChatGPT - SEDA.pptx
A non-technical introduction to ChatGPT - SEDA.pptxA non-technical introduction to ChatGPT - SEDA.pptx
A non-technical introduction to ChatGPT - SEDA.pptx
 

Semelhante a UKSG 2023 - Will artificial intelligence change how readers use the research literature?

PadminiNarayanan-Intro-2018.pptx
PadminiNarayanan-Intro-2018.pptxPadminiNarayanan-Intro-2018.pptx
PadminiNarayanan-Intro-2018.pptxDESMONDEZIEKE1
 
Giab for jax long read 190917
Giab for jax long read 190917Giab for jax long read 190917
Giab for jax long read 190917GenomeInABottle
 
From Panels to Genomes with VarSeq: The Complete Tertiary Platform for Short ...
From Panels to Genomes with VarSeq: The Complete Tertiary Platform for Short ...From Panels to Genomes with VarSeq: The Complete Tertiary Platform for Short ...
From Panels to Genomes with VarSeq: The Complete Tertiary Platform for Short ...Golden Helix
 
Big Data and Genomic Medicine by Corey Nislow
Big Data and Genomic Medicine by Corey NislowBig Data and Genomic Medicine by Corey Nislow
Big Data and Genomic Medicine by Corey NislowKnome_Inc
 
Applications of Genomic and Proteomic Tools
Applications of Genomic and Proteomic ToolsApplications of Genomic and Proteomic Tools
Applications of Genomic and Proteomic ToolsRaju Paudel
 
human genome project by varaprasad
human genome project by varaprasadhuman genome project by varaprasad
human genome project by varaprasadVaraprasad Padala
 
GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015GenomeInABottle
 
Grand round whsiao_may2015
Grand round whsiao_may2015Grand round whsiao_may2015
Grand round whsiao_may2015IRIDA_community
 
How Can We Make Genomic Epidemiology a Widespread Reality? - William Hsiao
How Can We Make Genomic Epidemiology a Widespread Reality?  - William HsiaoHow Can We Make Genomic Epidemiology a Widespread Reality?  - William Hsiao
How Can We Make Genomic Epidemiology a Widespread Reality? - William HsiaoWilliam Hsiao
 
Quality Assessment of Biomedical Metadata using Topic Modeling
Quality Assessment of Biomedical Metadata using Topic ModelingQuality Assessment of Biomedical Metadata using Topic Modeling
Quality Assessment of Biomedical Metadata using Topic ModelingStuti Nayak
 
Comparative genomics and proteomics
Comparative genomics and proteomicsComparative genomics and proteomics
Comparative genomics and proteomicsNikhil Aggarwal
 
2010StanfordE25 Michele dragoescu e25 project
2010StanfordE25 Michele dragoescu e25 project2010StanfordE25 Michele dragoescu e25 project
2010StanfordE25 Michele dragoescu e25 projectmdragoescu
 
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User PerspectiveVarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User PerspectiveGolden Helix
 
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User PerspectiveVarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User PerspectiveGolden Helix
 
ChEMBL US tour December 2014
ChEMBL US tour December 2014ChEMBL US tour December 2014
ChEMBL US tour December 2014John Overington
 

Semelhante a UKSG 2023 - Will artificial intelligence change how readers use the research literature? (20)

PadminiNarayanan-Intro-2018.pptx
PadminiNarayanan-Intro-2018.pptxPadminiNarayanan-Intro-2018.pptx
PadminiNarayanan-Intro-2018.pptx
 
Giab for jax long read 190917
Giab for jax long read 190917Giab for jax long read 190917
Giab for jax long read 190917
 
From Panels to Genomes with VarSeq: The Complete Tertiary Platform for Short ...
From Panels to Genomes with VarSeq: The Complete Tertiary Platform for Short ...From Panels to Genomes with VarSeq: The Complete Tertiary Platform for Short ...
From Panels to Genomes with VarSeq: The Complete Tertiary Platform for Short ...
 
rheumatoid arthritis
rheumatoid arthritisrheumatoid arthritis
rheumatoid arthritis
 
Big Data and Genomic Medicine by Corey Nislow
Big Data and Genomic Medicine by Corey NislowBig Data and Genomic Medicine by Corey Nislow
Big Data and Genomic Medicine by Corey Nislow
 
Applications of Genomic and Proteomic Tools
Applications of Genomic and Proteomic ToolsApplications of Genomic and Proteomic Tools
Applications of Genomic and Proteomic Tools
 
human genome project by varaprasad
human genome project by varaprasadhuman genome project by varaprasad
human genome project by varaprasad
 
GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015GIAB update for GRC GIAB workshop 191015
GIAB update for GRC GIAB workshop 191015
 
Open data genomics_palermo_2017_ver03
Open data genomics_palermo_2017_ver03Open data genomics_palermo_2017_ver03
Open data genomics_palermo_2017_ver03
 
Grand round whsiao_may2015
Grand round whsiao_may2015Grand round whsiao_may2015
Grand round whsiao_may2015
 
How Can We Make Genomic Epidemiology a Widespread Reality? - William Hsiao
How Can We Make Genomic Epidemiology a Widespread Reality?  - William HsiaoHow Can We Make Genomic Epidemiology a Widespread Reality?  - William Hsiao
How Can We Make Genomic Epidemiology a Widespread Reality? - William Hsiao
 
Quality Assessment of Biomedical Metadata using Topic Modeling
Quality Assessment of Biomedical Metadata using Topic ModelingQuality Assessment of Biomedical Metadata using Topic Modeling
Quality Assessment of Biomedical Metadata using Topic Modeling
 
171017 giab for giab grc workshop
171017 giab for giab grc workshop171017 giab for giab grc workshop
171017 giab for giab grc workshop
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Comparative genomics and proteomics
Comparative genomics and proteomicsComparative genomics and proteomics
Comparative genomics and proteomics
 
2010StanfordE25 Michele dragoescu e25 project
2010StanfordE25 Michele dragoescu e25 project2010StanfordE25 Michele dragoescu e25 project
2010StanfordE25 Michele dragoescu e25 project
 
2013 10 23_dna_for_dummies_v_presented
2013 10 23_dna_for_dummies_v_presented2013 10 23_dna_for_dummies_v_presented
2013 10 23_dna_for_dummies_v_presented
 
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User PerspectiveVarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
 
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User PerspectiveVarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
 
ChEMBL US tour December 2014
ChEMBL US tour December 2014ChEMBL US tour December 2014
ChEMBL US tour December 2014
 

Mais de UKSG: connecting the knowledge community

UKSG 2024 Plenary 4 - Combining Open Access research and large language model...
UKSG 2024 Plenary 4 - Combining Open Access research and large language model...UKSG 2024 Plenary 4 - Combining Open Access research and large language model...
UKSG 2024 Plenary 4 - Combining Open Access research and large language model...UKSG: connecting the knowledge community
 
UKSG 2024 Plenary 3 - There is No List: (How) Can We Combat “Predatory” Publi...
UKSG 2024 Plenary 3 - There is No List: (How) Can We Combat “Predatory” Publi...UKSG 2024 Plenary 3 - There is No List: (How) Can We Combat “Predatory” Publi...
UKSG 2024 Plenary 3 - There is No List: (How) Can We Combat “Predatory” Publi...UKSG: connecting the knowledge community
 
UKSG 2024 Plenary 2 - Are we there yet? A review of transitional agreements i...
UKSG 2024 Plenary 2 - Are we there yet? A review of transitional agreements i...UKSG 2024 Plenary 2 - Are we there yet? A review of transitional agreements i...
UKSG 2024 Plenary 2 - Are we there yet? A review of transitional agreements i...UKSG: connecting the knowledge community
 
UKSG 2024 Plenary 2 - What did we Read, What did we Publish: Distilling the d...
UKSG 2024 Plenary 2 - What did we Read, What did we Publish: Distilling the d...UKSG 2024 Plenary 2 - What did we Read, What did we Publish: Distilling the d...
UKSG 2024 Plenary 2 - What did we Read, What did we Publish: Distilling the d...UKSG: connecting the knowledge community
 
UKSG 2024 Lightning 2 - How GetFTR Supports Discovery and Access of OA Content
UKSG 2024 Lightning 2 - How GetFTR Supports Discovery and Access of OA ContentUKSG 2024 Lightning 2 - How GetFTR Supports Discovery and Access of OA Content
UKSG 2024 Lightning 2 - How GetFTR Supports Discovery and Access of OA ContentUKSG: connecting the knowledge community
 
UKSG 2024 Lightning 2 - Advocating for data sharing: messaging frameworks for...
UKSG 2024 Lightning 2 - Advocating for data sharing: messaging frameworks for...UKSG 2024 Lightning 2 - Advocating for data sharing: messaging frameworks for...
UKSG 2024 Lightning 2 - Advocating for data sharing: messaging frameworks for...UKSG: connecting the knowledge community
 
UKSG 2024 Lightning 2 - All Watched Over By Machines That Love Open Research
UKSG 2024 Lightning 2 - All Watched Over By Machines That Love Open ResearchUKSG 2024 Lightning 2 - All Watched Over By Machines That Love Open Research
UKSG 2024 Lightning 2 - All Watched Over By Machines That Love Open ResearchUKSG: connecting the knowledge community
 
UKSG 2024 Lightning 1 - Responding to the UN SDG Publishers Compact – Bristol...
UKSG 2024 Lightning 1 - Responding to the UN SDG Publishers Compact – Bristol...UKSG 2024 Lightning 1 - Responding to the UN SDG Publishers Compact – Bristol...
UKSG 2024 Lightning 1 - Responding to the UN SDG Publishers Compact – Bristol...UKSG: connecting the knowledge community
 
UKSG 2024 Lightning 1 - Practical steps towards an open research culture: Bui...
UKSG 2024 Lightning 1 - Practical steps towards an open research culture: Bui...UKSG 2024 Lightning 1 - Practical steps towards an open research culture: Bui...
UKSG 2024 Lightning 1 - Practical steps towards an open research culture: Bui...UKSG: connecting the knowledge community
 
UKSG 2024 - Reckoning or Retreat? A Longitudinal Look at DEIA in Scholarly Co...
UKSG 2024 - Reckoning or Retreat? A Longitudinal Look at DEIA in Scholarly Co...UKSG 2024 - Reckoning or Retreat? A Longitudinal Look at DEIA in Scholarly Co...
UKSG 2024 - Reckoning or Retreat? A Longitudinal Look at DEIA in Scholarly Co...UKSG: connecting the knowledge community
 
UKSG 2024 - You don't know what you've got till it's gone: Future directions ...
UKSG 2024 - You don't know what you've got till it's gone: Future directions ...UKSG 2024 - You don't know what you've got till it's gone: Future directions ...
UKSG 2024 - You don't know what you've got till it's gone: Future directions ...UKSG: connecting the knowledge community
 
UKSG 2024 - Vision, mission, passion: how UK University Presses collaborate t...
UKSG 2024 - Vision, mission, passion: how UK University Presses collaborate t...UKSG 2024 - Vision, mission, passion: how UK University Presses collaborate t...
UKSG 2024 - Vision, mission, passion: how UK University Presses collaborate t...UKSG: connecting the knowledge community
 
UKSG - 2024 - Fostering an Open Research culture: ARU's Graduate Trainee Seco...
UKSG - 2024 - Fostering an Open Research culture: ARU's Graduate Trainee Seco...UKSG - 2024 - Fostering an Open Research culture: ARU's Graduate Trainee Seco...
UKSG - 2024 - Fostering an Open Research culture: ARU's Graduate Trainee Seco...UKSG: connecting the knowledge community
 
UKSG 2024 - Creating credibility through community: Encouraging high quality ...
UKSG 2024 - Creating credibility through community: Encouraging high quality ...UKSG 2024 - Creating credibility through community: Encouraging high quality ...
UKSG 2024 - Creating credibility through community: Encouraging high quality ...UKSG: connecting the knowledge community
 
UKSG 2024 - Author Identity Metadata: Why a Small Publisher Can Address a Maj...
UKSG 2024 - Author Identity Metadata: Why a Small Publisher Can Address a Maj...UKSG 2024 - Author Identity Metadata: Why a Small Publisher Can Address a Maj...
UKSG 2024 - Author Identity Metadata: Why a Small Publisher Can Address a Maj...UKSG: connecting the knowledge community
 
UKSG 2024 - Captivate, Connect, and Convert: Unlocking the art of Collections...
UKSG 2024 - Captivate, Connect, and Convert: Unlocking the art of Collections...UKSG 2024 - Captivate, Connect, and Convert: Unlocking the art of Collections...
UKSG 2024 - Captivate, Connect, and Convert: Unlocking the art of Collections...UKSG: connecting the knowledge community
 
UKSG 2024 - A critical review of transitional agreements in the UK: why, how,...
UKSG 2024 - A critical review of transitional agreements in the UK: why, how,...UKSG 2024 - A critical review of transitional agreements in the UK: why, how,...
UKSG 2024 - A critical review of transitional agreements in the UK: why, how,...UKSG: connecting the knowledge community
 
UKSG 2024 - What next for sustainable open scholarship? The Cambridge Univers...
UKSG 2024 - What next for sustainable open scholarship? The Cambridge Univers...UKSG 2024 - What next for sustainable open scholarship? The Cambridge Univers...
UKSG 2024 - What next for sustainable open scholarship? The Cambridge Univers...UKSG: connecting the knowledge community
 

Mais de UKSG: connecting the knowledge community (20)

UKSG 2024 Plenary 4 - Combining Open Access research and large language model...
UKSG 2024 Plenary 4 - Combining Open Access research and large language model...UKSG 2024 Plenary 4 - Combining Open Access research and large language model...
UKSG 2024 Plenary 4 - Combining Open Access research and large language model...
 
UKSG 2024 Plenary 3 - There is No List: (How) Can We Combat “Predatory” Publi...
UKSG 2024 Plenary 3 - There is No List: (How) Can We Combat “Predatory” Publi...UKSG 2024 Plenary 3 - There is No List: (How) Can We Combat “Predatory” Publi...
UKSG 2024 Plenary 3 - There is No List: (How) Can We Combat “Predatory” Publi...
 
UKSG 2024 Plenary 2 - Let's Talk About Green
UKSG 2024 Plenary 2 - Let's Talk About GreenUKSG 2024 Plenary 2 - Let's Talk About Green
UKSG 2024 Plenary 2 - Let's Talk About Green
 
UKSG 2024 Plenary 2 - Are we there yet? A review of transitional agreements i...
UKSG 2024 Plenary 2 - Are we there yet? A review of transitional agreements i...UKSG 2024 Plenary 2 - Are we there yet? A review of transitional agreements i...
UKSG 2024 Plenary 2 - Are we there yet? A review of transitional agreements i...
 
UKSG 2024 Plenary 2 - What did we Read, What did we Publish: Distilling the d...
UKSG 2024 Plenary 2 - What did we Read, What did we Publish: Distilling the d...UKSG 2024 Plenary 2 - What did we Read, What did we Publish: Distilling the d...
UKSG 2024 Plenary 2 - What did we Read, What did we Publish: Distilling the d...
 
UKSG 2024 Lightning 2 - How GetFTR Supports Discovery and Access of OA Content
UKSG 2024 Lightning 2 - How GetFTR Supports Discovery and Access of OA ContentUKSG 2024 Lightning 2 - How GetFTR Supports Discovery and Access of OA Content
UKSG 2024 Lightning 2 - How GetFTR Supports Discovery and Access of OA Content
 
UKSG 2024 Lightning 2 - Advocating for data sharing: messaging frameworks for...
UKSG 2024 Lightning 2 - Advocating for data sharing: messaging frameworks for...UKSG 2024 Lightning 2 - Advocating for data sharing: messaging frameworks for...
UKSG 2024 Lightning 2 - Advocating for data sharing: messaging frameworks for...
 
UKSG 2024 Lightning 2 - All Watched Over By Machines That Love Open Research
UKSG 2024 Lightning 2 - All Watched Over By Machines That Love Open ResearchUKSG 2024 Lightning 2 - All Watched Over By Machines That Love Open Research
UKSG 2024 Lightning 2 - All Watched Over By Machines That Love Open Research
 
UKSG 2024 Lightning 1 - Responding to the UN SDG Publishers Compact – Bristol...
UKSG 2024 Lightning 1 - Responding to the UN SDG Publishers Compact – Bristol...UKSG 2024 Lightning 1 - Responding to the UN SDG Publishers Compact – Bristol...
UKSG 2024 Lightning 1 - Responding to the UN SDG Publishers Compact – Bristol...
 
UKSG 2024 Lightning 1 - Practical steps towards an open research culture: Bui...
UKSG 2024 Lightning 1 - Practical steps towards an open research culture: Bui...UKSG 2024 Lightning 1 - Practical steps towards an open research culture: Bui...
UKSG 2024 Lightning 1 - Practical steps towards an open research culture: Bui...
 
UKSG 2024 - Open infrastructure and standards: small bodies, big impact
UKSG 2024 - Open infrastructure and standards: small bodies, big impactUKSG 2024 - Open infrastructure and standards: small bodies, big impact
UKSG 2024 - Open infrastructure and standards: small bodies, big impact
 
UKSG 2024 - Reckoning or Retreat? A Longitudinal Look at DEIA in Scholarly Co...
UKSG 2024 - Reckoning or Retreat? A Longitudinal Look at DEIA in Scholarly Co...UKSG 2024 - Reckoning or Retreat? A Longitudinal Look at DEIA in Scholarly Co...
UKSG 2024 - Reckoning or Retreat? A Longitudinal Look at DEIA in Scholarly Co...
 
UKSG 2024 - You don't know what you've got till it's gone: Future directions ...
UKSG 2024 - You don't know what you've got till it's gone: Future directions ...UKSG 2024 - You don't know what you've got till it's gone: Future directions ...
UKSG 2024 - You don't know what you've got till it's gone: Future directions ...
 
UKSG 2024 - Vision, mission, passion: how UK University Presses collaborate t...
UKSG 2024 - Vision, mission, passion: how UK University Presses collaborate t...UKSG 2024 - Vision, mission, passion: how UK University Presses collaborate t...
UKSG 2024 - Vision, mission, passion: how UK University Presses collaborate t...
 
UKSG - 2024 - Fostering an Open Research culture: ARU's Graduate Trainee Seco...
UKSG - 2024 - Fostering an Open Research culture: ARU's Graduate Trainee Seco...UKSG - 2024 - Fostering an Open Research culture: ARU's Graduate Trainee Seco...
UKSG - 2024 - Fostering an Open Research culture: ARU's Graduate Trainee Seco...
 
UKSG 2024 - Creating credibility through community: Encouraging high quality ...
UKSG 2024 - Creating credibility through community: Encouraging high quality ...UKSG 2024 - Creating credibility through community: Encouraging high quality ...
UKSG 2024 - Creating credibility through community: Encouraging high quality ...
 
UKSG 2024 - Author Identity Metadata: Why a Small Publisher Can Address a Maj...
UKSG 2024 - Author Identity Metadata: Why a Small Publisher Can Address a Maj...UKSG 2024 - Author Identity Metadata: Why a Small Publisher Can Address a Maj...
UKSG 2024 - Author Identity Metadata: Why a Small Publisher Can Address a Maj...
 
UKSG 2024 - Captivate, Connect, and Convert: Unlocking the art of Collections...
UKSG 2024 - Captivate, Connect, and Convert: Unlocking the art of Collections...UKSG 2024 - Captivate, Connect, and Convert: Unlocking the art of Collections...
UKSG 2024 - Captivate, Connect, and Convert: Unlocking the art of Collections...
 
UKSG 2024 - A critical review of transitional agreements in the UK: why, how,...
UKSG 2024 - A critical review of transitional agreements in the UK: why, how,...UKSG 2024 - A critical review of transitional agreements in the UK: why, how,...
UKSG 2024 - A critical review of transitional agreements in the UK: why, how,...
 
UKSG 2024 - What next for sustainable open scholarship? The Cambridge Univers...
UKSG 2024 - What next for sustainable open scholarship? The Cambridge Univers...UKSG 2024 - What next for sustainable open scholarship? The Cambridge Univers...
UKSG 2024 - What next for sustainable open scholarship? The Cambridge Univers...
 

Último

Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...Pooja Nehwal
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 

Último (20)

Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 

UKSG 2023 - Will artificial intelligence change how readers use the research literature?

  • 1. Will artificial intelligence change how readers use the research literature? jake.lever@glasgow.ac.uk Jake Lever, University of Glasgow
  • 2. This talk will cover: • Why text mining is going to get bigger and bigger in science? • An application in precision medicine • What are these new large language models and how do they work? • How will they fit into text mining? • What’s next? Overview of the talk
  • 3. Text mining will become essential in science • Too many papers to read! • (for some areas of science) • Artificial intelligence methods may help: • Extract knowledge from papers • Summarise them • More? image generated with Midjourney
  • 4. • We can now gather huge amounts of data on each person • But what does it mean? Biomedicine is becoming a “big data” science https://www.genome.gov/about-genomics/fact-sheets/Sequencing-Human-Genome-cost https://nanoporetech.com/about-us/news/oxford-nanopore-announces-ps100-million-140m-fundraising-global-investors
  • 5. Interpreting biological experiments is getting more challenging Gene Event AP2B1 Overexpressed CDB10 Deletion PDDC1 Point Mutation TPSD1 Underexpressed WFDC5 Promoter Methylated BCL6B Alternative Splicing
  • 6. Search tools & knowledge bases are valuable Search Tools Knowledge Bases
  • 7. Search tools & knowledge bases are valuable Search Tools ✅ Flexible to different queries ✅ Easier to maintain ✅ Deals well with new literature ❌ May requires users to read (many) papers ❌ Cannot easily be used by automated analyses
  • 8. Search tools & knowledge bases are valuable Search Tools ✅ Flexible to different queries ✅ Easier to maintain ✅ Deals well with new literature ❌ May requires users to read (many) papers ❌ Cannot easily be used by automated analyses Knowledge Bases ✅ Little paper reading ✅ Structured & searchable ✅ Usable by automated analyses ❌ Only good if a KB exists for your problem ❌ Huge cost burden to create and maintain
  • 9. Could text mining be a solution? Can we use natural language processing (NLP) to create knowledge bases for our specific information need? http://www.anthropologyofknowledge.nu/2015/11/10/the-challenge-of-digital-reading
  • 10. Knowledge provenance is essential in biomedicine Where did you get that information to make that clinical/experimental decision? • Clinicians and scientists need to see the underlying data • There are often legal ramifications for clinical decision making
  • 11. An application in precision medicine What are large language models? Where do large language models fit? What’s coming next?
  • 12. An application in precision medicine What are large language models? Where do large language models fit? What’s coming next?
  • 13. Primary application: Precision medicine “The right drug to the right patient at the right time” • Relies heavily on the latest research • Groups around the world are manually reviewing literature constantly
  • 14. Interpretation is the bottleneck of precision medicine Good, Benjamin M., et al. "Organizing knowledge to enable personalization of medicine in cancer." Genome biology 15.8 (2014): 438.
  • 15. • Creating a knowledge base requires expert knowledge • Biocurators need text mining tools to help them triage the literature Biocuration
  • 16. Pharmacogenomics Knowledge Base (PharmGKB) Codeine Morphine
  • 17. Motivation Can text-mining find relevant knowledge in the literature to assist curation? “Our results show that the V433M mutation of the CYP4F2 gene affects metabolism of warfarin.”
  • 18. Step 1: Get available biomedical publications Takeuchi, Fumihiko, et al. "A genome-wide association study confirms VKORC1, CYP2C9, and CYP4F2 as principal genetic determinants of warfarin dose." PLoS genetics 5.3 (2009): e1000433. PubMed PMC Open Access subset PMC Author Manuscript Example Paper … After these quality control steps, a total of 1053 warfarin patients and 325,997 GWAS SNPs were retained for analysis. The GWAS SNPs included two SNPs not on the HumanCNV370 array but which are highly predictive of warfarin dose [rs9923231 (VKORC1) and rs1799853 (CYP2C9*2)] which we genotyped by TaqMan assay (Applied Biosystems). Defining CNV Regions Although we retained 325,997 GWAS SNPs for association testing of ...
  • 19. Step 2: Find chemicals, mutations and genes Takeuchi, Fumihiko, et al. "A genome-wide association study confirms VKORC1, CYP2C9, and CYP4F2 as principal genetic determinants of warfarin dose." PLoS genetics 5.3 (2009): e1000433. Annotations for: • Genes • Mutations • Chemicals • Diseases • Species • Celllines Example Paper … After these quality control steps, a total of 1053 warfarin patients and 325,997 GWAS SNPs were retained for analysis. The GWAS SNPs included two SNPs not on the HumanCNV370 array but which are highly predictive of warfarin dose [rs9923231 (VKORC1) and rs1799853 (CYP2C9*2)] which we genotyped by TaqMan assay (Applied Biosystems). Defining CNV Regions Although we retained 325,997 GWAS SNPs for association testing of ...
  • 20. Step 3: Find sentences that mention chemical, mutation and specific keywords Takeuchi, Fumihiko, et al. "A genome-wide association study confirms VKORC1, CYP2C9, and CYP4F2 as principal genetic determinants of warfarin dose." PLoS genetics 5.3 (2009): e1000433. Example Paper … After these quality control steps, a total of 1053 warfarin patients and 325,997 GWAS SNPs were retained for analysis. The GWAS SNPs included two SNPs not on the HumanCNV370 array but which are highly predictive of warfarin dose [rs9923231 (VKORC1) and rs1799853 (CYP2C9*2)] which we genotyped by TaqMan assay (Applied Biosystems). Defining CNV Regions Although we retained 325,997 GWAS SNPs for association testing of ...
  • 21. Step 4: Identify pharmacogenomic relations between each chemical/mutation pair Takeuchi, Fumihiko, et al. "A genome-wide association study confirms VKORC1, CYP2C9, and CYP4F2 as principal genetic determinants of warfarin dose." PLoS genetics 5.3 (2009): e1000433. Example Paper … After these quality control steps, a total of 1053 warfarin patients and 325,997 GWAS SNPs were retained for analysis. The GWAS SNPs included two SNPs not on the HumanCNV370 array but which are highly predictive of warfarin dose [rs9923231 (VKORC1) and rs1799853 (CYP2C9*2)] which we genotyped by TaqMan assay (Applied Biosystems). Defining CNV Regions Although we retained 325,997 GWAS SNPs for association testing of ... 1000 sentences manually annotated 80/20% split for evaluation Avg Precision: 78% Avg Recall: 25%
  • 22. Step 5: Convert to structured data The GWAS SNPs included two SNPs not on the HumanCNV370 array but which are highly predictive of warfarin dose [rs9923231 (VKORC1) and rs1799853 (CYP2C9*2)] which we genotyped by TaqMan assay (Applied Biosystems). Chemical Gene Mutation PubMed ID warfarin VKORC1 rs9923231 19300499 warfarin CYP2C9 rs1799853 19300499 warfarin CYP2C9 *2 19300499
  • 23. Results # of papers 7,170 # of sentences 15,228 # of associations 19,930 # of unique gene/mutation pairs 6,099 % of associations found in full-text 58.9
  • 24. PGxMine Resource • Relation extraction used to mine pharmacogenomics sentences from PubMed & PubMed Central • Used by PharmGKB curators to prioritize new papers for publication https://pgxmine.pharmgkb.org/ Lever, Jake, et al. "PGxMine: text mining for curation of PharmGKB." Pacific Symposium on Biocomputing 2020.
  • 25. Successful evaluation by PharmGKB curators Top 100 chemical/mutation associations not in PharmGKB 57 lead directly to at least one curatable paper 24 would likely lead indirectly to curatable papers through citations 19 did not lead to curatable papers 37 to one paper 16 to two papers 3 to three papers 1 to five papers 83 curatable papers found directly with PGxMine in top 100 hits! Lever, Jake, et al. "PGxMine: text mining for curation of PharmGKB." Pacific Symposium on Biocomputing 2020.
  • 26. Now integrated into PharmGKB as automated annotations Lever, Jake, et al. "PGxMine: text mining for curation of PharmGKB." Pacific Symposium on Biocomputing 2020.
  • 27. An application in precision medicine What are large language models? Where do large language models fit? What’s coming next?
  • 28. Language models can calculate the probability of the next word
  • 29. Can you guess the next word? Peter Piper picked a peck of pickled peppers, A peck of pickled peppers Peter Piper picked; If Peter Piper picked a peck of pickled …?
  • 30. From previous text: “peppers” appears 2/2 times after “pickled” Can you guess the next word? Peter Piper picked a peck of pickled peppers, A peck of pickled peppers Peter Piper picked; If Peter Piper picked a peck of pickled …?
  • 31. • Look at lots of text (scraped from the internet) • Some serious problems with bias and hate speech • See what words follow on another Where do these probabilities come from? Example corpora from recent Gopher language paper Rae, Jack W., et al. "Scaling language models: Methods, analysis & insights from training gopher." arXiv preprint arXiv:2112.11446 (2021).
  • 32. You need more than the previous word to guess the next it showed 9 o’clock on my ? Some words are more important for context
  • 33. Language models have often been named for muppets characters: • ELMO • BERT BERT used a new idea: Transformers Language models with deep learning https://muppet.fandom.com/wiki/Sesame_Street
  • 34. Playing language games It showed 9 o’clock on my __________ It showed 9 _________ on my watch It showed 9 o’clock on my stapler -> stapler is wrong Predict the next word: Predict a masked word: Spot the corrupted word:
  • 35. • Building and using a language model requires GPUs become essential and expensive! https://www.scan.co.uk/products/pny-nvidia-a100-80gb-hbm2-graphics-card-6912-cores-195-tflops-sp-97-tflops-dp
  • 36. • Only large companies can build these very large language models And scale up! https://www.theverge.com/2023/3/13/23637675/microsoft-chatgpt-bing-millions-dollars-supercomputer-openai Banks, Carl. The Sport of Tycoons. 1974
  • 37. Once upon a time … Using a language model to generate new text Word Probability there 65 / 100 a 22 / 100 it 11 / 100 the 3 / 100
  • 38. Once upon a time there … Using a language model to generate new text Word Probability was 61 / 100 were 23 / 100 had 3 / 100 could 1 / 100
  • 39. Once upon a time there was a young man who lived down by a river. He did not want to go to school so he… Using a language model to generate new text ● Keep generating word-by-word ● Picking the most likely word doesn’t create the most interesting text ○ So pick randomly but weight by the probabilities Word Probability ran 22 / 100 hid 21 / 100 told 15 / 100 stole 6 / 100
  • 40. Do language models know anything? The largest city in Scotland is …
  • 41. Do language models know anything? Word Probability Glasgow 56 / 100 Edinburgh 35 / 100 London 7 / 100 Dundee 2 / 100 The largest city in Scotland is …
  • 42. Do language models know anything? Word Probability Glasgow 56 / 100 Edinburgh 35 / 100 London 7 / 100 Dundee 2 / 100 The largest city in Scotland is … How? The huge amount of text used to train the system must have contained text about Scottish cities
  • 43. But, language models will hallucinate Source: https://www.unite.ai/preventing-hallucination-in-gpt-3-and-other-complex-language-models/
  • 44. • Stochastic parrots paper • Fantastic summary of the limitations of large language models • Language models encode the biases in the text used to train them • They don’t “know” anything - they just regurgitate grammatical patterns that they’ve seen Language models are stochastic parrots Bender, Emily M., et al. "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?🦜." Proceedings of the 2021 ACM conference on fairness, accountability, and transparency. 2021.
  • 45. An application in precision medicine What are large language models? Where do large language models fit? What’s coming next?
  • 46. • Prone to confident hallucination • Probably mostly factual - but really difficult to spot mistakes • Can’t cite their sources • But they are very good at working with language Using language models directly to ask knowledge is tricky
  • 47. Feed text to a large language model and ask for interpretation
  • 48. Language is complex and large language models can summarize Hall, Jeff M., et al. "Linkage of early-onset familial breast cancer to chromosome 17q21." Science 250.4988 (1990): 1684-1689.
  • 49. They can also extract structured knowledge Hall, Jeff M., et al. "Linkage of early-onset familial breast cancer to chromosome 17q21." Science 250.4988 (1990): 1684-1689.
  • 50. • Ongoing work to understand if they are much better than existing methods • Writing tasks as “prompts” is a bit of an art How good are they at useful language tasks?
  • 51. • This article does not exist • Sometimes you get lucky and it cites real things. Other things, it may make up publications (with authors, journals, DOIs, etc). For better or worse, readers will ask chat systems for knowledge
  • 52. An application in precision medicine What are large language models? Where do large language models fit? What’s coming next?
  • 53. • What happens when there is a new prime minister? • Language models are trapped by the text they are trained on • ChatGPT was trained until 2021 Language models get stuck in the past The current prime minister of the UK is …
  • 54. • Allow a language model to use a search engine to help it pick the next word • Would enable citing of sources! • But are the sources good? Retrieval-enhanced language models Guu, Kelvin, et al. "Retrieval augmented language model pre-training." International conference on machine learning. PMLR, 2020.
  • 55. The Bing chatbot (using GPT-4) can cite sources
  • 56. But even a language model with sources can still hallucinate
  • 57. • Scientists cannot read all the necessary papers so text mining can help • Large language models seem strangely good at many general tasks • There is a lot of hype • We need to be sceptical about their abilities and accuracy Conclusions
  • 58. Acknowledgements Jones Lab @ UBC ● Steven Jones ● Martin Jones ● Eric Zhao ● Jasleen Grewal ● Luka Culibrk ● Melika Bonakdar Griffith Lab @ WashU ● Obi Griffith ● Malachi Griffith ● Kilannin Krysiak ● Arpad Danos ● Jason Saliba Helix Lab @ Stanford ● Russ Altman ● Teri Klein ● Michelle Whirl-Carrillo ● PharmGKB team jake.lever@glasgow.ac.uk