BioAssay Express: Creating and exploiting assay metadata


CINF 24: BioAssay express: Creating and
exploiting assay metadata
collaborativedrug.com
bioassayexpress.com
bae@collaborativedrug.com
Philip Cheung
American Chemical Society
Sunday, August 25, 2019 - 3:10 PM -- Grand Ballroom A, Omni San Diego Hotel

A little bit about me…
Graduated Harvey Mudd College in 1996 B.S. in Biology
SAIC – 1996 – 1997
High Throughput Robotics Systems For ISB (Leroy Hood)
Orillion 1997 – 1999
WorldCom MCI / PlayStation 2 Event Management System
Medibuy.com 1999 – 2001
Global Health Exchange Between Premier Hospital Group
Pfizer Global Research and Development 2001 – 2009
Pfizer Global Crystal Structure Database
Oncology Project Support (Computational Biology)
Ophthalmology Indication Discovery (Machine Learning)
Dart NeuroScience (2009-2018)
Bioinformatics Group Leader
Independent Consultant (2018-Present)
Currently Support multiple informatics / Bioinformatics companies in San
Francisco, Boston, and San Diego

So what is assay informatics and why is it
exciting?

The IDEAL Cycle of Assay Management
• Plan experiments & capture ideas
• Perform experiments & capture data
• Analyze data & identify trends
• Store & protect the results
• Retrieve data & build knowledge -
across Concepts / Time / Projects

The REALISTIC Cycle of Assay Management
• Plan experiments & capture ideas
• Perform experiments & capture data
• Analyze data & identify trends
• Store & organize the results
• Retrieve data & build knowledge –
across Concepts / Time / Projects
• Post-It Note Edits & Lost Attributions
• Incomplete “Data Dump” & Lost Data
• Siloed data & Incomplete information
• Lost & non-reproducible data (crisis!)
• Inaccessible & unusable data leading to…
TIME WASTED & OPPORTUNIES LOST!

• No Common Vocabulary
• Limited Assay Mining/Searching
Capabilities
Barrier to Collaboration
Failure to Provide Insight
Even “Best Case” Assay Management is Inefficient

Efficiently & Quickly
Organize Assay Data
Machine Readable Format
Common Vocabulary through
Ontology Markup
Introduces Assay Informatics,
Providing New Insight by
Querying Biologic, Chemical,
and Assay Meta Data
BioAssay Express Leads the Field of Assay Informatics

Capture Metadata Unambiguously with BioAssay Express
Ambiguous
Unstructured
Cytoxicity test;
incubate hek cells for
24hrs. Add 1
compound from library
(tox 21); triplicate. 16
pt dilution. Inc. 48hrs
at 37C. Standard kit
protocol using thermo
Glo kit. Fluorescence in
triplicate. Report IC50
Ambiguous
Structured
Unambiguous
Structured
Cell Line:
Mode of Action:
Assay Kit:
Detection Method:
Detection Instrument:
Perturbagen Type:
Result:
Units:
HEK
Cytoxcity
GLO (Thermo)
Fluorescence
Tox 21
EC50
Cell Line:
Mode of Action:
Assay Kit:
Detection Method:
Detection Instrument:
Perturbagen Type:
Result:
Units:
HEK-293T Cell
[BTO_0002181]
Cytoxicity
[BAO_0000090]
CytoTox-Glo (Promega)
[BAO_0140009]
Luminescence
[BAO_0000045]
EnVision Multilabel Reader
[BAO_0000701]
Tox 21 Compound Library
[BAO_0070002]
IC50
[BAO_0000190]
nanomolar
[UO_0000065]
Missing WrongIncompleteSpelling

BioAssay Express Enables True Assay Informatics
Metadata
Assays

That seems like a lot of work? Can Machine
Learning help?

BioAssay Express - In Action!
Assay Registration is a Breeze!
• Insert Protocol
• Text Mining
• Expert Ontologies
• Predictive Text
• Machine Learning
• NLP
• Correlation Models
• Human Curation
• Accuracy is key

Dose Response assay for agonists of 5-Hydroxytryptamine
(Serotonin) Receptor Subtype 1A (5HT1A)
Assay Description:
Widely expressed in the human brain, 5-hydroxytryptamine (5-HT,
serotonin) receptors have been shown to have an important role in
depression as well as other cognitive and metabolic disorders [1, 2].
Discovering novel modulators of the 5-HT1A serotonin receptor
may not only help probe the function of this receptor, but also help
better understand the complex relationship among the 5-HT
receptor subtypes.
Protocol Summary:
As with the primary HTS assay, a Chinese Hamster Ovary
(CHO) cell line stably transfected with human 5HT1a receptor,
the nuclear factor of activated T-cell-beta lactamase (NFAT-BLA)
reporter construct and the G-alpha-15 promiscuous coupling protein
was used (Invitrogen, part K1083).
Cells were cultured in T-175 sq. cm flasks (Corning, part 431080) at
37 deg C and 95% RH. The assay began by dispensing 10 microliters
of cell suspension to each test well of a 384 well plate.
BioAssay Express: Optimized for Low False Positives

Assay Fingerprints: Bringing Informatics to Metadata
AssayMetadata
Assays  Assay Property Grid
• Generate assay fingerprints
• Compare hundreds of assays at a
glance
• Find, Share, and Innovate
 Blue boxes = exact match
- Blue lines = match inferred from
hierarchy
• Why are my results different from others
doing the ”same” assay?
• Has anyone studied this disease variant in
neurons?
• Did results vary when we switched
instrument models?
• Has someone else already done my
experiment
• What other programs have already screened
my target? Can I jumpstart my new program
with some existing chemistry?
• I want to do some machine learning – can I
find some other “appropriate” experiments.

So, I have a database…
how do I this apply to my data?

So how can I use this technology?
So let’s take a look at the steps required to process a
semi-structured dataset like clinicaltrials.gov

Structured because there are tables that hold categories of data

Unfortunately, its unstructured because the content does not map to an
ontology consistently.

So, how do I FAIR-ify my data?

So how do we go from unstructured to structured data?
Step 1 – Review the data you’re
importing; Map your data to
ontologies
ClinicalTrials.gov
Step 2 –Perform the import; some
percentage will map perfectly.
Step 3 -- Curate the remaining
data using the BAE’s NLP models.

Step 1- Map your data to ontologies
Open source BioAssay Template Tool
https://github.com/cdd/bioassay-template

Step 2- Import: Generate JSON Files

Step 2- Import: Or import using CSV File

Step 2- Some percentage will map out of the box
In this example, I imported a small
subset of clinical trials.gov – 18825 of
the 313472 (~6%) available studies
If I was interested in multiple myeloma,
I could write a script that was
aggressive mapping and assign all of
these as “multiple myeloma”, or I could
assign only the perfect matches, and
let BAE’s NLP help me with the
mappings.

Step 3 – Machine Assisted Curation
Importing /
Mapping data
allows BAE to
build Bayesian
Models

Interesting questions you can visualize
What cancer trials was Celecoxib used in What other combination therapies were also used in those

Interesting questions you can visualize
What new Interventions are being used in Malaria?

We’re picking up steam
* Extensive Proof Of Concept Study

But let’s talk about Robots now!

Current Directions -- Projects
An aspirational goal for our team is to build a metadata schema based on semantic web
vocabularies that is comprehensive to the extent that the text description becomes optional
There are many challenges involved in creating the ELN-to-robot loop, here we provide some
insights into our collaborations with UCSF automation experts at the Small Molecule Discovery
Center.

The High level goal – ELN to Robot
GBG by BioSero
Cellario By HiResBio
Adapter/Builder
Director by Wako (Fuji)
- Model the protocols at the step level so
we can export them out to automation
systems
- Import the results back into the system

Often times they are more complex -- branch and merge

Break the protocol down into steps
• Model the Dependencies
• Equipment
dependencies
• Reagent dependencies
• Previous step
dependencies
• Model the Steps
• Protocol Steps

Sample Protocol
https://pubchem.ncbi.nlm.nih.gov/bioassay/346#section=Protocol
Plate Layout
Plate Action
(Centrifuge)
Assay Start
(Add Compounds)
Incubation
Readout

Sample Protocol
https://pubchem.ncbi.nlm.nih.gov/bioassay/346#section=Protocol

Next Steps
• Next Steps
• Continue our work with UCSF with
simple proof of concept protocols
• Reach out to Vendors and see if we
can integrate into their simulation
platforms
• Thermo
• Wako
• Keep reaching out and learning
from experts like yourself.
http://www.bioassayexpress.com

Try it out!
• Collaborative Drug Discovery
• Alex Clark
• Hande Kücük McGuinty
• Peter Gedeck
• Samantha Jeschonek
• Barry Bunin
• For more info
• bae@collaborativedrug.com http://www.bioassayexpress.com

Smart Drug Discovery Software SavesScientistsTimeSmart Drug Discovery Software Saves Time
Session: CINF: Sci-Mix
Location: Exhibit Hall B,
Date & Time: Monday,
Aug 26 8:00 PM

BioAssay Express: Creating and exploiting assay metadata

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to BioAssay Express: Creating and exploiting assay metadata

Similar to BioAssay Express: Creating and exploiting assay metadata (20)

Recently uploaded

Recently uploaded (20)

BioAssay Express: Creating and exploiting assay metadata

Editor's Notes