2. Cannabis Indicia and Cannabis Sativa genomes
We import $1B/year in hemp products
$45B - $113B Cannabis black market
$1.3B California “legal” dispensary driven market (Growing 50% a year)
45% grown in the US (over 22M lbs)
40% Mexican
10% Canadian
5% Other
3. Therapeutic Index
35,000 annual deaths from alcohol in the US.
25-30% of violent crimes has EtOH involved in US, 50% in UK.
80% of Domestic violence has EtOH involved
Lisbon Football games successfully reduce riots by promoting Cannabis.
Steve Fox
Paul Armentano
Mason Tvert
4. Social and political considerations
Over 1M US citizens imprisoned every year for
Cannabis. 7.8M citizens imprisoned in 10 years
$50B/year in Prisons. Private prisons growing 17%/year.
25% of Global prisoners in the US
Rate of prescription drug overdoses per 100K- source CDC
5. Medical Excuse or Medical Use?
Endocannabinoid Pathway is Pervasive
Plays a critical role in the following disease etiologies.
Analgesics – Estimated $75B US market
Chronic Pain- Estimated to be over $200B US health care cost
(Source: Institute for Pain management)
Cancer Pain, wasting, apoptosis –
Estimated over $220B US health care cost (NCI- number is $117B)
MS spasticity- $10B Global
Diabetes and weight management – Very Large
6. Obesity and BMI have human variants in Endo-
Cannabinoid genes
Certain rare Human FAAH & MGLL genotypes are
associated with high Anandamide plasma levels and
obesity.
Patients with these genotypes will impact clinical trials
7. Human variation of the Receptors
Are there populations with mutations in CB1, CB2, “CB3“ which may require custom
dosages of Cannabinoids?
8. Over 85 Cannabinoids discovered
Cannabis has gone through a breeding bottleneck with prohibition and many silent chemotypes
we believe will be found in the genomics of existing strains.
http://en.wikipedia.org/wiki/Cannabinoids
What are the genetic pathways?
Which enzymes have variants in these pathways?
Are there extinct synthases in the genome which can be discovered/recovered?
Terpenoids?
11. Genetic Bottleneck of Prohibition
US penal codes unit of measure is weight based.
Not Volume or %THCA anchored.
Drives underground market towards higher concentration THCA plant matter
Many shared precursors in the pathways suggesting higher THCA concentration
has come at the cost of lower CBDA and other therapeutic cannabinoids.
12. Many Parts of the Cannabinoid pathways are still
unknown
Why Sequence the Genome?
1)Chemical Synthesis produces
racemics
2)The plant grows quickly and
productively.
3)Trend is towards cocktails of
cannabinoids and terpenoids
Discovery of pathways can aid in breeding
and Synthetic biology approaches to MFG
13. Predictive Genomics
Mutations in FAD binding domain
compromise and/or deactivate THC
production in strains
Sirikantaramas et al
THC Synthase Annotation
14. Applying Sequencing to Cannabinoids
1.6Gb (n=2) Dye based Estimate
Sequencing supports 650-1.0Gb (n=2)
10 Chromosomes
De Novo shotgun to 327X coverage
131Gb 2x100 ILMN, 300bp inserts
De Novo assembly with CLC Bio and SOAPdenovo on
a 64Gb RAM Mac
2 references
Sativa
Indica
65% AT
0.5->1% polymorphisms rate
300Mb assembly with CLC Bio.
15. Alignment of Assembly to Peach
Gene Finding
BLAST2GO
Pseudo-assembly to other plants helps annotate non polyA
expressed and or conserved regions.
16. Whole Genome sequencing reveals genetics of
THCA Synthase allozymes
Blue reads are paired reads Copia
THCA synthase gene
Red and Green reads are unpaired Transposon 8X higher coverage than rest of
Mechanism for higher genome implying many more copies
Vertical lines are SNPs THC gene copy than just 2
number
Lots of SNPs in the transposons since 740X of
transposons are collapsed into this assembly.
17. Move to Triple Backcrossed Cultivars
LA Confidential
http://uf4a.org/
18. Cannabis Indica
• Database includes
• LA Confidential- Highly phased DNA sequence (13.5Gb)
• Chemdawg- High Coverage DNA Sequence (131Gb)
LA Conf. 3X Backcrossed Assembly sums to 722Mb (All Contigs) & 676Mb (>500bp contigs)
21. THCA Synthase and its various paralogs
Long reads help to phase the SNPs in THC Synthase
Single reads
454: 700bp reads preserve phase.
SNPs
Are these 8 other copies of diverged THCA synthase making
THC or could they be the other silent chemotypes?
RNA-Seq can demonstrate expression
Phase is critical for Amino Acid prediction Failure to phase
IRLQFFLMGRstop
ATTCGTCTGCA [T/A] TTCTTCCTGAT [G/C] GGGCGCTG [A/C] TTT IRLQFFLMGRCF
IRLQFFLIGRstop
IRLQFFLIGRCF
IRLHFFLMGRstop
I R L Q or H F F L M or I G R Stop or C F IRLHFFLMGRCF
IRLHFFLIGRstop
IRLHFFLIGRCF
2^N Peptide predictions, where N= # unphased SNVs
22. Other data emerges
R
RNA Seq- Mexican Sativa
Purple Kush- Indica
USO-1-Hemp
Finola-Hemp
23. Polymorphisms across 3 cultivars
ChemDawg sequenced to 327X coverage with 2x100 reads
High AT content discovered, High polymorphism rate discovered
3x backcrossed LA Confidential (DNA Genetics) sequenced to over 15X
Lower polymorphism rate.
TABLE_2 Heterozygotes Homozygotes Total Ti/Tv
CD X CD CD= Chemdawg 1,413,345 100,274 1,513,619 1.64
LA X LA LA= LA Confidential 925,602 0 925,602 1.72
CD X LA PK= Purple Kush 1,960,931 1,506,345 3,467,276 1.62
LA X CD 1,357,810 1,491,827 2,849,637 1.84
LA X PK 1,854,661 1,988,717 3,843,378 1.76
CD X PK 3,000,128 1,573,243 4,573,371 1.69
PK X PK 1,085,040 221,657 1,306,697 1.66
SNV genome wide
SNV in the coding regions
CODING SNPs Heterozygotes Homozygotes Total
LA Conf X PKUSH 94,853 78,251 173,104
Chemdawg X PKUSH 302,449 94,467 396,916
Pkush X Pkush
24. RNA-Seq data from 5 tissues
Mature Bud
Early Bud
Mature Leaf
Early
Leaf/Petiole
Root
25. Characterizing THCA Synthase like genes
LA Confidential Contigs with BLAST hit to THC Synthase
Purple Kush assembly hole filled by 454 long reads
29. What markets are enabled with this?
Understanding Cannabichromene requires Schedule I licenses (time)
and is a longer term project.
Armed with the genome we can design QPCR assays to quantitate Cannabinoid
RNA and Mold for better labeling.
Courtagen also has the potential for Q400 ELISA assays for Pesticides and Mold
Medical Cannabis Industry needs better labeling and POC assays are required to
manage diversion concerns inherit in centralized testing labs.
Can we sequence patients to better understand cannabinoids and
metabolic disease?
30. Avantra’s Biomarker Platform Highlights
Simplified Multiplex Assay
Fully automated Multiplex ELISA (20-plex) on a chip with all reagents on board
Most applications measure five to seven different analytes
Minimal sample requirements - 100uL
Sample types: Serum, Plasma, Blood and other non particulate samples
Highly Precise and Accurate
3-4 log dynamic range on multiple analytes
Reproducibility - low Intra/inter assay CV below 10%
Instrument to instrument CV less than 0.3%
Improved accuracy with six replicates per analyte
Typical Calibration Curves
10’s of picogram sensitivity
100000 TIMP-1
HGF
S ig n a l In te n s ity ( R F U )
10000 ICAM-1
Fast and User Friendly Workflow TIE-2
1000 VEGF-R2
Less than 1 minute sample prep
Assay run time between 15-40 minutes 100 FGF-Basic
Bench-top system for non-specialized technician IL8
Compact foot print – 1.8 square feet 10
-2 0 2 4
E-selectin
PlGF
Log Concentration (ng/mL) VEGF
Company focus: Merge Genomic Data with Biomarker data
31. CLIA Certified for Mitochondrial Sequencing
1100 nuclear genes including CB1,
CB2, FAAH and MGLL
20,000X coverage of Mitochondrial
Genome
32. Courtagen’s CLIA sequencing pipeline
1 2 3 4 5 6
Customer Courtagen Biomarkers Ongoing
CLIA Databases Personalized
Acquisition Bioinformatics Service
(Saliva, Blood, Tissue) Laboratory Web/iPad App Portal
ATACCGCTGGC
CCTTTGGCATT
ACCTATGAAGA
TTGCTTCAGCC
AGCGTCAGTTT
CAACCTGTACG
CTAGTGTGTTT
Mito LR PCR, 2 different libraries
Nextera Library generation
Embedded controls
Haloplex 1100 genes
1:2000 children affected: Sequencing can save $100-$200K per year in costs.
Thought to be responsible for 10-20% of Autism
32 CONFIDENTIAL
33. mtSEEK PDx assay feature: Embedded controls
Control human DNA 1: NA19240
Purify DNA Make Barcoded Library for each mixture
Mix DNAs at precise ratios
Control human DNA 2: NA12878
Purify DNA
2 or more mixtures depending on application
90%:10% Mix 1 DNA Barcode CCCCCC
95%:5% Mix 2 DNA Barcode GGGGG
98%:2% Mix 3 DNA Barcode CACACA
99%:1% Mix 4 DNA Barcode GTGTGT
34. Barcoded Embedded DNA Controls
Barcoded Mixture Controls
Attach unique DNA
barcode
Clinical patient DNA 1
Mix Controls with Clinical
samples
Attach unique DNA barcode
Sequence samples and
barcodes Patients
Clinical patient DNA 2…50
De-multiplex barcodes
Controls in every run provide sensitivity and specificity
35. mtSEEK PDx Assay Features
CLIA validated assay with CPT codes
NUMTs Free capture technique (5% Heteroplasmy sensitivity)
Two Libraries made from each patient
only report genotypes observed in both libraries
Automated Nextera library generation
Barcoded Embedded Controls
Each Library Sequenced to 10,000X coverage
2 x 150bp reads assists in reducing noise from NUMTs
Dual indexing used to eliminate Patient mis-ID and sequencing artifacts
Saliva, Blood and Tissue CLIA validated
3 Day TAT. Backlog + Shipping and Approval= 3 week TAT
Consistent Nextera Library Generation MiSeq 2 X 150bp Sequencing
NUMTs Depletion step
36. Summary- Clearing the Smoke
Phased Genome Sequence provides:
Key cannabinoid synthase pathways now resolved
Synthetic biology approach for therapeutic cannabinoid manufacturing enabled
Toolkit to design RT qPCR assays for sequences predictive of cannabinoid
content and mold content. Critical to bring better labeling and regulation to the
growing dispensary based market for medical cannabis.
1 in 3 people will get cancer in their lifetime. 1in 4 will die with or from it. Anything
non-toxic and showing preliminary signs of cancer specific apoptosis is a priority.
Guzman et el.
Nature Cancer
Review -2004
37. Acknowledgements
In 6 months we started a company, sequenced a genome, Booked Revenue and
were acquired (Now a division of Courtagen Life Sciences).
2 Guys and a Garage
Christian Giannini
Lots of outsourcing
Doug Smith- Beckman Genomics
Karin Fredrickson, James Knight – Roche 454
Brian O’Connor, Sara Grimm- Nimbus Informatics
Tim Harkins- Life Technologies
Medicinal Plant Genomics Resource
Harm Van Bakel- Toronto
CLCBio
We are Hiring!
Genetic Counselors
Bioinformatics Scientists
http://www.courtagen.com/
Notas do Editor
Cannabis has gone through a breeding bottleneck with prohibition and many silent chemotypes we believe will be found in the genomics of existing strains.
Stay scientific and don’ t be influenced by 30 year old stigmas Better Cannabis regulation is needed. FDA trials on complex drug cocktails are expensive making it unlikely to be a pharmaceutical priority given the generic being ever present.