Introduction slides to set the scene for the QTLNetMiner demo available on YouTube https://www.youtube.com/watch?v=1FDCVrlB6G4x. For updates follow @KeywanHP.
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
QTLNetMiner - Efficient search and prioritization of gene evidence networks
1. Rothamsted Research
where knowledge grows
QTLNetMiner – Efficient search and
prioritization of gene evidence networks
WheatIS Annual Meeting, San Diego
9 January 2015
Keywan Hassani-Pak
2. Many gene discovery routes exploit genetic or transcriptome data
to produce markers for breeding or reverse genetics
Routes to candidate gene discovery
Gene
Expression
QTL/GWAS
Candidate
Genes
Prioritization Validation
Markers for
Breeding or GM
Traits
1
2
3
3. Gene Prioritization – a knowledge discovery challenge!
Orthologous Genes
Arabidopsis,
Rice, Yeast etc.
Lists of candidate genes
Gene Expression
Evaluation of different types of evidence
Expensive and labour-intensive
Literature
Phenotype, Gene
Ontologies
PathwaysOmics data
Traits
5. Data integration and transformation using Ondex
• Ondex parsers for many data sources to
transform raw data into semantic networks
• Accession mapping or text mining to link
concepts from different data sources
• Update data warehouse needs download of
new data and re-run integration workflow
Ondex: free, open-source, developed in Java
www.ondex.org
6. Building a Wheat Information Network through
integration of publicly available datasets
Wheat Genes Homology/Domains Annotations
5A
5B
5D
TTG2
seed color
seed coat development
DNA-binding WRKY
WRKY1
PMID 19129166
Inferred from Mutant
Phenotype
PMID: 15598800
GO
TO
encodes
text-mining
Mutations in TTG2 cause
phenotypic defects in trichome
development and seed color
pigmentation. PMID: 17766401
41% identity
EnsemblCompara
7. QTLNetMiner – Mining large semantic networks
for gene-trait discovery
Arabidopsis, Wheat,
Poplar at Rothamsted
Barley in collaboration with
IPK, Germany
Potato & Solanaceae
in collaboration with INTA,
Argentina
Animals in collaboration
with Roslin Institute, UK
• Web: https://ondex.rothamsted.ac.uk/QTLNetMiner
• Code: https://github.com/KeywanHP/QTLNetMiner
8. QTLNetMiner search interface
Define a QTL region
you are interested in.
Include a list of gene names and see
if they are related to your keyword.
Let’s help you to suggest
alternative search terms to
improve your results.
14. ... zoom into regions of interest
TRAES_1AL_0404BC790
TRAES_1BL_1D865A8CC
TRAES_1DL_5BAB0B6BC
WRKY43
CML9
Calcium signalling
Mechanical stimulus
response
Calcium ion detection
Stress tolerance
GO
GO
GO
TO
WRKY
Mutations of the AtCML9 gene also alter the
expression of several stress-regulated genes,
suggesting that AtCML9 is involved in salt
stress tolerance through its effects on the ABA-
mediated pathways.
15. Associating genes with trait terms through guilt by association in a
labelled & directed multi-graph (Ondex network)
QTLNetMiner – Semantic motif search
auxin
cytokinin
strigolactone
CCD
MAX
subapical shoots
axillary branching
shoot branching
hormone
?
Integrated knowledge network User input (prior knowledge)
Gene
16. • Scoring genes based on information retrieval metric
reflect how relevant a term is to a gene in a collection
• Developed a metric that takes into account
1. The amount of supporting evidence (tdf)
2. The specificity of evidence to a gene (IDFmean)
Candidate gene prioritisation
𝑆𝑐𝑜𝑟𝑒 𝑡, 𝑋 = 𝑡𝑑𝑓 𝑡, 𝑋 ∗ 𝐼𝐷𝐹𝑚𝑒𝑎𝑛(𝑋)
t: query terms
X: set of documents associated with a gene
17. Gene ranking – Example
Query: Phytophthora infestans|late blight resistance|response to pathogen|LRR
Score:
5.72
Score:
2.71
18. • Compatible with iOS, Android and Microsoft mobile devices
Replace the Java applet network viewer with CytoscapeJS
Replace the Flash GViewer with KineticsJS
• Develop a federated version (SolR, RDF, SPARQL) of
QTLNetMiner instead of centralised data warehousing
• Tighter integration with gene expression and variation
databases to improve gene ranking algorithm
Current and future development
Notas do Editor
QTL are genomic regions that assign variations observed in a phenotype to a region on the genetic map
Biomass traits: branching, height, leaf number etc.
What is going on underneath of a QTL? We are going from Willow to Poplar to Arabidopsis and other species