SlideShare uma empresa Scribd logo
1 de 19
Baixar para ler offline
Who’s this guy?
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz Max F.
de Carvalho
Scientific
Computing
Programme
(PROCC),
Fiocruz
Pan American
Center for
Foot-andMouth Disease
(PAHO/WHO)
WaFiS 2012
Knowledge
Discovery in
Databases
(KDD)
Complex

Knowledge Discovery in Databases through
Complex Networks: application to
phylodynamics
Luiz Max F. de Carvalho
Scientific Computing Programme (PROCC), Fiocruz
Pan American Center for Foot-and-Mouth Disease
(PAHO/WHO)
WaFiS 2012

September 28, 2012
Outline
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz Max F.
de Carvalho
Scientific
Computing
Programme
(PROCC),
Fiocruz
Pan American
Center for
Foot-andMouth Disease
(PAHO/WHO)

1 Knowledge Discovery in Databases (KDD)

2 Complex Networks

3 Example 1: Chitin pathway phylogeny

4 Example 2: Foot-and-mouth disease virus in South America
WaFiS 2012
Knowledge
Discovery in
Databases
(KDD)
Complex
Knowledge Discovery in Databases (KDD)
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz Max F.
de Carvalho
Scientific
Computing
Programme
(PROCC),
Fiocruz
Pan American
Center for
Foot-andMouth Disease
(PAHO/WHO)
WaFiS 2012
Knowledge
Discovery in
Databases
(KDD)
Complex

Lots of data
human brain very limited processing capacity
Information → Knowledge
Increasing number of molecular data (sequences, 3D
structures, antigenicity,. . . )
Is it possible to explore these databases to discover useful
stuff?
Well. . . Let’s see
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz Max F.
de Carvalho
Scientific
Computing
Programme
(PROCC),
Fiocruz
Pan American
Center for
Foot-andMouth Disease
(PAHO/WHO)
WaFiS 2012
Knowledge
Discovery in
Databases
(KDD)
Complex
[We may use] Complex Networks
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz Max F.
de Carvalho
Scientific
Computing
Programme
(PROCC),
Fiocruz
Pan American
Center for
Foot-andMouth Disease
(PAHO/WHO)
WaFiS 2012
Knowledge
Discovery in
Databases
(KDD)
Complex

Graphs → G = (V , E )
Yeah, but how?
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz Max F.
de Carvalho
Scientific
Computing
Programme
(PROCC),
Fiocruz
Pan American
Center for
Foot-andMouth Disease
(PAHO/WHO)
WaFiS 2012
Knowledge
Discovery in
Databases
(KDD)
Complex

We can explore the ”dynamic signature” of these Complex
Networks, i.e., study and compare their structural properties.
Some useful formulas:
Clustering Coefficient < c >:
Degree distribution PK =
Diameter: max(d(i, j))

3×#triangles
#triples

∞
K =K

pK
Ok, Let’s work then
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz Max F.
de Carvalho
Scientific
Computing
Programme
(PROCC),
Fiocruz
Pan American
Center for
Foot-andMouth Disease
(PAHO/WHO)
WaFiS 2012
Knowledge
Discovery in
Databases
(KDD)
Complex

1

Grab n sequences;

2

Create an n × n matrix using some kind of (normalized)
distance (say, S);

3

For each σ ∈ [0, 1] build M(σ) such that:
mij (σ) =

1 if Sij > σ,
0 if Sij < σ.

In a sense, we are transforming a single network in a family of
networks.
Analysis
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz Max F.
de Carvalho
Scientific
Computing
Programme
(PROCC),
Fiocruz
Pan American
Center for
Foot-andMouth Disease
(PAHO/WHO)
WaFiS 2012
Knowledge
Discovery in
Databases
(KDD)
Complex

We shall explore the relationships between these networks:
First, define a higher-order neighborhood indicator function,
such that you binarize the adjacency matrix with regard the
ˆ
path length , obtaining a matrix M = D M( ). Then
=1
δ(α, β) =

1
N2

N

N

(
i=1 j=1

ˆ
mij (α) mij (β)
ˆ
−
)
D(α)
D(β)

Evaluating δ(σ, σ + ∆σ) can give some interesting insights.

(1)
Example 1: Chitin pathway phylogeny
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz Max F.
de Carvalho
Scientific
Computing
Programme
(PROCC),
Fiocruz
Pan American
Center for
Foot-andMouth Disease
(PAHO/WHO)

Proteins related to the chitin metabolic pathway from
1605 complete genomes;

WaFiS 2012

BLAST distances (which are asymmetric);
Knowledge
Discovery in
Databases
(KDD)
Complex

Search for phylogenetic relationships
Example 1: Some results
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz Max F.
de Carvalho
Scientific
Computing
Programme
(PROCC),
Fiocruz
Pan American
Center for
Foot-andMouth Disease
(PAHO/WHO)
WaFiS 2012
Knowledge
Discovery in
Databases
(KDD)
Complex
Example 1: Some more results
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz Max F.
de Carvalho
Scientific
Computing
Programme
(PROCC),
Fiocruz
Pan American
Center for
Foot-andMouth Disease
(PAHO/WHO)
WaFiS 2012
Knowledge
Discovery in
Databases
(KDD)
Complex
Example 1: The expected Network(s)
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz Max F.
de Carvalho
Scientific
Computing
Programme
(PROCC),
Fiocruz
Pan American
Center for
Foot-andMouth Disease
(PAHO/WHO)
WaFiS 2012
Knowledge
Discovery in
Databases
(KDD)
Complex
Example 2: Foot-and-mouth disease virus in South
America
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz Max F.
de Carvalho
Scientific
Computing
Programme
(PROCC),
Fiocruz
Pan American
Center for
Foot-andMouth Disease
(PAHO/WHO)

S was built with phylogenetic (TN93) distances for NT
and JTT distances for AA;
Try to make sense of a somewhat big data set (167 seqs);
Extract some nice patterns;

WaFiS 2012
Knowledge
Discovery in
Databases
(KDD)
Complex
Indexes × σ
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz Max F.
de Carvalho
Scientific
Computing
Programme
(PROCC),
Fiocruz
Pan American
Center for
Foot-andMouth Disease
(PAHO/WHO)

(a)
WaFiS 2012
Knowledge
Discovery in
Databases
(KDD)
Complex

(b)
A nice network
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz Max F.
de Carvalho
Scientific
Computing
Programme
(PROCC),
Fiocruz
Pan American
Center for
Foot-andMouth Disease
(PAHO/WHO)
WaFiS 2012
Knowledge
Discovery in
Databases
(KDD)
Complex
Some more developments
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz Max F.
de Carvalho
Scientific
Computing
Programme
(PROCC),
Fiocruz
Pan American
Center for
Foot-andMouth Disease
(PAHO/WHO)
WaFiS 2012
Knowledge
Discovery in
Databases
(KDD)
Complex
Related Work
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz Max F.
de Carvalho
Scientific
Computing
Programme
(PROCC),
Fiocruz
Pan American
Center for
Foot-andMouth Disease
(PAHO/WHO)
WaFiS 2012
Knowledge
Discovery in
Databases
(KDD)
Complex

Identify transmission clusters (HIV, HCV) (Lewis et al,
2008,Plos Medicine)
Explore scale-free behavior in phylodynamics (Shiino,
2012, Frontiers in Microbiology)
Future Directions
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz Max F.
de Carvalho
Scientific
Computing
Programme
(PROCC),
Fiocruz
Pan American
Center for
Foot-andMouth Disease
(PAHO/WHO)
WaFiS 2012
Knowledge
Discovery in
Databases
(KDD)
Complex

Explore the spatial aspect in the construction of S
ˆ
Maybe S = µ + S(G )α
Power law analysis
Implement assortativity
Suggestions. . .
Thank You!
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz Max F.
de Carvalho
Scientific
Computing
Programme
(PROCC),
Fiocruz
Pan American
Center for
Foot-andMouth Disease
(PAHO/WHO)
WaFiS 2012
Knowledge
Discovery in
Databases
(KDD)
Complex

Mais conteúdo relacionado

Semelhante a Complex Networks and Data Mining on genetic databases

Community Finding with Applications on Phylogenetic Networks [Thesis]
Community Finding with Applications on Phylogenetic Networks [Thesis]Community Finding with Applications on Phylogenetic Networks [Thesis]
Community Finding with Applications on Phylogenetic Networks [Thesis]
Luís Rita
 
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven ScienceCapturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
dgarijo
 
Supervised deep learning_embeddings_for_the_predic
Supervised deep learning_embeddings_for_the_predicSupervised deep learning_embeddings_for_the_predic
Supervised deep learning_embeddings_for_the_predic
hema latha
 

Semelhante a Complex Networks and Data Mining on genetic databases (20)

Emerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomicsEmerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomics
 
Ontologies Ontop Databases
Ontologies Ontop DatabasesOntologies Ontop Databases
Ontologies Ontop Databases
 
Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...
Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...
Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...
 
Community Finding with Applications on Phylogenetic Networks [Thesis]
Community Finding with Applications on Phylogenetic Networks [Thesis]Community Finding with Applications on Phylogenetic Networks [Thesis]
Community Finding with Applications on Phylogenetic Networks [Thesis]
 
RDVW Hands-on session: Python
RDVW Hands-on session: PythonRDVW Hands-on session: Python
RDVW Hands-on session: Python
 
DisGeNET Tutorial SWAT4LS 2015-12-07
DisGeNET Tutorial SWAT4LS 2015-12-07DisGeNET Tutorial SWAT4LS 2015-12-07
DisGeNET Tutorial SWAT4LS 2015-12-07
 
Bioinformatica 29-09-2011-t1-bioinformatics
Bioinformatica 29-09-2011-t1-bioinformaticsBioinformatica 29-09-2011-t1-bioinformatics
Bioinformatica 29-09-2011-t1-bioinformatics
 
D02-NextGenSeq-MOLGENIS
D02-NextGenSeq-MOLGENISD02-NextGenSeq-MOLGENIS
D02-NextGenSeq-MOLGENIS
 
Challenges with implementing a new diagnostic platform in Post Entry Quarantine
Challenges with implementing a new diagnostic platform in Post Entry QuarantineChallenges with implementing a new diagnostic platform in Post Entry Quarantine
Challenges with implementing a new diagnostic platform in Post Entry Quarantine
 
Data Integration vs Transparency: Tackling the tension
Data Integration vs Transparency: Tackling the tensionData Integration vs Transparency: Tackling the tension
Data Integration vs Transparency: Tackling the tension
 
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven ScienceCapturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
 
A scalable ontology reasoner via incremental materialization
A scalable ontology reasoner via incremental materializationA scalable ontology reasoner via incremental materialization
A scalable ontology reasoner via incremental materialization
 
Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014
Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014
Biomedical Clusters, Clouds and Commons - DePaul Colloquium Oct 24, 2014
 
Connecting data across our clinical data warehouses: UC-Research eXchange (UC...
Connecting data across our clinical data warehouses: UC-Research eXchange (UC...Connecting data across our clinical data warehouses: UC-Research eXchange (UC...
Connecting data across our clinical data warehouses: UC-Research eXchange (UC...
 
Supervised deep learning_embeddings_for_the_predic
Supervised deep learning_embeddings_for_the_predicSupervised deep learning_embeddings_for_the_predic
Supervised deep learning_embeddings_for_the_predic
 
COVID-19 VACCINATION CLASSIFICATION OF OPINION MINING WITH SEMANTIC KNOWLEDGE...
COVID-19 VACCINATION CLASSIFICATION OF OPINION MINING WITH SEMANTIC KNOWLEDGE...COVID-19 VACCINATION CLASSIFICATION OF OPINION MINING WITH SEMANTIC KNOWLEDGE...
COVID-19 VACCINATION CLASSIFICATION OF OPINION MINING WITH SEMANTIC KNOWLEDGE...
 
BioData World Basel 2018
BioData World Basel 2018BioData World Basel 2018
BioData World Basel 2018
 
Transparency in the Data Supply Chain
Transparency in the Data Supply ChainTransparency in the Data Supply Chain
Transparency in the Data Supply Chain
 
Top Cited Articles in Data Mining - International Journal of Data Mining & Kn...
Top Cited Articles in Data Mining - International Journal of Data Mining & Kn...Top Cited Articles in Data Mining - International Journal of Data Mining & Kn...
Top Cited Articles in Data Mining - International Journal of Data Mining & Kn...
 
API-Centric Data Integration for Human Genomics Reference Databases: Achieve...
 API-Centric Data Integration for Human Genomics Reference Databases: Achieve... API-Centric Data Integration for Human Genomics Reference Databases: Achieve...
API-Centric Data Integration for Human Genomics Reference Databases: Achieve...
 

Último

The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
heathfieldcps1
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
SoniaTolstoy
 

Último (20)

Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 

Complex Networks and Data Mining on genetic databases

  • 1. Who’s this guy? Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-and-Mouth Disease (PAHO/WHO) WaFiS 2012 September 28, 2012
  • 2. Outline Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) 1 Knowledge Discovery in Databases (KDD) 2 Complex Networks 3 Example 1: Chitin pathway phylogeny 4 Example 2: Foot-and-mouth disease virus in South America WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex
  • 3. Knowledge Discovery in Databases (KDD) Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex Lots of data human brain very limited processing capacity Information → Knowledge Increasing number of molecular data (sequences, 3D structures, antigenicity,. . . ) Is it possible to explore these databases to discover useful stuff?
  • 4. Well. . . Let’s see Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex
  • 5. [We may use] Complex Networks Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex Graphs → G = (V , E )
  • 6. Yeah, but how? Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex We can explore the ”dynamic signature” of these Complex Networks, i.e., study and compare their structural properties. Some useful formulas: Clustering Coefficient < c >: Degree distribution PK = Diameter: max(d(i, j)) 3×#triangles #triples ∞ K =K pK
  • 7. Ok, Let’s work then Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex 1 Grab n sequences; 2 Create an n × n matrix using some kind of (normalized) distance (say, S); 3 For each σ ∈ [0, 1] build M(σ) such that: mij (σ) = 1 if Sij > σ, 0 if Sij < σ. In a sense, we are transforming a single network in a family of networks.
  • 8. Analysis Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex We shall explore the relationships between these networks: First, define a higher-order neighborhood indicator function, such that you binarize the adjacency matrix with regard the ˆ path length , obtaining a matrix M = D M( ). Then =1 δ(α, β) = 1 N2 N N ( i=1 j=1 ˆ mij (α) mij (β) ˆ − ) D(α) D(β) Evaluating δ(σ, σ + ∆σ) can give some interesting insights. (1)
  • 9. Example 1: Chitin pathway phylogeny Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) Proteins related to the chitin metabolic pathway from 1605 complete genomes; WaFiS 2012 BLAST distances (which are asymmetric); Knowledge Discovery in Databases (KDD) Complex Search for phylogenetic relationships
  • 10. Example 1: Some results Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex
  • 11. Example 1: Some more results Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex
  • 12. Example 1: The expected Network(s) Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex
  • 13. Example 2: Foot-and-mouth disease virus in South America Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) S was built with phylogenetic (TN93) distances for NT and JTT distances for AA; Try to make sense of a somewhat big data set (167 seqs); Extract some nice patterns; WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex
  • 14. Indexes × σ Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) (a) WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex (b)
  • 15. A nice network Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex
  • 16. Some more developments Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex
  • 17. Related Work Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex Identify transmission clusters (HIV, HCV) (Lewis et al, 2008,Plos Medicine) Explore scale-free behavior in phylodynamics (Shiino, 2012, Frontiers in Microbiology)
  • 18. Future Directions Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex Explore the spatial aspect in the construction of S ˆ Maybe S = µ + S(G )α Power law analysis Implement assortativity Suggestions. . .
  • 19. Thank You! Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex