SlideShare uma empresa Scribd logo
1 de 72
FBW
12-03-2019
Biological Databases
Wim Van Criekinge
Data Warehousing and Decision Support
Views and Decision Support
• OLAP queries are typically aggregate queries.
– Precomputation is essential for interactive response
times.
– The CUBE is in fact a collection of aggregate
queries, and precomputation is especially important:
lots of work on what is best to precompute given a
limited amount of space to store precomputed
results.
• Warehouses can be thought of as a collection
of asynchronously replicated tables and
periodically maintained views.
– Has renewed interest in view maintenance!
View Modification (Evaluate On Demand)
CREATE VIEW RegionalSales(category,sales,state)
AS SELECT P.category, S.sales, L.state
FROM Products P, Sales S, Locations L
WHERE P.pid=S.pid AND S.locid=L.locid
SELECT R.category, R.state, SUM(R.sales)
FROM RegionalSales AS R GROUP BY R.category, R.state
SELECT R.category, R.state, SUM(R.sales)
FROM (SELECT P.category, S.sales, L.state
FROM Products P, Sales S, Locations L
WHERE P.pid=S.pid AND S.locid=L.locid) AS R
GROUP BY R.category, R.state
View
Query
Modified
Query
View Materialization (Precomputation)
• Suppose we precompute RegionalSales and store
it with a clustered B+ tree index on
[category,state,sales].
– Then, previous query can be answered by an index-
only scan.
SELECT R.state, SUM(R.sales)
FROM RegionalSales R
WHERE R.category=“Laptop”
GROUP BY R.state
SELECT R.state, SUM(R.sales)
FROM RegionalSales R
WHERE R. state=“Wisconsin”
GROUP BY R.category
Index on precomputed view
is great!
Index is less useful (must
scan entire leaf level).
Materialized Views
• A view whose tuples are stored in the database
is said to be materialized.
– Provides fast access, like a (very high-level) cache.
– Need to maintain the view as the underlying tables
change.
– Ideally, we want incremental view maintenance
algorithms.
• Close relationship to data warehousing, OLAP,
(asynchronously) maintaining distributed
databases, checking integrity constraints, and
evaluating rules and triggers.
Issues in View Materialization
• What views should we materialize, and
what indexes should we build on the
precomputed results?
• Given a query and a set of materialized
views, can we use the materialized
views to answer the query?
• How frequently should we refresh
materialized views to make them
consistent with the underlying tables?
(And how can we do this
incrementally?)
Toad Edge for MySQL
Install BIOSQL locally
• Get latest version of mysql (MAMP,
mariaDB)
• Download biosqldb-mysql.sql
• Remove type=innodb
• Launch database server
• Connect using toad (port 8889)
• Create database biosql;
• Set as active database
• Use worksheet to execute biosqldb-
mysql.sql
MySQL and python DB API(pymysql)
Database drivers
pymysql Installation
pip install pymysql
MySQL Installation
brew install mysql
# Path Setting and inserting into .bash_profile
export MYSQL_PATH=/usr/local/Cellar/mysql/5.7.14
export PATH=$PATH:$MYSQL_PATH/bin
MySQL Start
Start: mysql.server start
Connection by root user: mysql -u root
Creating Database:
Create database djangogirls
Exit:
exit
Connecting MySQL using Client Tool
Tool that helps to manage dadabases iike Toad, Sequel Pro, DataGrip etc.
But tool for today is PyCharm!
print ("Uploading data");
import pymysql
db= pymysql.connect(host =
"localhost",port=8889,user="root",passwd="root",db="db")
cursor=db.cursor()
#cursor.execute("DROP TABLE IF EXISTS USER")
sql="insert into tb (tb_id,tb_name,tb_age,tb_sex) values
('1','Demo','26','ma')"
cursor.execute(sql)
db.commit()
db.close()
print ("Done")
Import from BioPython to BIOSQL
#Connecting to a BioSQL database -http://biopython.org/wiki/BioSQL
from Bio import Entrez
from Bio import SeqIO
from BioSQL import BioSeqDatabase
server = BioSeqDatabase.open_database(driver = "pymysql",host =
"localhost",port=8889,user="root",passwd="root",db="bio2019")
db = server.new_database("test2")
db = server["test2"]
import pprint
Entrez.email = "A.N.Other@example.com"
handle = Entrez.efetch(db="nucleotide", rettype="gb", retmode="text", id="6273291,6273290,6273289")
print ("Loading into BIOSQL")
count = db.load(SeqIO.parse(handle, "genbank"))
print ("Loaded %i records" % count)
server.adaptor.commit()
print ("ended succesfully")
Lab for Bioinformatics and computational genomics
Lab for Bioinformatics and computational genomics
Lab for Bioinformatics and computational genomics
Lab for Bioinformatics and computational genomics
Lab for Bioinformatics and computational genomics
Lab for Bioinformatics and computational genomics
Lab for Bioinformatics and computational genomics
Lab for Bioinformatics and computational genomics
The Technical Feasibility Argument
The Quality Argument
The Price Argument
The Logistics Argument
Lab for Bioinformatics and computational genomics
Lab for Bioinformatics and computational genomics
Recreational genomics
Lab for Bioinformatics and computational genomics
Recreational genomics
• Experimental designs are outdated by technological advances
• Genetic background (reference genome) as a concept will need to be
updated
• Traits dependent on multiple loci are “complicated”: educate and
provide tools to deal with it
Lab for Bioinformatics and computational genomics
Recreational genomics
Lab for Bioinformatics and computational genomics
Recreational genomics
• Eye color … why not the ear wax/asparagus or unibrown example
• … metabolize nutrients (newborns ?)
• … metabolize drugs in case you need it urgently ?
Lab for Bioinformatics and computational genomics
Recreational genomics
Lab for Bioinformatics and computational genomics
Recreational genomics
“several 23andMe users have reported taking the FDA’s
advice of reviewing their genetic results with their
physicians, only to find the doctors unprepared, unwilling,
or downright hostile to helping interpret the data”
Lab for Bioinformatics and computational genomics
Lab for Bioinformatics and computational genomics
Recreational genomics
Lab for Bioinformatics and computational genomics
Lab for Bioinformatics and computational genomics
Recreational genomics
Lab for Bioinformatics and computational genomics
Recreational genomics
Lab for Bioinformatics and computational genomics
Lab for Bioinformatics and computational genomics
my genome is too important (for me)
to leave it (only) to doctors
Lab for Bioinformatics and computational genomics
NXTGNT biohackerspace …
Lab for Bioinformatics and computational genomics
PGMv2: Personal Genomics Manifesto
Lab for Bioinformatics and computational genomics
Everyone should have the power and legitimacy to
be able to discover, develop and find new things
about their own genome data.
Intelligent exploration, experimentation and trial to
push the boundaries of knowledge are a basic
human right.
PGMv2: Personal Genomics Manifesto
Lab for Bioinformatics and computational genomics
Personal genome data access should be
affordable to all irrespective of nationality, gender,
social background or any other circumstance.
Not having access to a personal genetic test is in
itself a new kind of discrimination.
PGMv2: Personal Genomics Manifesto
Lab for Bioinformatics and computational genomics
Whether one wants to share genome data or keep it
private should be a matter of personal choice.
Whatever attitude a person has towards personal
genome privacy, it should be utterly respected.
Corporate interest can never compromise any human
right. Laws must fully protect individual human rights of
equality for every person, irrespective of predicted risks
from genetic data.
PGMv2: Personal Genomics Manifesto
Lab for Bioinformatics and computational genomics
Stating that genetic tests merely provide non-
clinical information misses the point of what
personal genomics is all about.
Most genomic information is uninterpretable and
may well be meaningless. But those are not
reasons to deny it to people.
Genetic test results are not unrelated to
someone’s health, one’s ability to respond to
certain drugs and one’s ethnic ancestry.
PGMv2: Personal Genomics Manifesto
Lab for Bioinformatics and computational genomics
Education in risks and opportunities for personal
genetic testing should be the primary aim of
policy makers.
Restricting access to interested people makes
no sense and it is virtually impossible to ensure.
Access to personal genomics data and tools for
its interpretation should become accessible to
everyone.
PGMv2: Personal Genomics Manifesto
Lab for Bioinformatics and computational genomics
Overview
• Who ? Where ?
• > Genetics
• Technology: Next Gen Sequencing
• Personal …. Medicine/Genomics
• Manifesto
• The App
^[now][transl⎮comput]ational[epi]genomic$
Lab for Bioinformatics and computational genomics
65
2019 03 05_biological_databases_part4_v_upload
2019 03 05_biological_databases_part4_v_upload
2019 03 05_biological_databases_part4_v_upload
2019 03 05_biological_databases_part4_v_upload
2019 03 05_biological_databases_part4_v_upload
2019 03 05_biological_databases_part4_v_upload
2019 03 05_biological_databases_part4_v_upload

Mais conteúdo relacionado

Mais procurados

How to make your published data findable, accessible, interoperable and reusable
How to make your published data findable, accessible, interoperable and reusableHow to make your published data findable, accessible, interoperable and reusable
How to make your published data findable, accessible, interoperable and reusablePhoenix Bioinformatics
 
A Short Guide to the E-utilities
A Short Guide to the E-utilitiesA Short Guide to the E-utilities
A Short Guide to the E-utilitiesDrake Huang
 
E-Utilities
E-UtilitiesE-Utilities
E-Utilitiesmkim8
 
ECCB 2014: Extracting patterns of database and software usage from the bioinf...
ECCB 2014: Extracting patterns of database and software usage from the bioinf...ECCB 2014: Extracting patterns of database and software usage from the bioinf...
ECCB 2014: Extracting patterns of database and software usage from the bioinf...geraintduck
 
Semi-automated Exploration and Extraction of Data in Scientific Tables
Semi-automated Exploration and Extraction of Data in Scientific TablesSemi-automated Exploration and Extraction of Data in Scientific Tables
Semi-automated Exploration and Extraction of Data in Scientific TablesElsevier
 
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...Lucidworks
 
Phylo finder: an intelligent search engine for phylogenetic tree databases
Phylo finder: an intelligent search engine for phylogenetic tree databasesPhylo finder: an intelligent search engine for phylogenetic tree databases
Phylo finder: an intelligent search engine for phylogenetic tree databasesAfnan Zuiter
 
Ontologies: Necessary, but not sufficient
Ontologies: Necessary, but not sufficientOntologies: Necessary, but not sufficient
Ontologies: Necessary, but not sufficientrobertstevens65
 
The Role of Metadata in Reproducible Computational Research
The Role of Metadata in Reproducible Computational ResearchThe Role of Metadata in Reproducible Computational Research
The Role of Metadata in Reproducible Computational ResearchJeremy Leipzig
 
Query-Load aware partitioning of RDF data
Query-Load aware partitioning of RDF dataQuery-Load aware partitioning of RDF data
Query-Load aware partitioning of RDF dataLuis Galárraga
 
University of Manchester Symposium 2012: Extraction and Representation of in ...
University of Manchester Symposium 2012: Extraction and Representation of in ...University of Manchester Symposium 2012: Extraction and Representation of in ...
University of Manchester Symposium 2012: Extraction and Representation of in ...geraintduck
 
Chris Brew - TR Discover: A Natural Language Interface for Exploring Linked D...
Chris Brew - TR Discover: A Natural Language Interface for Exploring Linked D...Chris Brew - TR Discover: A Natural Language Interface for Exploring Linked D...
Chris Brew - TR Discover: A Natural Language Interface for Exploring Linked D...Machine Learning Prague
 
GraphConnect Europe 2016 - Building a Repository of Biomedical Ontologies wit...
GraphConnect Europe 2016 - Building a Repository of Biomedical Ontologies wit...GraphConnect Europe 2016 - Building a Repository of Biomedical Ontologies wit...
GraphConnect Europe 2016 - Building a Repository of Biomedical Ontologies wit...Neo4j
 
Issues and activities in authoring ontologies
Issues and activities in authoring ontologiesIssues and activities in authoring ontologies
Issues and activities in authoring ontologiesrobertstevens65
 
An integrated dataset for in silico drug discovery
An integrated dataset for in silico drug discoveryAn integrated dataset for in silico drug discovery
An integrated dataset for in silico drug discoverySimon Cockell
 

Mais procurados (20)

2017 biological databases_part1_vupload
2017 biological databases_part1_vupload2017 biological databases_part1_vupload
2017 biological databases_part1_vupload
 
How to make your published data findable, accessible, interoperable and reusable
How to make your published data findable, accessible, interoperable and reusableHow to make your published data findable, accessible, interoperable and reusable
How to make your published data findable, accessible, interoperable and reusable
 
A Short Guide to the E-utilities
A Short Guide to the E-utilitiesA Short Guide to the E-utilities
A Short Guide to the E-utilities
 
E-Utilities
E-UtilitiesE-Utilities
E-Utilities
 
BLAST
BLASTBLAST
BLAST
 
ECCB 2014: Extracting patterns of database and software usage from the bioinf...
ECCB 2014: Extracting patterns of database and software usage from the bioinf...ECCB 2014: Extracting patterns of database and software usage from the bioinf...
ECCB 2014: Extracting patterns of database and software usage from the bioinf...
 
BLAST
BLASTBLAST
BLAST
 
2012 03 01_bioinformatics_ii_les1
2012 03 01_bioinformatics_ii_les12012 03 01_bioinformatics_ii_les1
2012 03 01_bioinformatics_ii_les1
 
Semi-automated Exploration and Extraction of Data in Scientific Tables
Semi-automated Exploration and Extraction of Data in Scientific TablesSemi-automated Exploration and Extraction of Data in Scientific Tables
Semi-automated Exploration and Extraction of Data in Scientific Tables
 
Blast 2013 1
Blast 2013 1Blast 2013 1
Blast 2013 1
 
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
 
Phylo finder: an intelligent search engine for phylogenetic tree databases
Phylo finder: an intelligent search engine for phylogenetic tree databasesPhylo finder: an intelligent search engine for phylogenetic tree databases
Phylo finder: an intelligent search engine for phylogenetic tree databases
 
Ontologies: Necessary, but not sufficient
Ontologies: Necessary, but not sufficientOntologies: Necessary, but not sufficient
Ontologies: Necessary, but not sufficient
 
The Role of Metadata in Reproducible Computational Research
The Role of Metadata in Reproducible Computational ResearchThe Role of Metadata in Reproducible Computational Research
The Role of Metadata in Reproducible Computational Research
 
Query-Load aware partitioning of RDF data
Query-Load aware partitioning of RDF dataQuery-Load aware partitioning of RDF data
Query-Load aware partitioning of RDF data
 
University of Manchester Symposium 2012: Extraction and Representation of in ...
University of Manchester Symposium 2012: Extraction and Representation of in ...University of Manchester Symposium 2012: Extraction and Representation of in ...
University of Manchester Symposium 2012: Extraction and Representation of in ...
 
Chris Brew - TR Discover: A Natural Language Interface for Exploring Linked D...
Chris Brew - TR Discover: A Natural Language Interface for Exploring Linked D...Chris Brew - TR Discover: A Natural Language Interface for Exploring Linked D...
Chris Brew - TR Discover: A Natural Language Interface for Exploring Linked D...
 
GraphConnect Europe 2016 - Building a Repository of Biomedical Ontologies wit...
GraphConnect Europe 2016 - Building a Repository of Biomedical Ontologies wit...GraphConnect Europe 2016 - Building a Repository of Biomedical Ontologies wit...
GraphConnect Europe 2016 - Building a Repository of Biomedical Ontologies wit...
 
Issues and activities in authoring ontologies
Issues and activities in authoring ontologiesIssues and activities in authoring ontologies
Issues and activities in authoring ontologies
 
An integrated dataset for in silico drug discovery
An integrated dataset for in silico drug discoveryAn integrated dataset for in silico drug discovery
An integrated dataset for in silico drug discovery
 

Semelhante a 2019 03 05_biological_databases_part4_v_upload

2015 bioinformatics personal_genomics_wim_vancriekinge
2015 bioinformatics personal_genomics_wim_vancriekinge2015 bioinformatics personal_genomics_wim_vancriekinge
2015 bioinformatics personal_genomics_wim_vancriekingeProf. Wim Van Criekinge
 
EiTESAL eHealth Conference 14&15 May 2017
EiTESAL eHealth Conference 14&15 May 2017 EiTESAL eHealth Conference 14&15 May 2017
EiTESAL eHealth Conference 14&15 May 2017 EITESANGO
 
Life sciences big data use cases
Life sciences big data use casesLife sciences big data use cases
Life sciences big data use casesGuy Coates
 
Mar2013 Performance Metrics Working Group
Mar2013 Performance Metrics Working GroupMar2013 Performance Metrics Working Group
Mar2013 Performance Metrics Working GroupGenomeInABottle
 
Best Practices for Building an End-to-End Workflow for Microbial Genomics
 Best Practices for Building an End-to-End Workflow for Microbial Genomics Best Practices for Building an End-to-End Workflow for Microbial Genomics
Best Practices for Building an End-to-End Workflow for Microbial GenomicsJonathan Jacobs, PhD
 
GIAB-GRC workshop oct2015 giab introduction 151005
GIAB-GRC workshop oct2015 giab introduction 151005GIAB-GRC workshop oct2015 giab introduction 151005
GIAB-GRC workshop oct2015 giab introduction 151005GenomeInABottle
 
The roles communities play in improving bioinformatics: better software, bett...
The roles communities play in improving bioinformatics: better software, bett...The roles communities play in improving bioinformatics: better software, bett...
The roles communities play in improving bioinformatics: better software, bett...Iddo
 
BioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataBioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataPhilip Cheung
 
Careers in bioinformatics, Scope, Skills and Jobs
Careers in bioinformatics, Scope, Skills and JobsCareers in bioinformatics, Scope, Skills and Jobs
Careers in bioinformatics, Scope, Skills and JobsM Abdullah Chaudhry
 
GIAB Integrating multiple technologies to form benchmark SVs 180517
GIAB Integrating multiple technologies to form benchmark SVs 180517GIAB Integrating multiple technologies to form benchmark SVs 180517
GIAB Integrating multiple technologies to form benchmark SVs 180517GenomeInABottle
 
Jillian ms defense-4-14-14-ja-novid2
Jillian ms defense-4-14-14-ja-novid2Jillian ms defense-4-14-14-ja-novid2
Jillian ms defense-4-14-14-ja-novid2Jillian Aurisano
 
Jillian ms defense-4-14-14-ja-novid2
Jillian ms defense-4-14-14-ja-novid2Jillian ms defense-4-14-14-ja-novid2
Jillian ms defense-4-14-14-ja-novid2Jillian Aurisano
 
Jillian ms defense-4-14-14-ja-novideo
Jillian ms defense-4-14-14-ja-novideoJillian ms defense-4-14-14-ja-novideo
Jillian ms defense-4-14-14-ja-novideoJillian Aurisano
 
MedChemica BigData What Is That All About?
MedChemica BigData What Is That All About?MedChemica BigData What Is That All About?
MedChemica BigData What Is That All About?Al Dossetter
 
Giab aug2015 intro and update 150821.pptx
Giab aug2015 intro and update 150821.pptxGiab aug2015 intro and update 150821.pptx
Giab aug2015 intro and update 150821.pptxGenomeInABottle
 
The BioAssay Research Database
The BioAssay Research DatabaseThe BioAssay Research Database
The BioAssay Research DatabaseRajarshi Guha
 
171017 giab for giab grc workshop
171017 giab for giab grc workshop171017 giab for giab grc workshop
171017 giab for giab grc workshopGenomeInABottle
 
Jillian ms defense-4-14-14-ja-novid3
Jillian ms defense-4-14-14-ja-novid3Jillian ms defense-4-14-14-ja-novid3
Jillian ms defense-4-14-14-ja-novid3Jillian Aurisano
 

Semelhante a 2019 03 05_biological_databases_part4_v_upload (20)

2015 bioinformatics personal_genomics_wim_vancriekinge
2015 bioinformatics personal_genomics_wim_vancriekinge2015 bioinformatics personal_genomics_wim_vancriekinge
2015 bioinformatics personal_genomics_wim_vancriekinge
 
EiTESAL eHealth Conference 14&15 May 2017
EiTESAL eHealth Conference 14&15 May 2017 EiTESAL eHealth Conference 14&15 May 2017
EiTESAL eHealth Conference 14&15 May 2017
 
Life sciences big data use cases
Life sciences big data use casesLife sciences big data use cases
Life sciences big data use cases
 
Mar2013 Performance Metrics Working Group
Mar2013 Performance Metrics Working GroupMar2013 Performance Metrics Working Group
Mar2013 Performance Metrics Working Group
 
Best Practices for Building an End-to-End Workflow for Microbial Genomics
 Best Practices for Building an End-to-End Workflow for Microbial Genomics Best Practices for Building an End-to-End Workflow for Microbial Genomics
Best Practices for Building an End-to-End Workflow for Microbial Genomics
 
GIAB-GRC workshop oct2015 giab introduction 151005
GIAB-GRC workshop oct2015 giab introduction 151005GIAB-GRC workshop oct2015 giab introduction 151005
GIAB-GRC workshop oct2015 giab introduction 151005
 
The roles communities play in improving bioinformatics: better software, bett...
The roles communities play in improving bioinformatics: better software, bett...The roles communities play in improving bioinformatics: better software, bett...
The roles communities play in improving bioinformatics: better software, bett...
 
BioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataBioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadata
 
171017 giab for giab grc workshop
171017 giab for giab grc workshop171017 giab for giab grc workshop
171017 giab for giab grc workshop
 
Careers in bioinformatics, Scope, Skills and Jobs
Careers in bioinformatics, Scope, Skills and JobsCareers in bioinformatics, Scope, Skills and Jobs
Careers in bioinformatics, Scope, Skills and Jobs
 
GIAB Integrating multiple technologies to form benchmark SVs 180517
GIAB Integrating multiple technologies to form benchmark SVs 180517GIAB Integrating multiple technologies to form benchmark SVs 180517
GIAB Integrating multiple technologies to form benchmark SVs 180517
 
Jillian ms defense-4-14-14-ja-novid2
Jillian ms defense-4-14-14-ja-novid2Jillian ms defense-4-14-14-ja-novid2
Jillian ms defense-4-14-14-ja-novid2
 
Jillian ms defense-4-14-14-ja-novid2
Jillian ms defense-4-14-14-ja-novid2Jillian ms defense-4-14-14-ja-novid2
Jillian ms defense-4-14-14-ja-novid2
 
Jillian ms defense-4-14-14-ja-novideo
Jillian ms defense-4-14-14-ja-novideoJillian ms defense-4-14-14-ja-novideo
Jillian ms defense-4-14-14-ja-novideo
 
MedChemica BigData What Is That All About?
MedChemica BigData What Is That All About?MedChemica BigData What Is That All About?
MedChemica BigData What Is That All About?
 
Giab aug2015 intro and update 150821.pptx
Giab aug2015 intro and update 150821.pptxGiab aug2015 intro and update 150821.pptx
Giab aug2015 intro and update 150821.pptx
 
The BioAssay Research Database
The BioAssay Research DatabaseThe BioAssay Research Database
The BioAssay Research Database
 
Cshl minseqe 2013_ouellette
Cshl minseqe 2013_ouelletteCshl minseqe 2013_ouellette
Cshl minseqe 2013_ouellette
 
171017 giab for giab grc workshop
171017 giab for giab grc workshop171017 giab for giab grc workshop
171017 giab for giab grc workshop
 
Jillian ms defense-4-14-14-ja-novid3
Jillian ms defense-4-14-14-ja-novid3Jillian ms defense-4-14-14-ja-novid3
Jillian ms defense-4-14-14-ja-novid3
 

Mais de Prof. Wim Van Criekinge

2019 02 21_biological_databases_part2_v_upload
2019 02 21_biological_databases_part2_v_upload2019 02 21_biological_databases_part2_v_upload
2019 02 21_biological_databases_part2_v_uploadProf. Wim Van Criekinge
 
2018 03 27_biological_databases_part4_v_upload
2018 03 27_biological_databases_part4_v_upload2018 03 27_biological_databases_part4_v_upload
2018 03 27_biological_databases_part4_v_uploadProf. Wim Van Criekinge
 
2018 02 20_biological_databases_part2_v_upload
2018 02 20_biological_databases_part2_v_upload2018 02 20_biological_databases_part2_v_upload
2018 02 20_biological_databases_part2_v_uploadProf. Wim Van Criekinge
 
2017 molecular profiling_wim_vancriekinge
2017 molecular profiling_wim_vancriekinge2017 molecular profiling_wim_vancriekinge
2017 molecular profiling_wim_vancriekingeProf. Wim Van Criekinge
 

Mais de Prof. Wim Van Criekinge (20)

2019 02 21_biological_databases_part2_v_upload
2019 02 21_biological_databases_part2_v_upload2019 02 21_biological_databases_part2_v_upload
2019 02 21_biological_databases_part2_v_upload
 
P7 2018 biopython3
P7 2018 biopython3P7 2018 biopython3
P7 2018 biopython3
 
P6 2018 biopython2b
P6 2018 biopython2bP6 2018 biopython2b
P6 2018 biopython2b
 
P4 2018 io_functions
P4 2018 io_functionsP4 2018 io_functions
P4 2018 io_functions
 
P3 2018 python_regexes
P3 2018 python_regexesP3 2018 python_regexes
P3 2018 python_regexes
 
P1 2018 python
P1 2018 pythonP1 2018 python
P1 2018 python
 
2018 05 08_biological_databases_no_sql
2018 05 08_biological_databases_no_sql2018 05 08_biological_databases_no_sql
2018 05 08_biological_databases_no_sql
 
2018 03 27_biological_databases_part4_v_upload
2018 03 27_biological_databases_part4_v_upload2018 03 27_biological_databases_part4_v_upload
2018 03 27_biological_databases_part4_v_upload
 
2018 03 20_biological_databases_part3
2018 03 20_biological_databases_part32018 03 20_biological_databases_part3
2018 03 20_biological_databases_part3
 
2018 02 20_biological_databases_part2_v_upload
2018 02 20_biological_databases_part2_v_upload2018 02 20_biological_databases_part2_v_upload
2018 02 20_biological_databases_part2_v_upload
 
P7 2017 biopython3
P7 2017 biopython3P7 2017 biopython3
P7 2017 biopython3
 
P6 2017 biopython2
P6 2017 biopython2P6 2017 biopython2
P6 2017 biopython2
 
Van criekinge 2017_11_13_rodebiotech
Van criekinge 2017_11_13_rodebiotechVan criekinge 2017_11_13_rodebiotech
Van criekinge 2017_11_13_rodebiotech
 
P4 2017 io
P4 2017 ioP4 2017 io
P4 2017 io
 
T5 2017 database_searching_v_upload
T5 2017 database_searching_v_uploadT5 2017 database_searching_v_upload
T5 2017 database_searching_v_upload
 
P1 3 2017_python_exercises
P1 3 2017_python_exercisesP1 3 2017_python_exercises
P1 3 2017_python_exercises
 
P3 2017 python_regexes
P3 2017 python_regexesP3 2017 python_regexes
P3 2017 python_regexes
 
P2 2017 python_strings
P2 2017 python_stringsP2 2017 python_strings
P2 2017 python_strings
 
P1 2017 python
P1 2017 pythonP1 2017 python
P1 2017 python
 
2017 molecular profiling_wim_vancriekinge
2017 molecular profiling_wim_vancriekinge2017 molecular profiling_wim_vancriekinge
2017 molecular profiling_wim_vancriekinge
 

Último

Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...Pooja Nehwal
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 

Último (20)

Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 

2019 03 05_biological_databases_part4_v_upload

  • 1.
  • 3.
  • 4. Data Warehousing and Decision Support
  • 5. Views and Decision Support • OLAP queries are typically aggregate queries. – Precomputation is essential for interactive response times. – The CUBE is in fact a collection of aggregate queries, and precomputation is especially important: lots of work on what is best to precompute given a limited amount of space to store precomputed results. • Warehouses can be thought of as a collection of asynchronously replicated tables and periodically maintained views. – Has renewed interest in view maintenance!
  • 6. View Modification (Evaluate On Demand) CREATE VIEW RegionalSales(category,sales,state) AS SELECT P.category, S.sales, L.state FROM Products P, Sales S, Locations L WHERE P.pid=S.pid AND S.locid=L.locid SELECT R.category, R.state, SUM(R.sales) FROM RegionalSales AS R GROUP BY R.category, R.state SELECT R.category, R.state, SUM(R.sales) FROM (SELECT P.category, S.sales, L.state FROM Products P, Sales S, Locations L WHERE P.pid=S.pid AND S.locid=L.locid) AS R GROUP BY R.category, R.state View Query Modified Query
  • 7. View Materialization (Precomputation) • Suppose we precompute RegionalSales and store it with a clustered B+ tree index on [category,state,sales]. – Then, previous query can be answered by an index- only scan. SELECT R.state, SUM(R.sales) FROM RegionalSales R WHERE R.category=“Laptop” GROUP BY R.state SELECT R.state, SUM(R.sales) FROM RegionalSales R WHERE R. state=“Wisconsin” GROUP BY R.category Index on precomputed view is great! Index is less useful (must scan entire leaf level).
  • 8. Materialized Views • A view whose tuples are stored in the database is said to be materialized. – Provides fast access, like a (very high-level) cache. – Need to maintain the view as the underlying tables change. – Ideally, we want incremental view maintenance algorithms. • Close relationship to data warehousing, OLAP, (asynchronously) maintaining distributed databases, checking integrity constraints, and evaluating rules and triggers.
  • 9. Issues in View Materialization • What views should we materialize, and what indexes should we build on the precomputed results? • Given a query and a set of materialized views, can we use the materialized views to answer the query? • How frequently should we refresh materialized views to make them consistent with the underlying tables? (And how can we do this incrementally?)
  • 10. Toad Edge for MySQL
  • 11. Install BIOSQL locally • Get latest version of mysql (MAMP, mariaDB) • Download biosqldb-mysql.sql • Remove type=innodb • Launch database server • Connect using toad (port 8889) • Create database biosql; • Set as active database • Use worksheet to execute biosqldb- mysql.sql
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18. MySQL and python DB API(pymysql)
  • 21. MySQL Installation brew install mysql # Path Setting and inserting into .bash_profile export MYSQL_PATH=/usr/local/Cellar/mysql/5.7.14 export PATH=$PATH:$MYSQL_PATH/bin
  • 22. MySQL Start Start: mysql.server start Connection by root user: mysql -u root Creating Database: Create database djangogirls Exit: exit
  • 23. Connecting MySQL using Client Tool Tool that helps to manage dadabases iike Toad, Sequel Pro, DataGrip etc. But tool for today is PyCharm!
  • 24. print ("Uploading data"); import pymysql db= pymysql.connect(host = "localhost",port=8889,user="root",passwd="root",db="db") cursor=db.cursor() #cursor.execute("DROP TABLE IF EXISTS USER") sql="insert into tb (tb_id,tb_name,tb_age,tb_sex) values ('1','Demo','26','ma')" cursor.execute(sql) db.commit() db.close() print ("Done")
  • 25. Import from BioPython to BIOSQL #Connecting to a BioSQL database -http://biopython.org/wiki/BioSQL from Bio import Entrez from Bio import SeqIO from BioSQL import BioSeqDatabase server = BioSeqDatabase.open_database(driver = "pymysql",host = "localhost",port=8889,user="root",passwd="root",db="bio2019") db = server.new_database("test2") db = server["test2"] import pprint Entrez.email = "A.N.Other@example.com" handle = Entrez.efetch(db="nucleotide", rettype="gb", retmode="text", id="6273291,6273290,6273289") print ("Loading into BIOSQL") count = db.load(SeqIO.parse(handle, "genbank")) print ("Loaded %i records" % count) server.adaptor.commit() print ("ended succesfully")
  • 26. Lab for Bioinformatics and computational genomics
  • 27. Lab for Bioinformatics and computational genomics
  • 28.
  • 29. Lab for Bioinformatics and computational genomics
  • 30. Lab for Bioinformatics and computational genomics
  • 31. Lab for Bioinformatics and computational genomics
  • 32. Lab for Bioinformatics and computational genomics
  • 33. Lab for Bioinformatics and computational genomics
  • 34. Lab for Bioinformatics and computational genomics The Technical Feasibility Argument The Quality Argument The Price Argument The Logistics Argument
  • 35. Lab for Bioinformatics and computational genomics
  • 36. Lab for Bioinformatics and computational genomics Recreational genomics
  • 37. Lab for Bioinformatics and computational genomics Recreational genomics • Experimental designs are outdated by technological advances • Genetic background (reference genome) as a concept will need to be updated • Traits dependent on multiple loci are “complicated”: educate and provide tools to deal with it
  • 38. Lab for Bioinformatics and computational genomics Recreational genomics
  • 39. Lab for Bioinformatics and computational genomics Recreational genomics • Eye color … why not the ear wax/asparagus or unibrown example • … metabolize nutrients (newborns ?) • … metabolize drugs in case you need it urgently ?
  • 40. Lab for Bioinformatics and computational genomics Recreational genomics
  • 41. Lab for Bioinformatics and computational genomics Recreational genomics “several 23andMe users have reported taking the FDA’s advice of reviewing their genetic results with their physicians, only to find the doctors unprepared, unwilling, or downright hostile to helping interpret the data”
  • 42. Lab for Bioinformatics and computational genomics
  • 43. Lab for Bioinformatics and computational genomics Recreational genomics
  • 44. Lab for Bioinformatics and computational genomics
  • 45. Lab for Bioinformatics and computational genomics Recreational genomics
  • 46. Lab for Bioinformatics and computational genomics Recreational genomics
  • 47. Lab for Bioinformatics and computational genomics
  • 48. Lab for Bioinformatics and computational genomics my genome is too important (for me) to leave it (only) to doctors
  • 49.
  • 50. Lab for Bioinformatics and computational genomics NXTGNT biohackerspace …
  • 51. Lab for Bioinformatics and computational genomics PGMv2: Personal Genomics Manifesto
  • 52. Lab for Bioinformatics and computational genomics Everyone should have the power and legitimacy to be able to discover, develop and find new things about their own genome data. Intelligent exploration, experimentation and trial to push the boundaries of knowledge are a basic human right. PGMv2: Personal Genomics Manifesto
  • 53. Lab for Bioinformatics and computational genomics Personal genome data access should be affordable to all irrespective of nationality, gender, social background or any other circumstance. Not having access to a personal genetic test is in itself a new kind of discrimination. PGMv2: Personal Genomics Manifesto
  • 54. Lab for Bioinformatics and computational genomics Whether one wants to share genome data or keep it private should be a matter of personal choice. Whatever attitude a person has towards personal genome privacy, it should be utterly respected. Corporate interest can never compromise any human right. Laws must fully protect individual human rights of equality for every person, irrespective of predicted risks from genetic data. PGMv2: Personal Genomics Manifesto
  • 55. Lab for Bioinformatics and computational genomics Stating that genetic tests merely provide non- clinical information misses the point of what personal genomics is all about. Most genomic information is uninterpretable and may well be meaningless. But those are not reasons to deny it to people. Genetic test results are not unrelated to someone’s health, one’s ability to respond to certain drugs and one’s ethnic ancestry. PGMv2: Personal Genomics Manifesto
  • 56. Lab for Bioinformatics and computational genomics Education in risks and opportunities for personal genetic testing should be the primary aim of policy makers. Restricting access to interested people makes no sense and it is virtually impossible to ensure. Access to personal genomics data and tools for its interpretation should become accessible to everyone. PGMv2: Personal Genomics Manifesto
  • 57. Lab for Bioinformatics and computational genomics Overview • Who ? Where ? • > Genetics • Technology: Next Gen Sequencing • Personal …. Medicine/Genomics • Manifesto • The App ^[now][transl⎮comput]ational[epi]genomic$
  • 58. Lab for Bioinformatics and computational genomics
  • 59.
  • 60.
  • 61.
  • 62.
  • 63.
  • 64.
  • 65. 65