SlideShare uma empresa Scribd logo
1 de 35
Biostatistics and Statistical
      Bioinformatics



            Setia Pramana

    Universitas Brawijaya Malang,
           7 October 2011

                                    1
BECOMING A
STATISTICIAN?


                2
Who Need Statisticians?
• Can only become a lecturer/teacher?
• NO…… More applied fields:
• My classmates work in:
   – Information and Communication
     Technology.
   – Research and Developments
   – Governments: Ministry of Finance, PLN,
     Bank Indonesia, Danareksa, etc.
   – Entrepreneur
   – Many more...
• Writer....
• Read the book: 9 Summers 10 Autumns
                                              3
4
BIOSTATISTICIANS



                   5
Biostatistics

 • The study of statistics as applied to biological
   areas such as Biological laboratory
   experiments, medical research (including
   clinical research), and public health services
   research.
 • Biostatistics, far from being an unrelated
   mathematical science, is a discipline essential
   to modern medicine – a pillar in its edifice’
   (Journal of the American Medical Association
   (1966)



                                                      6
Biostatistics

 • Public Health:
    – Epidemiology
    – Modeling Infectious Diseases: HIV, HCV
    – Disease Mapping
    – Genetics: family related disease

 • Bioinformatics
    – Image Processing
    – Data Mining
    – Pattern recognition
    – etc
                                               7
Biostatistics

 • Agriculture
   – Experimental Design
   – Genetics
   • Biomedical Research
   • Evidence-based medicine
   • Clinical studies
   • Drug Development




                               8
Statistical Methods?
•   t-test
•   ANOVA
•   Regression
•   Cluster analysis
•   Discriminant analysis
•   Non-Linear Modeling
•   Multiple comparison
•   Linear Mixed Model
•   Bayesian
•   Etc,

• z                         9
BIOSTATISTICIANS IN DRUG
DEVELOPMENT


                           10
Drugs Development

 • Takes 10-15 years
 • Cost more than 1 million USD
 • To ensure that only the drugs that are that
   are both safe and effective can be marketed.
 • Stages:
   - Drug Discovery
   - Pre-clinical Development
   - Clinical Development -> 4 Phases
 Statisticians are involved in all stages (a must)


                                                     11
discovery of compound; synthesis
Pharmaceutical development and purification of drug substance;
                           manufacturing procedures
Pre-clinical (animal) studies     pharmacological profile; acute
                                  toxicity; effects of long-term usage
Investigational New Drug application

Phase I clinical trials     small; focus on safety

                            medium size; focus on safety and
Phase II clinical trials
                            short-term efficacy;

Phase III clinical trials   large and comparative; focus on
                            efficacy and cost benefits
 New Drug Application

                            „real world” experience; demonstrate
 Phase IV clinical trials   cost benefits; rare adverse reactions
                                                          12
                                                                     12
International Conference on
Harmonization (ICH)
 • The international harmonization of
   requirements for drug research and
   development so that information generated in
   one country or area would be acceptable to
   other countries or areas.
 • Regions: Europe, USA, Japan.
 • All clinical trials must follow ICH regulations.
 • Statistics plays important role.
 • Statistical Principles for Clinical Trials (ICH
   E9).


                                                      13
Preclinical and Clinical Development

 • Statisticians are involved from the beginning
   of the study
 • Planning the study
    – Formulating the hypothesis
    – Choosing the endpoint
    – Choosing the design and sample size
 • Conduct of the study
    – Patient accrual
    – Data collection
 • Data Quality control, Data analysis
 • Publication of results
                                                   14
BIOINFORMATICS



                 15
Bioinformatics

 • Bioinformatics is a science straddling the
   domains of biomedical, informatics,
   mathematics and statistics.
 • Applying computational techniques to biology
   data

 •   Functional Genomics
 •   Proteomics
 •   Sequence Analysis
 •   Phylogenetic
 •   Etc,.
                                                  16
“Informatics” in Bioinformatics

 • Databases
    – Building, Querying
    – Object DB
 • •Text String Comparison
    – Text Search
 • Finding Patterns
    – AI / Machine Learning
    – Clustering
    – Data mining
 • etc

                                  17
Central Dogma of Molecular Biology

• Genes contain
  construction
  information
• All structure and
  function is made
  up by proteins




                                     18
Genomics

 • Premise: Physiological changes -> Gene
   expression changes -> mRNA abundance
   level changes

 • Objective: Use gene expression levels
   measured via DNA microarrays to identify a
   set of genes that are differentially expressed
   across two sets of samples (e.g., in diseased
   cells compared to normal cells)




                                                    19
Microarrays Technology

 • DNA microarrays are a new and promising
   biotechnology which allow the monitoring of
   expression of thousand genes simultaneously




                                                 20
Gene Expression Analysis

• Overview of the
  process of
  generating high
  throughput gene
  expression data
  using
  microarrays.




                           21
Preprocessed data

 Genes    C1 C2       C3   T1 T2 T3
 G8521    6.89 7.18 6.60   7.40 7.15 7.40
 G8522    6.78 6.55 6.37   6.89 6.78 6.92
 G8523    6.52 6.61 6.72   6.51 6.59 6.46
 G8524    5.67 5.69 5.88   7.43 7.16 7.31
 G8525    5.64 5.91 5.61   7.41 7.49 7.41
 G8526    4.63 4.85 5.72   5.71 5.47 5.79
 G8527    8.28 7.88 7.84   8.12 7.99 7.97
 G8528    7.81 7.58 7.24   7.79 7.38 8.60
 G8529    4.26 4.20 4.82   3.11 4.94 3.08
 G8530    7.36 7.45 7.31   7.46 7.53 7.35
 G8531    5.30 5.36 5.70   5.41 5.73 5.77
 G8532    5.84 5.48 5.93   5.84 5.73 5.75
                                            22
Applications

 • High efficacy and low/no side effect drug
 • Personalized medicine.
 • Genes related disease.
 • Biological discovery
    – new and better molecular diagnostics
    – new molecular targets for therapy
    – finding and refining biological pathways
 • Molecular diagnosis of leukemia, breast
   cancer,
 • Appropriate treatment for genetic signature
 • Potential new drug targets
                                                 23
Challenges

 • Mega data, difficult to visualize
 • Too few records (columns/samples), usually <
   100
 • Too many rows(genes), usually > 1,000
 • Too many columns likely to lead to False
   positives
 • for exploration, a large set of all relevant genes
   is desired
 • for diagnostics or identification of therapeutic
   targets, the smallest set of genes is needed
 • model needs to be explainable to biologists
                                                   24
Microarray Data Analysis Types

• Gene Selection
   – find genes for therapeutic targets
• Classification (Supervised)
   – identify disease (biomarker study)
   – predict outcome / select best treatment
• Clustering (Unsupervised)
   – find new biological classes / refine existing ones
   – Understanding regulatory relationship/pathway
   – exploration



                                                     25
Gene Selection

 • Modified t-test
 • Significance Analysis of Microarray (SAM)
 • Limma (Linear model for microarrays )
 • Random forest
 • Lasso (least absolute selection and shrinkage
   operator)
 • Linear Mixed model
 • Elastic-net
 • Etc,


                                                   26
Visualization

 •   Dimensionality reduction
 •   PCA (Principal Component Analysis)
 •   Biplot
 •   Multi dimensional scaling
 •   Etc




                                          27
Clustering

 • Cluster the genes
 • Cluster the
   arrays/conditions
 • Cluster both
   simultaneously

 • K-means
 • Hierarchical
 • Biclustering
   algorithms

                       28
Clustering

• Cluster or
  Classify genes
  according to
  tumors

• Cluster tumors
  according to
  genes




                   29
Biclustering

 • A biclustering method is an unsupervised
   learning method which looks for sub-matrices
   in a data matrix with a high similarity of
   elements.
 • Algorithms: Statistical based, AI, machine
   learning.
 • BiclustGUI: A User Friendly Interface for
   Biclustering Analysis




                                                  30
Bicluster Structure




                      31
Software/Statistical Packages

 •   Minitab
 •   SAS
 •   SPSS
 •   R
 •   S-Plus
 •   Matlab
 •   Stata




                                32
• R now is growing, especially in bioinformatics
   – Statistics, data analysis, machine learning
   – Free
   – High Quality
   – Open Source
   – Extendable (you can submit and publish
     your own package!!)
   – Can be integrated with other languages (C/
     C++, Java, Python)
   – Large active user community
   – Command-based (-)
                                                   33
Summary

• Statisticians can flexibly get involved in many
  fields.
• Only tools, applications are widely range.
• Biostatisticians have many opportunities in
  public health services ( Centers for Disease
  Control and Prevention, CDC), pharmaceutical
  companies, research institutions etc.
• Statistical Bioinformatics: cutting edge
  technology -> methods are growing -> many
  more developments in future.


                                                    34
Thank you for your
       attention...



        hafidztio@yahoo.com
http://setiopramono.wordpress.com



                                    35

Mais conteúdo relacionado

Mais procurados

Statistical significance of alignments
Statistical significance of alignmentsStatistical significance of alignments
Statistical significance of alignments
avrilcoghlan
 

Mais procurados (20)

dot plot analysis
dot plot analysisdot plot analysis
dot plot analysis
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)
 
Bioinformatics, its application main
Bioinformatics, its application mainBioinformatics, its application main
Bioinformatics, its application main
 
Application of bioinformatics
Application of bioinformaticsApplication of bioinformatics
Application of bioinformatics
 
Sequence file formats
Sequence file formatsSequence file formats
Sequence file formats
 
PROTEIN MICROARRAYS
PROTEIN MICROARRAYSPROTEIN MICROARRAYS
PROTEIN MICROARRAYS
 
MICROARRAY
MICROARRAYMICROARRAY
MICROARRAY
 
Intro to illumina sequencing
Intro to illumina sequencingIntro to illumina sequencing
Intro to illumina sequencing
 
Protein micro array
Protein micro arrayProtein micro array
Protein micro array
 
Primary and secondary database
Primary and secondary databasePrimary and secondary database
Primary and secondary database
 
Global and Local Sequence Alignment
Global and Local Sequence AlignmentGlobal and Local Sequence Alignment
Global and Local Sequence Alignment
 
Student's T-Test
Student's T-TestStudent's T-Test
Student's T-Test
 
Statistical significance of alignments
Statistical significance of alignmentsStatistical significance of alignments
Statistical significance of alignments
 
Blast
BlastBlast
Blast
 
Bioinformatics on internet
Bioinformatics on internetBioinformatics on internet
Bioinformatics on internet
 
Proteins databases
Proteins databasesProteins databases
Proteins databases
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencing
 
Genomics and proteomics in drug discovery and development
Genomics and proteomics in drug discovery and developmentGenomics and proteomics in drug discovery and development
Genomics and proteomics in drug discovery and development
 
MULTIPLE SEQUENCE ALIGNMENT
MULTIPLE  SEQUENCE  ALIGNMENTMULTIPLE  SEQUENCE  ALIGNMENT
MULTIPLE SEQUENCE ALIGNMENT
 

Destaque

Chapter 09
Chapter 09Chapter 09
Chapter 09
bmcfad01
 

Destaque (12)

Bio statistics1
Bio statistics1Bio statistics1
Bio statistics1
 
Biostatistics /certified fixed orthodontic courses by Indian dental academy
Biostatistics /certified fixed orthodontic courses by Indian dental academy Biostatistics /certified fixed orthodontic courses by Indian dental academy
Biostatistics /certified fixed orthodontic courses by Indian dental academy
 
Statistics in orthodontics
Statistics in orthodonticsStatistics in orthodontics
Statistics in orthodontics
 
Bio statistics 2 /certified fixed orthodontic courses by Indian dental academy
Bio statistics 2 /certified fixed orthodontic courses by Indian dental academy Bio statistics 2 /certified fixed orthodontic courses by Indian dental academy
Bio statistics 2 /certified fixed orthodontic courses by Indian dental academy
 
statistics in orthodontics /certified fixed orthodontic courses by Indian de...
 statistics in orthodontics /certified fixed orthodontic courses by Indian de... statistics in orthodontics /certified fixed orthodontic courses by Indian de...
statistics in orthodontics /certified fixed orthodontic courses by Indian de...
 
Burstone’s T Loop
Burstone’s T LoopBurstone’s T Loop
Burstone’s T Loop
 
Friction less mechanics in orthodontics /certified fixed orthodontic course...
Friction less mechanics in orthodontics   /certified fixed orthodontic course...Friction less mechanics in orthodontics   /certified fixed orthodontic course...
Friction less mechanics in orthodontics /certified fixed orthodontic course...
 
biostatistics
biostatisticsbiostatistics
biostatistics
 
Chapter 09
Chapter 09Chapter 09
Chapter 09
 
Confidence Intervals
Confidence IntervalsConfidence Intervals
Confidence Intervals
 
Introduction to biostatistics
Introduction to biostatisticsIntroduction to biostatistics
Introduction to biostatistics
 
INTRODUCTION TO BIO STATISTICS
INTRODUCTION TO BIO STATISTICS INTRODUCTION TO BIO STATISTICS
INTRODUCTION TO BIO STATISTICS
 

Semelhante a Biostatistics and Statistical Bioinformatics

Semelhante a Biostatistics and Statistical Bioinformatics (20)

Role of bioinformatics in drug designing
Role of bioinformatics in drug designingRole of bioinformatics in drug designing
Role of bioinformatics in drug designing
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Zen and the Art of Data Science Maintenance
Zen and the Art of Data Science MaintenanceZen and the Art of Data Science Maintenance
Zen and the Art of Data Science Maintenance
 
introduction to bioinfromatics.pptx
introduction to bioinfromatics.pptxintroduction to bioinfromatics.pptx
introduction to bioinfromatics.pptx
 
Nw biotech fundamentals day 1 session 5 life sciences
Nw biotech fundamentals day 1 session 5   life sciencesNw biotech fundamentals day 1 session 5   life sciences
Nw biotech fundamentals day 1 session 5 life sciences
 
Precision and Participatory Medicine - MEDINFO 2015 Panel on big data
Precision and Participatory Medicine - MEDINFO 2015 Panel on big dataPrecision and Participatory Medicine - MEDINFO 2015 Panel on big data
Precision and Participatory Medicine - MEDINFO 2015 Panel on big data
 
Pistoia alliance debates analytics 15-09-2015 16.00
Pistoia alliance debates   analytics 15-09-2015 16.00Pistoia alliance debates   analytics 15-09-2015 16.00
Pistoia alliance debates analytics 15-09-2015 16.00
 
Basic of bioinformatics
Basic of bioinformaticsBasic of bioinformatics
Basic of bioinformatics
 
Career oppurtunities in the field of Bioinformatics
Career oppurtunities in the field of BioinformaticsCareer oppurtunities in the field of Bioinformatics
Career oppurtunities in the field of Bioinformatics
 
Clinical developments of medicines based on biomarkers
Clinical developments of medicines based on biomarkersClinical developments of medicines based on biomarkers
Clinical developments of medicines based on biomarkers
 
"Bedside to Bench" in Drug Discovery
"Bedside to Bench" in Drug Discovery"Bedside to Bench" in Drug Discovery
"Bedside to Bench" in Drug Discovery
 
Sharing and standards christopher hart - clinical innovation and partnering...
Sharing and standards   christopher hart - clinical innovation and partnering...Sharing and standards   christopher hart - clinical innovation and partnering...
Sharing and standards christopher hart - clinical innovation and partnering...
 
Amia tb-review-08
Amia tb-review-08Amia tb-review-08
Amia tb-review-08
 
Pharmaceutical industry – change in discovery and development
Pharmaceutical industry – change in discovery and developmentPharmaceutical industry – change in discovery and development
Pharmaceutical industry – change in discovery and development
 
The Uneven Future of Evidence-Based Medicine
The Uneven Future of Evidence-Based MedicineThe Uneven Future of Evidence-Based Medicine
The Uneven Future of Evidence-Based Medicine
 
Wim de Grave: Big Data in life sciences
Wim de Grave:  Big Data in life sciencesWim de Grave:  Big Data in life sciences
Wim de Grave: Big Data in life sciences
 
Interpreting Complex Real World Data for Pharmaceutical Research
Interpreting Complex Real World Data for Pharmaceutical ResearchInterpreting Complex Real World Data for Pharmaceutical Research
Interpreting Complex Real World Data for Pharmaceutical Research
 
Trends in clinical research and career gd 09_may20
Trends in clinical research and career gd 09_may20Trends in clinical research and career gd 09_may20
Trends in clinical research and career gd 09_may20
 
SLAS Tips for Scientific Publishing
SLAS Tips for Scientific PublishingSLAS Tips for Scientific Publishing
SLAS Tips for Scientific Publishing
 
Microbial Genomics and Surveillance: An Overview Snapshot for a Layman’s Unde...
Microbial Genomics and Surveillance: An Overview Snapshot for a Layman’s Unde...Microbial Genomics and Surveillance: An Overview Snapshot for a Layman’s Unde...
Microbial Genomics and Surveillance: An Overview Snapshot for a Layman’s Unde...
 

Mais de Setia Pramana

Multivariate data analysis
Multivariate data analysisMultivariate data analysis
Multivariate data analysis
Setia Pramana
 

Mais de Setia Pramana (20)

Big data for official statistics @ Konferensi Big Data Indonesia 2016
Big data for official statistics @ Konferensi Big Data Indonesia 2016 Big data for official statistics @ Konferensi Big Data Indonesia 2016
Big data for official statistics @ Konferensi Big Data Indonesia 2016
 
Resampling methods
Resampling methodsResampling methods
Resampling methods
 
Introduction to Computational Statistics
Introduction to Computational StatisticsIntroduction to Computational Statistics
Introduction to Computational Statistics
 
Bioinformatics I-4 lecture
Bioinformatics I-4 lectureBioinformatics I-4 lecture
Bioinformatics I-4 lecture
 
Correlation and Regression Analysis using SPSS and Microsoft Excel
Correlation and Regression Analysis using SPSS and Microsoft ExcelCorrelation and Regression Analysis using SPSS and Microsoft Excel
Correlation and Regression Analysis using SPSS and Microsoft Excel
 
Pengalaman Menjadi Mahasiswa Muslim di Eropa
Pengalaman Menjadi Mahasiswa Muslim di EropaPengalaman Menjadi Mahasiswa Muslim di Eropa
Pengalaman Menjadi Mahasiswa Muslim di Eropa
 
Multivariate data analysis
Multivariate data analysisMultivariate data analysis
Multivariate data analysis
 
Molecular Subtyping of Breast Cancer and Somatic Mutation Discovery Using DNA...
Molecular Subtyping of Breast Cancer and Somatic Mutation Discovery Using DNA...Molecular Subtyping of Breast Cancer and Somatic Mutation Discovery Using DNA...
Molecular Subtyping of Breast Cancer and Somatic Mutation Discovery Using DNA...
 
The Role of The Statisticians in Personalized Medicine: An Overview of Stati...
The Role of The Statisticians in Personalized Medicine:  An Overview of Stati...The Role of The Statisticians in Personalized Medicine:  An Overview of Stati...
The Role of The Statisticians in Personalized Medicine: An Overview of Stati...
 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
 
High throughput Data Analysis
High throughput Data AnalysisHigh throughput Data Analysis
High throughput Data Analysis
 
Research Methods for Computational Statistics
Research Methods for Computational StatisticsResearch Methods for Computational Statistics
Research Methods for Computational Statistics
 
Survival Data Analysis for Sekolah Tinggi Ilmu Statistik Jakarta
Survival Data Analysis for Sekolah Tinggi Ilmu Statistik JakartaSurvival Data Analysis for Sekolah Tinggi Ilmu Statistik Jakarta
Survival Data Analysis for Sekolah Tinggi Ilmu Statistik Jakarta
 
The Role of Statistician in Personalized Medicine: An Overview of Statistical...
The Role of Statistician in Personalized Medicine: An Overview of Statistical...The Role of Statistician in Personalized Medicine: An Overview of Statistical...
The Role of Statistician in Personalized Medicine: An Overview of Statistical...
 
“Big Data” and the Challenges for Statisticians
“Big Data” and the  Challenges for Statisticians“Big Data” and the  Challenges for Statisticians
“Big Data” and the Challenges for Statisticians
 
Getting a Scholarship, how?
Getting a Scholarship, how?Getting a Scholarship, how?
Getting a Scholarship, how?
 
Kehidupan sehari-hari dengan Personnummer atau SIN Single Identity Number
Kehidupan sehari-hari dengan Personnummer atau SIN Single Identity NumberKehidupan sehari-hari dengan Personnummer atau SIN Single Identity Number
Kehidupan sehari-hari dengan Personnummer atau SIN Single Identity Number
 
Research possibilities with the Personal Identification Number (person nummer...
Research possibilities with the Personal Identification Number (person nummer...Research possibilities with the Personal Identification Number (person nummer...
Research possibilities with the Personal Identification Number (person nummer...
 
Developing R Graphical User Interfaces
Developing R Graphical User InterfacesDeveloping R Graphical User Interfaces
Developing R Graphical User Interfaces
 
Academia vs industry
Academia vs industryAcademia vs industry
Academia vs industry
 

Último

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 

Último (20)

Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 

Biostatistics and Statistical Bioinformatics

  • 1. Biostatistics and Statistical Bioinformatics Setia Pramana Universitas Brawijaya Malang, 7 October 2011 1
  • 3. Who Need Statisticians? • Can only become a lecturer/teacher? • NO…… More applied fields: • My classmates work in: – Information and Communication Technology. – Research and Developments – Governments: Ministry of Finance, PLN, Bank Indonesia, Danareksa, etc. – Entrepreneur – Many more... • Writer.... • Read the book: 9 Summers 10 Autumns 3
  • 4. 4
  • 6. Biostatistics • The study of statistics as applied to biological areas such as Biological laboratory experiments, medical research (including clinical research), and public health services research. • Biostatistics, far from being an unrelated mathematical science, is a discipline essential to modern medicine – a pillar in its edifice’ (Journal of the American Medical Association (1966) 6
  • 7. Biostatistics • Public Health: – Epidemiology – Modeling Infectious Diseases: HIV, HCV – Disease Mapping – Genetics: family related disease • Bioinformatics – Image Processing – Data Mining – Pattern recognition – etc 7
  • 8. Biostatistics • Agriculture – Experimental Design – Genetics • Biomedical Research • Evidence-based medicine • Clinical studies • Drug Development 8
  • 9. Statistical Methods? • t-test • ANOVA • Regression • Cluster analysis • Discriminant analysis • Non-Linear Modeling • Multiple comparison • Linear Mixed Model • Bayesian • Etc, • z 9
  • 11. Drugs Development • Takes 10-15 years • Cost more than 1 million USD • To ensure that only the drugs that are that are both safe and effective can be marketed. • Stages: - Drug Discovery - Pre-clinical Development - Clinical Development -> 4 Phases Statisticians are involved in all stages (a must) 11
  • 12. discovery of compound; synthesis Pharmaceutical development and purification of drug substance; manufacturing procedures Pre-clinical (animal) studies pharmacological profile; acute toxicity; effects of long-term usage Investigational New Drug application Phase I clinical trials small; focus on safety medium size; focus on safety and Phase II clinical trials short-term efficacy; Phase III clinical trials large and comparative; focus on efficacy and cost benefits New Drug Application „real world” experience; demonstrate Phase IV clinical trials cost benefits; rare adverse reactions 12 12
  • 13. International Conference on Harmonization (ICH) • The international harmonization of requirements for drug research and development so that information generated in one country or area would be acceptable to other countries or areas. • Regions: Europe, USA, Japan. • All clinical trials must follow ICH regulations. • Statistics plays important role. • Statistical Principles for Clinical Trials (ICH E9). 13
  • 14. Preclinical and Clinical Development • Statisticians are involved from the beginning of the study • Planning the study – Formulating the hypothesis – Choosing the endpoint – Choosing the design and sample size • Conduct of the study – Patient accrual – Data collection • Data Quality control, Data analysis • Publication of results 14
  • 16. Bioinformatics • Bioinformatics is a science straddling the domains of biomedical, informatics, mathematics and statistics. • Applying computational techniques to biology data • Functional Genomics • Proteomics • Sequence Analysis • Phylogenetic • Etc,. 16
  • 17. “Informatics” in Bioinformatics • Databases – Building, Querying – Object DB • •Text String Comparison – Text Search • Finding Patterns – AI / Machine Learning – Clustering – Data mining • etc 17
  • 18. Central Dogma of Molecular Biology • Genes contain construction information • All structure and function is made up by proteins 18
  • 19. Genomics • Premise: Physiological changes -> Gene expression changes -> mRNA abundance level changes • Objective: Use gene expression levels measured via DNA microarrays to identify a set of genes that are differentially expressed across two sets of samples (e.g., in diseased cells compared to normal cells) 19
  • 20. Microarrays Technology • DNA microarrays are a new and promising biotechnology which allow the monitoring of expression of thousand genes simultaneously 20
  • 21. Gene Expression Analysis • Overview of the process of generating high throughput gene expression data using microarrays. 21
  • 22. Preprocessed data Genes C1 C2 C3 T1 T2 T3 G8521 6.89 7.18 6.60 7.40 7.15 7.40 G8522 6.78 6.55 6.37 6.89 6.78 6.92 G8523 6.52 6.61 6.72 6.51 6.59 6.46 G8524 5.67 5.69 5.88 7.43 7.16 7.31 G8525 5.64 5.91 5.61 7.41 7.49 7.41 G8526 4.63 4.85 5.72 5.71 5.47 5.79 G8527 8.28 7.88 7.84 8.12 7.99 7.97 G8528 7.81 7.58 7.24 7.79 7.38 8.60 G8529 4.26 4.20 4.82 3.11 4.94 3.08 G8530 7.36 7.45 7.31 7.46 7.53 7.35 G8531 5.30 5.36 5.70 5.41 5.73 5.77 G8532 5.84 5.48 5.93 5.84 5.73 5.75 22
  • 23. Applications • High efficacy and low/no side effect drug • Personalized medicine. • Genes related disease. • Biological discovery – new and better molecular diagnostics – new molecular targets for therapy – finding and refining biological pathways • Molecular diagnosis of leukemia, breast cancer, • Appropriate treatment for genetic signature • Potential new drug targets 23
  • 24. Challenges • Mega data, difficult to visualize • Too few records (columns/samples), usually < 100 • Too many rows(genes), usually > 1,000 • Too many columns likely to lead to False positives • for exploration, a large set of all relevant genes is desired • for diagnostics or identification of therapeutic targets, the smallest set of genes is needed • model needs to be explainable to biologists 24
  • 25. Microarray Data Analysis Types • Gene Selection – find genes for therapeutic targets • Classification (Supervised) – identify disease (biomarker study) – predict outcome / select best treatment • Clustering (Unsupervised) – find new biological classes / refine existing ones – Understanding regulatory relationship/pathway – exploration 25
  • 26. Gene Selection • Modified t-test • Significance Analysis of Microarray (SAM) • Limma (Linear model for microarrays ) • Random forest • Lasso (least absolute selection and shrinkage operator) • Linear Mixed model • Elastic-net • Etc, 26
  • 27. Visualization • Dimensionality reduction • PCA (Principal Component Analysis) • Biplot • Multi dimensional scaling • Etc 27
  • 28. Clustering • Cluster the genes • Cluster the arrays/conditions • Cluster both simultaneously • K-means • Hierarchical • Biclustering algorithms 28
  • 29. Clustering • Cluster or Classify genes according to tumors • Cluster tumors according to genes 29
  • 30. Biclustering • A biclustering method is an unsupervised learning method which looks for sub-matrices in a data matrix with a high similarity of elements. • Algorithms: Statistical based, AI, machine learning. • BiclustGUI: A User Friendly Interface for Biclustering Analysis 30
  • 32. Software/Statistical Packages • Minitab • SAS • SPSS • R • S-Plus • Matlab • Stata 32
  • 33. • R now is growing, especially in bioinformatics – Statistics, data analysis, machine learning – Free – High Quality – Open Source – Extendable (you can submit and publish your own package!!) – Can be integrated with other languages (C/ C++, Java, Python) – Large active user community – Command-based (-) 33
  • 34. Summary • Statisticians can flexibly get involved in many fields. • Only tools, applications are widely range. • Biostatisticians have many opportunities in public health services ( Centers for Disease Control and Prevention, CDC), pharmaceutical companies, research institutions etc. • Statistical Bioinformatics: cutting edge technology -> methods are growing -> many more developments in future. 34
  • 35. Thank you for your attention... hafidztio@yahoo.com http://setiopramono.wordpress.com 35