SlideShare uma empresa Scribd logo
1 de 32
Baixar para ler offline
Stat310            Inference


                         Hadley Wickham
Tuesday, 31 March 2009
1. Homework / Take home exam
               2. Recap
               3. Data vs. distributions
               4. Estimation
                         1. Maximum likelihood
                         2. Method of moments
               5. Feedback

Tuesday, 31 March 2009
Assessment
                   Short homework this week. (But you
                   have to do some reading)
                   Take home test will be available online
                   next Thursday.
                   Both take home and homework will be
                   due in class on Thursday April 9.
                   Will put up study guide asap.


Tuesday, 31 March 2009
Recap

                   What are the 5 parameters of the bivariate
                   normal?
                   If X and Y are bivariate normal, and their
                   correlation is zero, what does that imply
                   about X and Y? Is that usually true?




Tuesday, 31 March 2009
Data vs. Distributions
                   Random experiments produce data.
                   A repeatable random experiment has
                   some underlying distribution.
                   We want to go from the data to say
                   something about the underlying
                   distribution.



Tuesday, 31 March 2009
Coin tossing
                   Half the class generates 100 heads and tails
                   by flipping coins.
                   The other half generates 100 heads and tails
                   just by writing down what they think the
                   sequence would be.
                   Write up on the board.
                   I’ll come in and guess which group was
                   which.


Tuesday, 31 March 2009
Problem

                   Have some data
                   and a probability model, with unknown
                   parameters.
                   Want to estimate the value of those
                   parameters



Tuesday, 31 March 2009
Some definitons
                   Parameter space: set of all possible
                   parameter values
                   Estimator: process/function which takes
                   data and gives best guess for parameter
                   (usually many possible estimators for a
                   problem)
                   Point estimate: estimator for a single value


Tuesday, 31 March 2009
Example

                   Data: 5.7 3.0 5.7 4.5 6.0 6.3 4.9 5.8 4.4 5.8
                   Model: Normal(?, 1)


                   What is the mean of the underlying
                   distribution? (5.2?)



Tuesday, 31 March 2009
Uncertainty

                   Also want to be able to quantify how
                   certain/confident we are in our answer.
                   How close is our estimate to the true
                   mean?




Tuesday, 31 March 2009
Simulation
                   One approach to find the answer is to use
                   simulation, i.e., set up a case where we
                   know what the true answer is and see
                   what happens.
                   X ~ Normal(5, 1)
                   Draw 10 numbers from this distribution
                   and calculate their average.


Tuesday, 31 March 2009
3.1 3.4 5.1 4.9 2.2 4.4 4.2 3.9 5.6 4.9 4.2
      5.9 2.8 6.0 5.1 2.7 6.5 4.2 4.9 4.6 4.4 4.7
      5.0 5.3 5.3 5.1 5.4 4.7 4.7 4.4 5.9 4.2 5.0
      4.3 5.4 5.5 4.9 3.1 4.1 4.8 3.6 6.8 5.5 4.8
      3.8 6.1 3.8 5.2 5.7 5.2 3.2 5.2 5.3 2.3 4.6
      5.6 6.0 5.5 5.5 5.1 7.3 5.4 6.1 4.4 4.9 5.6




Tuesday, 31 March 2009
Repeat 1000 times
     120




     100




         80
 count




         60




         40


                               95% of values
                                lie between
         20


                                4.5 and 5.6
          0

                4.0      4.5        5.0        5.5   6.0
                                  samp
Tuesday, 31 March 2009
Theory

                   From Tuesday, we know what the
                   distribution of the average is. Write it
                   down.
                   Create a 95% confidence interval.
                   How does it compare to the simulation?



Tuesday, 31 March 2009
Why the mean?

                   Why is the mean of the data a good
                   estimate of μ? Are there other estimators
                   that might be as good or better?
                   In general, how can we figure out an
                   estimator for a parameter of a
                   distribution?



Tuesday, 31 March 2009
Maximum likelihood
                   Method of moments



Tuesday, 31 March 2009
Maximum likelihood

                   Write down log-likelihood (i.e., given this
                   data how likely is it that it was generated
                   from this parmeter?)
                   Find the maximum (i.e., differentiate and
                   set to zero)




Tuesday, 31 March 2009
Example
                   X ~ Binomial(10, p?)
                   Here is some data drawn from that
                   random experiment: 4 5 1 5 3 2 4 2 2 4
                   We know the joint pdf because they are
                   independent. Can try out various values
                   of p and see which is most likely



Tuesday, 31 March 2009
Your turn

                   Write down the joint pdf for X1, X2, …, Xn
                   ~ Binomial(n, p)


                   Try evaluating it for x = (4 5 1 5 3 2 4 2 2
                   4), n = 10, p = 0.1



Tuesday, 31 March 2009
Try 10 different
                                       ●




                                                                   values of p
    3.0e−08



    2.5e−08



    2.0e−08
 prob




    1.5e−08



    1.0e−08                                ●




    5.0e−09



                                 ●
    0.0e+00                                          ●
                         ●   ●                           ●     ●    ●    ●   ●




                     0.0         0.2       0.4           0.6       0.8       1.0
                                                 p
Tuesday, 31 March 2009
Try 100 different
                                                      values of p
    3.5e−08



    3.0e−08



    2.5e−08



    2.0e−08
 prob




    1.5e−08


                                                     True p is 0.3
    1.0e−08



    5.0e−09



    0.0e+00

                     0.0   0.2   0.4       0.6         0.8      1.0
                                       p
Tuesday, 31 March 2009
Calculus
                   Can do the same analytically with calculus.
                   Want to find the maximum of the pdf with
                   respect to p. (How do we do this?)
                   Normally call this the likelihood when
                   we’re thinking of the x’s being fixed and
                   the parameters varying.
                   Usually easier to work with the log pdf
                   (why?)


Tuesday, 31 March 2009
Steps

                   Write out log-likelihood
                   (Discard constants)
                   Differentiate and set to 0
                   (Check second derivative is positive)




Tuesday, 31 March 2009
Analytically

                   Mean of x’s is 3.2
                   n = 10
                   Maximum likelihood estimate of p for this
                   example is 0.32




Tuesday, 31 March 2009
Method of moments
                   We know how to calculate sample
                   moments (e.g. mean and variance of data)
                   We know what the moments of the
                   distribution are in terms of the
                   parameters.
                   Why not just match them up?



Tuesday, 31 March 2009
Binomial
                   E(X) = np Var(X) = np(1-p)




Tuesday, 31 March 2009
Binomial
                   E(X) = np Var(X) = np(1-p)
                   p = mean / n = 3.2 / 10 = 0.32




Tuesday, 31 March 2009
Binomial
                   E(X) = np Var(X) = np(1-p)
                   p = mean / n = 3.2 / 10 = 0.32
                   p(1-p) = var / n = 2 / 10 = 0.2




Tuesday, 31 March 2009
Binomial
                   E(X) = np Var(X) = np(1-p)
                   p = mean / n = 3.2 / 10 = 0.32
                   p(1-p) = var / n = 2 / 10 = 0.2
                   -p 2   + p - 0.2 = 0




Tuesday, 31 March 2009
Binomial
                   E(X) = np Var(X) = np(1-p)
                   p = mean / n = 3.2 / 10 = 0.32
                   p(1-p) = var / n = 2 / 10 = 0.2
                   -p 2   + p - 0.2 = 0
                   p = (0.276, 0.725)



Tuesday, 31 March 2009
Your turn

                   What are the method of moments
                   estimators for the mean and variance of
                   the normal distribution?
                   What about the gamma distribution?




Tuesday, 31 March 2009
Feedback



Tuesday, 31 March 2009

Mais conteúdo relacionado

Semelhante a 21 Inference

Binomail probability distribution
Binomail probability distributionBinomail probability distribution
Binomail probability distributionAbdrie Setegn
 
Artificial intelligence cs607 handouts lecture 11 - 45
Artificial intelligence   cs607 handouts lecture 11 - 45Artificial intelligence   cs607 handouts lecture 11 - 45
Artificial intelligence cs607 handouts lecture 11 - 45Sattar kayani
 
learningIntro.doc
learningIntro.doclearningIntro.doc
learningIntro.docbutest
 
learningIntro.doc
learningIntro.doclearningIntro.doc
learningIntro.docbutest
 
Sean Holden (University of Cambridge) - Proving Theorems_ Still A Major Test ...
Sean Holden (University of Cambridge) - Proving Theorems_ Still A Major Test ...Sean Holden (University of Cambridge) - Proving Theorems_ Still A Major Test ...
Sean Holden (University of Cambridge) - Proving Theorems_ Still A Major Test ...Codiax
 
Random forest sgv_ai_talk_oct_2_2018
Random forest sgv_ai_talk_oct_2_2018Random forest sgv_ai_talk_oct_2_2018
Random forest sgv_ai_talk_oct_2_2018digitalzombie
 
ensemble learning
ensemble learningensemble learning
ensemble learningbutest
 
Genetic Algorithms
Genetic AlgorithmsGenetic Algorithms
Genetic Algorithmsanas_elf
 
Elementary statistical inference1
Elementary statistical inference1Elementary statistical inference1
Elementary statistical inference1SEMINARGROOT
 

Semelhante a 21 Inference (16)

Monte Carlo
Monte CarloMonte Carlo
Monte Carlo
 
23 Estimation
23 Estimation23 Estimation
23 Estimation
 
Binomail probability distribution
Binomail probability distributionBinomail probability distribution
Binomail probability distribution
 
02 Ddply
02 Ddply02 Ddply
02 Ddply
 
Artificial intelligence cs607 handouts lecture 11 - 45
Artificial intelligence   cs607 handouts lecture 11 - 45Artificial intelligence   cs607 handouts lecture 11 - 45
Artificial intelligence cs607 handouts lecture 11 - 45
 
12 Ddply
12 Ddply12 Ddply
12 Ddply
 
learningIntro.doc
learningIntro.doclearningIntro.doc
learningIntro.doc
 
learningIntro.doc
learningIntro.doclearningIntro.doc
learningIntro.doc
 
Sean Holden (University of Cambridge) - Proving Theorems_ Still A Major Test ...
Sean Holden (University of Cambridge) - Proving Theorems_ Still A Major Test ...Sean Holden (University of Cambridge) - Proving Theorems_ Still A Major Test ...
Sean Holden (University of Cambridge) - Proving Theorems_ Still A Major Test ...
 
Random forest sgv_ai_talk_oct_2_2018
Random forest sgv_ai_talk_oct_2_2018Random forest sgv_ai_talk_oct_2_2018
Random forest sgv_ai_talk_oct_2_2018
 
ensemble learning
ensemble learningensemble learning
ensemble learning
 
Genetic Algorithms
Genetic AlgorithmsGenetic Algorithms
Genetic Algorithms
 
Elementary statistical inference1
Elementary statistical inference1Elementary statistical inference1
Elementary statistical inference1
 
MLE.pdf
MLE.pdfMLE.pdf
MLE.pdf
 
2019 PMED Spring Course - SMARTs-Part II - Eric Laber, April 10, 2019
2019 PMED Spring Course - SMARTs-Part II - Eric Laber, April 10, 2019 2019 PMED Spring Course - SMARTs-Part II - Eric Laber, April 10, 2019
2019 PMED Spring Course - SMARTs-Part II - Eric Laber, April 10, 2019
 
Probability unit2.pptx
Probability unit2.pptxProbability unit2.pptx
Probability unit2.pptx
 

Mais de Hadley Wickham (20)

27 development
27 development27 development
27 development
 
27 development
27 development27 development
27 development
 
24 modelling
24 modelling24 modelling
24 modelling
 
23 data-structures
23 data-structures23 data-structures
23 data-structures
 
Graphical inference
Graphical inferenceGraphical inference
Graphical inference
 
R packages
R packagesR packages
R packages
 
22 spam
22 spam22 spam
22 spam
 
21 spam
21 spam21 spam
21 spam
 
20 date-times
20 date-times20 date-times
20 date-times
 
19 tables
19 tables19 tables
19 tables
 
18 cleaning
18 cleaning18 cleaning
18 cleaning
 
17 polishing
17 polishing17 polishing
17 polishing
 
16 critique
16 critique16 critique
16 critique
 
15 time-space
15 time-space15 time-space
15 time-space
 
14 case-study
14 case-study14 case-study
14 case-study
 
13 case-study
13 case-study13 case-study
13 case-study
 
12 adv-manip
12 adv-manip12 adv-manip
12 adv-manip
 
11 adv-manip
11 adv-manip11 adv-manip
11 adv-manip
 
11 adv-manip
11 adv-manip11 adv-manip
11 adv-manip
 
10 simulation
10 simulation10 simulation
10 simulation
 

Último

Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 

Último (20)

Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 

21 Inference

  • 1. Stat310 Inference Hadley Wickham Tuesday, 31 March 2009
  • 2. 1. Homework / Take home exam 2. Recap 3. Data vs. distributions 4. Estimation 1. Maximum likelihood 2. Method of moments 5. Feedback Tuesday, 31 March 2009
  • 3. Assessment Short homework this week. (But you have to do some reading) Take home test will be available online next Thursday. Both take home and homework will be due in class on Thursday April 9. Will put up study guide asap. Tuesday, 31 March 2009
  • 4. Recap What are the 5 parameters of the bivariate normal? If X and Y are bivariate normal, and their correlation is zero, what does that imply about X and Y? Is that usually true? Tuesday, 31 March 2009
  • 5. Data vs. Distributions Random experiments produce data. A repeatable random experiment has some underlying distribution. We want to go from the data to say something about the underlying distribution. Tuesday, 31 March 2009
  • 6. Coin tossing Half the class generates 100 heads and tails by flipping coins. The other half generates 100 heads and tails just by writing down what they think the sequence would be. Write up on the board. I’ll come in and guess which group was which. Tuesday, 31 March 2009
  • 7. Problem Have some data and a probability model, with unknown parameters. Want to estimate the value of those parameters Tuesday, 31 March 2009
  • 8. Some definitons Parameter space: set of all possible parameter values Estimator: process/function which takes data and gives best guess for parameter (usually many possible estimators for a problem) Point estimate: estimator for a single value Tuesday, 31 March 2009
  • 9. Example Data: 5.7 3.0 5.7 4.5 6.0 6.3 4.9 5.8 4.4 5.8 Model: Normal(?, 1) What is the mean of the underlying distribution? (5.2?) Tuesday, 31 March 2009
  • 10. Uncertainty Also want to be able to quantify how certain/confident we are in our answer. How close is our estimate to the true mean? Tuesday, 31 March 2009
  • 11. Simulation One approach to find the answer is to use simulation, i.e., set up a case where we know what the true answer is and see what happens. X ~ Normal(5, 1) Draw 10 numbers from this distribution and calculate their average. Tuesday, 31 March 2009
  • 12. 3.1 3.4 5.1 4.9 2.2 4.4 4.2 3.9 5.6 4.9 4.2 5.9 2.8 6.0 5.1 2.7 6.5 4.2 4.9 4.6 4.4 4.7 5.0 5.3 5.3 5.1 5.4 4.7 4.7 4.4 5.9 4.2 5.0 4.3 5.4 5.5 4.9 3.1 4.1 4.8 3.6 6.8 5.5 4.8 3.8 6.1 3.8 5.2 5.7 5.2 3.2 5.2 5.3 2.3 4.6 5.6 6.0 5.5 5.5 5.1 7.3 5.4 6.1 4.4 4.9 5.6 Tuesday, 31 March 2009
  • 13. Repeat 1000 times 120 100 80 count 60 40 95% of values lie between 20 4.5 and 5.6 0 4.0 4.5 5.0 5.5 6.0 samp Tuesday, 31 March 2009
  • 14. Theory From Tuesday, we know what the distribution of the average is. Write it down. Create a 95% confidence interval. How does it compare to the simulation? Tuesday, 31 March 2009
  • 15. Why the mean? Why is the mean of the data a good estimate of μ? Are there other estimators that might be as good or better? In general, how can we figure out an estimator for a parameter of a distribution? Tuesday, 31 March 2009
  • 16. Maximum likelihood Method of moments Tuesday, 31 March 2009
  • 17. Maximum likelihood Write down log-likelihood (i.e., given this data how likely is it that it was generated from this parmeter?) Find the maximum (i.e., differentiate and set to zero) Tuesday, 31 March 2009
  • 18. Example X ~ Binomial(10, p?) Here is some data drawn from that random experiment: 4 5 1 5 3 2 4 2 2 4 We know the joint pdf because they are independent. Can try out various values of p and see which is most likely Tuesday, 31 March 2009
  • 19. Your turn Write down the joint pdf for X1, X2, …, Xn ~ Binomial(n, p) Try evaluating it for x = (4 5 1 5 3 2 4 2 2 4), n = 10, p = 0.1 Tuesday, 31 March 2009
  • 20. Try 10 different ● values of p 3.0e−08 2.5e−08 2.0e−08 prob 1.5e−08 1.0e−08 ● 5.0e−09 ● 0.0e+00 ● ● ● ● ● ● ● ● 0.0 0.2 0.4 0.6 0.8 1.0 p Tuesday, 31 March 2009
  • 21. Try 100 different values of p 3.5e−08 3.0e−08 2.5e−08 2.0e−08 prob 1.5e−08 True p is 0.3 1.0e−08 5.0e−09 0.0e+00 0.0 0.2 0.4 0.6 0.8 1.0 p Tuesday, 31 March 2009
  • 22. Calculus Can do the same analytically with calculus. Want to find the maximum of the pdf with respect to p. (How do we do this?) Normally call this the likelihood when we’re thinking of the x’s being fixed and the parameters varying. Usually easier to work with the log pdf (why?) Tuesday, 31 March 2009
  • 23. Steps Write out log-likelihood (Discard constants) Differentiate and set to 0 (Check second derivative is positive) Tuesday, 31 March 2009
  • 24. Analytically Mean of x’s is 3.2 n = 10 Maximum likelihood estimate of p for this example is 0.32 Tuesday, 31 March 2009
  • 25. Method of moments We know how to calculate sample moments (e.g. mean and variance of data) We know what the moments of the distribution are in terms of the parameters. Why not just match them up? Tuesday, 31 March 2009
  • 26. Binomial E(X) = np Var(X) = np(1-p) Tuesday, 31 March 2009
  • 27. Binomial E(X) = np Var(X) = np(1-p) p = mean / n = 3.2 / 10 = 0.32 Tuesday, 31 March 2009
  • 28. Binomial E(X) = np Var(X) = np(1-p) p = mean / n = 3.2 / 10 = 0.32 p(1-p) = var / n = 2 / 10 = 0.2 Tuesday, 31 March 2009
  • 29. Binomial E(X) = np Var(X) = np(1-p) p = mean / n = 3.2 / 10 = 0.32 p(1-p) = var / n = 2 / 10 = 0.2 -p 2 + p - 0.2 = 0 Tuesday, 31 March 2009
  • 30. Binomial E(X) = np Var(X) = np(1-p) p = mean / n = 3.2 / 10 = 0.32 p(1-p) = var / n = 2 / 10 = 0.2 -p 2 + p - 0.2 = 0 p = (0.276, 0.725) Tuesday, 31 March 2009
  • 31. Your turn What are the method of moments estimators for the mean and variance of the normal distribution? What about the gamma distribution? Tuesday, 31 March 2009