SlideShare uma empresa Scribd logo
1 de 109
Baixar para ler offline
Digging into the Dirichlet
Max Sklar
@maxsklar

New York Machine Learning Meetup
December 19th, 2013
Dedication

Meyer Marks
1925 - 2013
The Dirichlet Distribution
Let’s start with something simpler
Let’s start with something simpler
A Pie Chart!
Let’s start with something simpler
A Pie Chart!
AKA
Discrete Distribution
Multinomial Distribution
Let’s start with something simpler
A Pie Chart!
K = The number of
categories.
K=5
Examples of Multinomial Distributions
Examples of Multinomial Distributions
Examples of Multinomial Distributions
Examples of Multinomial Distributions
Examples of Multinomial Distributions
What does the raw data look like?
What does the raw data look like?
id

# dislikes

1

231

23

2

Counts!

# likes

81

40

3

67

9

4

121

14

5

9

31

6

18

0

7

1

1
What does the raw data look like?
More specifically:
- K columns of counts
- N rows of data

id

# likes

# dislikes

1

231

23

2

81

40

3

67

9

4

121

14

5

9

31

6

18

0

7

1

1
BUT...

Counts != Multinomial Distribution
BUT...
We can estimate the multinomial
distribution with the counts, using
the maximum likelihood estimate

366

181

203
BUT...
We can estimate the multinomial
distribution with the counts, using
the maximum likelihood estimate

Sum =
366 + 181 + 203 =
750

366

181

203
BUT...
We can estimate the multinomial
distribution with the counts, using
the maximum likelihood estimate

366 / 750
181 / 750
203 / 750

366

181

203
BUT...
We can estimate the multinomial
distribution with the counts, using
the maximum likelihood estimate

48.8%
24.1%
27.1%

366

181

203
BUT...
366

203

1

Uh Oh

181
2

1
BUT...
366

This column
will be all
Yellow
right?

181

203

1

2

1

0

1

0
BUT...
366

203

1

Panic!!!!

181
2

1

0

1

0

0

0

0
Bayesian Statistics to the Rescue
Bayesian Statistics to the Rescue
Still assume each row was
generated by a
multinomial distribution
Bayesian Statistics to the Rescue
Still assume each row was
generated by a
multinomial distribution
We just don’t know
which one!
The Dirichlet Distribution
Is a probability
distribution over all
possible multinomial
distributions, p.
The Dirichlet Distribution

?

Represents our
uncertainty over the
actual distribution
that created the row.

?
The Dirichlet Distribution

p: represents a multinomial distribution
alpha: the parameters of the dirichlet
K: the number of categories
Bayesian Updates
Bayesian Updates

Also a Dirichlet!
Bayesian Updates

Also a Dirichlet!
(Conjugate Prior)
Bayesian Updates
Bayesian Updates
Bayesian Updates

+1
Why Does
this Work?
Let’s look at it
again.
Entropy
Entropy
Information Content
Entropy
Information Content
Energy
Entropy
Information Content
Energy
Log Likelihood
Entropy
Information Content
Energy
Log Likelihood
The Dirichlet Distribution
The Dirichlet Distribution

Normalizing Constant
The Dirichlet Distribution

Normalizing Constant
The Dirichlet Distribution
The Dirichlet Distribution
The Dirichlet Distribution
The Dirichlet Distribution

Linear
The Dirichlet MACHINE

Prior

1.2

3.0

0.3
The Dirichlet MACHINE

Prior

1.2

3.0

0.3
The Dirichlet MACHINE

Update

2.2

3.0

0.3
The Dirichlet MACHINE

Prior

2.2

3.0

0.3
The Dirichlet MACHINE

Update

2.2

3.0

1.3
The Dirichlet MACHINE

Prior

2.2

3.0

1.3
The Dirichlet MACHINE

Update

2.2

3.0

2.3
Interpreting the Parameters
1

3

2

What does this alpha
vector really mean?
Interpreting the Parameters
1

normalized

1/6

3/6

3

2

sum

2/6

6
Interpreting the Parameters
1

Expected
Value

3

2

6
Interpreting the Parameters
1

Expected
Value

3

2

6

Weight
ANALOGY: Normal Distribution

Precision =
1 / variance
ANALOGY: Normal Distribution
High precision:
data is close to
the mean
Low precision:
far away from
the mean
Interpreting the Parameters
1

Expected
Value

3

2

6

Precision
High Weight Dirichlet
0.4

0.4

0.2
Low Weight Dirichlet
0.4

0.4

0.2
Urn Model
At each step,
pick a ball from
the urn..
Replace it, and
add another ball
of that color into
the urn
Urn Model
At each step,
pick a ball from
the urn..
Replace it, and
add another ball
of that color into
the urn
Urn Model
At each step,
pick a ball from
the urn..
Replace it, and
add another ball
of that color into
the urn
Urn Model
At each step,
pick a ball from
the urn..
Replace it, and
add another ball
of that color into
the urn
Urn Model
At each step,
pick a ball from
the urn..
Replace it, and
add another ball
of that color into
the urn
Urn Model
Rich get richer...
Urn Model
Rich get richer...
Urn Model
Finally yellow
catches a break
Urn Model
Finally yellow
catches a break
Urn Model
But it’s too late...
Urn Model
As the urn gets
more populated,
the distribution
gets “stuck” in
place.
Urn Model
Once lots of data
has been collected,
or the dirichlet has
high precision, it’s
hard to overturn that
with new data
Chinese Restaurant Process
When you find
the white ball,
throw a new
color into the
mix.
Chinese Restaurant Process
When you find
the white ball,
throw a new
color into the
mix.
Chinese Restaurant Process
When you find
the white ball,
throw a new
color into the
mix.
Chinese Restaurant Process
When you find
the white ball,
throw a new
color into the
mix.
Chinese Restaurant Process
When you find
the white ball,
throw a new
color into the
mix.
Chinese Restaurant Process
When you find
the white ball,
throw a new
color into the
mix.
Chinese Restaurant Process
When you find
the white ball,
throw a new
color into the
mix.
Chinese Restaurant Process
When you find
the white ball,
throw a new
color into the
mix.
Chinese Restaurant Process
The expected
infinite
distribution
(mean) is
exponential.
# of white balls
controls the
exponent
So coming back to the count data:
20

0

2

What Dirichlet
parameters explain
the data?

0
1

17

14

6

0

15

5

0

0

20

0

0

14

6
So coming back to the count data:
20

0

2

Newton’s Method:
Requires Gradient
+ Hessian

0
1

17

14

6

0

15

5

0

0

20

0

0

14

6
So coming back to the count data:
20

0

2

Reads all of the
data...

0
1

17

14

6

0

15

5

0

0

20

0

0

14

6
So coming back to the count data:
20

https://github.
com/maxsklar/Baye
sPy/tree/master/Con
jugatePriorTools

0

0

2

1

17

14

6

0

15

5

0

0

20

0

0

14

6
So coming back to the count data:
Compress the data
into a Matrix and a
Vector:
Works for lots of
sparsely populated
rows

20

0

0

2

1

17

14

6

0

15

5

0

0

20

0

0

14

6
The Compression MACHINE

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0
The Compression MACHINE
K=3
(the 4th row is a special, total row)

M=6
The maximum # samples per
input

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0
The Compression MACHINE
1

0

0

K=3
(the 4th row is a special, total row)

M=6
The maximum # samples per
input

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0
The Compression MACHINE
K=3
(the 4th row is a special, total row)

M=6
The maximum # samples per
input

1

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

1

0

0

0

0

0
The Compression MACHINE
1

3

2

K=3
(the 4th row is a special, total row)

M=6
The maximum # samples per
input

1

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

1

0

0

0

0

0
The Compression MACHINE
K=3
(the 4th row is a special, total row)

M=6
The maximum # samples per
input

2

0

0

0

0

0

1

1

1

0

0

0

1

1

0

0

0

0

2

1

1

1

1

1
The Compression MACHINE
0

6

0

K=3
(the 4th row is a special, total row)

M=6
The maximum # samples per
input

2

0

0

0

0

0

1

1

1

0

0

0

1

1

0

0

0

0

2

1

1

1

1

1
The Compression MACHINE
K=3
(the 4th row is a special, total row)

M=6
The maximum # samples per
input

2

0

0

0

0

0

2

2

2

1

1

1

1

1

0

0

0

0

3

2

2

2

2

2
The Compression MACHINE
2

1

1

K=3
(the 4th row is a special, total row)

M=6
The maximum # samples per
input

2

0

0

0

0

0

2

2

2

1

1

1

1

1

0

0

0

0

3

2

2

2

2

2
The Compression MACHINE
K=3
(the 4th row is a special, total row)

M=6
The maximum # samples per
input

3

1

0

0

0

0

3

2

2

1

1

1

2

1

0

0

0

0

4

3

3

3

2

2
DEMO
Our Popularity Prior
Dirichlet Mixture Models
Anything you can
go with a
Gaussian, you
can also do with a
Dirichlet
Dirichlet Mixture Models
Example:
Mixture of
Gaussians using
ExpectationMaximization
Dirichlet Mixture Models
Assume each row is
a mixture of
multinomials.
And the parameters
of that mixture are
pulled from a
Dirichlet.

?
Dirichlet Mixture Models
Latent Dirichlet
Allocation
Topic Model
Questions

Mais conteúdo relacionado

Mais procurados

3.7 outlier analysis
3.7 outlier analysis3.7 outlier analysis
3.7 outlier analysisKrish_ver2
 
Principal Component Analysis (PCA) and LDA PPT Slides
Principal Component Analysis (PCA) and LDA PPT SlidesPrincipal Component Analysis (PCA) and LDA PPT Slides
Principal Component Analysis (PCA) and LDA PPT SlidesAbhishekKumar4995
 
Principal component analysis and lda
Principal component analysis and ldaPrincipal component analysis and lda
Principal component analysis and ldaSuresh Pokharel
 
3.1 clustering
3.1 clustering3.1 clustering
3.1 clusteringKrish_ver2
 
Spectral clustering
Spectral clusteringSpectral clustering
Spectral clusteringSOYEON KIM
 
CLUSTER SILHOUETTES.pptx
CLUSTER SILHOUETTES.pptxCLUSTER SILHOUETTES.pptx
CLUSTER SILHOUETTES.pptxagniva pradhan
 
1 seaborn introduction
1 seaborn introduction 1 seaborn introduction
1 seaborn introduction YuleiLi3
 
Introduction to Generalized Linear Models
Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
Introduction to Generalized Linear Modelsrichardchandler
 
Lect5 principal component analysis
Lect5 principal component analysisLect5 principal component analysis
Lect5 principal component analysishktripathy
 
Presentation on unsupervised learning
Presentation on unsupervised learning Presentation on unsupervised learning
Presentation on unsupervised learning ANKUSH PAL
 
Hierarchical clustering.pptx
Hierarchical clustering.pptxHierarchical clustering.pptx
Hierarchical clustering.pptxNTUConcepts1
 
Gentle Introduction to Dirichlet Processes
Gentle Introduction to Dirichlet ProcessesGentle Introduction to Dirichlet Processes
Gentle Introduction to Dirichlet ProcessesYap Wooi Hen
 
Introduction to R programming
Introduction to R programmingIntroduction to R programming
Introduction to R programmingAlberto Labarga
 
Introduction to ggplot2
Introduction to ggplot2Introduction to ggplot2
Introduction to ggplot2maikroeder
 
Intro to Classification: Logistic Regression & SVM
Intro to Classification: Logistic Regression & SVMIntro to Classification: Logistic Regression & SVM
Intro to Classification: Logistic Regression & SVMNYC Predictive Analytics
 

Mais procurados (20)

3.7 outlier analysis
3.7 outlier analysis3.7 outlier analysis
3.7 outlier analysis
 
Principal Component Analysis (PCA) and LDA PPT Slides
Principal Component Analysis (PCA) and LDA PPT SlidesPrincipal Component Analysis (PCA) and LDA PPT Slides
Principal Component Analysis (PCA) and LDA PPT Slides
 
Principal component analysis and lda
Principal component analysis and ldaPrincipal component analysis and lda
Principal component analysis and lda
 
3.1 clustering
3.1 clustering3.1 clustering
3.1 clustering
 
Spectral clustering
Spectral clusteringSpectral clustering
Spectral clustering
 
Pca analysis
Pca analysisPca analysis
Pca analysis
 
CLUSTER SILHOUETTES.pptx
CLUSTER SILHOUETTES.pptxCLUSTER SILHOUETTES.pptx
CLUSTER SILHOUETTES.pptx
 
1 seaborn introduction
1 seaborn introduction 1 seaborn introduction
1 seaborn introduction
 
Introduction to Generalized Linear Models
Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
Introduction to Generalized Linear Models
 
Lect5 principal component analysis
Lect5 principal component analysisLect5 principal component analysis
Lect5 principal component analysis
 
Presentation on unsupervised learning
Presentation on unsupervised learning Presentation on unsupervised learning
Presentation on unsupervised learning
 
Topology for data science
Topology for data scienceTopology for data science
Topology for data science
 
Dbscan
DbscanDbscan
Dbscan
 
Hierarchical clustering.pptx
Hierarchical clustering.pptxHierarchical clustering.pptx
Hierarchical clustering.pptx
 
Gentle Introduction to Dirichlet Processes
Gentle Introduction to Dirichlet ProcessesGentle Introduction to Dirichlet Processes
Gentle Introduction to Dirichlet Processes
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Introduction to R programming
Introduction to R programmingIntroduction to R programming
Introduction to R programming
 
Introduction to ggplot2
Introduction to ggplot2Introduction to ggplot2
Introduction to ggplot2
 
Intro to Classification: Logistic Regression & SVM
Intro to Classification: Logistic Regression & SVMIntro to Classification: Logistic Regression & SVM
Intro to Classification: Logistic Regression & SVM
 
Optics
OpticsOptics
Optics
 

Destaque

A Linked Data Dataset for Madrid Transport Authority's Datasets
A Linked Data Dataset for Madrid Transport Authority's DatasetsA Linked Data Dataset for Madrid Transport Authority's Datasets
A Linked Data Dataset for Madrid Transport Authority's DatasetsOscar Corcho
 
NIPS machine learning in computational biology presentation
NIPS machine learning in computational biology presentationNIPS machine learning in computational biology presentation
NIPS machine learning in computational biology presentationKieran Campbell
 
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14Daniel Lewis
 
אלעד גולדנברג: איקומרס טוב למדינה - 3 צעדים לשינוי צרכני אמיתי באינטרנט בישראל
 אלעד גולדנברג: איקומרס טוב למדינה - 3 צעדים לשינוי צרכני אמיתי באינטרנט בישראל אלעד גולדנברג: איקומרס טוב למדינה - 3 צעדים לשינוי צרכני אמיתי באינטרנט בישראל
אלעד גולדנברג: איקומרס טוב למדינה - 3 צעדים לשינוי צרכני אמיתי באינטרנט בישראלElad Goldenberg
 
온라인 서비스 개선을 데이터 활용법 - 김진영 (How We Use Data)
온라인 서비스 개선을 데이터 활용법  - 김진영 (How We Use Data)온라인 서비스 개선을 데이터 활용법  - 김진영 (How We Use Data)
온라인 서비스 개선을 데이터 활용법 - 김진영 (How We Use Data)Jin Young Kim
 
Image segmentation hj_cho
Image segmentation hj_choImage segmentation hj_cho
Image segmentation hj_choHyungjoo Cho
 
Normalization 방법
Normalization 방법 Normalization 방법
Normalization 방법 홍배 김
 
[225]yarn 기반의 deep learning application cluster 구축 김제민
[225]yarn 기반의 deep learning application cluster 구축 김제민[225]yarn 기반의 deep learning application cluster 구축 김제민
[225]yarn 기반의 deep learning application cluster 구축 김제민NAVER D2
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks남주 김
 
Expectation Maximization and Gaussian Mixture Models
Expectation Maximization and Gaussian Mixture ModelsExpectation Maximization and Gaussian Mixture Models
Expectation Maximization and Gaussian Mixture Modelspetitegeek
 
Squeezing Deep Learning Into Mobile Phones
Squeezing Deep Learning Into Mobile PhonesSqueezing Deep Learning Into Mobile Phones
Squeezing Deep Learning Into Mobile PhonesAnirudh Koul
 

Destaque (11)

A Linked Data Dataset for Madrid Transport Authority's Datasets
A Linked Data Dataset for Madrid Transport Authority's DatasetsA Linked Data Dataset for Madrid Transport Authority's Datasets
A Linked Data Dataset for Madrid Transport Authority's Datasets
 
NIPS machine learning in computational biology presentation
NIPS machine learning in computational biology presentationNIPS machine learning in computational biology presentation
NIPS machine learning in computational biology presentation
 
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
 
אלעד גולדנברג: איקומרס טוב למדינה - 3 צעדים לשינוי צרכני אמיתי באינטרנט בישראל
 אלעד גולדנברג: איקומרס טוב למדינה - 3 צעדים לשינוי צרכני אמיתי באינטרנט בישראל אלעד גולדנברג: איקומרס טוב למדינה - 3 צעדים לשינוי צרכני אמיתי באינטרנט בישראל
אלעד גולדנברג: איקומרס טוב למדינה - 3 צעדים לשינוי צרכני אמיתי באינטרנט בישראל
 
온라인 서비스 개선을 데이터 활용법 - 김진영 (How We Use Data)
온라인 서비스 개선을 데이터 활용법  - 김진영 (How We Use Data)온라인 서비스 개선을 데이터 활용법  - 김진영 (How We Use Data)
온라인 서비스 개선을 데이터 활용법 - 김진영 (How We Use Data)
 
Image segmentation hj_cho
Image segmentation hj_choImage segmentation hj_cho
Image segmentation hj_cho
 
Normalization 방법
Normalization 방법 Normalization 방법
Normalization 방법
 
[225]yarn 기반의 deep learning application cluster 구축 김제민
[225]yarn 기반의 deep learning application cluster 구축 김제민[225]yarn 기반의 deep learning application cluster 구축 김제민
[225]yarn 기반의 deep learning application cluster 구축 김제민
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
Expectation Maximization and Gaussian Mixture Models
Expectation Maximization and Gaussian Mixture ModelsExpectation Maximization and Gaussian Mixture Models
Expectation Maximization and Gaussian Mixture Models
 
Squeezing Deep Learning Into Mobile Phones
Squeezing Deep Learning Into Mobile PhonesSqueezing Deep Learning Into Mobile Phones
Squeezing Deep Learning Into Mobile Phones
 

Semelhante a Digging into the Dirichlet Distribution by Max Sklar

Number system
Number systemNumber system
Number systemAmit Shaw
 
Reliable multimedia transmission under noisy condition
Reliable multimedia transmission under noisy conditionReliable multimedia transmission under noisy condition
Reliable multimedia transmission under noisy conditionShahrukh Ali Khan
 
2012 2013 3rd 9 weeks midterm review
2012 2013 3rd 9 weeks midterm review2012 2013 3rd 9 weeks midterm review
2012 2013 3rd 9 weeks midterm reviewarinedge
 
Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016Spencer Fox
 
Representation Of Data
Representation Of DataRepresentation Of Data
Representation Of Datagavhays
 
Ezgi Karaesmen - Data Cleaning and Manipulation with R
Ezgi Karaesmen - Data Cleaning and Manipulation with REzgi Karaesmen - Data Cleaning and Manipulation with R
Ezgi Karaesmen - Data Cleaning and Manipulation with RRehgan Avon
 
Dee 2034 chapter 1 number and code system (Baia)
Dee 2034 chapter 1 number and code system (Baia)Dee 2034 chapter 1 number and code system (Baia)
Dee 2034 chapter 1 number and code system (Baia)SITI SABARIAH SALIHIN
 
Grouping and Displaying Data to Convey Meaning: Tables & Graphs chapter_2 _fr...
Grouping and Displaying Data to Convey Meaning: Tables & Graphs chapter_2 _fr...Grouping and Displaying Data to Convey Meaning: Tables & Graphs chapter_2 _fr...
Grouping and Displaying Data to Convey Meaning: Tables & Graphs chapter_2 _fr...Prashant Borkar
 
Data repersentation.
Data repersentation.Data repersentation.
Data repersentation.Ritesh Saini
 
Chapter 1 number and code system sss
Chapter 1 number and code system sssChapter 1 number and code system sss
Chapter 1 number and code system sssBaia Salihin
 
Data Representation
Data RepresentationData Representation
Data RepresentationRick Jamil
 
An introduction to probabilistic data structures
An introduction to probabilistic data structuresAn introduction to probabilistic data structures
An introduction to probabilistic data structuresMiguel Ping
 
Beyond PFCount: Shrif Nada
Beyond PFCount: Shrif NadaBeyond PFCount: Shrif Nada
Beyond PFCount: Shrif NadaRedis Labs
 
Year 12 Maths A Textbook - Chapter 10
Year 12 Maths A Textbook - Chapter 10Year 12 Maths A Textbook - Chapter 10
Year 12 Maths A Textbook - Chapter 10westy67968
 
Whole Numbers, Fractions, Decimals, Ratios & Percents, Statistics, Real Numbe...
Whole Numbers, Fractions, Decimals, Ratios & Percents, Statistics, Real Numbe...Whole Numbers, Fractions, Decimals, Ratios & Percents, Statistics, Real Numbe...
Whole Numbers, Fractions, Decimals, Ratios & Percents, Statistics, Real Numbe...REYBETH RACELIS
 

Semelhante a Digging into the Dirichlet Distribution by Max Sklar (20)

Dynamic Itemset Counting
Dynamic Itemset CountingDynamic Itemset Counting
Dynamic Itemset Counting
 
Dynamic Itemset Counting
Dynamic Itemset CountingDynamic Itemset Counting
Dynamic Itemset Counting
 
Number system
Number systemNumber system
Number system
 
Cis435 week06
Cis435 week06Cis435 week06
Cis435 week06
 
Reliable multimedia transmission under noisy condition
Reliable multimedia transmission under noisy conditionReliable multimedia transmission under noisy condition
Reliable multimedia transmission under noisy condition
 
2012 2013 3rd 9 weeks midterm review
2012 2013 3rd 9 weeks midterm review2012 2013 3rd 9 weeks midterm review
2012 2013 3rd 9 weeks midterm review
 
Number system
Number systemNumber system
Number system
 
Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016
 
Representation Of Data
Representation Of DataRepresentation Of Data
Representation Of Data
 
Ezgi Karaesmen - Data Cleaning and Manipulation with R
Ezgi Karaesmen - Data Cleaning and Manipulation with REzgi Karaesmen - Data Cleaning and Manipulation with R
Ezgi Karaesmen - Data Cleaning and Manipulation with R
 
Dee 2034 chapter 1 number and code system (Baia)
Dee 2034 chapter 1 number and code system (Baia)Dee 2034 chapter 1 number and code system (Baia)
Dee 2034 chapter 1 number and code system (Baia)
 
Grouping and Displaying Data to Convey Meaning: Tables & Graphs chapter_2 _fr...
Grouping and Displaying Data to Convey Meaning: Tables & Graphs chapter_2 _fr...Grouping and Displaying Data to Convey Meaning: Tables & Graphs chapter_2 _fr...
Grouping and Displaying Data to Convey Meaning: Tables & Graphs chapter_2 _fr...
 
Data repersentation.
Data repersentation.Data repersentation.
Data repersentation.
 
Chapter 1 number and code system sss
Chapter 1 number and code system sssChapter 1 number and code system sss
Chapter 1 number and code system sss
 
Data Representation
Data RepresentationData Representation
Data Representation
 
An introduction to probabilistic data structures
An introduction to probabilistic data structuresAn introduction to probabilistic data structures
An introduction to probabilistic data structures
 
Beyond PFCount: Shrif Nada
Beyond PFCount: Shrif NadaBeyond PFCount: Shrif Nada
Beyond PFCount: Shrif Nada
 
Year 12 Maths A Textbook - Chapter 10
Year 12 Maths A Textbook - Chapter 10Year 12 Maths A Textbook - Chapter 10
Year 12 Maths A Textbook - Chapter 10
 
Whole Numbers, Fractions, Decimals, Ratios & Percents, Statistics, Real Numbe...
Whole Numbers, Fractions, Decimals, Ratios & Percents, Statistics, Real Numbe...Whole Numbers, Fractions, Decimals, Ratios & Percents, Statistics, Real Numbe...
Whole Numbers, Fractions, Decimals, Ratios & Percents, Statistics, Real Numbe...
 
Number system
Number systemNumber system
Number system
 

Mais de Hakka Labs

Always Valid Inference (Ramesh Johari, Stanford)
Always Valid Inference (Ramesh Johari, Stanford)Always Valid Inference (Ramesh Johari, Stanford)
Always Valid Inference (Ramesh Johari, Stanford)Hakka Labs
 
DataEngConf SF16 - High cardinality time series search
DataEngConf SF16 - High cardinality time series searchDataEngConf SF16 - High cardinality time series search
DataEngConf SF16 - High cardinality time series searchHakka Labs
 
DataEngConf SF16 - Data Asserts: Defensive Data Science
DataEngConf SF16 - Data Asserts: Defensive Data ScienceDataEngConf SF16 - Data Asserts: Defensive Data Science
DataEngConf SF16 - Data Asserts: Defensive Data ScienceHakka Labs
 
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataDatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataHakka Labs
 
DataEngConf SF16 - Recommendations at Instacart
DataEngConf SF16 - Recommendations at InstacartDataEngConf SF16 - Recommendations at Instacart
DataEngConf SF16 - Recommendations at InstacartHakka Labs
 
DataEngConf SF16 - Running simulations at scale
DataEngConf SF16 - Running simulations at scaleDataEngConf SF16 - Running simulations at scale
DataEngConf SF16 - Running simulations at scaleHakka Labs
 
DataEngConf SF16 - Deriving Meaning from Wearable Sensor Data
DataEngConf SF16 - Deriving Meaning from Wearable Sensor DataDataEngConf SF16 - Deriving Meaning from Wearable Sensor Data
DataEngConf SF16 - Deriving Meaning from Wearable Sensor DataHakka Labs
 
DataEngConf SF16 - Collecting and Moving Data at Scale
DataEngConf SF16 - Collecting and Moving Data at Scale DataEngConf SF16 - Collecting and Moving Data at Scale
DataEngConf SF16 - Collecting and Moving Data at Scale Hakka Labs
 
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQDataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQHakka Labs
 
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...Hakka Labs
 
DataEngConf SF16 - Three lessons learned from building a production machine l...
DataEngConf SF16 - Three lessons learned from building a production machine l...DataEngConf SF16 - Three lessons learned from building a production machine l...
DataEngConf SF16 - Three lessons learned from building a production machine l...Hakka Labs
 
DataEngConf SF16 - Scalable and Reliable Logging at Pinterest
DataEngConf SF16 - Scalable and Reliable Logging at PinterestDataEngConf SF16 - Scalable and Reliable Logging at Pinterest
DataEngConf SF16 - Scalable and Reliable Logging at PinterestHakka Labs
 
DataEngConf SF16 - Bridging the gap between data science and data engineering
DataEngConf SF16 - Bridging the gap between data science and data engineeringDataEngConf SF16 - Bridging the gap between data science and data engineering
DataEngConf SF16 - Bridging the gap between data science and data engineeringHakka Labs
 
DataEngConf SF16 - Multi-temporal Data Structures
DataEngConf SF16 - Multi-temporal Data StructuresDataEngConf SF16 - Multi-temporal Data Structures
DataEngConf SF16 - Multi-temporal Data StructuresHakka Labs
 
DataEngConf SF16 - Entity Resolution in Data Pipelines Using Spark
DataEngConf SF16 - Entity Resolution in Data Pipelines Using SparkDataEngConf SF16 - Entity Resolution in Data Pipelines Using Spark
DataEngConf SF16 - Entity Resolution in Data Pipelines Using SparkHakka Labs
 
DataEngConf SF16 - Beginning with Ourselves
DataEngConf SF16 - Beginning with OurselvesDataEngConf SF16 - Beginning with Ourselves
DataEngConf SF16 - Beginning with OurselvesHakka Labs
 
DataEngConf SF16 - Routing Billions of Analytics Events with High Deliverability
DataEngConf SF16 - Routing Billions of Analytics Events with High DeliverabilityDataEngConf SF16 - Routing Billions of Analytics Events with High Deliverability
DataEngConf SF16 - Routing Billions of Analytics Events with High DeliverabilityHakka Labs
 
DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...
DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...
DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...Hakka Labs
 
DataEngConf SF16 - Methods for Content Relevance at LinkedIn
DataEngConf SF16 - Methods for Content Relevance at LinkedInDataEngConf SF16 - Methods for Content Relevance at LinkedIn
DataEngConf SF16 - Methods for Content Relevance at LinkedInHakka Labs
 
DataEngConf SF16 - Spark SQL Workshop
DataEngConf SF16 - Spark SQL WorkshopDataEngConf SF16 - Spark SQL Workshop
DataEngConf SF16 - Spark SQL WorkshopHakka Labs
 

Mais de Hakka Labs (20)

Always Valid Inference (Ramesh Johari, Stanford)
Always Valid Inference (Ramesh Johari, Stanford)Always Valid Inference (Ramesh Johari, Stanford)
Always Valid Inference (Ramesh Johari, Stanford)
 
DataEngConf SF16 - High cardinality time series search
DataEngConf SF16 - High cardinality time series searchDataEngConf SF16 - High cardinality time series search
DataEngConf SF16 - High cardinality time series search
 
DataEngConf SF16 - Data Asserts: Defensive Data Science
DataEngConf SF16 - Data Asserts: Defensive Data ScienceDataEngConf SF16 - Data Asserts: Defensive Data Science
DataEngConf SF16 - Data Asserts: Defensive Data Science
 
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataDatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
 
DataEngConf SF16 - Recommendations at Instacart
DataEngConf SF16 - Recommendations at InstacartDataEngConf SF16 - Recommendations at Instacart
DataEngConf SF16 - Recommendations at Instacart
 
DataEngConf SF16 - Running simulations at scale
DataEngConf SF16 - Running simulations at scaleDataEngConf SF16 - Running simulations at scale
DataEngConf SF16 - Running simulations at scale
 
DataEngConf SF16 - Deriving Meaning from Wearable Sensor Data
DataEngConf SF16 - Deriving Meaning from Wearable Sensor DataDataEngConf SF16 - Deriving Meaning from Wearable Sensor Data
DataEngConf SF16 - Deriving Meaning from Wearable Sensor Data
 
DataEngConf SF16 - Collecting and Moving Data at Scale
DataEngConf SF16 - Collecting and Moving Data at Scale DataEngConf SF16 - Collecting and Moving Data at Scale
DataEngConf SF16 - Collecting and Moving Data at Scale
 
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQDataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
 
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
 
DataEngConf SF16 - Three lessons learned from building a production machine l...
DataEngConf SF16 - Three lessons learned from building a production machine l...DataEngConf SF16 - Three lessons learned from building a production machine l...
DataEngConf SF16 - Three lessons learned from building a production machine l...
 
DataEngConf SF16 - Scalable and Reliable Logging at Pinterest
DataEngConf SF16 - Scalable and Reliable Logging at PinterestDataEngConf SF16 - Scalable and Reliable Logging at Pinterest
DataEngConf SF16 - Scalable and Reliable Logging at Pinterest
 
DataEngConf SF16 - Bridging the gap between data science and data engineering
DataEngConf SF16 - Bridging the gap between data science and data engineeringDataEngConf SF16 - Bridging the gap between data science and data engineering
DataEngConf SF16 - Bridging the gap between data science and data engineering
 
DataEngConf SF16 - Multi-temporal Data Structures
DataEngConf SF16 - Multi-temporal Data StructuresDataEngConf SF16 - Multi-temporal Data Structures
DataEngConf SF16 - Multi-temporal Data Structures
 
DataEngConf SF16 - Entity Resolution in Data Pipelines Using Spark
DataEngConf SF16 - Entity Resolution in Data Pipelines Using SparkDataEngConf SF16 - Entity Resolution in Data Pipelines Using Spark
DataEngConf SF16 - Entity Resolution in Data Pipelines Using Spark
 
DataEngConf SF16 - Beginning with Ourselves
DataEngConf SF16 - Beginning with OurselvesDataEngConf SF16 - Beginning with Ourselves
DataEngConf SF16 - Beginning with Ourselves
 
DataEngConf SF16 - Routing Billions of Analytics Events with High Deliverability
DataEngConf SF16 - Routing Billions of Analytics Events with High DeliverabilityDataEngConf SF16 - Routing Billions of Analytics Events with High Deliverability
DataEngConf SF16 - Routing Billions of Analytics Events with High Deliverability
 
DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...
DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...
DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...
 
DataEngConf SF16 - Methods for Content Relevance at LinkedIn
DataEngConf SF16 - Methods for Content Relevance at LinkedInDataEngConf SF16 - Methods for Content Relevance at LinkedIn
DataEngConf SF16 - Methods for Content Relevance at LinkedIn
 
DataEngConf SF16 - Spark SQL Workshop
DataEngConf SF16 - Spark SQL WorkshopDataEngConf SF16 - Spark SQL Workshop
DataEngConf SF16 - Spark SQL Workshop
 

Último

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 

Último (20)

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 

Digging into the Dirichlet Distribution by Max Sklar