SlideShare uma empresa Scribd logo
1 de 17
Baixar para ler offline
Applied Statistics
Prof. Dr. Ir. O. Thas

Group Project
Ameril, Camar (01305028)
Herzallah, Mohammed (01303232)
Ludevese, Christine (01307342)
Nicomel, Nina Ricci (01302593)
Rifai, Ridwan (01302922)

2013 – 2014
1. Statistical Methodology
Being one of the well-known beverages and one of the most important commodities, coffee
has gained its place in the global market. Its taste serves as the primary indicator of the preference of
the consumers. With this, companies involved in coffee marketing are interested in a great number
of factors that may affect the taste of the coffee.
In one case study, a company organized a tasting session in which 40 women were invited to
taste one cup of coffee with distinct characteristics. The participants were requested to give a taste
score to the cup of coffee on a scale of 1 to 20. Additionally, they were asked regarding their age (in
years) and their average coffee consumption per day (in cups). The researchers also have collected
data on the coffee samples and these included the caffeine content of each cup of coffee (in
milligrams), the origin of the coffee beans (i.e. Bolivia, Brazil, Columbia, Peru), the roasting time of
the coffee beans (in minutes), and the species of the coffee plant (i.e. Coffea arabica or Coffea
robusta). It is of primary interest to determine the following: (1) influence of the origin of the coffee
beans and the age of the participants on the taste score; (2) occurrence of the difference in the caffeine
content of the C. arabica and C. robusta plant species and (3) the effect of age on the average coffee
consumption per day of the participants.
Prior to performing any statistical analyses, assumptions on the normality of the residuals or
observations and equality of group variances were assessed using Kolmogorov-Smirnov (KS) test
(with supporting QQ plots) and Modified Levene test, respectively. Two-way Analysis of Variance
(ANOVA) was used to evaluate Question 1 since the influence of two independent variables (i.e.
origin of the coffee beans and age of the participants) on the taste score needs to be examined.
Subsequently, assessments were made whether there is an interaction between the two independent
variables. To find the differences within each factor, Tukey test was used with a family-wise error
rate equal to 5%. As for Question 2, the dataset was first split to assess the assumptions since there
are two species of coffee plant involved. Since it was known that the normality assumption was not
fulfilled and that the data is homoscedastic, Wilcoxon rank-sum test was used for evaluation. Oneway ANOVA was used in Question 3 given that the influence of only one independent variable (i.e.
age) on the average coffee consumption per day of the participants needs to be determined.
All tests were done at a 5% level of significance using TIBCO Spotfire S+ software.
2. Results and Comments
2.1. Question 1
Data points on the QQ plot of the residuals (Figure 1) were alternately found below and above
the line. This may indicate some degree of skewing. For verification, KS test was performed and this
confirmed non-normality with p-value equal to 0.0081 (Table 1). However, since the number of
observations is greater than 30, the Central Limit Theorem may apply and the normality can be
assumed. As for the homoscedasticity assumption, Modified Levene test confirmed the equality of
the variances with p-value equal to 0.7993 (Table 2). Since the assumptions were fulfilled, ANOVA
was then performed to answer Question 1.
Considering Table 3, there is an influence of the origin of the coffee beans and age of the
participants on the taste score since p-values are much less than 0.05. The interaction term is also
significant with p-value equal to 0.0486. This can be supported by the interaction plot (Figure 2)
since the generated lines are not parallel to each other. This means that the influence of the origin of
the coffee beans on the mean taste score depends on the age of the participant and vice versa. Because
of the interaction term, both independent variables must be assessed separately to find the differences.
In Tables 4, 5 and 6, the influence of the origins on the mean taste score was assessed according to
the three age classes. For people younger than 40 years (age class A), there is no significant effect of
origin on the mean taste score. For people aged 40 or older but younger than 60 years (age class B),
only significant differences between (1) Bolivia and Columbia and (2) Brazil and Columbia were
observed. For people aged 60 years or older (age class C), significant differences between (1) Bolivia
and Brazil, (2) Brazil and Columbia and (3) Brazil and Peru were observed. In Tables 7, 8, 9 and 10,
the influence of the age classes on the mean taste score was assessed according to the four origins.
For Bolivia, there is a significant difference between age class B and age class C. For Brazil,
significant differences between (1) age class A and age class B and (2) age class A and age class C
were observed. For Columbia, only significant difference between age class A and age class C was
observed. For Peru, there is no significant effect of age class on the mean taste score. From these
results, it can be deduced where the researchers should source their coffee beans from, which is Peru.
This is because regardless of the age of the coffee consumer, same taste score will be given as long
as the origin of the coffee beans is Peru.
2.2. Question 2
The normality of the observations for C. arabica and C. robusta were assessed separately. By
observing the generated QQ plots (Figure 3) and box plots (Figure 4) for both plant species, it is clear
that the observations for C. robusta are normally distributed while those of C. arabica are not. This
can be supported by KS tests (Tables 11 and 12) wherein p-values of 0.5 and 0 were obtained for C.
robusta and C. arabica, respectively. With this, normality assumption was not fulfilled. As for the
homoscedasticity assumption, Modified Levene test revealed that variances of the groups are equal
with p-value equal to 0.6697864 (Table 13). In this case, Wilcoxon rank-sum test was performed and
this showed that HO must be rejected at a 5 % level of significance. Based on this sample, the mean
caffeine content of C. arabica is greater than the mean caffeine content of C. robusta. However, with
p-value equal to 0.0338, it can be inferred that the conclusion is not strong, thus, further analyses
should be done before suggesting which species of coffee should be marketed.
2.3. Question 3
QQ plot of the residuals (Figure 5) clearly shows that residuals do not follow a normal
distribution. For verification, KS test was performed and this confirmed non-normality with p-value
equal to 0.0188 (Table 15). However, since the number of observations is greater than 30, the Central
Limit Theorem may apply and the normality can be assumed. As for the homoscedasticity
assumption, Modified Levene test confirmed the equality of the variances with p-value equal to
0.6697 (Table 16). Since the assumptions were fulfilled, ANOVA was then performed to answer
Question 3.
From Table 17, it can be strongly inferred that at least one age class results in a different
average frequency of coffee consumption per day. At a 5% family-wise level of significance, there
is significant difference between the mean frequency of coffee consumption per day of (1) age class
B and age class C and (2) age class A and age class B. We are 95% family-wise confident that when
the age class is B, the mean frequency of coffee consumption per day will be between 1.38 and 3.53
cups more than when the age class is C. Same interpretation can be done for age class A and age
class B. Moreover, it can also be known from the estimates in Table 17 that the most interesting age
group target is B, followed by C, then by A.
3. Executive Summary
As general remarks, by performing statistical analyses, the researchers concluded that there
is an influence of the origin of the coffee beans and the age of the participants to the taste of the
coffee. The younger age group (i.e. participants younger than 40 years) does not discriminate the
origin of the coffee beans. However, for the middle age group (i.e. participants aged 40 of older but
younger than 60 years) and the older age group (i.e. participants aged 60 or older), the origin of the
coffee beans is a significant factor that affects the taste of the coffee. It is further suggested that Peru
as the origin of coffee beans is the most preferred source.
The mean caffeine content of a more expensive C. arabica is higher than the mean caffeine
content of a cheaper C. robusta. Yet, at 5% level of significance, it is suggested that further analyses
should be done before recommending what species should be the best one for manufacturing coffee
beverage.
The age group of the consumer is important to consider for the marketability of the coffee
product. The middle age group (40 to 59 years old) is the most interesting target consumer since they
are relatively highest in the frequency of consumption. The second interesting one is the older age
group (60 years old and above) and the third interesting group is the younger age group (39 years old
and below) having the higher and lowest frequency of consumption of coffee, respectively.
4. Appendices
4.1. Appendix 1: S+ outputs for Question 1
4.1.1. Assessment of normality assumption
To assess the normality assumption, the Kolmogorov-Smirnov test was used. In this test, the
residuals were considered since the number of observations for each origin is 10, which is a relatively
small number. Additionally, the QQ plot of the residuals was obtained and considered.
H0: Residuals follow a normal distribution.
H1: Residuals do not follow a normal distribution.
Table 1. S+ output of Kolmogorov-Smirnov test.
One sample Kolmogorov-Smirnov Test of Composite Normality
data: residuals in coffee.data
ks = 0.1644, p-value = 0.0081
alternative hypothesis: True cdf is not the normal distn. with estimated
parameters
sample estimates:
mean of x standard deviation of x
0
2.213208
22

0
-4

-2

Residuals

2

4

27

5
-2

-1

0

1

2

Quantiles of Standard Normal

Figure 1. QQ plot of the residuals.
4.1.2. Assessment of homoscedasticity assumption
To assess the homoscedasticity assumption, Modified Levene test was used.
H0: Variances of all groups are equal.
H1: Variances of all groups are not equal.
Table 2. S+ output of Modified Levene test.
***

Modified Levene test ***

Df Sum of Sq Mean Sq F Value
Pr(F)
groep 10
22.675 2.267500 0.601624 0.7993811
Residuals 29
109.300 3.768966
4.1.3. Assessment of interaction effects
H0: There is no interaction between origin and age for average taste score.
H1: There is an interaction between origin and age for average taste score.
Table 3. S+ output of the two-way ANOVA for the interaction assessment.
*** Analysis of Variance Model ***
Short Output:
Call:
aov(formula = taste ~ origin * ageclass, data = coffee.data, na.action =
na.exclude)
Terms:
origin ageclass origin:ageclass Residuals
Sum of Squares 384.0750 194.2043
84.4623 191.0333
Deg. of Freedom
3
2
5
29
Residual standard error: 2.566585
1 out of 12 effects not estimable
Estimated effects may be unbalanced

20

Type III Sum of Squares
Df Sum of Sq Mean Sq F Value
Pr(F)
origin 3 309.5743 103.1914 15.66508 0.00000303
ageclass 2 190.2823 95.1412 14.44300 0.00004441
origin:ageclass 5
84.4623 16.8925 2.56438 0.04868222
Residuals 29 191.0333
6.5874

ageclass

15
10
5

mean of taste -- 0 NA's

21.99+ thru 39
39.00+ thru 59
59.00+ thru 70

Bolivia

Brazil

Columbia

Peru

origin

Figure 2. Interaction plot for origin and age class.
4.1.4. Effects of origin of the coffee beans on the taste score of participants younger than 40
years
H0: There is no effect of the origin of the coffee beans on the taste score of participants younger than
40 years.
H1: There is an effect of the origin of the coffee beans on the taste score of participants younger than
40 years.
Table 4. S+ output of the one-way ANOVA for effects of origin of the coffee beans on the taste
score of participants younger than 40 years.
*** Analysis of Variance Model ***
Short Output:
Call:
aov(formula = taste ~ origin, data = coffee.data.21.99..thru.39, na.action =
na.exclude)
Terms:
origin Residuals
Sum of Squares 54.85714 32.00000
Deg. of Freedom
2
4
Residual standard error: 2.828427
Estimated effects may be unbalanced
Type III Sum of Squares
Df Sum of Sq Mean Sq F Value
Pr(F)
origin 2 54.85714 27.42857 3.428571 0.1357341
Residuals 4 32.00000 8.00000
95 % simultaneous confidence intervals for specified
linear combinations, by the Tukey method
critical point: 3.5638
response variable: taste
intervals excluding 0 are flagged by '****'
Estimate Std.Error Lower Bound Upper Bound
Brazil-Columbia 8.00e+000
3.27
-3.64
19.60
Brazil-Peru 8.00e+000
3.27
-3.64
19.60
Columbia-Peru 8.44e-015
2.31
-8.23
8.23
4.1.5. Effects of origin of the coffee beans on the taste score of participants aged 40 or older but
younger than 60 years
H0: There is no effect of the origin of the coffee beans on the taste score of participants aged 40 or
older but younger than 60 years.
H1: There is an effect of the origin of the coffee beans on the taste score of participants aged 40 or
older but younger than 60 years
Table 5. S+ output of the one-way ANOVA for effects of origin of the coffee beans on the taste
score of participants aged 40 or older but younger than 60 years.
*** Analysis of Variance Model ***
Short Output:
Call:
aov(formula = taste ~ origin, data = coffee.data.39.00..thru.59, na.action =
na.exclude)
Terms:
origin Residuals
Sum of Squares 216.3333 120.6167
Deg. of Freedom
3
16
Residual standard error: 2.74564
Estimated effects may be unbalanced
Type III Sum of Squares
Df Sum of Sq Mean Sq F Value
Pr(F)
origin 3 216.3333 72.11111 9.565658 0.0007429687
Residuals 16 120.6167 7.53854
Estimated Coefficients:
(Intercept) originBrazil originColumbia originPeru
15.5
-0.9
-9.166667
-4.75
95 % simultaneous confidence intervals for specified
linear combinations, by the Tukey method
critical point: 2.861
response variable: taste
intervals excluding 0 are flagged by '****'
Bolivia-Brazil
Bolivia-Columbia
Bolivia-Peru
Brazil-Columbia
Brazil-Peru
Columbia-Peru

Estimate Std.Error Lower Bound Upper Bound
0.90
1.57
-3.5800
5.38
9.17
1.86
3.8500
14.50 ****
4.75
1.68
-0.0603
9.56
8.27
2.01
2.5300
14.00 ****
3.85
1.84
-1.4200
9.12
-4.42
2.10
-10.4000
1.58
4.1.6. Effects of origin of the coffee beans on the taste score of participants aged 60 or older
H0: There is no effect of the origin of the coffee beans on the taste score of participants aged 60 or
older.
H1: There is an effect of the origin of the coffee beans on the taste score of participants aged 60 or
older.
Table 6. S+ output of the one-way ANOVA for effects of origin of the coffee beans on the taste
score of participants aged 60 or older.
*** Analysis of Variance Model ***
Short Output:
Call:
aov(formula = taste ~ origin, data = coffee.data.59.00..thru.70, na.action =
na.exclude)
Terms:
origin Residuals
Sum of Squares 222.6603
38.4167
Deg. of Freedom
3
9
Residual standard error: 2.066039
Estimated effects may be unbalanced
Type III Sum of Squares
Df Sum of Sq Mean Sq F Value
Pr(F)
origin 3 222.6603 74.22009 17.38779 0.0004362847
Residuals 9
38.4167 4.26852
Estimated Coefficients:
(Intercept) originBrazil originColumbia originPeru
7
7.5
-2.75 0.6666667
95 % simultaneous confidence intervals for specified
linear combinations, by the Tukey method
critical point: 3.1219
response variable: taste
intervals excluding 0 are flagged by '****'
Bolivia-Brazil
Bolivia-Columbia
Bolivia-Peru
Brazil-Columbia
Brazil-Peru
Columbia-Peru

Estimate Std.Error Lower Bound Upper Bound
-7.500
1.79
-13.10
-1.91 ****
2.750
1.79
-2.84
8.34
-0.667
1.89
-6.55
5.22
10.300
1.46
5.69
14.80 ****
6.830
1.58
1.91
11.80 ****
-3.420
1.58
-8.34
1.51
4.1.7. Effects of the age class of the participants on the taste score of coffee from Bolivia
H0: There is no effect of the age class of the participants on the taste score of coffee from Bolivia.
H1: There is an effect of the age class of the participants on the taste score of coffee from Bolivia.
Table 7. S+ output of the one-way ANOVA for effects of the age class of the participants on the
taste score of coffee from Bolivia.
*** Analysis of Variance Model ***
Short Output:
Call:
aov(formula = taste ~ ageclass, data = coffee.data.Bolivia, na.action =
na.exclude)
Terms:
Sum of Squares
Deg. of Freedom

ageclass Residuals
115.6
74.0
1
8

Residual standard error: 3.041381
Estimated effects may be unbalanced
Type III Sum of Squares
Df Sum of Sq Mean Sq F Value
Pr(F)
ageclass 1
115.6 115.60 12.4973 0.007674012
Residuals 8
74.0
9.25
95 % non-simultaneous confidence intervals for specified
linear combinations, by the Fisher LSD method
critical point: 2.306
response variable: taste
intervals excluding 0 are flagged by '****'
39.00+ thru 59-59.00+ thru 70

Estimate Std.Error Lower Bound Upper Bound
8.5
2.4
2.96
14 ****
4.1.8. Effects of the age class of the participants on the taste score of coffee from Brazil
H0: There is no effect of the age class of the participants on the taste score of coffee from Brazil.
H1: There is an effect of the age class of the participants on the taste score of coffee from Brazil.
Table 8. S+ output of the one-way ANOVA for effects of the age class of the participants on the
taste score of coffee from Brazil.
*** Analysis of Variance Model ***
Short Output:
Call:
aov(formula = taste ~ ageclass, data = coffee.data.Brazil, na.action =
na.exclude)
Terms:
Sum of Squares
Deg. of Freedom

ageclass Residuals
26.7
12.2
2
7

Residual standard error: 1.320173
Estimated effects may be unbalanced
Type III Sum of Squares
Df Sum of Sq Mean Sq F Value
Pr(F)
ageclass 2
26.7 13.35000 7.659836 0.01727571
Residuals 7
12.2 1.74286
95 % simultaneous confidence intervals for specified
linear combinations, by the Tukey method
critical point: 2.9451
response variable: taste
intervals excluding 0 are flagged by '****'
21.99+ thru 39-39.00+ thru 59
21.99+ thru 39-59.00+ thru 70
39.00+ thru 59-59.00+ thru 70

Estimate Std.Error Lower Bound Upper Bound
5.4
1.450
1.14
9.66 ****
5.5
1.480
1.15
9.85 ****
0.1
0.886
-2.51
2.71
4.1.9. Effects of the age class of the participants on the taste score of coffee from Columbia
H0: There is no effect of the age class of the participants on the taste score of coffee from Columbia.
H1: There is an effect of the age class of the participants on the taste score of coffee from Columbia.
Table 9. S+ output of the one-way ANOVA for effects of the age class of the participants on the
taste score of coffee from Columbia.
*** Analysis of Variance Model ***
Short Output:
Call:
aov(formula = taste ~ ageclass, data = coffee.data.Columbia, na.action =
na.exclude)
Terms:
ageclass Residuals
Sum of Squares 106.1833
69.4167
Deg. of Freedom
2
7
Residual standard error: 3.149074
Estimated effects may be unbalanced
Type III Sum of Squares
Df Sum of Sq Mean Sq F Value
Pr(F)
ageclass 2 106.1833 53.09167 5.353782 0.03884073
Residuals 7
69.4167 9.91667
95 % simultaneous confidence intervals for specified
linear combinations, by the Tukey method
critical point: 2.9451
response variable: taste
intervals excluding 0 are flagged by '****'
21.99+ thru 39-39.00+ thru 59
21.99+ thru 39-59.00+ thru 70
39.00+ thru 59-59.00+ thru 70

Estimate Std.Error Lower Bound Upper Bound
5.67
2.57
-1.910
13.20
7.75
2.41
0.667
14.80 ****
2.08
2.41
-5.000
9.17
4.1.10. Effects of the age class of the participants on the taste score of coffee from Peru
H0: There is no effect of the age class of the participants on the taste score of coffee from Peru.
H1: There is an effect of the age class of the participants on the taste score of coffee from Peru.
Table 10. S+ output of the one-way ANOVA for effects of the age class of the participants on
the taste score of coffee from Peru.
*** Analysis of Variance Model ***
Short Output:
Call:
aov(formula = taste ~ ageclass, data = coffee.data.Peru, na.action =
na.exclude)
Terms:
ageclass Residuals
Sum of Squares 30.18333 35.41667
Deg. of Freedom
2
7
Residual standard error: 2.249339
Estimated effects may be unbalanced
Type III Sum of Squares
Df Sum of Sq Mean Sq F Value
Pr(F)
ageclass 2 30.18333 15.09167 2.982824 0.1156281
Residuals 7 35.41667 5.05952
95 % simultaneous confidence intervals for specified
linear combinations, by the Tukey method
critical point: 2.9451
response variable: taste
intervals excluding 0 are flagged by '****'
21.99+ thru 39-39.00+ thru 59
21.99+ thru 39-59.00+ thru 70
39.00+ thru 59-59.00+ thru 70

Estimate Std.Error Lower Bound Upper Bound
1.25
1.72
-3.81
6.31
4.33
1.84
-1.08
9.74
3.08
1.72
-1.98
8.14
4.2. Appendix 2: S+ outputs for Question 2
4.2.1. Assessment of normality assumption
There are two samples with 20 observations each. The dataset was split first according to species (i.e.
C. arabica and C. robusta) then the normality of both samples was assessed separately.
H0: The observations for C. arabica were obtained from a normal distribution.
H1: The observations for C. robusta were not obtained from a normal distribution.
Table 11. S+ output of Kolmogorov-Smirnov test for C. arabica.
One sample Kolmogorov-Smirnov Test of Composite Normality
data: caffein in coffee.data.arabica
ks = 0.305, p-value = 0
alternative hypothesis:
True cdf is not the normal distn. with estimated parameters
sample estimates:
mean of x standard deviation of x
284.7324
133.3316

H0: The observations for C. robusta were obtained from a normal distribution.
H1: The observations for C. robusta were not obtained from a normal distribution.
Table 12. S+ output of Kolmogorov-Smirnov test for C. robusta.
One sample Kolmogorov-Smirnov Test of Composite Normality
data: caffein in coffee.data.robusta
ks = 0.125, p-value = 0.5
alternative hypothesis:
True cdf is not the normal distn. with estimated parameters
sample estimates:
mean of x standard deviation of x
207.8488
133.0805
400

caffein

500

500

400

300

300

200

200

100

100

0
-1.5

0.0

1.5

-1.5

Normal Distribution

0.0

1.5

Normal Distribution

(a)

(b)

Figure 3. QQ plots of the caffeine content of (a) C. arabica and (b) C. robusta.

600

400

caffein

caffein

600

200

0

arabica

robusta
species

Figure 4. Box plots of the caffeine content of C. arabica and C. robusta.
4.2.2. Assessment of homoscedasticity assumption
H0: Variances of all groups are equal.
H1: Variances of all groups are not equal.
Table 13. S+ output of Modified Levene test.
***

Modified Levene test ***

Df Sum of Sq Mean Sq
F Value
Pr(F)
groep 1
1756.0 1756.021 0.1847069 0.6697864
Residuals 38 361268.6 9507.069

4.2.3. Assessment of the caffeine contents of C. arabica and C. robusta
H0: The mean caffeine content of C. arabica is equal to the mean caffeine content of C. robusta.
H1: The mean caffeine content of C. arabica is greater than the mean caffeine content of C. robusta.
Table 14. S+ output of Exact Wilcoxon rank-sum test.
Exact Wilcoxon rank-sum test
data: x: caffein with species = arabica , and y: caffein with species =
robusta
rank-sum statistic W = 478, n = 20, m = 20, p-value = 0.0338
alternative hypothesis: mu is greater than 0
4.3. Appendix 3: S+ outputs for Question 3
4.3.1. Assessment of normality assumption
H0: Residuals follow a normal distribution.
H1: Residuals do not follow a normal distribution.
Table 15. S+ output of Kolmogorov-Smirnov test.
One sample Kolmogorov-Smirnov Test of Composite Normality
data: residuals in coffee.data
ks = 0.1533, p-value = 0.0188
alternative hypothesis:
True cdf is not the normal distn. with estimated parameters
sample estimates:
mean of x standard deviation of x
5.551115e-018
1.206281

0
-1

Residuals

1

2

14

-2

21
10
-2

-1

0

1

Quantiles of Standard Normal

Figure 5. QQ plot of the residuals.
4.3.2. Assessment of homoscedasticity assumption
H0: Variances of all groups are equal.
H1: Variances of all groups are not equal.
Table 16. S+ output of Modified Levene test.
***

Modified Levene test ***

Df Sum of Sq Mean Sq
F Value
Pr(F)
groep 1
1756.0 1756.021 0.1847069 0.6697864
Residuals 38 361268.6 9507.069

2
4.3.3. Assessment of the differences in coffee consumption among the three age classes
H0: All three age classes result in the same average frequency of coffee consumption per day.
H1: At least one age class results in a different average frequency of coffee consumption per day.
Table 17. S+ output of the one-way ANOVA for the determination if differences in coffee
consumption among the three age classes exist.
*** Analysis of Variance Model ***
Short Output:
Call:
aov(formula = frequency ~ ageclass, data = coffee.data, na.action =
na.exclude
)
Terms:
ageclass Residuals
Sum of Squares 82.85055 56.74945
Deg. of Freedom
2
37
Residual standard error: 1.238454
Estimated effects may be unbalanced
Type III Sum of Squares
Df Sum of Sq Mean Sq F Value
Pr(F)
ageclass 2 82.85055 41.42527 27.00881 5.860174e-008
Residuals 37 56.74945 1.53377

Table 18. S+ output of the Tukey method.
95 % simultaneous confidence intervals for specified
linear combinations, by the Tukey method
critical point: 2.4415
response variable: frequency
intervals excluding 0 are flagged by '****'
21.99+ thru 39-39.00+ thru 59
21.99+ thru 39-59.00+ thru 70
39.00+ thru 59-59.00+ thru 70

Estimate Std.Error Lower Bound Upper Bound
-3.440
0.544
-4.77
-2.110 ****
-0.989
0.581
-2.41
0.429
2.450
0.441
1.38
3.530 ****

Mais conteúdo relacionado

Destaque

Clowds collaborate
Clowds collaborateClowds collaborate
Clowds collaborateClowds
 
Everything you need for your cd & dvd
Everything you need for your cd & dvdEverything you need for your cd & dvd
Everything you need for your cd & dvdIli Martins
 
патриотическое воспитание в мбоу сош №1
патриотическое воспитание в мбоу сош №1патриотическое воспитание в мбоу сош №1
патриотическое воспитание в мбоу сош №1марина маслова
 
132054419 soal-cpns-pemkab-free
132054419 soal-cpns-pemkab-free132054419 soal-cpns-pemkab-free
132054419 soal-cpns-pemkab-freeAhmed Dani
 
Icebreakingpenyegaransua 111024053019-phpapp02
Icebreakingpenyegaransua 111024053019-phpapp02Icebreakingpenyegaransua 111024053019-phpapp02
Icebreakingpenyegaransua 111024053019-phpapp02anita sriwaty
 
Why abcd & dvd ltd
Why abcd & dvd ltdWhy abcd & dvd ltd
Why abcd & dvd ltdIli Martins
 
Faktorfaktoryangmempengaruhikehamilan
FaktorfaktoryangmempengaruhikehamilanFaktorfaktoryangmempengaruhikehamilan
Faktorfaktoryangmempengaruhikehamilananita sriwaty
 
Tarea 4 como considera su incursion en el entorno educativo
Tarea 4 como considera su incursion en el entorno educativoTarea 4 como considera su incursion en el entorno educativo
Tarea 4 como considera su incursion en el entorno educativoGeintner Albuja
 

Destaque (20)

Practical reliability
Practical reliabilityPractical reliability
Practical reliability
 
Clowds collaborate
Clowds collaborateClowds collaborate
Clowds collaborate
 
Special Needs Ppp Final
Special Needs Ppp FinalSpecial Needs Ppp Final
Special Needs Ppp Final
 
Ngay hoi isic
Ngay hoi isicNgay hoi isic
Ngay hoi isic
 
Everything you need for your cd & dvd
Everything you need for your cd & dvdEverything you need for your cd & dvd
Everything you need for your cd & dvd
 
Can portfolio
Can portfolioCan portfolio
Can portfolio
 
керженец
керженецкерженец
керженец
 
патриотическое воспитание в мбоу сош №1
патриотическое воспитание в мбоу сош №1патриотическое воспитание в мбоу сош №1
патриотическое воспитание в мбоу сош №1
 
2015 Ajou University
2015 Ajou University 2015 Ajou University
2015 Ajou University
 
Spring finished
Spring   finishedSpring   finished
Spring finished
 
Juniper Wi-Fi
Juniper Wi-FiJuniper Wi-Fi
Juniper Wi-Fi
 
Milk analysis
Milk analysisMilk analysis
Milk analysis
 
Dm
DmDm
Dm
 
Ingles
InglesIngles
Ingles
 
132054419 soal-cpns-pemkab-free
132054419 soal-cpns-pemkab-free132054419 soal-cpns-pemkab-free
132054419 soal-cpns-pemkab-free
 
Icebreakingpenyegaransua 111024053019-phpapp02
Icebreakingpenyegaransua 111024053019-phpapp02Icebreakingpenyegaransua 111024053019-phpapp02
Icebreakingpenyegaransua 111024053019-phpapp02
 
Why abcd & dvd ltd
Why abcd & dvd ltdWhy abcd & dvd ltd
Why abcd & dvd ltd
 
Faktorfaktoryangmempengaruhikehamilan
FaktorfaktoryangmempengaruhikehamilanFaktorfaktoryangmempengaruhikehamilan
Faktorfaktoryangmempengaruhikehamilan
 
Tarea 4 como considera su incursion en el entorno educativo
Tarea 4 como considera su incursion en el entorno educativoTarea 4 como considera su incursion en el entorno educativo
Tarea 4 como considera su incursion en el entorno educativo
 
Multimedia
MultimediaMultimedia
Multimedia
 

Semelhante a Ameril herzallah ludevese_nicomel_rifai (applied statistics project)

Eng315 professional experience professional communications art
Eng315 professional experience professional communications artEng315 professional experience professional communications art
Eng315 professional experience professional communications artjoney4
 
Psyc a303 statistics and methods name
Psyc a303 statistics and methods                          namePsyc a303 statistics and methods                          name
Psyc a303 statistics and methods namePOLY33
 
Business Statistics assignment 2014
Business Statistics assignment 2014Business Statistics assignment 2014
Business Statistics assignment 2014TIEZHENG YUAN
 
Biol 3095 – 2012 Annotated bibliographies
Biol 3095 – 2012 Annotated bibliographiesBiol 3095 – 2012 Annotated bibliographies
Biol 3095 – 2012 Annotated bibliographiesCarla-Figueroa-Garcia
 
Bisphenol a levels in human urine
Bisphenol a levels in human urineBisphenol a levels in human urine
Bisphenol a levels in human urinericguer
 
Carbon 13 SWB Intake
Carbon 13 SWB IntakeCarbon 13 SWB Intake
Carbon 13 SWB IntakeReza Alavi
 
Presentation epi
Presentation epiPresentation epi
Presentation epiKoko Hamza
 
Cohorte ejercicio 1-dia-care-2006-van-dam-398-403
Cohorte ejercicio 1-dia-care-2006-van-dam-398-403Cohorte ejercicio 1-dia-care-2006-van-dam-398-403
Cohorte ejercicio 1-dia-care-2006-van-dam-398-403Erika Mayo
 
NUTRITIONAL EPIDEMIOLOGY END OF SEMESTER EXAM.docx
NUTRITIONAL EPIDEMIOLOGY END OF SEMESTER EXAM.docxNUTRITIONAL EPIDEMIOLOGY END OF SEMESTER EXAM.docx
NUTRITIONAL EPIDEMIOLOGY END OF SEMESTER EXAM.docxkihembopamelah
 
Effect of Food Source on Enzymatic Activity in C. maculatus [draft 2]
Effect of Food Source on Enzymatic Activity in C. maculatus [draft 2]Effect of Food Source on Enzymatic Activity in C. maculatus [draft 2]
Effect of Food Source on Enzymatic Activity in C. maculatus [draft 2]Dylan Easterday
 
Basis for the Public's Opinions on the Healthiness of Coffee
Basis for the Public's Opinions on the Healthiness of CoffeeBasis for the Public's Opinions on the Healthiness of Coffee
Basis for the Public's Opinions on the Healthiness of CoffeeWilliam Cheaqui
 
Difference-from-Control test.pptx
Difference-from-Control test.pptxDifference-from-Control test.pptx
Difference-from-Control test.pptxElifEngin7
 
Final Data Mining_Elizabeth Ortega
Final Data Mining_Elizabeth OrtegaFinal Data Mining_Elizabeth Ortega
Final Data Mining_Elizabeth OrtegaElizabeth Ortega
 
Effects of coffee intake in different types of cancer- Carla Figueroa
Effects of coffee intake in different types of cancer- Carla FigueroaEffects of coffee intake in different types of cancer- Carla Figueroa
Effects of coffee intake in different types of cancer- Carla FigueroaCarla-Figueroa-Garcia
 

Semelhante a Ameril herzallah ludevese_nicomel_rifai (applied statistics project) (20)

Eng315 professional experience professional communications art
Eng315 professional experience professional communications artEng315 professional experience professional communications art
Eng315 professional experience professional communications art
 
Psyc a303 statistics and methods name
Psyc a303 statistics and methods                          namePsyc a303 statistics and methods                          name
Psyc a303 statistics and methods name
 
Team27
Team27Team27
Team27
 
Business Statistics assignment 2014
Business Statistics assignment 2014Business Statistics assignment 2014
Business Statistics assignment 2014
 
Biol 3095 – 2012 Annotated bibliographies
Biol 3095 – 2012 Annotated bibliographiesBiol 3095 – 2012 Annotated bibliographies
Biol 3095 – 2012 Annotated bibliographies
 
Guava
GuavaGuava
Guava
 
Bisphenol a levels in human urine
Bisphenol a levels in human urineBisphenol a levels in human urine
Bisphenol a levels in human urine
 
Carbon 13 SWB Intake
Carbon 13 SWB IntakeCarbon 13 SWB Intake
Carbon 13 SWB Intake
 
Presentation epi
Presentation epiPresentation epi
Presentation epi
 
tarafornutostudyposter
tarafornutostudypostertarafornutostudyposter
tarafornutostudyposter
 
Cohorte ejercicio 1-dia-care-2006-van-dam-398-403
Cohorte ejercicio 1-dia-care-2006-van-dam-398-403Cohorte ejercicio 1-dia-care-2006-van-dam-398-403
Cohorte ejercicio 1-dia-care-2006-van-dam-398-403
 
NUTRITIONAL EPIDEMIOLOGY END OF SEMESTER EXAM.docx
NUTRITIONAL EPIDEMIOLOGY END OF SEMESTER EXAM.docxNUTRITIONAL EPIDEMIOLOGY END OF SEMESTER EXAM.docx
NUTRITIONAL EPIDEMIOLOGY END OF SEMESTER EXAM.docx
 
Coughlin_US Senate testimony on caffeine_July 2013
Coughlin_US Senate testimony on caffeine_July 2013Coughlin_US Senate testimony on caffeine_July 2013
Coughlin_US Senate testimony on caffeine_July 2013
 
Effect of Food Source on Enzymatic Activity in C. maculatus [draft 2]
Effect of Food Source on Enzymatic Activity in C. maculatus [draft 2]Effect of Food Source on Enzymatic Activity in C. maculatus [draft 2]
Effect of Food Source on Enzymatic Activity in C. maculatus [draft 2]
 
Basis for the Public's Opinions on the Healthiness of Coffee
Basis for the Public's Opinions on the Healthiness of CoffeeBasis for the Public's Opinions on the Healthiness of Coffee
Basis for the Public's Opinions on the Healthiness of Coffee
 
sensoryanalysis_poster
sensoryanalysis_postersensoryanalysis_poster
sensoryanalysis_poster
 
Difference-from-Control test.pptx
Difference-from-Control test.pptxDifference-from-Control test.pptx
Difference-from-Control test.pptx
 
Final Data Mining_Elizabeth Ortega
Final Data Mining_Elizabeth OrtegaFinal Data Mining_Elizabeth Ortega
Final Data Mining_Elizabeth Ortega
 
Coffee and Cancer_Benefit-Risk Evaluation_Coughlin and Nehlig_ASIC Costa Rica...
Coffee and Cancer_Benefit-Risk Evaluation_Coughlin and Nehlig_ASIC Costa Rica...Coffee and Cancer_Benefit-Risk Evaluation_Coughlin and Nehlig_ASIC Costa Rica...
Coffee and Cancer_Benefit-Risk Evaluation_Coughlin and Nehlig_ASIC Costa Rica...
 
Effects of coffee intake in different types of cancer- Carla Figueroa
Effects of coffee intake in different types of cancer- Carla FigueroaEffects of coffee intake in different types of cancer- Carla Figueroa
Effects of coffee intake in different types of cancer- Carla Figueroa
 

Mais de Mohamed Herzallah

Mais de Mohamed Herzallah (10)

Aartselaar Aquafin water , Environmental water plant .
Aartselaar Aquafin water , Environmental water plant .Aartselaar Aquafin water , Environmental water plant .
Aartselaar Aquafin water , Environmental water plant .
 
EIA for Al Tafila wind farm
EIA for Al Tafila wind farm EIA for Al Tafila wind farm
EIA for Al Tafila wind farm
 
Report Tidal power .
Report Tidal power . Report Tidal power .
Report Tidal power .
 
Tidal power - Renewable energy .
Tidal power - Renewable energy  . Tidal power - Renewable energy  .
Tidal power - Renewable energy .
 
Mohammed herzallah doroteya
Mohammed herzallah   doroteyaMohammed herzallah   doroteya
Mohammed herzallah doroteya
 
Diabetes in the Arabs world .
Diabetes in the Arabs world .Diabetes in the Arabs world .
Diabetes in the Arabs world .
 
Household waste in PALESTINE .
Household waste in PALESTINE .Household waste in PALESTINE .
Household waste in PALESTINE .
 
Asia .
Asia . Asia .
Asia .
 
Microbiology .
Microbiology . Microbiology .
Microbiology .
 
Ecological Risk Assessment
Ecological Risk Assessment Ecological Risk Assessment
Ecological Risk Assessment
 

Último

Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 

Último (20)

Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 

Ameril herzallah ludevese_nicomel_rifai (applied statistics project)

  • 1. Applied Statistics Prof. Dr. Ir. O. Thas Group Project Ameril, Camar (01305028) Herzallah, Mohammed (01303232) Ludevese, Christine (01307342) Nicomel, Nina Ricci (01302593) Rifai, Ridwan (01302922) 2013 – 2014
  • 2. 1. Statistical Methodology Being one of the well-known beverages and one of the most important commodities, coffee has gained its place in the global market. Its taste serves as the primary indicator of the preference of the consumers. With this, companies involved in coffee marketing are interested in a great number of factors that may affect the taste of the coffee. In one case study, a company organized a tasting session in which 40 women were invited to taste one cup of coffee with distinct characteristics. The participants were requested to give a taste score to the cup of coffee on a scale of 1 to 20. Additionally, they were asked regarding their age (in years) and their average coffee consumption per day (in cups). The researchers also have collected data on the coffee samples and these included the caffeine content of each cup of coffee (in milligrams), the origin of the coffee beans (i.e. Bolivia, Brazil, Columbia, Peru), the roasting time of the coffee beans (in minutes), and the species of the coffee plant (i.e. Coffea arabica or Coffea robusta). It is of primary interest to determine the following: (1) influence of the origin of the coffee beans and the age of the participants on the taste score; (2) occurrence of the difference in the caffeine content of the C. arabica and C. robusta plant species and (3) the effect of age on the average coffee consumption per day of the participants. Prior to performing any statistical analyses, assumptions on the normality of the residuals or observations and equality of group variances were assessed using Kolmogorov-Smirnov (KS) test (with supporting QQ plots) and Modified Levene test, respectively. Two-way Analysis of Variance (ANOVA) was used to evaluate Question 1 since the influence of two independent variables (i.e. origin of the coffee beans and age of the participants) on the taste score needs to be examined. Subsequently, assessments were made whether there is an interaction between the two independent variables. To find the differences within each factor, Tukey test was used with a family-wise error rate equal to 5%. As for Question 2, the dataset was first split to assess the assumptions since there are two species of coffee plant involved. Since it was known that the normality assumption was not fulfilled and that the data is homoscedastic, Wilcoxon rank-sum test was used for evaluation. Oneway ANOVA was used in Question 3 given that the influence of only one independent variable (i.e. age) on the average coffee consumption per day of the participants needs to be determined. All tests were done at a 5% level of significance using TIBCO Spotfire S+ software. 2. Results and Comments 2.1. Question 1 Data points on the QQ plot of the residuals (Figure 1) were alternately found below and above the line. This may indicate some degree of skewing. For verification, KS test was performed and this confirmed non-normality with p-value equal to 0.0081 (Table 1). However, since the number of observations is greater than 30, the Central Limit Theorem may apply and the normality can be assumed. As for the homoscedasticity assumption, Modified Levene test confirmed the equality of the variances with p-value equal to 0.7993 (Table 2). Since the assumptions were fulfilled, ANOVA was then performed to answer Question 1. Considering Table 3, there is an influence of the origin of the coffee beans and age of the participants on the taste score since p-values are much less than 0.05. The interaction term is also significant with p-value equal to 0.0486. This can be supported by the interaction plot (Figure 2) since the generated lines are not parallel to each other. This means that the influence of the origin of the coffee beans on the mean taste score depends on the age of the participant and vice versa. Because of the interaction term, both independent variables must be assessed separately to find the differences. In Tables 4, 5 and 6, the influence of the origins on the mean taste score was assessed according to the three age classes. For people younger than 40 years (age class A), there is no significant effect of origin on the mean taste score. For people aged 40 or older but younger than 60 years (age class B), only significant differences between (1) Bolivia and Columbia and (2) Brazil and Columbia were observed. For people aged 60 years or older (age class C), significant differences between (1) Bolivia and Brazil, (2) Brazil and Columbia and (3) Brazil and Peru were observed. In Tables 7, 8, 9 and 10, the influence of the age classes on the mean taste score was assessed according to the four origins. For Bolivia, there is a significant difference between age class B and age class C. For Brazil,
  • 3. significant differences between (1) age class A and age class B and (2) age class A and age class C were observed. For Columbia, only significant difference between age class A and age class C was observed. For Peru, there is no significant effect of age class on the mean taste score. From these results, it can be deduced where the researchers should source their coffee beans from, which is Peru. This is because regardless of the age of the coffee consumer, same taste score will be given as long as the origin of the coffee beans is Peru. 2.2. Question 2 The normality of the observations for C. arabica and C. robusta were assessed separately. By observing the generated QQ plots (Figure 3) and box plots (Figure 4) for both plant species, it is clear that the observations for C. robusta are normally distributed while those of C. arabica are not. This can be supported by KS tests (Tables 11 and 12) wherein p-values of 0.5 and 0 were obtained for C. robusta and C. arabica, respectively. With this, normality assumption was not fulfilled. As for the homoscedasticity assumption, Modified Levene test revealed that variances of the groups are equal with p-value equal to 0.6697864 (Table 13). In this case, Wilcoxon rank-sum test was performed and this showed that HO must be rejected at a 5 % level of significance. Based on this sample, the mean caffeine content of C. arabica is greater than the mean caffeine content of C. robusta. However, with p-value equal to 0.0338, it can be inferred that the conclusion is not strong, thus, further analyses should be done before suggesting which species of coffee should be marketed. 2.3. Question 3 QQ plot of the residuals (Figure 5) clearly shows that residuals do not follow a normal distribution. For verification, KS test was performed and this confirmed non-normality with p-value equal to 0.0188 (Table 15). However, since the number of observations is greater than 30, the Central Limit Theorem may apply and the normality can be assumed. As for the homoscedasticity assumption, Modified Levene test confirmed the equality of the variances with p-value equal to 0.6697 (Table 16). Since the assumptions were fulfilled, ANOVA was then performed to answer Question 3. From Table 17, it can be strongly inferred that at least one age class results in a different average frequency of coffee consumption per day. At a 5% family-wise level of significance, there is significant difference between the mean frequency of coffee consumption per day of (1) age class B and age class C and (2) age class A and age class B. We are 95% family-wise confident that when the age class is B, the mean frequency of coffee consumption per day will be between 1.38 and 3.53 cups more than when the age class is C. Same interpretation can be done for age class A and age class B. Moreover, it can also be known from the estimates in Table 17 that the most interesting age group target is B, followed by C, then by A. 3. Executive Summary As general remarks, by performing statistical analyses, the researchers concluded that there is an influence of the origin of the coffee beans and the age of the participants to the taste of the coffee. The younger age group (i.e. participants younger than 40 years) does not discriminate the origin of the coffee beans. However, for the middle age group (i.e. participants aged 40 of older but younger than 60 years) and the older age group (i.e. participants aged 60 or older), the origin of the coffee beans is a significant factor that affects the taste of the coffee. It is further suggested that Peru as the origin of coffee beans is the most preferred source. The mean caffeine content of a more expensive C. arabica is higher than the mean caffeine content of a cheaper C. robusta. Yet, at 5% level of significance, it is suggested that further analyses should be done before recommending what species should be the best one for manufacturing coffee beverage. The age group of the consumer is important to consider for the marketability of the coffee product. The middle age group (40 to 59 years old) is the most interesting target consumer since they are relatively highest in the frequency of consumption. The second interesting one is the older age group (60 years old and above) and the third interesting group is the younger age group (39 years old and below) having the higher and lowest frequency of consumption of coffee, respectively.
  • 4. 4. Appendices 4.1. Appendix 1: S+ outputs for Question 1 4.1.1. Assessment of normality assumption To assess the normality assumption, the Kolmogorov-Smirnov test was used. In this test, the residuals were considered since the number of observations for each origin is 10, which is a relatively small number. Additionally, the QQ plot of the residuals was obtained and considered. H0: Residuals follow a normal distribution. H1: Residuals do not follow a normal distribution. Table 1. S+ output of Kolmogorov-Smirnov test. One sample Kolmogorov-Smirnov Test of Composite Normality data: residuals in coffee.data ks = 0.1644, p-value = 0.0081 alternative hypothesis: True cdf is not the normal distn. with estimated parameters sample estimates: mean of x standard deviation of x 0 2.213208 22 0 -4 -2 Residuals 2 4 27 5 -2 -1 0 1 2 Quantiles of Standard Normal Figure 1. QQ plot of the residuals. 4.1.2. Assessment of homoscedasticity assumption To assess the homoscedasticity assumption, Modified Levene test was used. H0: Variances of all groups are equal. H1: Variances of all groups are not equal. Table 2. S+ output of Modified Levene test. *** Modified Levene test *** Df Sum of Sq Mean Sq F Value Pr(F) groep 10 22.675 2.267500 0.601624 0.7993811 Residuals 29 109.300 3.768966
  • 5. 4.1.3. Assessment of interaction effects H0: There is no interaction between origin and age for average taste score. H1: There is an interaction between origin and age for average taste score. Table 3. S+ output of the two-way ANOVA for the interaction assessment. *** Analysis of Variance Model *** Short Output: Call: aov(formula = taste ~ origin * ageclass, data = coffee.data, na.action = na.exclude) Terms: origin ageclass origin:ageclass Residuals Sum of Squares 384.0750 194.2043 84.4623 191.0333 Deg. of Freedom 3 2 5 29 Residual standard error: 2.566585 1 out of 12 effects not estimable Estimated effects may be unbalanced 20 Type III Sum of Squares Df Sum of Sq Mean Sq F Value Pr(F) origin 3 309.5743 103.1914 15.66508 0.00000303 ageclass 2 190.2823 95.1412 14.44300 0.00004441 origin:ageclass 5 84.4623 16.8925 2.56438 0.04868222 Residuals 29 191.0333 6.5874 ageclass 15 10 5 mean of taste -- 0 NA's 21.99+ thru 39 39.00+ thru 59 59.00+ thru 70 Bolivia Brazil Columbia Peru origin Figure 2. Interaction plot for origin and age class.
  • 6. 4.1.4. Effects of origin of the coffee beans on the taste score of participants younger than 40 years H0: There is no effect of the origin of the coffee beans on the taste score of participants younger than 40 years. H1: There is an effect of the origin of the coffee beans on the taste score of participants younger than 40 years. Table 4. S+ output of the one-way ANOVA for effects of origin of the coffee beans on the taste score of participants younger than 40 years. *** Analysis of Variance Model *** Short Output: Call: aov(formula = taste ~ origin, data = coffee.data.21.99..thru.39, na.action = na.exclude) Terms: origin Residuals Sum of Squares 54.85714 32.00000 Deg. of Freedom 2 4 Residual standard error: 2.828427 Estimated effects may be unbalanced Type III Sum of Squares Df Sum of Sq Mean Sq F Value Pr(F) origin 2 54.85714 27.42857 3.428571 0.1357341 Residuals 4 32.00000 8.00000 95 % simultaneous confidence intervals for specified linear combinations, by the Tukey method critical point: 3.5638 response variable: taste intervals excluding 0 are flagged by '****' Estimate Std.Error Lower Bound Upper Bound Brazil-Columbia 8.00e+000 3.27 -3.64 19.60 Brazil-Peru 8.00e+000 3.27 -3.64 19.60 Columbia-Peru 8.44e-015 2.31 -8.23 8.23
  • 7. 4.1.5. Effects of origin of the coffee beans on the taste score of participants aged 40 or older but younger than 60 years H0: There is no effect of the origin of the coffee beans on the taste score of participants aged 40 or older but younger than 60 years. H1: There is an effect of the origin of the coffee beans on the taste score of participants aged 40 or older but younger than 60 years Table 5. S+ output of the one-way ANOVA for effects of origin of the coffee beans on the taste score of participants aged 40 or older but younger than 60 years. *** Analysis of Variance Model *** Short Output: Call: aov(formula = taste ~ origin, data = coffee.data.39.00..thru.59, na.action = na.exclude) Terms: origin Residuals Sum of Squares 216.3333 120.6167 Deg. of Freedom 3 16 Residual standard error: 2.74564 Estimated effects may be unbalanced Type III Sum of Squares Df Sum of Sq Mean Sq F Value Pr(F) origin 3 216.3333 72.11111 9.565658 0.0007429687 Residuals 16 120.6167 7.53854 Estimated Coefficients: (Intercept) originBrazil originColumbia originPeru 15.5 -0.9 -9.166667 -4.75 95 % simultaneous confidence intervals for specified linear combinations, by the Tukey method critical point: 2.861 response variable: taste intervals excluding 0 are flagged by '****' Bolivia-Brazil Bolivia-Columbia Bolivia-Peru Brazil-Columbia Brazil-Peru Columbia-Peru Estimate Std.Error Lower Bound Upper Bound 0.90 1.57 -3.5800 5.38 9.17 1.86 3.8500 14.50 **** 4.75 1.68 -0.0603 9.56 8.27 2.01 2.5300 14.00 **** 3.85 1.84 -1.4200 9.12 -4.42 2.10 -10.4000 1.58
  • 8. 4.1.6. Effects of origin of the coffee beans on the taste score of participants aged 60 or older H0: There is no effect of the origin of the coffee beans on the taste score of participants aged 60 or older. H1: There is an effect of the origin of the coffee beans on the taste score of participants aged 60 or older. Table 6. S+ output of the one-way ANOVA for effects of origin of the coffee beans on the taste score of participants aged 60 or older. *** Analysis of Variance Model *** Short Output: Call: aov(formula = taste ~ origin, data = coffee.data.59.00..thru.70, na.action = na.exclude) Terms: origin Residuals Sum of Squares 222.6603 38.4167 Deg. of Freedom 3 9 Residual standard error: 2.066039 Estimated effects may be unbalanced Type III Sum of Squares Df Sum of Sq Mean Sq F Value Pr(F) origin 3 222.6603 74.22009 17.38779 0.0004362847 Residuals 9 38.4167 4.26852 Estimated Coefficients: (Intercept) originBrazil originColumbia originPeru 7 7.5 -2.75 0.6666667 95 % simultaneous confidence intervals for specified linear combinations, by the Tukey method critical point: 3.1219 response variable: taste intervals excluding 0 are flagged by '****' Bolivia-Brazil Bolivia-Columbia Bolivia-Peru Brazil-Columbia Brazil-Peru Columbia-Peru Estimate Std.Error Lower Bound Upper Bound -7.500 1.79 -13.10 -1.91 **** 2.750 1.79 -2.84 8.34 -0.667 1.89 -6.55 5.22 10.300 1.46 5.69 14.80 **** 6.830 1.58 1.91 11.80 **** -3.420 1.58 -8.34 1.51
  • 9. 4.1.7. Effects of the age class of the participants on the taste score of coffee from Bolivia H0: There is no effect of the age class of the participants on the taste score of coffee from Bolivia. H1: There is an effect of the age class of the participants on the taste score of coffee from Bolivia. Table 7. S+ output of the one-way ANOVA for effects of the age class of the participants on the taste score of coffee from Bolivia. *** Analysis of Variance Model *** Short Output: Call: aov(formula = taste ~ ageclass, data = coffee.data.Bolivia, na.action = na.exclude) Terms: Sum of Squares Deg. of Freedom ageclass Residuals 115.6 74.0 1 8 Residual standard error: 3.041381 Estimated effects may be unbalanced Type III Sum of Squares Df Sum of Sq Mean Sq F Value Pr(F) ageclass 1 115.6 115.60 12.4973 0.007674012 Residuals 8 74.0 9.25 95 % non-simultaneous confidence intervals for specified linear combinations, by the Fisher LSD method critical point: 2.306 response variable: taste intervals excluding 0 are flagged by '****' 39.00+ thru 59-59.00+ thru 70 Estimate Std.Error Lower Bound Upper Bound 8.5 2.4 2.96 14 ****
  • 10. 4.1.8. Effects of the age class of the participants on the taste score of coffee from Brazil H0: There is no effect of the age class of the participants on the taste score of coffee from Brazil. H1: There is an effect of the age class of the participants on the taste score of coffee from Brazil. Table 8. S+ output of the one-way ANOVA for effects of the age class of the participants on the taste score of coffee from Brazil. *** Analysis of Variance Model *** Short Output: Call: aov(formula = taste ~ ageclass, data = coffee.data.Brazil, na.action = na.exclude) Terms: Sum of Squares Deg. of Freedom ageclass Residuals 26.7 12.2 2 7 Residual standard error: 1.320173 Estimated effects may be unbalanced Type III Sum of Squares Df Sum of Sq Mean Sq F Value Pr(F) ageclass 2 26.7 13.35000 7.659836 0.01727571 Residuals 7 12.2 1.74286 95 % simultaneous confidence intervals for specified linear combinations, by the Tukey method critical point: 2.9451 response variable: taste intervals excluding 0 are flagged by '****' 21.99+ thru 39-39.00+ thru 59 21.99+ thru 39-59.00+ thru 70 39.00+ thru 59-59.00+ thru 70 Estimate Std.Error Lower Bound Upper Bound 5.4 1.450 1.14 9.66 **** 5.5 1.480 1.15 9.85 **** 0.1 0.886 -2.51 2.71
  • 11. 4.1.9. Effects of the age class of the participants on the taste score of coffee from Columbia H0: There is no effect of the age class of the participants on the taste score of coffee from Columbia. H1: There is an effect of the age class of the participants on the taste score of coffee from Columbia. Table 9. S+ output of the one-way ANOVA for effects of the age class of the participants on the taste score of coffee from Columbia. *** Analysis of Variance Model *** Short Output: Call: aov(formula = taste ~ ageclass, data = coffee.data.Columbia, na.action = na.exclude) Terms: ageclass Residuals Sum of Squares 106.1833 69.4167 Deg. of Freedom 2 7 Residual standard error: 3.149074 Estimated effects may be unbalanced Type III Sum of Squares Df Sum of Sq Mean Sq F Value Pr(F) ageclass 2 106.1833 53.09167 5.353782 0.03884073 Residuals 7 69.4167 9.91667 95 % simultaneous confidence intervals for specified linear combinations, by the Tukey method critical point: 2.9451 response variable: taste intervals excluding 0 are flagged by '****' 21.99+ thru 39-39.00+ thru 59 21.99+ thru 39-59.00+ thru 70 39.00+ thru 59-59.00+ thru 70 Estimate Std.Error Lower Bound Upper Bound 5.67 2.57 -1.910 13.20 7.75 2.41 0.667 14.80 **** 2.08 2.41 -5.000 9.17
  • 12. 4.1.10. Effects of the age class of the participants on the taste score of coffee from Peru H0: There is no effect of the age class of the participants on the taste score of coffee from Peru. H1: There is an effect of the age class of the participants on the taste score of coffee from Peru. Table 10. S+ output of the one-way ANOVA for effects of the age class of the participants on the taste score of coffee from Peru. *** Analysis of Variance Model *** Short Output: Call: aov(formula = taste ~ ageclass, data = coffee.data.Peru, na.action = na.exclude) Terms: ageclass Residuals Sum of Squares 30.18333 35.41667 Deg. of Freedom 2 7 Residual standard error: 2.249339 Estimated effects may be unbalanced Type III Sum of Squares Df Sum of Sq Mean Sq F Value Pr(F) ageclass 2 30.18333 15.09167 2.982824 0.1156281 Residuals 7 35.41667 5.05952 95 % simultaneous confidence intervals for specified linear combinations, by the Tukey method critical point: 2.9451 response variable: taste intervals excluding 0 are flagged by '****' 21.99+ thru 39-39.00+ thru 59 21.99+ thru 39-59.00+ thru 70 39.00+ thru 59-59.00+ thru 70 Estimate Std.Error Lower Bound Upper Bound 1.25 1.72 -3.81 6.31 4.33 1.84 -1.08 9.74 3.08 1.72 -1.98 8.14
  • 13. 4.2. Appendix 2: S+ outputs for Question 2 4.2.1. Assessment of normality assumption There are two samples with 20 observations each. The dataset was split first according to species (i.e. C. arabica and C. robusta) then the normality of both samples was assessed separately. H0: The observations for C. arabica were obtained from a normal distribution. H1: The observations for C. robusta were not obtained from a normal distribution. Table 11. S+ output of Kolmogorov-Smirnov test for C. arabica. One sample Kolmogorov-Smirnov Test of Composite Normality data: caffein in coffee.data.arabica ks = 0.305, p-value = 0 alternative hypothesis: True cdf is not the normal distn. with estimated parameters sample estimates: mean of x standard deviation of x 284.7324 133.3316 H0: The observations for C. robusta were obtained from a normal distribution. H1: The observations for C. robusta were not obtained from a normal distribution. Table 12. S+ output of Kolmogorov-Smirnov test for C. robusta. One sample Kolmogorov-Smirnov Test of Composite Normality data: caffein in coffee.data.robusta ks = 0.125, p-value = 0.5 alternative hypothesis: True cdf is not the normal distn. with estimated parameters sample estimates: mean of x standard deviation of x 207.8488 133.0805
  • 14. 400 caffein 500 500 400 300 300 200 200 100 100 0 -1.5 0.0 1.5 -1.5 Normal Distribution 0.0 1.5 Normal Distribution (a) (b) Figure 3. QQ plots of the caffeine content of (a) C. arabica and (b) C. robusta. 600 400 caffein caffein 600 200 0 arabica robusta species Figure 4. Box plots of the caffeine content of C. arabica and C. robusta.
  • 15. 4.2.2. Assessment of homoscedasticity assumption H0: Variances of all groups are equal. H1: Variances of all groups are not equal. Table 13. S+ output of Modified Levene test. *** Modified Levene test *** Df Sum of Sq Mean Sq F Value Pr(F) groep 1 1756.0 1756.021 0.1847069 0.6697864 Residuals 38 361268.6 9507.069 4.2.3. Assessment of the caffeine contents of C. arabica and C. robusta H0: The mean caffeine content of C. arabica is equal to the mean caffeine content of C. robusta. H1: The mean caffeine content of C. arabica is greater than the mean caffeine content of C. robusta. Table 14. S+ output of Exact Wilcoxon rank-sum test. Exact Wilcoxon rank-sum test data: x: caffein with species = arabica , and y: caffein with species = robusta rank-sum statistic W = 478, n = 20, m = 20, p-value = 0.0338 alternative hypothesis: mu is greater than 0
  • 16. 4.3. Appendix 3: S+ outputs for Question 3 4.3.1. Assessment of normality assumption H0: Residuals follow a normal distribution. H1: Residuals do not follow a normal distribution. Table 15. S+ output of Kolmogorov-Smirnov test. One sample Kolmogorov-Smirnov Test of Composite Normality data: residuals in coffee.data ks = 0.1533, p-value = 0.0188 alternative hypothesis: True cdf is not the normal distn. with estimated parameters sample estimates: mean of x standard deviation of x 5.551115e-018 1.206281 0 -1 Residuals 1 2 14 -2 21 10 -2 -1 0 1 Quantiles of Standard Normal Figure 5. QQ plot of the residuals. 4.3.2. Assessment of homoscedasticity assumption H0: Variances of all groups are equal. H1: Variances of all groups are not equal. Table 16. S+ output of Modified Levene test. *** Modified Levene test *** Df Sum of Sq Mean Sq F Value Pr(F) groep 1 1756.0 1756.021 0.1847069 0.6697864 Residuals 38 361268.6 9507.069 2
  • 17. 4.3.3. Assessment of the differences in coffee consumption among the three age classes H0: All three age classes result in the same average frequency of coffee consumption per day. H1: At least one age class results in a different average frequency of coffee consumption per day. Table 17. S+ output of the one-way ANOVA for the determination if differences in coffee consumption among the three age classes exist. *** Analysis of Variance Model *** Short Output: Call: aov(formula = frequency ~ ageclass, data = coffee.data, na.action = na.exclude ) Terms: ageclass Residuals Sum of Squares 82.85055 56.74945 Deg. of Freedom 2 37 Residual standard error: 1.238454 Estimated effects may be unbalanced Type III Sum of Squares Df Sum of Sq Mean Sq F Value Pr(F) ageclass 2 82.85055 41.42527 27.00881 5.860174e-008 Residuals 37 56.74945 1.53377 Table 18. S+ output of the Tukey method. 95 % simultaneous confidence intervals for specified linear combinations, by the Tukey method critical point: 2.4415 response variable: frequency intervals excluding 0 are flagged by '****' 21.99+ thru 39-39.00+ thru 59 21.99+ thru 39-59.00+ thru 70 39.00+ thru 59-59.00+ thru 70 Estimate Std.Error Lower Bound Upper Bound -3.440 0.544 -4.77 -2.110 **** -0.989 0.581 -2.41 0.429 2.450 0.441 1.38 3.530 ****