1. 2013/05/23
1
STATISTICS
2. X-Kit Textbook
Chapter 13
3. Precalculus Textbook
Appendix B: Concepts in Statistics
Par B.3
Information23 May 2013
• Find information on Edulink on:
• Calculation of Semester Marks.
• Consultation Time during the June Exam.
• Exam Info.
• Report any problems with marks to Ms Durandt
as soon as possible, CR523.
• Semester Tests available at the Collection Facility
at the Mathematics Department.
• Find all memos on Edulink.
• Second Semester?
CONTENT
Dependent &
Independent
Variables
Scatter Diagrams
Correlation
Regression
TABLE: MARKS ACHIEVEDVERSUSTIME
STUDIED
STUDENT TIME STUDIED
(X) IN HOURS
MARKS
ACHIEVED (Y)
IN %
A 2 60
B 5 85
C 1 30
D 4 70
E 2 40
SCATTERDIAGRAM
2, 60
5, 85
1, 30
4, 70
2, 40
0
10
20
30
40
50
60
70
80
90
0 1 2 3 4 5 6
MarksAchieved(%)
Time Studied (Hours)
Scatter Diagram of marks achieved versus time studied
Y-Values
INTERPRETINGA SCATTERDIAGRAM
0
10
20
30
40
50
60
70
80
90
0 1 2 3 4 5 6
MarksAchieved(%)
Time Studied (Hours)
Y-Values
Linear (Y-Values)
2. 2013/05/23
2
IS THERELATIONSHIPBETWEENVARIABLES
STRONGOR WEAK?
POSITIVE CORRELATION
0
5
10
15
20
25
30
35
40
0 2 4 6
Y-Values
NEGATIVE CORRELATION
0
5
10
15
20
25
0 2 4 6
Y-Values
IS THERELATIONSHIPBETWEENVARIABLES
STRONGOR WEAK?
NO CORRELATION
0
5
10
15
20
25
30
35
0 10 20 30 40
Y-Values
NON-LINEAR CORRELATION
0
0.5
1
1.5
2
2.5
3
3.5
0 1 2 3
Y-Values
SUMMINGUP CORRELATIONIN A NUMBER
• The correlation coefficient 𝑟 is a number that
tells us exactly how strong or weak the
correlation between two variables is.
• Calculate 𝑟 by using the formula:
𝒓 =
𝒙𝒚 − 𝒏𝒙 𝒚
𝒙 𝟐 − 𝒏𝒙 𝟐 𝒚 𝟐 − 𝒏𝒚 𝟐
• Calculate 𝑟 by using your calculator.
THE MEANINGOF 𝒓
Perfect Strong Mode-
rate
Weak No Linear
Correlation
Weak Mode-
rate
Strong Perfect
-1.00 -0.75 -0.50 -0.25 0.00 +0.25 +0.50 +0.75 +1.00
NEGATIVE CORRELATION POSITIVE CORRELATION
CALCULATETHE CORRELATIONCOEFFICIENT:
MARKS ACHIEVEDVERSUSTIME STUDIED
STUDENT TIME STUDIED
(X) IN HOURS
MARKS
ACHIEVED (Y)
IN %
A 2 60
B 5 85
C 1 30
D 4 70
E 2 40
SOLUTION
𝒓 = 𝟎. 𝟗𝟑𝟕
Very Strong Positive Linear
Relationship
3. 2013/05/23
3
MAKE PREDICTIONS
Follow the plan:
•Find the LINE OF BEST FIT.
•Decide how “well” it fits.
•From a good fit we can make predictions.
LINE OF BESTFIT (REGRESSIONLINE)
• Formula 𝒚 = 𝒂 + 𝒃𝒙
• Formula for the regression coefficients:
𝒃 =
𝒙𝒚 − 𝒏𝒙 𝒚
𝒙 𝟐 − 𝒏𝒙 𝟐
𝒂 = 𝒚 − 𝒃𝒙
• Use your calculator to calculate the regression
coefficients.
CALCULATETHE REGRESSIONCOEFFICIENT:
MARKS ACHIEVEDVERSUSTIME STUDIED
STUDENT TIME STUDIED
(X) IN HOURS
MARKS
ACHIEVED (Y)
IN %
A 2 60
B 5 85
C 1 30
D 4 70
E 2 40
LINE OF BESTFIT
𝒚 = 𝟐𝟏. 𝟓 + 𝟏𝟐. 𝟕𝒙
INTERPRETINGA SCATTERDIAGRAM
y = 12.685x + 21.481
R² = 0.8777
0
10
20
30
40
50
60
70
80
90
0 1 2 3 4 5 6
MarksAchieved(%)
Time Studied (Hours)
Y-Values
Y-Values
Linear (Y-Values)
MEASURINGHOW WELL THELINE FITS
• How well does our line fit the real data? How accurate is
our model?
• 𝑟 , the CORRELATION COEFFICIENT tells us how STRONG
the relationship is between two variables, or how closely
the data fits our line.
• 𝑟2
, the COEFFICIENT OF DETERMINATION measure the
ACCURACY of our predictions. For a perfect fit 𝑟2
= 1 ,
closer to zero indicate a poorer fit.
• The coefficient of determination tells us that 87.8% of the
variation in students’ marks is linked to the amount of
time they spend studying. The other 12.2% is due to
other factors, like intelligence levels.
4. 2013/05/23
4
PREDICTINGFROMTHELINE OF “BEST FIT”
If your friend only study for 2.5 hours,
will he pass the test?
𝒚 = 𝟐𝟏. 𝟓 + 𝟏𝟐. 𝟕𝒙
HOMEWORK
•Example X-Kit textbook page 310 – 311.
•Practise for your exams page 312 number
1, 2, 3, 4, 5, & 6.
•Par B.3 (page B14) all odd number
questions.