Final session in a series of four seminars presented to University of North Texas librarians. This presentation brings together some best practices for gathering, organizing, analyzing, and presenting statistics and data.
2. Know what you know and what you don’t know
Have a comparison group
Use validated measures
Have a Data Entry Plan
Get to know your data
If it doesn’t fit, change it
Place your bets before you collect the data
Use the best methods of analysis for your question & your data
Go beyond the p-value
BEST PRACTICES
3. What is Statistics?
•Study of Data
•Collecting
•Organizing
•Summarizing
•Analyzing
•Presenting
•Storing &
Sharing
Why is it
Important?
•Make sense of the
data
•Explain what
happens and
(possibly) why
•Make sound
decisions
•To know how
close we are to
the truth.
6. How do users differ when
(searching, finding, selecting)
(articles, books, Web sites)?
What are the effects of ___________On ____________?
Whichis better at improving
_________?
How are people (finding, selecting, using) _______?
What are factors associated
with ___________?
STARTING WITH YOUR
RESEARCH QUESTION
8. Nominal
•Counts by category
•No meaning between the categories (Blue is not better than
Red)
Ordinal
•Ranks
•Scales
•Space between ranks is subjective
Interval
•Integers
•No baseline
•Space between values is equal and objective, but discrete
Ratio
•Interval data with a baseline
•Space between is continuous
LEVELS OF MEASUREMENT (NOIR)
12. WAYS OF COMPARING…
Time Periods
Other Libraries
National Surveys
Patron Types
Material Types
13. •Qualitative
•Comparison
Expected ranks or ratios
•Quantitative
•Correlations
Two variables
•Quantitative or Qualitative
•Paired or Not Paired
Samples or Groups
KINDS OF COMPARISON
16. USE A TOOL WITH ESTABLISHED VALIDITY
Approaches and Study
Skills Inventory for
Students (ASSIST)
User Engagement Scale (UES)
17. ESTABLISH VALIDITY OF MEASURES
•ConsistencyReliability
•Common sense
Content or
Face Validity
•Based on theory
Construct
Validity
•Comparison with other
valid measures
Criterion
Validity
24. • Average
• For Quantative data
• Excel function: =Average(range)
Mean
• Middle
• For Quantitative or Rank data
• Excel function: =Median(range)
Median
• Most common
• Primarily for Qualitative data
• Excel function: =Mode(range)
Mode
MEASURES OF CENTRAL TENDENCY
26. DISTRIBUTION OR SPREAD OF QUALITATIVE
DATA
Tables
•Counts
•Percentages/Ratios
•Averages of Counts
Excel
•Pivot Tables
27. PIVOT TABLES IN EXCEL
Select Data
•Highlight table
•Insert->Pivot Table
Select
Variables
•Categories (Row Labels)
•Values
Change
Settings
•Percentage of Grand Total
•Average
29. GRAPH & CHART RULES OF THUMB
Trends
Connection
across the X-
axis
Categorical
Comparisons
Grouped
Stacked
Relative
Stacked
Categorical
Few
Categories
Differences
are Wide
31. John W. Tukey
Exploratory Data
Analysis
Examining your data
visually.
Stem & Leaf
Hinges
Box plots
Scatter plots, etc.
EXPLORATORY DATA ANALYSIS
45. DEMONSTRATION OF DISTRIBUTIONS
Distribution of the
Population
The “Truth”
N is the # of samples
n is the number of items
in each sample
Watch the cumulative mean & medians slowly
merge to the population
49. Evaluate the
distribution of
raw data
Select a
transformation
method
Transform the
data
Normally
Distributed?
Statistically
Test
Transformed
Data
HOW TO BECOME NORMAL
Express the result in the terms
of the transformation
53. EXAMPLE HYPOTHESIS
>=75%* <75%*
*…of journal articles cited by UNT PACS faculty in journal articles
published between 2008-2011.
UNT Libraries provides access to…
58. Variable Type
What is being
compared
Independence
of units
Underlying
variance in the
population
Distribution Sample size
Number of
comparison
groups
FACTORS ASSOCIATED WITH CHOICE OF
STATISTICAL METHOD
63. Correlations
•Cohen’s guidelines
for Pearson’s r
Differences from the
mean
•Standardized
•weighted against
the standard
deviation
•Cohen’s d
𝑑 =
𝑥1 − 𝑥2
𝑠
EFFECT SIZES OF QUANTITATIVE DATA
Effect
Size
r>
Small .10
Medium .30
Large .50
64. Based on
Contingency
table
• Odds of event A divided by odds of event
B
• Case-control studies
Odds ratio
• Uses probabilities rather than odds
• Experiments, RCTsRelative risk
EFFECT SIZES OF QUALITATIVE DATA
Test A/B Yes No Total
Yes 10 15 25
No 50 25 75
Totals 60 40 100
65. Point estimates
Intervals
Based on
Expressed as:
•Single value
•Mean
•Degree of uncertainty
•Range of certainty around the
point estimate
•Point estimate (e.g. mean)
•Confidence level (usually .95)
•Standard deviation
•The mean score of the students
who had the IL training was 83.5
with a 95% CI of 78.3 and 89.4.
CONFIDENCE INTERVALS
67. Know what you know and what you don’t know
Have a comparison group
Use validated measures
Have a Data Entry Plan
Get to know your data
If it doesn’t fit, change it
Place your bets before you collect the data
Use the best methods of analysis for your question & your data
Go beyond the p-value
BEST PRACTICES
68. RESOURCES
Rice Virtual Lab in
Statistics
Excel Tutorials for
Statistical Analysis
Khan Academy -
videos
Basic Research
Methods for
Librarians
Descriptive Statistical
Techniques for
Librarians