O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.
Próximos SlideShares
What to Upload to SlideShare
Avançar
Transfira para ler offline e ver em ecrã inteiro.

1

Compartilhar

Excel tutorial for frequency distribution

Baixar para ler offline

Excel tutorial explaining and giving step-wise instructions for preparing a Frequency Distribution Table from Master Chart

Excel tutorial for frequency distribution

  1. 1. Frequency Distribution of Numeric data: Step-wise Tutorial using Excel
  2. 2. Frequency Distribution Table Displays the no. of occurrences or frequencies of various outcomes in a sample or a population. Class f % Cumulative f Cumulative % 10 - 20 492 8.9 492 8.9 20 - 30 602 10.9 1094 19.8 30 - 40 632 11.4 1726 31.2 40 - 50 670 12.1 2396 43.3 50 - 60 620 11.2 3016 54.5 60 - 70 619 11.2 3635 65.7 70 - 80 631 11.4 4266 77.1 80 - 90 600 10.8 4866 88 90 - 100 665 12 5531 100
  3. 3. Let us start with a set of data To illustrate how easy it is with Excel, a set of fictitious data of 5531 patients of Hypertension who were treated with an antihypertensive drug is presented Data set is resident in Sheet1 which has been renamed “Data Table” to make it easy to remember
  4. 4. Rename Sheet 2 as “FreqDist” to accommodate Frequency Distribution table
  5. 5. Should be descriptive i.e. they should indicate the type of data contained in the field Units of measurement should be mentioned where needed e.g. HeightCm, WeightKg etc Many workers use underscore to separate the field name and units e.g. Height_cm, Weight_kg In this presentation, underscores have been dispensed with and the first letter of the units has been capitalized for convenience e.g. HeightCm instead of Height_cm Field Titles
  6. 6. In Excel, data is referred to as addresses of the cells in which it resides It is impossible to remember the cell addresses in which data of Age, Height and all other numeric fields reside We can give names to a range of data so as to use the Range Name e.g. Age instead of cell address (C2:C5532)
  7. 7. Select the entire data and instruct Excel to give the Name in the top cell of each column to all the data below that field name e.g. C2:C5532 would be named as Age, E2:E5532 as HeightCm and so on
  8. 8. • Select all data (Ctrl-A) • Formulas  Defined Names Create from Selection Top Row OK
  9. 9. For Frequency Distribution Table, you need to determine: •Number of observations (n) •Range of data (DataRange) •Number of Classes (c) •Class Interval (i)
  10. 10. •Construction of Classes •Determination of frequencies (f) •Other parameters (%, cum. f, Cum % etc) Subsequent Steps
  11. 11. Prepare the area for Frequency Distribution Table in FreqDist sheet You can use field names to refer to the data relating to the field. You can use cell addresses but it is cumbersome Copy the field names from Data Table to FreqDist sheet so that you donot have to go to the Data Table for field names or their spellings
  12. 12. • Field names copied to FreqDist sheet by method of user’s choice
  13. 13. • Prepare this blank table • Contains parameters required for Frequency Distribution Table
  14. 14. We will now give the name “Field” to B3, “n” to B4 and so on I.e. contents in Cells A3 to A9 will be used as Names for Cells B3 to B9 for convenience This can be done in one go by giving B3:B9 the names from the left column as shown in the next slides
  15. 15. • Select A3:B9 (Coloured cells) • Click Formulas  Defined Names  Create from Selection
  16. 16. Click Left Column to give contents of Col. A as names to the adjoining cells in Col. B OK
  17. 17. Time to fill the blank table
  18. 18. Put here the name of the field you will use for the Frquency Distribution Table in the next few steps Formula
  19. 19. Give name RawData to Field of Interest Select the Column Age in Data Table Sheet by clicking C1 i.e. Age and then pressing Ctrl+Shift+↓ to select all data-containing cells in the column Go to Name Box and type RawData to give this alias to the field Age
  20. 20. Why use a single Alias? You can use cell references or range names of different ranges (fields) for creating separate Frequency Distribution tables Using a single alias for all fields, turn by turn, has the advantage that you only change the column reference of RawData and it starts representing the new field Saves plenty of time and energy
  21. 21. Frequency Distribution of Age
  22. 22. Determination of “n” Can easily be done by using the COUNT function of Excel All you have to do is click cell B6 and enter “=count(RawData)” without quotes Cell B6 has the name “n”. You can access this data by using this name
  23. 23. Determination of “n” “=” tells Excel that what follows is a formula and not merely text COUNT function counts all cells which contain numeric data, even if it is zero, i.e. it gives “n” It will not count cells which are blank or contain text
  24. 24. “n” Formula
  25. 25. Correct “n” reassures that all cells will be used in the subsequent steps “n” will also be used for calculating percentages
  26. 26. Determination of Minimum and Maximum Values For minimum value, enter “=min(RawData)” in B7 For maximum value, enter “=max(RawData)” in B8
  27. 27. Determines minimum value Formula
  28. 28. Determines maximum value Formula
  29. 29. Determination of Data Range Range = Maximum – Minimum + 1 Range can also be taken as Maximum – Minimum Whatever you decide, be consistent
  30. 30. Range = Max. – Min. + 1 Formula
  31. 31. No. of Classes (C) Several formulas available to calculate C Best to go by conventions in your area of work
  32. 32. Class Interval (i): General “i” should be an odd number below 8 (1, 3, 5, 7) or 2 or 10. Larger and smaller numbers can be multiples or factors of these (2.5, 7.5, 15, 25, 50, 75, 100, 125, 200, 250 etc)
  33. 33. i = Range/c Fractions are avoided by modified formula as i = roundup(Range/C, 0) This ROUNDs the answer UP to the next higher whole number (0 decimal places) In the given worksheet, the user has to enter “i” manually but he must keep the principles on this and previous slide in mind Class Interval (i): Calculation
  34. 34. Class Interval decided and entered by the user
  35. 35. Lower Limit of Lowest Class (LL1) LL1 is the key calculation in frequency distribution LL1 must be a multiple of i Should be less than or equal to minimum value so that the lowest class contains the minimum value
  36. 36. Ll1 (Contd) In the formula “=int(MinVal/EntClassInt) * EntClassInt”, int(MinVal/EntClassInt) calculates the quotient (integer) of the division (whole number and ignores the remainder or modulus) On multiplication with class interval (EntClassInt), it gives LL1 Here, MinVal = 12, i = 10, LL1 = int(12/10) * 10 = int(1.2) * 10 = 1 * 10 = 10. Hence the lowest class (StartClass) should begin with 10
  37. 37. Lower limit of Lowest Class (LL1) Formula
  38. 38. Construction of Classes
  39. 39. Construction of Classes: General Principles All classes should be equal & continuous (No gaps even if the frequency for the relevant class is 0) Open-ended classes not provided for in this presentation Classes with zero data are not allowed at the top or bottom
  40. 40. Skeleton Table for Frequency Distribution Prepare a skeleton Frequency Distribution Table as shown in next slide It will be used as a template for showing Frequency Distribution of different fields, one field at a time It provides for upto 20 classes in the Frequency Distribution Table If lesser no. of classes are used for any field, remaining rows will remain blank
  41. 41. Template for Frequency Distribution Table For Total
  42. 42. Preparing the Lowest Class
  43. 43. Lower Limit of Lowest Class (LL1) = StartClass Formula
  44. 44. If(E5 = “”, “”, ……………..) “=If(E5 = “”, “”, ………..) in the next slide means that if the “From” cell (E5 here) is blank, leave this cell also blank This ensures Blank rows, if there is no data in the “From” cell of any row The formula in the next slide adds “I” to LL1 to get UL1
  45. 45. Upper Limit of Lowest Class (UL1) Formula
  46. 46. Concatenation Operation The formula in the next slide, concatenates (joins fragments of text) the numbers in “From” and “To” columns, separated by a hyphen. This column is not required for mathematical operations but is very useful to show the classes when you prepare an observation table or a graph or chart from the Frequency Distribution Table
  47. 47. Class for Observation Table or Graph Formula
  48. 48. Using COUNTIF to Count Frequencies “COUNT” simply counts numeric-data containing cells irrespective of their values “COUNTIF”, on the other hand, counts cells that contain values that meet pre-defined criteria e.g. < 10, > 20, ≥ 30, <> (not equal to) 40 and so on COUNTIF will be used to count cells which contain data belonging to a specific class, turn by turn
  49. 49. Two Methods of Determining Frequencies  Frequency (f) for 30-40 class = Count cells containing values ≥ 30 and < 40  f for 30-40 class also determined as Cumulative f for < 40 minus Cumulative f for < 30  In this presentation, the second method has been used
  50. 50. Need for “From” and “Upto” ColumnsNow we shall ask Excel to read an UPTO value from a cell (e.g. F5) and count the cells in the range in question (RawData) that contain values below that (F5) For this reason, we have to have separate “From (≥)” and “Upto (<)” columns. The mathematical symbols also indicate that for the 30-40 class, all values 30 or more (upto, but less than 40) will be placed in the 30-40 class whereas 40 and above (upto, but less than 50) in the 40-50 class
  51. 51. Count of cells containing values less than UL1 (F5 i.e. 20) Formula
  52. 52. %age rounded to one decimal place Formula
  53. 53. For Lowest Class, f = Cumulative f Formula
  54. 54. %age rounded to one decimal place Running totals of f & % Formula
  55. 55. Preparing the Second Class
  56. 56. Copy first row to the second and change formulas of two cells (Next slide)
  57. 57. The 1st part ensures that if MaxVal has already been crossed, a blank row is produced, otherwise “i“ is added to LL1 (Do NOT enter LL2 = UL1 as sometimes you may want a gap as discussed later) Formula
  58. 58. “f” for this class is calculated as “Cum f” for this Class minus “Cum f” of preceding Class Formula
  59. 59. Preparing Higher Classes: Piece of Cake
  60. 60. Copy 2nd row to all the other blue rows.
  61. 61. Frequency Distribution Table is ready! For Totals use SUM function in the Total Row
  62. 62. • Frequency Distribution Table is ready! • Get Totals by using SUM function in the Total Row • Check Total by selecting the data in the “f” column, sum shows up in the status bar as long as you keep data selected
  63. 63. Frequency Distribution Chart/Graph
  64. 64. Select Columns which contain classes & f along with Column Headings (G4:H13)
  65. 65. Insert  Recommended Charts  Select Chart OK
  66. 66. Format the chart as required
  67. 67. Frequency Distribution with other class intervals I want data in classes of 15 each
  68. 68. All you have to do is to change the EntClassInt value which you had entered earlier Let us see the effects of changing the Class interval from 10 to 15
  69. 69. Class Interval = 15 Instantaneous change in Table & Graph
  70. 70. Comparison with automatic Data Analysis tool
  71. 71. Data  Data Analysis Histogram
  72. 72. Input Range, Bin Range, Output Range, Type of Output
  73. 73. Frequencies do NOT match
  74. 74. Works well for discrete classes
  75. 75. • Understand the working of tools before you use themCaution
  76. 76. Save this Workbook for Future Use A little laborious to get the Frequency Distribution for the first time Save this table After this comes the easy part
  77. 77. To get the frequency distribution of other fields, turn by turn, all you have to do is to change the cell reference of the RawData range To get the frequency distribution of HeightCm, you have to change the cell reference of RawData to that of HeightCm i.e. from $C$2:$C$5532 to $E$2:$E$5532 If your data is in a rectangular table, just change the two column references from C to E, without disturbing the row numbers.
  78. 78. Frequency Distribution of other numeric fields e.g. HeightCm
  79. 79. Formulas  Defined Names  Name Manager  RawData  Edit
  80. 80. Change Column from C to E at both places
  81. 81. Frequency Distribution of HeightCm by merely changing Column reference at two places =E1 to get the new field name Change Class Interval, if required
  82. 82. This way you can change Columns in RawData to the columns of any other numeric field to get the frequency distribution of that field You can also change the graph type and its formatting as desired
  83. 83. For Discontinuous data, use discontinuous classes e.g. 10-19, 20-29 etc
  84. 84. Change Upper Limit of starting class (UL1) only. Others will adjust Note this is NOT “live”
  85. 85. For true (Actual) Class Limits, subtract half unit from lower limit and add half unit to upper limit e.g. for 10-19, you should take 9.5-19.5 into account. For 20-29, take 19.5 to 29.5 into account and so on
  86. 86. Reduce by half Unit
  87. 87. Automatic adjustment Formula
  88. 88. Round LL1 upwards & UL1 downwards New Class
  89. 89. No change required in subsequent rows
  • NehadElhemali

    Mar. 29, 2020

Excel tutorial explaining and giving step-wise instructions for preparing a Frequency Distribution Table from Master Chart

Vistos

Vistos totais

183

No Slideshare

0

De incorporações

0

Número de incorporações

4

Ações

Baixados

19

Compartilhados

0

Comentários

0

Curtir

1

×