Guidelines for Creating Histograms

  • Use 5 to 20 class intervals. More class intervals are appropriate for larger samples.
  • Each observation should lie within exactly one class interval.
    A common way to ensure this is to construct your class boundaries so that denoting them requires one more decimal place accuracy than your data. For example, if you had data looked like 23, 23, 24, 25, 25, 27, ... one of your class boundaries might be 24.5. The use of ".5" is common in this context. Similarly, if your data looked like 45.32, 45.67, 45.87, 46.88, ... then one of your class boundaries might be 45.875.
  • The class intervals should have equal width
  • One should include all classes in the depiction of the histogram, even if there are no data points in one or more of these classes.
    That is to say, you should have bars of "zero-height" for classes with no data associated with them.
  • One should use common sense, and pick class limits and boundaries that are reasonable.
    Use nice "round" numbers for your class limits/marks as long as there is not a compelling reason to avoid doing so. It will make your histogram easier to read. For example, if your data starts with 43, 46, 48, 48, 52, 57, 58, ... you might pick a lower class limit of 40 and a class width of 5 (provided that a reasonable number of classes resulted).
  • The goal of making a histogram is to "see" the distribution of the variable.
    Experiment with different choices for boundaries, subject to the above restrictions, to find out which graphical properties (modality, skewness or symmetry, outliers, etc...) persist and which are just spurious effects of a particular choice of boundaries. Then use the boundaries that best reveal these persistant properites.