In An Introduction to Bar Charts, we saw how bar charts could be used to display measures relating to 'discrete' variables on a chart. In this post, we'll consider how a histogram can be used to plot 'continuous' data using discrete 'bars' to represent the data frequency over a particular range of values.
Pulse-rates of a group of 100 students (from the OpenLearn Unit More working with charts, graphs and tables).
Read Discrete and continuous variables (from the OpenLearn unit More working with charts, graphs and tables).
What continuous and discrete variables did you identify in the final exercise?
Now read both of these OpenLearn sections on histograms: Histograms (from Working with charts, graphs and tables) and Histograms (from More working with charts, graphs and tables).
What are the defining characteristics of a histogram? Write down at least three ways in which a histogram differs from a simple bar chart.
In a histogram, the range of values that define the samples reported by the height of each bar are often referred to as bins. One thing to check when reading a histogram is the bin size used for each of the bars. In most situations, they should all be the same width... Choosing an appropriate "bin size", or "bin width" can often influence the shape of the histogram, as described here: The Histogram (from NetMBA).
In contrast to a bar char, where each bar represents the value of a discrete variable, in a histogram, the height of each bar in a histogram actually represents a frequency count of how many times a value occurs within the bin corresponding to that bar.
A histogram thus shows the distribution of data values across a particular range. The human eye is very good at recognising different distribution pattern shapes, so a histogram is one way of detecting potential anomalies in the distribution of data.
In the next post, we will consider alternative ways of displaying continuous data.