There’s a link in today’s links dump to a post from Pictures of Numbers, a rarely-updated blog on the visual presentation of data (via Swans On Tea, I think). There’s some really good stuff there about how to make graphs that are easy to read and interpret.
I would like to dissent mildly from one of their points, in the Better Axes post, specifically the advice about not starting at zero. In many cases, this is good advice, but like most rules of thumb, it shouldn’t be followed too closely.
Take, for example, this post from one of my metastable xenon papers:
A strict application of the presentation rules promoted by people like Edward Tufte (which are generally quite similar to the things Pictures of Numbers says) would say that there’s too much white space in this graph, particularly at the bottom. The variation in the data would be much more obvious if the vertical axis started at 0.5 instead of 0.
The problem is, that would be a deceptive presentation of the data. The whole point of this graph, in the context of the research, is that there’s actually very little variation in the data. The points show the relative collision rates for various different isotopes of xenon, and they’re all more or less the same. Blowing up the axis would make what differences there are more obvious, but that would distort the point of the graph.
This comes up a lot in intro labs, in which we frequently ask students to make null measurements– to measure how the period of a pendulum depends on the mass, for example, or how the period of a mass oscillating on a spring depends on the amplitude. Students will carefully measure a half-dozen points, pop them into Excel, and make a graph that is auto-scaled so that the data points span the full vertical range of the plot. Then they’ll fit a trend line to the data, and declare that the period increases linearly with the mass. And if you look at the plot, it looks for all the world as if the trend line slopes impressively from one corner to the other.
Of course, the period of a pendulum doesn’t depend on the mass, so what they’re fitting is just noise. The dramatic sloping trend lines are dramatic only because the scale is blown up so much. If the vertical scale went all the way to zero, it would be clear that there’s no variation worth mentioning.
I’m not saying that you should totally abandon the principles Tufte and Pictures of Numbers suggest– for most cases, their advice is good advice. It’s important to remember that all these rules have exceptions, though, and to think carefully about what you’re trying to show before you plot your data.
Hey Chad – Are you sure you’re interpreting Tufte correctly? My impression from his discussion of “graphical integrity” is that he would consider a plot with a zoomed-in vertical scale (massively overemphasizing minute variations) to be an example of a significant “lie factor”. That is, a true 1% variation in the data would show up as, e.g., a 50% variation in the vertical position of the curve in the graph.
I’m going off my recollection of The Visual Display of Quantitative Information (the book is in my screaming-infant-free office at work), which goes on at some length about the need to minimize white space by choosing the axis scale to maximize the spread of points, and so on.
The “lie factor” thing I’ve mostly seen applied to bar graphs and the like, not scatter plots.
One thing this emphasizes is how you can manipulate your plots to emphasize what you want. In choosing a scale, keep in mind what is considered large/small variability in *your* field. If you are showing stability of numerical methods, and “no variation” should be near machine precision, and your axis is like 10^-3, your graph is misleading. In this case, how big is the difference between a rate of 1 and 2? I think that should be more of a guide in how to scale, rather than the amount of white space.
Why not plot both? First and absolute scale that includes zero and all of the data, which instantly shows the variation in proportion to the central tendencies, and then a relative scale which details the variation?
Label the one ‘full scale’ and the other ‘detail’, and people will quickly get used to the scheme and start liking the pairing.
In this case, how big is the difference between a rate of 1 and 2?
Pretty big. What’s plotted is the ratio of the collision rate for each sample to the collision rate for a reference sample. A factor of two difference between isotopes is possible, but would be highly noteworthy.
Why not plot both? First and absolute scale that includes zero and all of the data, which instantly shows the variation in proportion to the central tendencies, and then a relative scale which details the variation?
In a lot of cases, there are tight space restrictions– Physical Review Letters requires all papers to fit within four journal pages, including the figures. You can’t really justify putting in two figures if you can get the essentail information into one.
You see a lot of inset plots in PRL for this reason. People who want to do the full-scale/detail thing will put one figure inside the frame of the other.
I would include zero if I wanted to show that the points are away from zero, but in this case it sounds like an arbitrary value.
What I would do (or at least try) would be to make the points stand out more by making them larger: the lines are all very heavy. I would also play around with the vertical size: you could simply chop out the 0 to 0.5 part, making the whole graph smaller. It makes the journal’s production staff happier, too!
Hmm, actually I think this plot would still show little variation if you kept the same sized figure but started at 0.5. The key is that you’ve got error bars, so they’re filling the plot up: they’re showing that the data is dominated by noise.
Could always plot variation from the theoretically predicted value?
(Gah. I’ve spent too much time with engineers, if plotting the residual is becoming my default)
Mike from Pictures of Numbers here.
I think starting that graph from zero is just fine. I don’t think there’s too much white space, and I don’t think Tufte would think so either. You’re quite right that trimming the white space would unbalance the data and magnify noise.
If I were to make any changes, I’d reconsider the assortment of symbols you’re using in the key. Perhaps arrange the five isotopes in numerical order, and use a sequence of symbols that change in both size or shape? If they were less random, it would be easier to look for patterns.
Also I would probably gray out the dashed line a little, label it directly with the word “Theory”, and get the key out of the way: it’s encroaching on the data a bit.
Anyway, thanks for taking the time to discuss this.