A few days back, Matthew Beckler added the Kindle edition to his sales rank tracker for How to Teach Physics to Your Dog. Given my well-known love for playing with graphs of data, it was inevitable that I would plot both of these in a variety of ways.
So, what do we learn from this? Well, we learn that people in the Albany. NY area don’t own Kindles:
OK, maybe that’s not obvious to everybody…
When you look at that graph, the blue line is the Amazon sales rank of the physical book edition, while the red line is the Amazon sales rank of the Kindle edition. The two track each other pretty well for a while, but diverge dramatically after about 48 hours, with the physical book sales rank shooting into the triple digits, while the Kindle sales rank stayed around 3,500. So, what happened at 48 hours?
The only significant development that I’m aware of that took place around then (which was about 1pm ET Sunday) is that both the Albany Times Union and the Schenectady Gazette ran articles about the book (the Times Union had a piece about the book written for them by a freelance writer, the Gazette went with the AP review, plus a notice of this weekend’s signing). Nothing else happened around that time that I know of.
So, this tells us that notices in the local papers were enough to drive up the sales rank of the physical book, but not the Kindle edition. So, people in the Albany area don’t own Kindles. Or, to be more precise, people in the Albany area who read print newspapers (neither the Gazette nor the Times Union put the book on their web sites) don’t own Kindles.
Amazing what you can learn from looking at graphs.
The other obvious thing that you can do with these data is to look at what relationship, if any, exists between the book sales rank and the Kindle sales rank. The easiest way to get at this is to plot one on the vertical axis and the other on the horizontal axis:
I’ve divided the data into two sets for this graph. The purple points at the first 48 hours of the data set, and the green are the last 48 hours. For the first 48 hours, they track each other pretty well– a straight line drawn through the purple points would come close to most of them, and has a slope close to 1 (1.27, to be precise). It’s not perfect, but it’s plenty good enough for social science.
The green points are way off that line, for the most part, but there’s a big clump of them over in the upper left, that would fit reasonably well to a line with a slope of a bit less than 3 (making a rough cut of that group gives a slope of 2.93). Those points are the stretch from Sunday afternoon through Monday night, when the physical book rank was at its highest point.
There’s also a sort of a tail connecting the Sunday-Monday group, as the book rank drifts back up to more or less where it was before the dramatic spike. At the time of this writing, the book rank is back up to 2500, which is about where it’s been since the AP review ran.
So that’s this week’s thrilling installment of Playing With Graphs…
Sales rank is probably not the good data to plot here; sales volume would be more fitting (you might recover it by assuming that it follows a Pareto law).
It’s not perfect, but it’s plenty good enough for social science.
I LOL’ed.
What an irresponsible analysis.
You’re supposed to be in a position of scientific authority here, yet you post a couple charts then unapologetically, without qualification, jump to some obscure conclusion because it makes for a sensationalist headline? This is a terrible example to be setting for your readers.
You want to conclude from these two charts that “people in the Albany area who read print newspapers don’t own Kindles.”?
There are plenty of other explanations. Let me try a few:
1. Something else you don’t know about (god forbid) caused a few more people to buy print versions.
2. A larger chunk of the population (which includes disproportionately fewer Kindle owners) buys books on Sundays.
3. People who own Kindles are less likely than non-Kindle owners to be convinced by a stupid article they read in the local newspaper to buy the book.
4. People who read the newspaper and _and_ own a Kindle still prefer to buy some books in print.
5. Random variation.
You’re not being a scientist. You’re being a sensationalist. Then we wonder why science is so misrepresented in the media…
Well I thought it was funny…
6. Scientists should never, ever be funny.
Ok, then you’re not a very good blogger for failing to recognize that sarcasm doesn’t work on the internet. Clearly the first commenter thought this was a serious analysis.
If I were a social scientist, I might examine the comments of this post and come to the unequivocally true conclusion that readers of science themed blogs are incapable of recognizing hyperbole and sarcasm, despite when the blogger is laying it on awful thick.
@1 and especially @3: Oh come on! He was drawing absolute conclusions from a scatter plot. It is blatantly obvious that his conclusions are meant to be comical and an attempt to poke at the soft sciences.
For variables like these, without any obvious cause-effect relationship, you should actually be calculating the slope for a reduced-major-axis regression, rather than simple Least Squares. The RMA slope is always higher.
But the slopes are irrelevant here anyway; it’s the correlation coefficients we want.
Yes, I’m having fun.
Ah! I’ll repeat what idlemind wrote : I thought it was funny… Though maybe a little unfair to social scientists. They would at least run a covariance matrix…
Or maybe you’re not a good enough blog reader ? Or not a regular reader of this particular blog ? Ever heard of inside jokes ?
And you left out one of the more plausible explanations. “Or, to be more precise, people in the Albany area who read print newspapers […] don’t own Kindles.” : there may be a link between usage of technology for reading newspapers and usage of technology for reading in general. Or maybe those who buy for immediate reading have different habits or buy at different hours of the day or maybe it has to do with part of the population of the country waking up at another time or people seeing the print edition of the newspaper liked better the black and white picture of the book rather than the color one,etc.
Or maybe I’m reading it backwards since sale rank is inverse number of copies sold : the more copies, the higher rank. 1000 copies = 3000th rank 3000 copies = 1000th rank. According to this, Mr. Orzel is misreading the “data”, if we can call it so, since the spike in rank at 48 hours means the exact opposite : it “fell” to the 3 digits area. And the Kindle version “rose”.
So everybody in Albany who reads print papers DO OWN a Kindle.
And the second graph is also interesting : it clearly shows that the book (both versions) now sell more than in the beginning. And that, when more books are bought, well, more books are bought. That’s a good lesson.
Or maybe I am the one misreading the graph…
*runs hiding under his bed*
Sarcasm doesn’t work on the internet? Oh really. Fantastic. Thanks so much for telling me.
…
Seems to be working just fine. Great post, btw.
To the forecasting of earthquakes in Chile and Argentina newspapers, e-mail.
Argentina Newspapers: Buenos Aires province, and the helicopter crash in Santiago, Chile earthquake-related. Crash area meteorological disasters will occur. Sao Paulo, Rio de Janeiro, the Amazon River in central Peru has also crash will occur. Details http://www.shaodl.com/sspl7.htm
This is before the earthquake to the United Nations in Haiti, Dominica, Jamaica, newspapers, e-mail. http://www.dominicantoday.com/dr/world/2009/10/11/33512/UN-recovers-11-victims-from-Haiti-plane-crash
English Channel, California, Ecuador, Peru,south and north, New Zealand, Italy south — north, Mexico, Spain, Turkey southern ,Morocco, Caucasus, northern Iran, in recent earthquake.