Cross-Linking and Statistics

Posted on February 8, 2008 by Chad Orzel

Henry Farrell is doing some substantive political science blogging over at the Monkey Cage, looking at a paper by his co-blogger Eszter Hargittai and her colleagues. They did a really imprssive amount of work to look at the linking habits of liberal and conservative bloggers, and Henry zeroes in on one of the findings in particular:

Straw-man arguments account for 43% of the 42 links from conservative blogs to liberals in our sample, and 54% of the 63 links from liberal blogs to conservatives in our group of entries that include cross-ideological linkages. …Posts that concretely address the content of a blog entry from an ideological opponent represent about a quarter (26%) of all conservative and about one fifth (21%) of all liberal posts with cross-ideological links. Substantive disagreement accounted for 12% of links from conservative to liberal blogs and 16% of links from liberals to conservatives, while substantive agreement accounted for 14% of links from conservatives to liberals and 5% of pointers from liberals to conservatives.

Henry offers four possible explanations, two of which might be characterized as straw man arguments on behalf of the different sides, which I guess reveals him as a liberal… I’d like to suggest a fifth explanation:

5) You’re overthinking this.

The sample size here is ridiculously small– a total of 63 liberal-to-conservative links, and 42 conservative-to-liberal links. The 11% difference between the linking habits of the two groups looks big, but you’d expect an uncertainty of about 13% in the L-to-C number, and 15% in the C-to-L number. That’s toward the outside of the “statistical fluctuation” range, but it’s not a difference I would trust in any way.

It looks even less believable to me looking at the raw data. They tallied links from twenty conservative and twenty liberal blogs for three weeks, and then read and characterized the cross-links between them. The straw man fractions for the individual weeks look like this:

June 2004 C-to-L: 2/6 L-to-C: 3/13

October 2004 C-to-L: 11/22 L-to-C: 17/23

March 2005 C-to-L: 5/14 L-to-C: 14/27

Total C-to-L: 18/42 L-to-C: 34/63

These are really small numbers. More importantly, one of the three weeks of data accounts for better than 50% of the difference between the total numbers of “straw man” posts. There’s a big ideological difference in the total number of posts that week, as well, with 27 cross-linking posts by liberals, and only 14 by conservatives.

While the week of March 13-19, 2005 is described as “a week without any special political events” in the paper, it sure looks like something happened that week that got the liberal blogosphere riled up. The total number of links regardless of ideology is higher for the liberal blogs as well– 173 to 150 for that week– which further suggests some agitation on the liberal side.

In short, I’m not really convinced it’s necessary to go to any great effort to explain this discrepancy. It’s interesting trivia, but looking at the numbers, I’m not convinced it isn’t just noise.

Of course, the methods described in the paper sound like so much work that I doubt we’ll see this repeated any time soon.

5 thoughts on “Cross-Linking and Statistics”

justawriter says:

February 8, 2008 at 1:15 pm

While the week of March 13-19, 2005 is described as “a week without any special political events” in the paper, it sure looks like something happened that week that got the liberal blogosphere riled up.

Heh. I recall Eschaton has its archives sorted by week, so I popped over to see what he wrote about that week. There were important votes on social security, medicare and drilling in ANWR. That was also the week Sen. Frist made his long distance diagnosis of Terry Schiavo. I would be interested in know their definition of “significant poltical events.”
Henry says:

February 8, 2008 at 2:03 pm

I did make the proviso though that this was all predicated “[o]n the assumption that this captures a broader phenomenon” by which I meant to say (although I should have been _much_ clearer) that I didn’t know whether it actually captured something real, but was operating, for the purpose of argument, as if it did. And as a half-arsed Bayesian, I think that this is a fairish starting point in the absence of better data (an absence which as you say is likely to continue because of the immense amounts of work involved in generating it) – while we certainly don’t know what is going on, our priors should have shifted a bit in the direction of believing that there are significant differences in linking practices (only a bit, mind you, but it’s the only data we have apart from purely idiosyncratic personal assessments).
Chad Orzel says:

February 8, 2008 at 2:39 pm

I would be interested in know their definition of “significant poltical events.”

They said special political events, meaning elections and concentions and suchlike. The other two weeks were chosen based on their relation to the party conventions and election of 2004.

And as a half-arsed Bayesian, I think that this is a fairish starting point in the absence of better data (an absence which as you say is likely to continue because of the immense amounts of work involved in generating it) – while we certainly don’t know what is going on, our priors should have shifted a bit in the direction of believing that there are significant differences in linking practices (only a bit, mind you, but it’s the only data we have apart from purely idiosyncratic personal assessments).

Bayesian statistics is one of those things that I wish I understood well enough to discuss sensibly.

I would say that the October numbers look suggestive that there might be something there, but the June figures are useless, and the March ones suspect, given the large disparity in the number of posts. I agree that it’s probably worth another look (or would be, if it were less hassle), but at the moment, I wouldn’t want to put too much effort into theorizing about the cause of the effect.

Of course, I just put probably too much effort into poking at the numbers in order to say that it’s not worth theorizing about…
Eric Lund says:

February 8, 2008 at 3:20 pm

In addition to the small-number statistics, there is also the potential for selection bias. The paper itself is behind a paywall, but I read Henry’s post. Henry didn’t disclose which blogs were surveyed, so I don’t know how representative the sites are. As justawriter points out, there are good reasons for thinking that events of the week of 13-19 March 2005 would get the left more riled up than the right, so absent further evidence we cannot consider that week representative. There is also the issue that whether an argument is a straw man or substantive may be very much in the eye of the beholder, a point Henry implicitly concedes in his quote from the Daily Kos.

I’d also say that you’re being too kind to Henry on his four explanations. All four of them could be reasonably interpreted as straw man arguments: the first two are clearly so, the third represents how conservatives want to be seen (with no evidence that it is actually true), and the fourth is an oversimplified representation of blogospheric dynamics.

There is an important asymmetry to consider, which is the reason why the left felt the need to develop a “netroots” organization. As Bob Somerby of the Daily Howler and many others have thoroughly documented, the “mainstream media” have been systematically distorting political discourse in the Republican Party’s favor for at least fifteen years now. The pundits and editorialists are still almost entirely pro-Bush (Keith Olbermann is the only significant exception I can name offhand) even though more than 2/3 of the people in this country think we are on the wrong track. That’s the need the netroots is trying to address. If there is no equivalent on the right (a point which I would dispute; the right had a large web presence even before the left did), it is because the right’s talking points are regurgitated on the pages of the WaPo and NYT as well as cable talking heads.
Harlan says:

February 8, 2008 at 7:13 pm

I read a fair amount of liberal political blogs, and it is very very rare for a liberal political blog to cite a conservative liberal blog. There is just no dialog between them. Liberal blogs quote the traditional media all the time, including traditional media it disagrees with in order to discredit it (usually), and of course blogs link to each other.

This study is silly. It’s measuring something that almost never happens.

Comments are closed.