On Hugo Voting Slates and Clustering – Uncertain Principles Archive

This Hugo nomination scandal continues to rage on, and much of what’s going on is just a giant sucking vortex of stupid. Standing out from this, though, is the guest post by Bruce Schneier at Making Light, which cuts through the bullshit to get to what’s really important, namely using this as an excuse to do some math.

One of the many terrible ideas being floated is to use some analysis of the clustering of ballots to identify “slate voters,” and having done that… something. Target their address with orbital lasers, maybe, or just sternly “Tsk tsk” in their general direction. This depends, obviously, on having clearly identifiable “slate voters” who stand out from the norm. Which clearly depends on how much clustering you would normally expect.

Without access to the original ballots, of course, you can’t perfectly answer this, but you can use the publicly accessible data to put some limits on this. You can’t say whether there was significant clustering of ballots in the real set of nominations, but you can say something about the maximum and minimum possible influence of “slates” which are defined for this purpose as individual ballots that have some degree of overlap with each other.

How to do this? Well, the data files on the Hugo site give the total number of nominating ballots for each category, and the number of votes for each of the top N works in that category. So, for example, in 2009, there were 633 nominating ballots, and the top vote-getters were:

Little Brother, 120
Anathem, 93
Graveyard Book 82
Saturn’s Children 74
Zoe’s Tale 54

So, what can we do with this? Well, first of all, we can say that the minimum number of “slate” voters is zero– the votes for the eventual finalists come to just 423, well under the total of 633. So these results might’ve come from a set of ballots where no individual voter nominated more than one of the eventual finalists.

What’s the maximum influence of slate voting, then? Well, clearly, you can’t have more five-member slates than the vote total for the last of the finalists, so a maximum of 54. We could round out the rest of the finalists by adding partial “slates,” say, due to imperfect coordination of votes. That would give you a bloc of 20 people who voted for 4/5 finalists, then 8 with 3/5, 11 with 2/5, and 27 who only voted for Little Brother. So we can say that, in 2009, “slate voting” could involve at most 54/633 ballots, or about 8.5% of the nominators.

How typical was 2009, anyway? Well, I picked it because it stands out in my mind as a year where the final result was skewed by the personal popularity of one of the finalists (that is, I thought Anathem was head and shoulders better than the field, but Graveyard book won because Neil Gaiman). I also did the same thing for the last three years worth of stats. The following list shows the year, the size of the biggest “slate” you could make including all of the finalists, and the total number of ballots:

2012 71/958 ballots
2013 118/1113 ballots
2014 98/1595 ballots

Together with the 2009 results, that’s an average maximum “slate” contribution of about 8.2% of the total nominations for the Best Novel category. So, yeah, a block of a couple hundred people all voting exactly the same way would stand out really clearly.

The other question you might ask would be whether Best Novel, the most heavily nominated category, is somehow anomalous. I pulled the same numbers for Best Short Story, and got the following:

2009: 31/448 ballots
2012: 36/611 ballots
2013: 34/662 ballots
2014: 43/865 ballots

(Note that in both 2013 and 2014, there were fewer than 5 finalists because of the “5% rule” than any finalist must get at least 5% of the total. If you wanted to include a “full slate” of 5 that would encompass the unsuccessful next nominee or two, the 2013 total is 28, and the 2014 total is 38.)

“Best Short Story” is generally the most scattered category, as you would expect from the failure to produce a full slate of finalists the last two years, so the “slate” contribution here is smaller, just barely over the 5% threshold– about 5.5% if you looked at the “full slate” cases including the would-be finalists excluded by the 5% rule.

So, that’s about as much information as I think you can get out of what’s readily available. It’ll be interesting to see how this compares to the analogous numbers this year, and also whether there’s a big drop from the last finalist in the Puppy-dominated categories to the best of the rest– some numbers I’ve seen suggest the Puppy contingent was around a couple hundred nominators, which wouldn’t be all that large compared to the typical Best Novel nomination pool, but would blow away the typical Best Short Story pool by a huge margin.

And, you know, having an excuse to play with numbers is an infinitesimal bright spot to go with this giant pile of awful.

2 comments

Those numbers are already available for this year: https://www.flickr.com/photos/coalescent/17038229312/

The biggest slate you could make including all the novels is 256, and including all the short stories it’s 151. That’s a very substantial increase from the 2009 figures you quoted. On the other hand there were more than three times as many ballots, 2,122 (which isn’t just a lot of slate voters; it’s been growing steadily for years, although this year is a bit higher than might be expected).

If you want to try to identify slate voters from those numbers, I would guess that the right number is probably more like the 145 for the Novella category, as I doubt there were a lot of non-slate voters going for whichever John C. Wright was last on the list…

Comments are closed.