Thoughts on Gender-Blind Reviews in Astronomy

There’s been a good deal of buzz about the results of a recently complete round of proposals for Hubble telescope observations, which used a double-blind process where the reviewers got science proposals without information that would identify who made the proposal. The headline result of this was that the approval rate for proposals with female PI’s (8.7%) was equal to that for proposals with male PI’s (8.0%). This is in stark contrast to the previous round where those fractions were 13% and 24%, respectively. This is a striking enough change that even political pundits took note.

The basics of this have been well covered by people actually in astronomy, but here are some miscellaneous thoughts as an outsider to the process:

— First, the obligatory note that these acceptance rates are sucktastic– parity at an 8% funding rate is only like a cheer-and-a-half scenario. There was something weird about the timing of these, though, and the number of submitted proposals this round is way higher than the previous, so maybe that’s a one-time downward fluctuation.

— Second, while the acceptance rates were equal, the raw numbers were still very unequal– 12 approved proposals with female PI’s vs. 28 with male PI’s. This reflects a big difference in the number of submissions, which in turn reflects the demographics of the field as a whole.

— A aspect of this that I haven’t seen discussed is this bit from the description of what they did:

In response to the key concern raised by the community, adding a (non-anonymized) “Team Expertise and Background” section, to be made available after the final ranking, to allow the TAC to assess whether the proposal team would be capable of executing the proposed investigation. If there were clear, sufficient deficiencies, the panel could flag a proposal for disqualification; panels would not have the option of recommending replacement proposals.

I find this aspect interesting, for reasons I’ll get into in the next point, but there isn’t any discussion in the presentation of results about whether this came into play at all. That would be interesting information to have, to see whether there were proposals whose idea seemed good that were deemed unfeasible for the particular PI’s, and whether there were any patterns to those.

(You could easily imagine a gender difference in either direction for background-based decisions. Unblinding might lead a biased committee to rate some female PI’s as less able to carry out the work, or it might reveal some overconfident male PI’s who proposed stuff they really don’t have the background for. I suspect the “no replacement” clause keeps the number of rejections on this basis low, though, possibly even zero.)

— I find the “Expertise and Background” part interesting, because when I think about whether double-blind review could work in my own field, that’s the biggest sticking point. This, I think, reflects a big difference in the way research is done in astronomy versus experimental AMO physics. I suspect expertise and background looms larger in experimental AMO.

From the outside, this looks very much like a “user facility” sort of deal– that is, the proposals in question are requests for acquisition of and access to data from a common instrument. The proposers don’t need to know directly how to work the Hubble, just how to do image processing and analysis with a fairly standard set of tools. In this case, past experience will play a smallish role– it’s probably not trivial for someone with a background in, say, radio astronomy to move to analyzing Hubble images, but someone who has only worked with images from Kitt Peak in the past probably won’t have much of an adjustment.

The corner of physics I work in, on the other hand, deals with unique apparatus– labs are local things, where the PI and students build a system custom-designed to study a particular problem. Because of that, past experience is going to be a bigger factor. I once heard Wolfgang Ketterle asked whether it was difficult to make a Bose-Einstein Condensate, to which he replied (paraphrasing slightly) “That depends on whether you’ve made one before.” There are a lot of tricky bits to the process, and it’s easy to get hung up along the way. Having done it once, though, you’ll know where the biggest pitfalls are, and the second time will go more smoothly.

A proposal for an experiment that calls for making a BEC and hitting it with a pulsed laser, then, will read as much more feasible if the PI has past experience working in labs that have made BECs and used pulsed lasers. Someone who has done neither of those things might have a great idea for an experiment, but it’s not at all clear that they’ll be able to pull it off in a timely manner.

So, when I think about the possibility of double-blind review for experimental AMO physics, the need to assess the past experience of the PI’s is a big issue. This two-stage process seems like a work-around with some promise, though: assess the quality of the scientific idea first, then go back and look at the backgrounds to check for feasibility after you’ve identified the most promising of the ideas. But, again, I’d be curious to see how much this came into play in the astronomy case, where background seems a less significant factor.

— That said, double-blinding seems more immediately applicable to physics fields that already work in a “user facility” sort of mode, such as particle and nuclear and some parts of condensed matter. I don’t know a great deal about how those fields allocate their resources, but I hope people who do take a good look at this, and strongly consider implementing something similar.

— I’ve wondered in the past about the possibility of doing gender-blinded faculty searches, and will file this away as something to use the next time we need to make a case for this kind of thing. (Which I hope to God won’t be for a few years yet, because I’ve been on more than enough faculty searches already…)