To Automate or Not To Automate? – Uncertain Principles Archive

The Female Science Professor has a nice post about high and low tech data acquisition:

An MS student has repeatedly questioned why he/she has to use a low-tech method to acquire, somewhat tediously, some data that could be acquired more rapidly with a higher-tech method. I say ‘more rapidly’ because the actual acquisition time once the machine is on and ready for analysis can be fairly rapid, but this technique becomes much less rapid when the substantial (and tedious) preparation time is considered. In any case, with the low-tech method, you can get data any time you want, and the amount of data one gets is limited only by your time.

Without knowing more about her research field, it’s sort of tough to develop a specific opinion, but this is a pretty universal question. When I was at NIST, I had several lunchtime conversations with a guy down the hall who maintained that it was always a good idea to spend a day or two automating everything in the data collection system when you first got an experimental signal. I was generally more of the “Woo-hoo! Data!” school, and tended to just plunge ahead using tedious, non-automated methods until that became completely intolerable.

In one experiment, this took the form of the world’s most expensive laser stabilization system: we set up a spectrum analyzer to measure the frequency of the laser we were trying to control, and then somebody stood next to the laser control box and tweaked the frequency if it started to drift too much. We referred to it as the “biological lock,” and as somebody else in the lab pointed out, were were using an NRC post-doc to do the job of a fifteen-dollar box of electronics. It saved us a few days of building and de-bugging an actual lock circuit, though, and this was supposed to be a one-afternoon experiment (which eventually took three months, but, hey, we were getting data all that time…).

The advantage of automation, of course, is that it allows you to easily collect vast amounts of data. At NIST, we had a couple of three-ring binders full of graphs of data that we took using an automated system (we set up a LabView program to scan the laser frequency over a wide range and record the signals we were looking for). We never did explain the phenomenon we were investigating in one of those sets of experiments– the theory turned out to be extremely difficult– but we were able to exhaustively explore variations of all the parameters, in a way that wouldn’t have been possible if we hadn’t automated the data collection.

On the other hand, non-automated methods have their advantages, as well. The showpiece graph for the first paper I was an author on (visible on this page that I made back in 1997) was from the very first day we got a useful signal, when the other grad student on the project and I sat in the lab watching the LED display on a digital counter, and writing numbers down on paper. We got tons of other similar graphs later on, after we automated the system, and in some respects the experimental conditions for those later runs were better, but in the end, the data taken by hand made for a cleaner graph, because we were able to exercise some judgement about points that were anomalous while they were being acquired, where the automated system just took down everything, and produced noisier data.

It’s kind of a tough call, though, and comes down to a trade-off is really between preparation time and acquisition time. On balance, I suspect my colleague from NIST is probably right, that spending time automating things at the beginning of an experiment is probably a net win, in terms of the total amount of time spent on data collection. Psychologically, though, the concentrated nature of the up-front prep work seems much more unpleasant than the more spread out time spent on taking data, meaning that it can be awfully attractive to just plunge ahead into data collection even though some more prep time would pay dividends in the end.