Active Engagement Works: “Improved Learning in a Large-Enrollment Physics Class”

Physics is a notoriously difficult and unpopular subject, which is probably why there is a large and active Physics Education Research community within physics departments in the US. This normally generates a lot of material in the Physical Review Special Topics journal, but last week, a PER paper appeared in Science, which is unusual enough to deserve the ResearchBlogging treatment.

OK, what’s this paper about? Well, with the exceptional originality that physicists bring to all things, the title pretty much says it all. They demonstrated that a different style of teaching applied to a large lecture class produced better attendance, more student engagement, and better learning as compared to a control section of the same course taught at the same time.

So, they showed that there are better methods than the traditional lecture. Haven’t we known that for decades? How does that get into Science? Well, this is an exceptionally clean test, with all the sorts of controls you would want for good science. They took two sections of a huge introductory class, about 270 students each, and for a one-week period, they had one section taught by the regular professor (a highly regarded lecturer) and one section taught by a post-doc trained in a new teaching method. They covered the same material, using many of the same in-class examples and “clicker” questions, and at the end of the week gave both sections a short exam on the material just covered.

And the results were impressive? Very. The students from the experimental section got an average score of 74% on the test, compared to 41% for the control section. The two distributions were really dramatically different:

i-d4b63bd69def4910e05d2344d85ade2a-Physics_Education_Scores.png

Yeah, that’s a pretty dramatic difference. Are you sure they got the same test? It says that they did– I’m not in British Columbia, so I can’t confirm it. Interestingly, it notes that the experimental section only covered 11 of the 12 topics on the test due to time constraints, so they were starting with a slight handicap.

So what’s the brilliant new method? Basically, making the class more participatory.

The experimental method asks the students to do most of the fact-based learning outside of class– reading some explanatory material whose nature is unclear– and spends the in-class time answering questions in small groups. They’re posed a question, given a couple of minutes to discuss it with their partners and submit answers via the clicker system, then given some feedback on the answers. At intervals, they’re given more involved “group tasks,” asking them to figure out something more complicated. There are also some demonstrations, the nature of which was unclear.

You used the word “unclear” twice. Unclear, how? Well, in annoying glamour-journal fashion, they push most of the good stuff off into the “supplementary online material.” this includes the twelve test questions and all the clicker questions and group tasks. It doesn’t explain what demonstrations were done (though a demonstration is mentioned in the text), nor does it identify what they were supposed to read. It says “students were assigned a three- or four-page reading, and they completed a short true-false online quiz on the reading,” but doesn’t say whether that was reading from a textbook or something written especially for the class. I suspect it was a textbook section or so for each class, but there’s no way to tell.

So, all of these gains come from doing problems in class? From having the students do problems in class, apparently. Though they’re not all that involved, as problems go– the questions were all multiple-choice, as you would expect for a 270-student class.

Can you give an example? Sure. The material covered was on Maxwell’s equations and electromagnetic waves, and one of the clicker questions was:

Which of the following is true?
a) For EM waves to exist, they must propagate in a medium with atoms. With no
atoms present, the field cannot have any effect on the system and therefore can’t
exist.
b) An EM wave can propagate through a vacuum.
c) An EM wave is like a wave travelling along a rope in that it needs atoms to move
up and down.
d) An EM wave can only propagate in a vacuum since any medium would get in the
way of its propagation.
e) More than one of the above is true.

They would get this, discuss with a partner for a few minutes, enter their answers, and then the instructor would give feedback to the class.

OK, so that’s a clicker question. What’s a group task look like? From the same lecture, we have:

A friend of yours reminds you that en EM wave consists of both an E and B field.
She asks you if the following electric field
E(x,t)=100x²t Volts/m
could be that of an EM wave.Can you help? Be quantitative in your answer.

That’s a little subtle. That’s the idea. They would get a longer time to work these out, with some feedback along the way.

So, the whole class is just this stuff? That’s the idea, yes. The control group was a more traditional sort of lecture, though it used “clicker questions” as well.

It seems kind of surprising that this would lead to such a big improvement. Are you sure that this isn’t some other effect? A more enthusiastic instructor, or some such? While the instructors for the experimental section were authors on the paper, and thus presumably deeply committed to the project, they’re also post-docs with minimal prior teaching experience. While it’s hard to rule out an instructor effect, it’s unlikely to be just that.

Could this just be an effect of novelty? You know, anything you do to liven the class up improves performance? The technical name for the “any change you make improves things” is the “Hawthorne Effect,” after an experiment around 1930. They vehemently deny that this is what’s going on, citing a number of sources claiming the effect doesn’t really exist (including this old paper from Science, which is several kinds of appalling. Interestingly, though, they sort of implicitly claim a Hawthorne-ish effect to explain one of their results, namely the vastly improved attendance in the experimental section, which rose from 57% the weeks before the trial to 75% during the experiment. They suggest that the novelty of the trial got students to come see what was happening, and the new methods got them to stay. Make of that what you will.

So, what are the limitations of this? Well, basically, it was a one-week test of a new and different method of teaching, followed immediately by a short test that was basically identical to some of the in-class material used during the experiment. It doesn’t tell you whether the effects would hold up for a full semester (though previous studies have compared entire courses taught with new methods, and suggest substantial gains), or how well the material would be retained. It would’ve been interesting, for example, to see if the experimental section scored substantially better on final exam questions covering the material from this part of the course.

That’s a good point. I wonder why they didn’t do that? Probably because it would’ve complicated what was otherwise a very clean test. Also, I suspect either logistical (sorting out the relevant questions from the final) or ethical (there might be trouble getting permission to use student test scores as part of a research paper, unlike using a voluntary separate assessment that did not affect the final grade) issues may have come into play. They don’t mention it at all, though.

Anyway, it looks pretty impressive. Are you going to implement this? It’s a fairly compelling argument in favor of their methods, but I’m not sure. The “get them to read the book ahead of time” thing is problematic at best, and our more compressed schedule makes it harder to do more time-intensive methods of instruction (their experiment ran in week 12 of a semester; our entire course has to fit a 10-week trimester). Also, there’s the problem of being the one person trying a new technique (I got killed on my student evaluations last term because I did a couple of things differently than the colleagues teaching the other sections), and the fact that the supplementary material includes the sentence: “We estimate that under normal circumstances a moderately experienced instructor would require about 5hrs of preparation time per one hour class in this format.” That’s a little daunting.

It’s definitely something I’ll think about for the fall, though, as I’ll be the only person teaching intro mechanics that term, giving me a little more flexibility in terms of how I run the course.

Deslauriers, L., Schelew, E., & Wieman, C. (2011). Improved Learning in a Large-Enrollment Physics Class Science, 332 (6031), 862-864 DOI: 10.1126/science.1201783

18 comments

becca says:

May 16, 2011 at 11:04 am

As far as evaluations go- it may be worth remembering that the majority of the students said they wished the whole course had been delivered like that.

As a random aside… the traditionally taught students produced a lovely bell curve of results on the multiple choice test (that was probably developed to give such a lovely bell curve in such students).
If you taught everyone with the new method, and you wanted to maintain a bell curve result, you’d have to change the test (in most of the 260 person classes I’ve taken, there is a general attitude that a test producing bell-shaped curves is more valid than others, so that valid statistical assumptions are easier, I presume)
So it’s important to remember that the students probably won’t notice they are learning more with this method if they are all exposed to it and you adjust the exam as well.
ADD says:

May 16, 2011 at 11:46 am

Wow, 57% attendance is pretty awful. I went to a big state school as an undergrad, but I don’t recall attendance being so low in my large introductory physics classes. Did the paper mention what fraction showed up for the traditional lecture during the trial?
Chad Orzel says:

May 16, 2011 at 12:10 pm

So it’s important to remember that the students probably won’t notice they are learning more with this method if they are all exposed to it and you adjust the exam as well.

Yeah.
That’s why I didn’t mention the student opinion questions in the write-up. I tend to discount those results because of the novelty factor– the one week where they did something radically different was probably the most interesting week of the term precisely because they did something radically different. If the whole term was taught that way, though, the novelty factor wouldn’t be there, and they might find just as many things to complain about as in the traditional format.

Did the paper mention what fraction showed up for the traditional lecture during the trial?

Control section attendance was 55% pre-experiment and 53% during. The table caption says this is the “Average value of multiple measurements carried out in a 2-week interval before the experiment.” There’s a comment somewhere in the text that suggests there may have been students going in and out to some degree (showing up to turn in/ collect homework, then leaving, or rolling in significantly late, etc.).
ScentOfViolets says:

May 16, 2011 at 12:55 pm

The “get them to read the book ahead of time” thing is problematic at best, and our more compressed schedule makes it harder to do more time-intensive methods of instruction (their experiment ran in week 12 of a semester; our entire course has to fit a 10-week trimester).

You mean that students will do better on a standardized test if they actually (like I imagine every teacher tells them to do at the beginning of the semester) read the material before coming to class?

Say it ain’t so, Joe!

I don’t think you can make a comparison between traditional and “new” methods until you isolate this very important variable and the extent of its effect first. Unless possibly you could make the case that the nontraditional method in and of itself encourages proper behaviour.
Wilson says:

May 16, 2011 at 1:52 pm

Regarding both attendance levels and reading the material…
--bill says:

May 16, 2011 at 2:04 pm

Was the test at the end of the week part of their course grade? If not, that might explain much.

It sounds like the difference between the two methods is the amount of time students spent outside of class on the material. In large lecture classes, I usually took notes and then ignored the material until it was time to study for the test or do homework. Here it sounds like the students were evaluated on their readings *every day* (the short on-line quiz) and then had to perform in class. The daily assessment might be the important part, rather than the in-class work.

In my own teaching, I’ve found that the more often I quiz, the better students do. Being graded seems to be the strongest student motivator.
Sherri says:

May 16, 2011 at 2:27 pm

Is there a way to avoid 250+ student classes anymore? (She asks as a parent of a high school sophomore looking at colleges…)
Scott Long says:

May 16, 2011 at 5:34 pm

This is tangential, but to simply improve class attendance, I found that having unannounced quizzes on the previous day’s material worked wonders. My attendance in chemistry survey and organic chemistry was usually around 90% or better, much better than the 50% attendance otherwise seen. I also liked to include occasional challenging questions in my “lectures,” but unfortunately my students almost invariably got annoyed and frustrated in those situations.
thomas says:

May 16, 2011 at 6:51 pm

We’ve tried clickers in a second-year statistics class, without the rest of the package, and got similarly impressive increases in attendance, and some improvement in performance on the final exam. We didn’t find any clear benefit (or harm) from the pre-reading and group tasks, in a different class, although the students liked it. We only had historical controls, not parallel controls, though.

In our context worried not so much about a Hawthorne effect as a Red Queen effect. The students liked the clickers, which may well explain the increase in attendance, and the increase in attendance could easily explain the exam differences. The effect may wear off as clickers become familiar and boring. A related possibility is an arms race: we’re competing against other courses for student attention, and the novelty makes students spend more of their effort on this course. If everyone did it, the benefits might go away. The duration of the benefit is important because reorganising a course this way takes quite a lot of work, which you’d want to amortize over several years.

@ScentOfViolets: Unless possibly you could make the case that the nontraditional method in and of itself encourages proper behaviour. Well, yes. That’s exactly the case they are trying to make. The argument is that not presenting material in lectures encourages reading it, and that doing problems in lectures encourages concentration on the problems, and that these are both important behaviours. Having the material explained in lectures and having students read the material in advance would presumably be even better, if you had a strategy that could make it happen.
CCPhysicist says:

May 16, 2011 at 9:53 pm

Of course attendance and reading have a significant effect, and one benefit of innovative approaches (I classify this as one of many forms of active engagement) is an improvement in both. However, reading the book only works if the book is actually meant to be read. That is why I also wonder what text those students were reading.

To Sherri @7:
You can avoid those colleges by asking that question. You might, for example, be able to use on-line registration tools to see what the class sizes are for relevant courses AND who is teaching them.
Lord says:

May 16, 2011 at 10:24 pm

It should increase both reading the material and working on problems which many students ignore and don’t realize they don’t understand something until confronted with not knowing how to apply it. The groups could help both by lesser students gathering how to use the concepts and better students learning by how to explain themselves.
Chad Orzel says:

May 17, 2011 at 10:00 am

Was the test at the end of the week part of their course grade? If not, that might explain much.

No. The test whose scores are plotted in the graph above was part of the study, but not included in the final average. At least, that’s how I read it.

Is there a way to avoid 250+ student classes anymore?

At the risk of sounding self-serving: Small liberal arts colleges. It doesn’t completely eliminate large lectures– I know of a few classes that fill a big auditorium for the weekly lectures– but most of our classes are smaller, and in Physics, our intro course is taught in sections that are capped at 18 (but lots of sections).
Joe Barsugli says:

May 17, 2011 at 12:37 pm

Chemistry has had http://www.pogil.org/ since the 1990’s.

I know someone who implemented POGIL in community college organic chemistry. According to him, the students did amazingly well on the ACS national test and it was a pleasure to teach because the students were engaged in learning in the classroom. It’s not simply a matter of adding clickers and a few group exercises — there is a lot more to making the technique effective, hence the term “guided inquiry”. But at the time, the big 4-year Tier I research University refused to even try this teaching method…..
Drivebyposter says:

May 18, 2011 at 2:06 am

I think I would kill to have more classes that involve actual engagement like this.

Is there a way to avoid 250+ student classes anymore? (She asks as a parent of a high school sophomore looking at colleges…)

I second Prof. Orzel’s recommendation of small liberal arts colleges. I think my largest class size so far has been 30 or so with the average close to 20. I just finished one that had 9 or 10 students in it.
Melissa Holcomb says:

May 18, 2011 at 8:19 am

I chortled when I read the description of the “new” teaching approach. I’ve taught the didactic component of a junior-level course in the discipline I teach (nursing) using problem-based learning since 2007. Yes, the students hated it. Yes, it means they have to read the textbook. Yes, my evaluations were kinda sucky there for a while.

On the other hand– mean scores on a nationally standardized test of the content I teach have gone up every semester and are consistently higher for my content area than for the other content areas tested on the same cohorts of students in the same semesters.

It did require a significant investment of time from me to adapt all my lectures to case studies that would incorporate the same content, but it’s been worthwhile. My textbook is going to a new edition this year and we’ve started enrolling slightly larger cohorts, which means I’ll have to revise everything about the course this summer. That’s the job, right?
Robert Evans says:

July 9, 2011 at 2:13 pm

Did you notice how much multi-tasking occurs during the course of their new methodology? It has been demonstrated that multi-tasking has long-term negative consequences, so what are the potential long-term negative consequences of this new study?

The control group was crap. An appropriate control group would have been to take a class taught in the new method from the beginning (which they say they have at this school) and convert it to standard lecture format in parallel with the experimental conversion of a normally taught class to the new methodology.

No comments were made on students who dropped out of or did poorly in the new-method, and whether these students would have likely dropped or done poorly in conventional lecture courses. Given pre-experiment and post-experiment quizzes, as well as the existence of entire courses taught with the new methodology, there’s no excuse for this lack. These studies need to be done before studies like the current one are lauded. The unforeseen negative effects of these avant-garde teaching (I almost wrote “learning”) strategies need to be discovered and addressed.

I submitted an e-letter to this paper which was declined by Science that covers even more points (such as the study they reference to dismiss the Hawthorne effect explicitly stating it is not applicable to the kind of study performed here). They never responded to my points.

I speak as a person who has been an undergraduate for about 9 years (cumulative) over the last 16 years. I speak as a person who prefers minimal group studying and maximal individual or one-on-one studying (and prefers the use of resource centers where such one-on-one studying can take place over any group-based study). I hate group-based study, yet it’s been made all the rage in the last twenty years. Please don’t continue to mess people like me over in the name of what’s good for the majority. There is plenty of room for the educational diversity that can address everyone’s needs, but too often the new-wave makes a clean sweep of everything (Charles Eliot, papers instead of textbooks, PLTL to the exclusion of tutoring, etc…).

Robert Evans
In Hell's Kitchen (NYC) says:

August 10, 2011 at 4:04 pm

is it possible that the “experimental” section taught the test ?
A biology prof says:

August 19, 2011 at 3:08 pm

At my University, the Administration is highly supportive of these kinds of approaches — and they seem to be very aware that one result of the implementation will be negative student evaluations. They have let us know in advance that they are expecting negative student evaluations, and the retention/tenure/promotion for faculty using engagement methods are not going to punish faculty for these evaluations. I think administrators wanting to encourage these innovations NEED to be aware of this if they want faculty to try more active approaches.

Comments are closed.