November 13, 2012

36-402, Spring 2012: Self-Assessment and Lessons Learned (Advanced Data Analysis from an Elementary Point of View)

Attention conservation notice: Complacent navel-gazing about teaching statistics to undergraduates at a very un-representative university. Written back in May, not posted due to laziness.

Overall, I think this went tolerably but not perfectly. The content was improved and is near (but not at) a local optimum. I've also got a better sense than last year of how much they're learning, and I think the homework is helping them learn. But the course could definitely do more to move them towards doing data analysis, as opposed to answering homework questions.

It was a substantially bigger class than last time (88 students vs. 63), and this led to some real issues. The lecture room was fine, but office hours were over-crowded, the stream of e-mail from student seemed unending, and, worst of all, there simply wasn't enough TA time available for grading. (88 weekly assignments of serious data analysis is a lot to grade.) Everything did get graded eventually, but not as fast as it should have been. From the number of students registered for the fall course in regression which is ADA's pre-requisite, I can expect about as many again in 2013, or a few less. This is about twice as many students as I'd like to see in it.

Speaking of grading, I had to completely re-do the assignments, since solutions from last year were circulating1 --- and I did not put the new solutions on-line. This was a big chunk of work I was not anticipating, but at least now I have two sets of problems for the future.

In terms of actual content, I cut nothing major, but, by eliminating exam review sessions, etc., squeezed in some extra stuff: a reminder lecture on multivariate distributions, smooth tests and relative densities, time series, and more on estimating causal effects. I am relatively happy with the over-all content, though less so with the organization. In particular I wonder about devoting the very first part of the course to regression with one input variable, as a really elementary way to introduce fundamental concepts (smoothing, kernels, splines, generalization, cross-validation, bootstrap, specification checking), and only then go to regression with multiple inputs. I worry, however, that this would be coddling them.

The "quality control samples" were eye-opening. Some students found them nerve-wracking, which was counterproductive; but lots of them also gave every sign of being candid. What I found most surprising was how many of them (a third? more?) were taking ADA because they had been funneled into it by major other than statistics, often one of CMU's several larval-quant programs. (I'm not sure how those took my plugging Red Plenty in class while explaining Lagrange multipliers.) Their self-de-motivating question was not "Is this going to be on the exam?", but rather, "Is this going to come up in interviews?". If I were a better teacher I'd work more at reaching these students.

Week-to-week, the quality control samples were very helpful for figuring out what students actually grasped, and, even more, how their thinking went astray (when it did). In particular, there turned out to be much more variation than I'd anticipated in readiness to apply material they had already learned in a new context — the "but that's from chapter 7 and this is chapter 12" problem, or even the "but that's from introduction to statistical inference, or matrices and linear transformations, which I took a year ago" problem. I am not sure if I should provide more cues, or if, again, that would be coddling.

These facts about motivation and preparation don't please me, exactly, but at least for right now these seem like constraints I have to (and can) work within.

I will certainly use the quality-control samples again in 2013, but I may take fewer each week, and make it clearer to the students that they're not being graded.

Some specifics:

Finally, if students again complain that the class makes it harder for them to take seriously what they hear in economics or psychology or the like, remember not to laugh maniacally, or use expressions like "Victory is mine!" Rather, maintain a sober, thoughtful, benevolent, and above all sane expression, and talk solemnly about the great prospects for using good statistical methods to solve real-world problems.

[1]: One advantage to putting bad jokes in your solutions is that it becomes pretty obvious when someone is parroting them back to you.

Advanced Data Analysis from an Elementary Point of View

Posted at November 13, 2012 18:42 | permanent link

Three-Toed Sloth