The Bactra Review: Occasional and eclectic book reviews by Cosma Shalizi 137

Relative Distribution Methods in the Social Sciences

by Mark S. Handcock and Martina Morris

Statistics for Social Science and Public Policy series

Springer-Verlag, 1999

Beyond Mean and Deviance

This is a simple, yet fundamentally brilliant, trick for comparing the whole distribution of some trait across groups, or across time. Take one population as the reference group. Then, for each individual in the other population, ask "at what quantile in the other group's distribution would this individual fall"? If the two distributions are the same, then this "relative data" will be uniformly distributed; and if not, not. More than that, the relative distribution's departures from uniformity tell us how the populations differ.

This is vastly more informative than the usual routine of just looking at means (or medians) and variances. The graphical displays it leads to are actually illuminating. Also, you can still try to account for associations with covariates, and there are some very natural non-parametric ways to do so; these lead to assessments of the importance of covariates in information-theoretic terms (conditional relative entropies). You can also do proper statistical inference on a whole range of summmary measures, going far beyond the usual deal of just looking at shifts in location or (for the really daring) the Gini coefficient. There are some situations where the data are so limited, or so bad, that relative distributions become unreliable, and old-fashioned mean-variance comparisons will be better than nothing, but in general the former are vastly more informative. (In fact, they are sufficient statistics for comparison, without assuming Gaussian distributions.)

Handcock and Morris explain all this starting from the very basics, including a refresher on distribution functions and densities, and illustrate it with case studies drawn from their work on changing patterns of American income and work. Software (in R, of course) is available from the authors' site for the book, which also has errata.

I read (and referee) too many social scientists and biologists who seem to think that "statistical comparisons" means "t-tests and the analysis of variance". I have taught classes where those were the only systematic methods of comparison. There has been no excuse for any of this since this book was published.

Disclaimer: I know Handcock and Morris slightly, at the reciprocal-talk-invitations level; also, Morris and I are both on the Santa Fe Institute external faculty.

265 pp., many graphs, bibliography, index

Economics / Probability and Statistics / Sociology

Currently in print as a hardback, US$98.95, ISBN 0387987789 [Buy from Powell's]

10 August 2007