October 04, 2005

A Thought I Have No Time to Pursue

Start with your favorite large Erdos-Renyi random graph. Color all of the nodes, in such a manner that the number of nodes of a given color follows a strongly skewed distribution, perhaps a power law. (Exponential growth easily gives power-law size distributions.) Now form the aggregated graph, with one node for each color, and an edge between colors if any two disaggregated nodes of those colors are linked. Query: What is the degree distribution of the aggregated graph? (Inspired by thinking, while walking home, about attempts to model the structure of the Internet at the autonomous system level. Why I was doing that, I have no idea.)

Update, later that night: Aaron Clauset writes to point me to this paper:

M. Fayed, P. Krapivsky, J.W. Byers, M. Crovella, D. Finkel and S. Redner, "On the emergence of highly variable distributions in the autonomous system topology", ACM SIGCOMM Computer Communication Review 33 (2003): 41--49 [PDF reprint via Prof. Redner]
Abstract (omitting references): Recent studies observe that vertex degree in the autonomous systems (AS) graph exhibits a highly variable distribution. The most prominent explanatory model for this phenomenon is the Barabasi-Albert (B-A) model. A central feature of the B-A model is preferential connectivity --- meaning that the likelihood a new node in a growing graph will connect to an existing node is proportional to the existing node's degree. In this paper we ask whether a more general explanation than the B-A model, and absent the assumption of preferential connectivity, is consistent with empirical data. We are motivated by two observations: first, AS degree and AS size are highly correlated; and second, highly variable AS size can arise simply through exponential growth. We construct a model incorporating exponential growth in the size of the Internet and in the number of ASes, and show that it yields a size distribution exhibiting a power-law tail. In such a model, if an AS's link formation is roughly proportional to its size, then AS out-degree will also show high variability. Moreover, our approach is more flexible than previous work, since the choice of which AS to connect to does not impact high variability, thus can be freely specified. We instantiate such a model with empirically derived estimates of historical growth rates and show that the resulting degree distribution is in good agreement with that of real AS graphs.

This isn't exactly the model I had in mind; it's more realistic, for the Internet, than aggregating a static random graph. (I'm pleased to see that people who know what they're doing also thought to employ the idea that exponential growth leads to a power-law size distribution; presumably a re-invention, since they don't cite Reed and Hughes.) I remain a bit curious about the effects of aggregating a random network, but now will definitely not pursue it.

Update, 7 October: Aaron was too well-bred to point out his own papers on why many (in fact, almost all) networks seem to have power-law link distributions, when you probe them the wrong way. Fortunately, someone reminded me.

Update, 21 October: This looks relevant, if anyone's interested.

Enigmas of Chance; Networks; Power Laws

Posted at October 04, 2005 21:46 | permanent link

Three-Toed Sloth