"Network Comparisons Using Sample Splitting"
My fifth Ph.D. student is defending his thesis towards the end of the month:
- Lawrence Wang, Network Comparisons Using Sample Splitting
- Abstract: Many scientific questions about networks are actually
network comparison problems: Could two networks have reasonably come from a
common source? Are there specific differences? We outline a procedure that
tests the hypothesis that multiple networks were drawn from the same
probabilistic source. In addition, when the networks are indeed different, our
procedure may characterize the differences between the sources.
- We first address the case where the two networks being compared share the
same exact nodes. We wish to use common parametric network models and the
standard likelihood ratio test (LRT), but the infeasibility of computing the
maximum likelihood estimate in our selected families of models complicates
matters. However, we take advantage of the fact that the standard likelihood
ratio test has a simple asymptotic distribution under a specific restriction of
the model family. In addition, we show that a sample splitting approach is
applicable: We can use part of the network data to choose an appropriate model
space, and use the remaining network data to compute the LRT statistic and
appeal to its asymptotic null distribution to obtain an appropriate p-value.
Moreover, we show that while a single sample split results in a random p-value,
we can choose to do multiple sample splits and aggregate the resulting
individual p-values. Sample splitting is a more general framework --- nothing
is particularly special about the specific hypothesis we decide to test. We
illustrate a couple of extensions of the framework which also provide different
ways to characterize differences in network models.
- We also address the more general case where the two networks being compared
no longer share the same set of nodes. The main difficulty in this case is
that there might not be an implicit alignment of the nodes in the two
networks. Our procedure relies on the graphon model family which can handle
networks of any size, but more importantly can be put in an aligned form which
makes it comparable. We show that the framework for alignment can be
generalized, which allows this method to handle a larger class of models.
- Time and place: 3:30 pm on Monday, 25 April 2016 in Porter Hall A22
Enigmas of Chance;
Kith and Kin
Posted at April 15, 2016 12:00 | permanent link