Notebooks

Analysis of Network Data

17 Jul 2022 15:08

That is, of data in the form of networks --- I don't (as such) care about packet flow or other aspects of computer networks...

Things I wish I knew how to do: bootstrap a network, non-parametrically. (The model with a fixed degree sequence is a start, but what's the equivalent of the block bootstraps used for time series, which preserve dependence? [Update, 2017: see my paper with Alden Green.]) Cross-validation on networks. (You could say that link prediction is leave-one-out CV, but how about k-fold CV? [Update, 2016: see the papers by Chen and Lei, and (especially) by Dabbs and Junker.]) Estimate a distribution over networks by somehow smoothing an adjacency matrix. [Update, 2016: see Lawrence Wang's thesis.] Compare networks to say if they came from the same distribution [Update, 2015: See my paper with Dena Asta]. --- These may or may not be aspects of a single problem.

Community discovery is an important sub-topic, and I like exponential family random graph models, stochastic block models and graph limits enough to give them their own notebooks. Many of the entries under graphical models are about figuring out the network of interaction between random variables from patterns of dependence across those variables.

--- Although many of the relevant papers appear in the journal Social Networks, published by Elsevier, a company known to also publish advertising disguised as peer-reviewed scientific journals (e.g., The Australasian Journal of Bone and Joint Medicine), I know of no particular reason to believe that their findings are actually meretricious propaganda on behalf of a paying client. It would, however, be better if the community would shift to a journal whose publisher did not pollute the process of scientific communication whenever it was profitable to do so.

See also: Biochemical Network Evolution; Complex networks; Community discovery; Embedding Networks in Hyperbolic Spaces; Exponential families of random graph models;

  • Gene Expression Data Analysis Graph Limits and Exchangeable Random Graphs; Graph Theory; Graph Sampling Algorithms; Graph Spectra; Homophily vs. influence; Inferring networks from non-network data; Joint modeling of texts and networks; Network comparison; Political networks; Power laws (for questions about "scale-free" networks); Projectivity in Statistical Models; Relational learning; Social networks; Statistics in general; Statistics of structured data; Visualizing network data
      Recommended, big picture:
    • Stephen E. Fienberg, "A Brief History of Statistical Models for Network Analysis and Open Challenges", Journal of Computational and Graphical Statistics 21 (2012): 825--839
    • Anna Goldenberg, Alice X. Zheng, Stephen E. Fienberg, Edoardo M. Airoldi, "A survey of statistical network models", Foundations and Trends in Machine Learning 2 (2009): 1--117 = arxiv:0912.5410
    • Abigail Z. Jacobs, Aaron Clauset, "A unified view of generative models for networks: models, methods, opportunities, and challenges", arxiv:1411.4070
    • Eric D. Kolaczyk, Statistical Analysis of Network Data: Methods and Models [Best available up-to-date textbook on the subject. Mini-review.]
      Recommended, close-ups:
    • Nesreen K. Ahmed, Jennifer Neville, Ramana Kompella, "Reconsidering the Foundations of Network Sampling" [PDF preprint]
    • Edo Airoldi, David M. Blei, Stephen E. Fienberg, Anna Goldenberg, Eric P. Xing and Alice X. Zheng (eds.), Statistical Network Analysis: Models, Issues, and New Directions [Disclaimer: contains one of my papers.]
    • Edoardo M. Airoldi, David M. Blei, Stephen E. Fienberg, Eric P. Xing, "Mixed Membership Stochastic Blockmodels", Journal of Machine Learning Research 9 (2008): 1981--2014
    • Sharmodeep Bhattacharyya, Peter J. Bickel, "Subsampling Bootstrap of Count Features of Networks", arxiv:1312.2645
    • Peter J. Bickel, Aiyou Chen, and Elizaveta Levina, "The method of moments and degree distributions for network models", Annals of Statistics 39 (2011): 38--59
    • Stephen P. Borgatti, Kathleen M. Carley and David Krackhardt, "On the robustness of centrality measures under conditions of imperfect data", Social Networks 28 (2006): 124--136
    • Peter J. Carrington, John Scott and Stanley Wasserman (eds.), Models and Methods in Social Network Analysis [Best thought of as a supplement to Wasserman and Faust, bringing it more up to date. Blurb]
    • Sourav Chatterjee, Persi Diaconis and Allan Sly, "Random graphs with a given degree sequence", Annals of Applied Probability 21 (2011): 1400--1435, arxiv:1005.1136 [Interesting application of the new technology of graph limits to a classic model. May not be terribly practical yet but definitely promising.]
    • Kehui Chen, Jing Lei, "Network Cross-Validation for Determining the Number of Communities in Network Data", arxiv:1411.1715
    • Aaron Clauset and Cristopher Moore, "Accuracy and Scaling Phenomena in Internet Mapping", Physical Review Letters 94 (2005): 018701, cond-mat/0410059
    • Aaron Clauset, Cristopher Moore and M. E. J. Newman
      • "Structural Inference of Hierarchies in Networks", physics/0610051
      • "Hierarchical Structure and the Prediction of Missing Links in Networks", Nature 453 (2008): 98--101, arxiv:0811.0484
    • Eduardo Corona, Terran Lane, Curtis Storlie, Joshua Neil, "Using Laplacian Methods, RKHS Smoothing Splines and Bayesian Estimation as a framework for Regression on Graph and Graph Related Domains" [Technical report, University of New Mexico Computer Science, 2008-06, PDF]
    • Beau Dabbs and Brian Junker, "Comparison of Cross-Validation Methods for Stochastic Block Models", arxiv:1605.03000
    • Hoda Eldaridry and Jennifer Neville, "A Resampling Technique for Relational Data Graphs", SNA-KDD 2008 [PDF reprint via Prof. Neville]
    • Katherine Faust and John Skvoretz, "Comparing Networks Across Space and Time, Size and Species", Sociological Methodology 32 (2002): 267--299 [Though see some remarks in my paper with Ale Rinaldo on ERGMs, linked below]
    • Dennis M. Feehan, Matthew J. Salganik, "Generalizing the Network Scale-Up Method: A New Estimator for the Size of Hidden Populations", arxiv:1404.4009
    • Linton C. Freeman and Douglas R. White (2003), "Using Galois Lattices to Represent Network Data", Sociological Methodology 23: 127--146 [PDF reprint]
    • Wenjie Fu, Le Song, Eric P. Xing, "A State-Space Mixed Membership Blockmodel for Dynamic Network Tomography", arxiv:0901.0135
    • Krista Gile and Mark S. Handcock, "Model-based Assessment of the Impact of Missing Data on Inference for Networks" [Working Paper 66, Center for Statistics and the Social Sciences, University of Washington (2006). PDF preprint]
    • Steven M. Goodreau, James A. Kitts and Martina Morris, "Birds of a Feather, Or Friend of a Friend?: Using Exponential Random Graph Models to Investigate Adolescent Social Networks", Demography 46 (2009): 103--125 [In addition to the substantive findings, this is a great introduction to the "exponential-family random graph model" (ERGM) approach to modeling complex networks.]
    • Mark S. Handcock and Krista J. Gile, "Modeling social networks from sampled data", Annals of Applied Statistics 4 (2010): 5--25, arxiv:1010.0891
    • Mark S. Handcock, David R. Hunter, Carter T. Butts, Steven M. Goodreau, and Martina Morris (eds.), "Statistical Modeling of Social Networks with 'statnet'", special volume (24) of the Journal of Statistical Software (2008) [Introduction to a whole issue on the ERGM approach]
    • J. A. Henderson and P. A. Robinson, "Geometric Effects on Complex Network Structure in the Cortex", Physical Review Letters 107 (2011): 018102
    • Peter Hoff, "Modeling homophily and stochastic equivalence in symmetric relational data", NIPS 2007, arxiv:0711.1146
    • Peter D. Hoff, Adrian E. Raftery and Mark S. Handcock, "Latent Space Approaches to Social Network Analysis", Journal of the American Statistical Association 97 (2002): 1090--1098 [PDF preprint]
    • Jake Hofman, "Large-scale social media analysis with Hadoop" [Tutorial at ICWSM 2010. The content is not really specific to social media...]
    • David R. Hunter, Steven M. Goodreau and Mark S. Handcock, "Goodness of Fit of Social Network Models", Journal of the American Statistical Association 103 (2008): 248--258 [PDF]
    • Eric D. Kolaczyk and Gábor Csárdi, Statistical Analysis of Network Data with R
    • Eric D. Kolaczyk and Pavel N. Krivitsky, "On the question of effective sample size in network modeling", arxiv:1112.0840
    • Gueorgi Kossinets, "Effects of Missing Data in Social Networks", Social Networks 28 (2006): 247--268, arxiv:cond-mat/0306335
    • Sang Hoon Lee, Pan-Jun Kim, and Hawoong Jeong, "Statistical properties of sampled networks", cond-mat/0505232
    • Youjin Lee, Elizabeth L. Ogburn, "Testing for Network and Spatial Autocorrelation", arxiv:1710.03296
    • James Robert Lloyd, Peter Orbanz, Zoubin Ghahramani and Daniel M. Roy, "Random function priors for exchangeable arrays with applications to graphs and relational data", NIPS 2012
    • Mahendra Mariadassou, Stéphane Robin, Corinne Vacher, "Uncovering latent structure in valued graphs: A variational approach", Annals of Applied Statistics 4 (2010): 715--742, arxiv:1011.1813
    • Manul Middendorf, Etay Ziv and Chris Wiggins, "Inferring Network Mechanisms: The Drosophila melanogaster Protein Interaction Network", q-bio.QM/0408010 [Machine learning meets complex networks: specifically, learning decision trees to accurately classify networks by the process which grew them. Neat.]
    • Sebastian Moreno, Sergey Kirshner, Jennifer Neville, S.V.N. Vishwanathan, "Tied Kronecker Product Graph Models to Capture Variance in Network Populations" [PDF reprint]
    • M. E. J. Newman, "Analysis of weighted networks", Physical Review E 70 (2004): 056131, arxiv:cond-mat/0407503
    • M. E. J. Newman, Steven H. Strogatz and Duncan J. Watts, "Random graphs with arbitrary degree distributions and their applications", Physical Review E 64 (2001): 026118, cond-mat/0007235 [Though they don't quite put it this way, these methods are very naturally employed to generate surrogate network data, which keeps the degree distribution of the original but is otherwise randomized.]
    • Benjamin P. Olding, Patrick J. Wolfe, "Inference for graphs and networks: Extending classical tools to modern data", arxiv:0906.4980
    • Art B. Owen and Dean G. Eckles, "Bootstrapping data arrays of arbitrary order", Annals of Applied Statistics 6 (2012): 895--927, arxiv:1106.2125
    • Henry Pao, Glen A. Coppersmith and Carey E. Priebe, "Statistical Inference on Random Graphs: Comparative Power Analysis", Journal of Computational and Graphical Statistics 20 (2011): 395--416
    • Patrick O. Perry, Patrick J. Wolfe, "Null models for network data", arxiv:1201.5871
    • Gail E. Potter and Niel Hens, "A penalized likelihood approach to estimate within-household contact networks from egocentric data", Journal of the Royal Statistical Society C 62 (2013): 629--648
    • Jörg Reichardt and Douglas R. White, "Role models for complex networks", arxiv:0708.0958
    • Martin Rosvall and Carl T. Bergstrom, "Mapping Change in Large Networks", PLoS One 5 (2010): e8694
    • Camille Roth, "Measuring Generalized Preferential Attachment in Dynamic Social Networks", nlin.AO/0507021 [Applies more generally than to social networks]
    • Purnamrita Sarkar and Andrew W. Moore, "Dynamic Social Network Analysis using Latent Space Models", forthcoming in Advances in Neural Information Processing Systems 18 (NIPS 2005) [Abstract, link to PDF]
    • Grant Schoenebeck, "Potential Networks, Contagious Communities, and Understanding Social Network Structure", WWW 2013, arxiv:1304.1845
    • Jesse Shore and Benjamin Lubin, "Spectral goodness of fit for network models", Social Networks 43 (2015): 16–-27, arxiv:1407.7247
    • Laura M. Smith, Kristina Lerman, Cristina Garcia-Cardona, Allon G. Percus, Rumi Ghosh, "Spectral Clustering with Epidemic Diffusion", arxiv:1303.2663
    • Michael P. H. Stumpf and Carsten Wiuf, "Sampling properties of random graphs: the degree distribution", Physical Review E 72 (2005): 036118, cond-math/0507345
    • Michael P. H. Stumpf, Carsten Wiuf and Robert M. May, "Subnets of scale-free networks are not scale-free: Sampling properties of networks", PNAS 102 (2005): 4221--4224
    • Daniel L. Sussman, Minh Tang, Carey E. Priebe, "Universally Consistent Latent Position Estimation and Vertex Classification for Random Dot Product Graphs", arxiv:1207.6745
    • Andrew C. Thomas, "Censoring Out-Degree Compromises Inferences of Social Network Contagion and Autocorrelation", arxiv:1008.1636
    • Andrew C. Thomas and Joseph K. Blitzstein, "Valued Ties Tell Fewer Lies: Why Not To Dichotomize Network Edges With Thresholds", arxiv:1101.0788
    • Yu-Xiang Wang, James Sharpnack, Alex Smola, Ryan J. Tibshirani, "Trend Filtering on Graphs", arxiv:1410.7690
    • S. Wasserman and K. Faust, Social Network Analysis [This was, for a long time, the Bible of the field. Like the Bible, it is not without value, especially if approached as a historical document, but at the same time, much of it is over-detailed, boring, and filled with prescriptions that no longer make much sense.]
    • Carsten Wiuf, Markus Brameier, Oskar Hagberg and Michael P. H. Stumpf, "A likelihood approach to analysis of network data", Proceedings of the National Academy of Sciences (USA) 103 (2006): 7566--7570 [My comments. Shorter: A nice piece of work, though limited to "duplication attachment" models, a limitation which is not really made clear by the abstract.]
    • Douglas R. White and Vincent Duquenne, eds. (1996), special issue on "Social Network and Discrete Structure Analysis", Social Networks 18: 169--318
    • Rongjing Xiang and Jennifer Neville, "Relational Learning with One Network: An Asymptotic Analysis", AI Stats 2011 [PDF reprint]
    • Yang Yang, Ira M. Longini Jr, M. Elizabeth Halloran, "A resampling-based test to detect person-to-person transmission of infectious disease", Annals of Applied Statistics 1 (2007): 211--228, arxiv:0709.0406 [Though the null they are comparing it to is one of IID disease onset times, which is, I think, only appropriate when there is no assortative mixing in the social network for traits which influence onset times for a non-contagious disease.]
    • Yaonan Zhang, Eric D. Kolaczyk and Bruce D. Spencer, "Estimating Network Degree Distributions Under Sampling: An Inverse Problem, with Applications to Monitoring Social Media Networks", Annals of Applied Statistics 9 (2015): 166--199, arxiv:1305.4977
    • Yaojia Zhu, Xiaoran Yan, Lise Getoor, Cristopher Moore, "Scalable Text and Link Analysis with Mixed-Topic Link Models", arxiv:1303.7264
      Sort-of recommended:
    • Sen Pei and Hernán A. Makse, "Spreading dynamics in complex networks", Journal of Statistical Mechanics: Theory and Experiment (2013): P12002 [Only half-recommended because, despite the title and abstract, the way they define "influence" is just the number of nodes reachable from a given node. (They're using a directed graph [LiveJournal] so that's not just the size of the connected component the node is in.) No actual spreading dynamics are analyzed. (To be really precise, they define an edge between blogs 1 and 2 iff blog 1 has cited a specific post of blog 2 at least once. A chain of such links does not necessarily indicate information spreading.) It's interesting to see how this correlates with different centrality measures, but it's not quite what was promised.]
      Pride compels me to recommend:
    • Dena Marie Asta, Geometric Approaches to Inference: Non-Euclidean Data and Networks [Ph.D. thesis, CMU Departments of Statistics and of Engineering & Public Policy, 2015]
    • Justin Gross, Cues and Heuristics on Capitol Hill: Relational Decision-Making in the United States Senate [Ph.D. thesis, CMU Department of Statistics, 2010]
    • Neil A. Spencer, Networks, Point Processes, and Networks of Point Processes [Ph.D. thesis, CMU Departments of STatistics and of Machine Learning, 2020]
    • Lawrence Wang, Network Comparisons using Sample Splitting [Ph.D. thesis, CMU Department of Statistics, 2016]
      To read:
    • Alexandre H. Abdo and A. P. S. de Moura, "Clustering as a measure of the local topology of networks", physics/0605235
    • Aaron B. Adcock, Blair D. Sullivan, Michael W. Mahoney, "Tree decompositions and social graphs", arxiv:1411.1546
    • Nesreen K. Ahmed, Jennifer Neville, Ramana Kompella, "Network Sampling: From Static to Streaming Graphs", arxiv:1211.3412
    • Elizabeth S. Allman, Catherine Matias, John A. Rhodes, "Parameter identifiability in a class of random graph mixture models", arxiv:1006.0826
    • Gerrit Ansmann and Klaus Lehnertz, "Constrained randomization of weighted networks", Physical Review E 84 (2011): 026103
    • Peter M. Aronow, Forrest W. Crawford, "Nonparametric Identification for Respondent-Driven Sampling", arxiv:1504.03574
    • Tomaso Aste, Ruggero Gramatica, T. Di Matteo, "Exploring complex networks via topological embedding on surfaces", arxiv:1107.3456
    • James Atwood, Don Towsley, Krista Gile, David Jensen, "Learning to Generate Networks", arxiv:1405.5868
    • Pierre Baldi et al., Modeling the Internet and the Web: Probabilistic Methods and Algorithms
    • Anirban Banerjee, "Structural distance and evolutionary relationship of networks", arxiv:0807.3185
    • Kim Baskerville and Maya Paczuski, "Subgraph ensembles and motif discovery using an alternative heuristic for graph isomorphism", Physical Review E 74 (2006): 051903
    • Christian Bauckhage, Kristian Kersting and Fabian Hadiji, "Parameterizing the Distance Distribution of Undirected Networks", UAI 2015
    • Yakir Berchenko, Jonathan Rosenblatt, Simon D. W. Frost, "Modeling and Analysing Respondent Driven Sampling as a Counting Process", arxiv:1304.3505
    • Etienne Birmele
    • Hendrik Blockeel, Robert Brijder, "Non-Confluent NLC Graph Grammar Inference by Compressing Disjoint Subgraphs", arxiv:0901.4876
    • Cristiano Bocci, Luca Chiantini, Fabio Rapallo, "Max-plus objects to study the complexity of graphs", arxiv:1111.1352
    • Andrea Capocci, G. Caldarelli and P. De Los Rios, "Quantitative description and modeling of real networks," cond-mat/0206336
    • Arnaud Casteigts, Paola Flocchini, Walter Quattrociocchi, Nicola Santoro, "Time-Varying Graphs and Dynamic Networks", arxiv:1012.0009
    • Yoon-Sik Cho, Aram Galstyan, Jeff Brantingham, George Tita, "Latent Point Process Models for Spatial-Temporal Networks", arxiv:1302.2671
    • Federica Cerina, Vincenzo De Leo, Marc Barthelemy, Alessandro Chessa, "Spatial correlations in attribute communities", arxiv:1112.3308
    • Ines Chami, Sami Abu-El-Haija, Bryan Perozzi, Christopher Ré, Kevin Murphy, "Machine Learning on Graphs: A Model and Comprehensive Taxonomy", Journal of Machine Learning Research 23 (2022): 89
    • Munmun De Choudhury, Winter A. Mason, Jake M. Hofman and Duncan J. Watts, "Inferring Relevant Social Networks from Interpersonal Communication", WWW 2010, pp. 301--310 [On the importance of picking the right threshold to define a binary network tie]
    • Vittoria Colizza, Alessandro Flammini, M. Angeles Serrano, Alessandro Vespignani, "Detecting rich-club ordering in complex network", physics/0602134
    • Forrest W. Crawford, "The graphical structure of respondent-driven sampling", arxiv:1406.0721
    • Forrest W. Crawford, Jiacheng Wu, Robert Heimer, "Hidden population size estimation from respondent-driven sampling: a network approach", arxiv:1504.08349
    • Mihai Cucuringu, Vincent D. Blondel and Paul Van Dooren, "Extracting spatial information from networks with low-order eigenvectors", Physical Review E 87 (2013): 032803
    • Luciano da F. Costa, Francisco A. Rodrigues, Gonzalo Travieso and P. R. Villas Boas, "Characterization of complex networks: A survey of measurements", cond-mat/0505185
    • Leon Danon, Ashley P. Ford, Thomas House, Chris P. Jewell, Matt J. Keeling, Gareth O. Roberts, Joshua V. Ross, Matthew C. Vernon, "Networks and the Epidemiology of Infectious Disease", arxiv:1011.5950
    • Xiaowen Dong, Pascal Frossard, Pierre Vandergheynst, Nikolai Nefedov, "Clustering on Multi-Layer Graphs via Subspace Analysis on Grassmann Manifolds", arxiv:1303.2221
    • Anton Dries, Siegfried Nijssen, "Mining Patterns in Networks using Homomorphism", arxiv:1110.3225
    • Daniel M. Dunlavy, Tamara G. Kolda, Evrim Acar, "Temporal Link Prediction using Matrix and Tensor Factorizations", arxiv:1005.4006
    • Thibault Espinasse, Jean-Michel Loubes, "A Kriging procedure for processes indexed by graphs", arxiv:1406.6592
    • Ernesto Estrada, "Quantifying network heterogeneity", Physical Review E 82 (2010): 066102
    • Jacob G. Foster, David V. Foster, Peter Grassberger and Maya Paczuski, "Link likelihoods in random networks with fixed and partially fixed degree sequence", cond-mat/0610446
    • Francois Fouss, Marco Saerens and Masashi Shimbo, Algorithms and Models for Network Data and Link Analysis
    • Kevin Françoisse, Ilkka Kivimäki, Amin Mantrach, Fabrice Rossi, Marco Saerens, "A bag-of-paths framework for network data analysis", arxiv:1302.6766
    • Birgitte Freiesleben de Blasio, Taral Guldahl Seierstad, Odd O. Aalen, "Frailty effects in networks: comparison and identification of individual heterogeneity versus preferential attachment in evolving networks", Journal of the Royal Statistical Society C forthcoming (2011)
    • Rumi Ghosh and Kristina Lerman, "Parameterized centrality metric for network analysis", Physical Review E 83 (2011): 066118
    • Gourab Ghoshal, Vinko Zlatic, Guido Caldarelli, M. E. J. Newman, "Random hypergraphs and their applications", Physical Review E 79 (2009): 066118, arxiv:0903.0419
    • Krista J. Gile, Lisa G. Johnston, Matthew J. Salganik, "Diagnostics for Respondent-driven Sampling", arxiv:1209.6254
    • Reid Ginoza and Andrew Mugler, "Network motifs come in sets: Correlations in the randomization process", Physical Review E 82 (2010): 011921
    • Benjamin Golub and Matthew O. Jackson, "Using selection bias to explain the observed structure of Internet diffusions", Proceedings of the National Academy of Sciences (USA) 107 (2010): 10833--10836
    • Neil Zhenqiang Gong, Ameet Talwalkar, Lester Mackey, Ling Huang, Eui Chul Richard Shin, Emil Stefanov, Elaine (Runting) Shi, Dawn Song, "Predicting Links and Inferring Attributes using a Social-Attribute Network (SAN)", arxiv:1112.3265
    • Neil Zhenqiang Gong, Wenchang Xu, Ling Huang, Prateek Mittal, Emil Stefanov, Vyas Sekar, Dawn Song, "Evolution of Social-Attribute Networks: Measurements, Modeling, and Implications using Google+", arxiv:1209.0835
    • Palash Goyal, Emilio Ferrara, "Graph Embedding Techniques, Applications, and Performance: A Survey", Knowledge Based Systems 151 (2018): 78--94, arxiv:1705.02801
    • Daniel Grady, Christian Thiemann, Dirk Brockmann, "Parameter-free identification of salient features in complex networks", arxiv:1110.3864
    • Roger Guimera and Marta Sales-Pardo, "Missing and spurious interactions and the reconstruction of complex networks", Proceedings of the National Academy of Sciences (USA) 106 (2009): 22073--22078
    • Eric C. Hall, Rebecca M. Willett, "Tracking Dynamic Point Processes on Networks", IEEE Transactions on Information Theory 62 (2016), arxiv:1409.0031
    • Mark S. Handcock, Krista J. Gile, "On the Concept of Snowball Sampling", arxiv:1108.0301
    • Robert A. Hanneman and Mark Riddle, Introduction to Social Network Methods [Online textbook, looks decent.]
    • Nicholas A. Heard, David J. Weston, Kiriaki Platanioti, David J. Hand, "Bayesian anomaly detection methods for social networks", Annals of Applied Statistics 4 (2010): 645--662, arxiv:1011.1788
    • Petter Holme, "Local symmetries in complex networks", cond-mat/0608695
    • Petter Holme, Jari Saramäki, "Temporal Networks", arxiv:1108.1780
    • Rui Jiang, Zhidong Tu, Ting Chen and Fengzhu Sun, "Network motif identification in stochastic networks", Proceedings of the National Academy of Sciences (USA) 103 (2006): 9404--9409
    • Natallia Katenka and Eric D. Kolaczyk, "Inference and characterization of multi-attribute networks with application to computational biology", Annals of Applied Statistics 6 (2012): 1068--1094
    • Eric D. Kolaczyk, David B. Chua, Marc Barthelemy, "Co-Betweenness: A Pairwise Notion of Centrality", arxiv:0709.3420
    • Gueorgi Kossinets and Duncan J. Watts
    • Vassilis Kostakos, Eamonn O'Neill, Alan Penn, "Brief encounter networks", 0709.0223 [Networks defined by brief transactions, rather than persistent ties.]
    • Arne Kovac, Andrew D.A.C. Smith, "Regression on a Graph", Journal of Computational and Graphical Statistics 20 (2011): 432--447, arxiv:0911.1928
    • Jerome Kunegis, Ernesto W. De Luca, Sahin Albayrak, "The Link Prediction Problem in Bipartite Networks", arxiv:1006.5367
    • Aapo Kyrola, Guy Bleloch and Carlos Guestrin, "GraphChi: Large-Scale Graph Computation on Just a PC" []
    • Matthieu Latapy and Clemence Magnien, "Measuring Fundamental Properties of Real-World Complex Networks", cs.NI/0609115 [How asymptotic are we?]
    • Claire Lemercier, "Formal network methods in history: why and how?", halshs-00521527
    • Jure Leskovec, Deepayan Chakrabarti, Jon Kleinberg, Christos Faloutsos, Zoubin Ghahramani, "Kronecker Graphs: An Approach to Modeling Networks", arxiv:0812.4905
    • Hairong Liu, Longin Jan Latecki, Shuicheng Yan, "Revealing Cluster Structure of Graph by Path Following Replicator Dynamic", arxiv:1303.2643
    • Han Liu, Xi Chen, John Lafferty and Larry Wasserman, "Graph-Valued Regression", NIPS 23 (2010) [PDF], arxiv:1006.3972
    • Linyuan Lu, Tao Zhou, "Link Prediction in Complex Networks: A Survey", arxiv:1010.0725
    • David Lusseau, Hal Whitehead, Shane Gero, "Incorporating uncertainty into the study of animal social networks", arxiv:0903.1519 [From a quick look, nothing in this depends on animals]
    • Sofus A. Macskassy, Foster Provost, "Classification in Networked Data: A Toolkit and a Univariate Case Study", Journal of Machine Learning Research 8 (2007): 935--983
    • Yoshiharu Maeno, Yukio Ohsawa, "Node discovery problem for a social network", arxiv:0710.4975
    • Michael W. Mahoney, Lorenzo Orecchia, Nisheeth K. Vishnoi, "A Local Spectral Method for Graphs: With Applications to Improving Graph Partitions and Exploring Data Graphs Locally", Journal of Machine Learning Research 13 (2012): 2339--2365, arxiv:0912.0681
    • Arun S. Maiya, Tanya Y. Berger-Wolf, "Benefits of Bias: Towards Better Characterization of Network Sampling", arxiv:1109.3911
    • Catherine Matias and Stéphane Robin, "Modeling heterogeneity in random graphs: a selective review", arxiv:1402.4296
    • Sunil K. Narang, Akshay Gadde, Eduard Sanou, Antonio Ortega, "Localized Iterative Methods for Interpolation in Graph Structured Data", arxiv:1310.2646
    • Sergiy Nesterko, Joseph Blitzstein, "Bias-Variance and Breadth-Depth Tradeoffs in Respondent-Driven Sampling", arxiv:1210.6059
    • Jennifer Neville, Brian Gallaghr, Tina Eliassi-Rad and Tao Wang, "Correcting evaluation bias of relational classifiers with network cross validation", Knowledge and Information Systems online before print (2011) [Open access]
    • M. E. J. Newman, "Estimating network structure from unreliable measurements", Physical Review E 98 (2018): 062321, arxiv:1803.02427
    • Sarah Ouadah, Stéphane Robin, Pierre Latouche, "A degree-based goodness-of-fit test for heterogeneous random graph models", arxiv:1507.08140
    • Patrick O. Perry, Patrick J. Wolfe, "Point process modeling for directed interaction networks", arxiv:1011.1703
    • Leonid Peshkin, "Structure induction by lossless graph compression", cs.DS/0703132
    • Pedro C. Pinto, Patrick Thiran, Martin Vetterli, "Locating the Source of Diffusion in Large-Scale Networks", Physical Review Letters 109 (2012): 068702, arxiv:1208.2534
    • Art F. Y. Poon, Kimberly C. Brouwer, Stefannie A. Strathdee, Michelle Firestone-Cruz, Remedios M. Lozada, Sergei L. Kosakovsky Pond, Douglas D. Heckathorn, Simon D. W. Frost, "Parsing Social Network Survey Data from Hidden Populations Using Stochastic Context-Free Grammars", PLoS One 4 (2009): 6777
    • Karthikeyan Rajendran, Ioannis G. Kevrekidis, "Analysis of data in the form of graphs", arxiv:1306.3524
    • Mathias Raschke, Markus Schlapfer and Roberto Nibali, "Measuring degree-degree association in networks", arxiv:1003.1634
    • Mathias Raschke, Markus Schlapfer and Konstantinous Trantopoulos, "Copula-based generation of degree-associated networks", arxiv:1012.0201
    • Emile Richard, Pierre-Andre Savalle, Nicolas Vayatis, "Graph Prediction in a Low-Rank and Autoregressive Setting", arxiv:1205.1406
    • Sebastien Roch and Karl Rohe, "Generalized least squares can overcome the critical threshold in respondent-driven sampling", Proceedings of the National Academy of Sciences 115 (2018): 1029--10304
    • Daniel M. Romero, Chenhao Tan, Johan Ugander, "Social-Topical Affiliations: The Interplay between Structure and Popularity", arxiv:1112.1115
    • Mauricio Sadinle, "The Strength of Arcs and Edges in Interaction Networks: Elements of a Model-Based Approach", arxiv:1211.0312
    • Purnamrita Sarkar, Deepayan Chakrabarti, Michael Jordan
    • Areejit Samal, Olivier C. Martin, "Randomizing genome-scale metabolic networks", arxiv:1012.1473
    • Michael Schweinberger, "Statistical modelling of network panel data: Goodness of fit", British Journal of Mathematical and Statistical Psychology 65 (2012): 263--281
    • M. Angeles Serrano, Marian Boguna, Romualdo Pastor-Satorras, "Correlations in weighted networks", cond-mat/0609029
    • Nino Shervashidze, Pascal Schweitzer, Erik Jan van Leeuwen, Kurt Mehlhorn, Karsten M. Borgwardt, "Weisfeiler-Lehman Graph Kernels", Journal of Machine Learning Research 12 (2011): 2539--2561
    • Donghyuk Shin, Si Si, Inderjit S. Dhillon, "Multi-Scale Link Prediction", arxiv:1206.1891
    • Mile Sikic, Alen Lancic, Nino Antulov-Fantulin, Hrvoje Stefancic, "Epidemic centrality and the underestimated epidemic impact on network peripheral nodes", arxiv:1110.2558
    • Aarti Singh, Robert D. Nowak, Robert Calderbank, "Detecting Weak but Hierarchically-Structured Patterns in Networks", Journal of Machine Learning Research proceedings 9 (2010): 749--756, arxiv:1003.0205
    • Jeffrey A. Smith, "Macrostructure from Microstructure: Generating Whole Systems from Ego Networks", Sociological Methodology 42 (2012): 155--205
    • Kyungchul Song, "Measuring the Graph Concordance of Locally Dependent Observations", Review of Economics and Statistics 100 (2018): 535--549, arxiv:1504.03712
    • Michael P. H. Stumpf, P. J. Ingram, I. Nouvel and Carsten Wiuf, "Statistical model selection methods applied to biological networks", Transactions in Computational Systems Biology forthcoming (2005) = q-bio.MN/0506013
    • Andrew C. Thomas, Hierarchical Models for Relational Data [Ph.D. thesis, Harvard statistics dept., 2009; PDF]
    • Johan Ugander, Lars Backstrom, Jon Kleinberg, "Subgraph Frequencies: Mapping the Empirical and Extremal Geography of Large Graph Collections", arxiv:1304.1548
    • Mark J. van der Laan, "Causal Inference for Networks", UC Berkeley Biostatistics working paper no. 300 (2012)
    • Ashton M. Verdery, Ted Mouw, Shawn Bauldry, Peter J. Mucha, "Network Structure and Biased Variance Estimation in Respondent Driven Sampling", arxiv:1309.5109
    • S. V. N. Vishwanathan, Nicol N. Schraudolph, Risi Kondor, Karsten M. Borgwardt, "Graph Kernels", Journal of Machine Learning Research 11 (2010): 1201--1242 ["Graphs become ever so much easier to understand when you project them into a Hilbert space." (Not an actual quote.)]
    • Hal Whitehead, Analyzing Animal Societies: Quantitative Methods for Vertebrate Social Analysis
    • Yanghua Xiao, Ben D. MacArthur, Hui Wang, Momiao Xiong, and Wei Wang, "Network quotients: Structural skeletons of complex systems", Physical Review E 78 (2008): 046102
    • Ya Xu, Justin S. Dyer, Art B. Owen, "Empirical stationary correlations for semi-supervised learning on graphs", Annals of Applied Statistics 4 (2010): 589--614, arxiv:1011.1766
    • Jean-Gabriel Young, Alec Kirkley, M. E. J. Newman, "Clustering of heterogeneous populations of networks", arxiv:2107.07489
    • Hyokun Yun, S. V. N. Vishwanathan, "Quilting Stochastic Kronecker Product Graphs to Generate Multiplicative Attribute Graphs", arxiv:1110.5383
    • Hugo Zanghi, Franck Picard, Vincent Miele, and Christophe Ambroise, "Strategies for online inference of model-based clustering in large and growing networks", Annals of Applied Statistics 4 (2010): 687--714
    • An Zeng, Giulio Cimini, "Removing spurious interactions in complex networks", arxiv:1110.5186
    • Shuheng Zhou, John Lafferty, Larry Wasserman, "Time Varying Undirected Graphs", arxiv:0802.2758
    • Xinyi Zheng, Ryan A. Rossi, Nesreen Ahmed, Dominik Moritz, "Network Report: A Structured Description for Network Datasets", arxiv:2206.03635
    • Jun Zhu, "Max-Margin Nonparametric Latent Feature Models for Link Prediction", arxiv:1206.4659
      To write:
    • Dena Asta and CRS, "Nonparametric Network Modeling in Hyperbolic Space"
    • CRS, "Indirect Inference of Neftwork Growth Models"
    • CRS and Shawn Mankad, "Statistical Properties of Aggregated Random Graphs"
    • Co-conspirators to be named later + CRS, "Smoothing Adjacency Matrices"
    • Co-conspirators to be named later + CRS, "Network Comparisons"
    • Co-conspirators to be named later + CRS, "Fractal Network Asymptotics"


  • Notebooks: