Change search
ReferencesLink to record
Permanent link

Direct link
Resampling effects on significance analysis of network clustering and ranking
Umeå University, Faculty of Science and Technology, Department of Physics. (IceLab)
Umeå University, Faculty of Science and Technology, Department of Physics.
2013 (English)In: PLoS ONE, ISSN 1932-6203, Vol. 8, no 1, e53943- p.Article in journal (Refereed) Published
Abstract [en]

Community detection helps us simplify the complex configuration of networks, but communities are reliable only if they are statistically significant. To detect statistically significant communities, a common approach is to resample the original network and analyze the communities. But resampling assumes independence between samples, while the components of a network are inherently dependent. Therefore, we must understand how breaking dependencies between resampled components affects the results of the significance analysis. Here we use scientific communication as a model system to analyze this effect. Our dataset includes citations among articles published in journals in the years 1984–2010. We compare parametric resampling of citations with non-parametric article resampling. While citation resampling breaks link dependencies, article resampling maintains such dependencies. We find that citation resampling underestimates the variance of link weights. Moreover, this underestimation explains most of the differences in the significance analysis of ranking and clustering. Therefore, when only link weights are available and article resampling is not an option, we suggest a simple parametric resampling scheme that generates link-weight variances close to the link-weight variances of article resampling. Nevertheless, when we highlight and summarize important structural changes in science, the more dependencies we can maintain in the resampling scheme, the earlier we can predict structural change. 

Place, publisher, year, edition, pages
2013. Vol. 8, no 1, e53943- p.
National Category
Probability Theory and Statistics
URN: urn:nbn:se:umu:diva-64527DOI: 10.1371/journal.pone.0053943OAI: diva2:602137
Swedish Research Council, 2009-5344
Available from: 2013-02-04 Created: 2013-01-31 Last updated: 2013-08-29Bibliographically approved
In thesis
1. Organization of information pathways in complex networks
Open this publication in new window or tab >>Organization of information pathways in complex networks
2013 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

A shuman beings, we are continuously struggling to comprehend the mechanism of dierent natural systems. Many times, we face a complex system where the emergent properties of the system at a global level can not be explained by a simple aggregation of the system's components at the micro-level. To better understand the macroscopic system eects, we try to model microscopic events and their interactions. In order to do so, we rely on specialized tools to connect local mechanisms with global phenomena. One such tool is network theory. Networks provide a powerful way of modeling and analyzing complex systems based on interacting elements. The interaction pattern links the elements of the system together and provides a structure that controls how information permeates throughout the system. For example, the passing of information about job opportunities in a society depends on how social ties are organized. The interaction pattern, therefore, often is essential for reconstructing and understanding the global-scale properties of the system.

In this thesis, I describe tools and models of network theory that we use and develop to analyze the organization of social or transportation systems. More specifically, we explore complex networks by asking two general questions: First, which mechanistic theoretical models can better explain network formation or spreading processes on networks? And second, what are the signi cant functional units of real networks? For modeling, for example, we introduce a simple agent-based model that considers interacting agents in dynamic networks that in the quest for information generate groups. With the model, we found that the network and the agents' perception are interchangeable; the global network structure and the local information pathways are so entangled that one can be recovered from the other one. For investigating signi cant functional units of a system, we detect, model, and analyze signi cant communities of the network. Previously introduced methods of significance analysis suer from oversimpli ed sampling schemes. We have remedied their shortcomings by proposing two dierent approaches: rst by introducing link prediction and second by using more data when they are available. With link prediction, we can detect statistically signi cant communities in large sparse networks. We test this method on real networks, the sparse network of the European Court of Justice case law, for example, to detect signi cant and insigni cant areas of law. In the presence of large data, on the other hand, we can investigate how underlying assumptions of each method aect the results of the signi cance analysis. We used this approach to investigate dierent methods for detecting signi cant communities of time-evolving networks. We found that, when we highlight and summarize important structural changes in a network, the methods that maintain more dependencies in signi cance analysis can predict structural changes earlier.

In summary, we have tried to model the systems with as simple rules as possible to better understand the global properties of the system. We always found that maintaing information about the network structure is essential for explaining important phenomena on the global scale. We conclude that the interaction pattern between interconnected units, the network, is crucial for understanding the global behavior of complex systems because it keeps the system integrated. And remember, everything is connected, albeit not always directly.

Place, publisher, year, edition, pages
Umeå, Sweden: Umeå University, 2013. 55 p.
Complex systems, Complex Networks, Information, Communication, Community Detection, Significance Analysis, Resampling, Information Spreading.
National Category
Computer and Information Science
Research subject
Computer and Information Science
urn:nbn:se:umu:diva-79734 (URN)978-91-7459-715-8 (ISBN)
Public defence
2013-09-20, Naturvetarhuset, N300, Umeå universitet, Umeå, 14:00
Available from: 2013-08-30 Created: 2013-08-29 Last updated: 2013-08-30Bibliographically approved

Open Access in DiVA

fulltext(493 kB)176 downloads
File information
File name FULLTEXT02.pdfFile size 493 kBChecksum SHA-512
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Search in DiVA

By author/editor
Mirshahvalad, AtiehRosvall, Martin
By organisation
Department of Physics
In the same journal
Probability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar
Total: 176 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 108 hits
ReferencesLink to record
Permanent link

Direct link