Compares Community Detection Solutions Using Permutation
Source:R/community.compare.R
community.compare.Rd
A permutation implementation to determine statistical significance of whether the community comparison measure is different from zero
Usage
community.compare(
base,
comparison,
method = c("vi", "nmi", "split.join", "rand", "adjusted.rand"),
iter = 1000,
shuffle.base = TRUE,
verbose = TRUE,
seed = NULL
)
Arguments
- base
Character or numeric vector. A vector of characters or numbers that are treated as the baseline communities
- comparison
Character or numeric vector (length =
length(base)
). A vector of characters or numbers that are treated as the baseline communities- method
Character (length = 1). Comparison metrics from
compare
. Defaults to"adjusted.rand"
. Available options:"vi"
— Variation of information (Meila, 2003)"nmi"
— Normalized mutual information (Danon et al., 2003)"split.join"
— Split-join distance (Dongen, 2000)"rand"
— Rand index (Rand, 1971)"adjusted.rand"
— adjusted Rand index (Hubert & Arabie, 1985; Steinley, 2004)
- iter
Numeric (length = 1). Number of permutations to perform. Defaults to
1000
(recommended)- shuffle.base
Boolean (length = 1). Whether the
base
cluster solution should be shuffled. Defaults toTRUE
to remain consistent with original implementation (Qannari et al., 2014); however, from a theoretical standpoint, it might make sense to only shuffle thecomparison
to determine whether it is specifically different from the recognizedbase
- verbose
Boolean (length = 1). Should progress be displayed? Defaults to
TRUE
. Set toFALSE
to not display progress- seed
Numeric (length = 1). Defaults to
NULL
or random results. Set for reproducible results. See Reproducibility and PRNG for more details on random number generation inEGAnet
Value
Returns data frame containing method used (Method
), empirical or observed
value (Empirical
), and p-value based on the permutation test (p.value
)
References
Implementation of Permutation Test
Qannari, E. M., Courcoux, P., & Faye, P. (2014).
Significance test of the adjusted Rand index. Application to the free sorting task.
Food Quality and Preference, 32, 93–97.
Variation of Information
Meila, M. (2003, August).
Comparing clusterings by the variation of information.
In Learning Theory and Kernel Machines: 16th Annual Conference on Learning Theory and 7th Kernel Workshop,
COLT/Kernel 2003, Washington, DC, USA, August 24-27, 2003. Proceedings (pp. 173-187). Berlin, DE: Springer Berlin Heidelberg.
Normalized Mutual Information
Danon, L., Diaz-Guilera, A., Duch, J., & Arenas, A. (2005).
Comparing community structure identification.
Journal of Statistical Mechanics: Theory and Experiment, 2005(09), P09008.
Split-join Distance
Dongen, S. (2000).
Performance criteria for graph clustering and Markov cluster experiments.
CWI (Centre for Mathematics and Computer Science).
Rand Index
Rand, W. M. (1971).
Objective criteria for the evaluation of clustering methods.
Journal of the American Statistical Association, 66(336), 846-850.
Adjusted Rand Index
Hubert, L., & Arabie, P. (1985).
Comparing partitions.
Journal of Classification, 2, 193-218.
Steinley, D. (2004). Properties of the Hubert-Arabie adjusted rand index. Psychological Methods, 9(3), 386.
Author
Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <alexpaulchristensen@gmail.com>
Examples
# Load data
wmt <- wmt2[,7:24]
# Estimate network
network <- EBICglasso.qgraph(data = wmt)
# Compute Edge Betweenness
edge_between <- community.detection(network, algorithm = "edge_betweenness")
#> Warning: At vendor/cigraph/src/community/edge_betweenness.c:503 : Membership vector will be selected based on the highest modularity score.
# Compute Fast Greedy
fast_greedy <- community.detection(network, algorithm = "fast_greedy")
# Perform permutation test
community.compare(edge_between, fast_greedy)
#> Warning: This implementation of `community.compare` is experimental.
#>
#> The underlying function and/or output may change until the results have been appropriately vetted and validated.
#> Warning: Some values were NA. These indices were removed: 3
#> Argument 'seed' is set to `NULL`. Results will not be reproducible. Set 'seed' for reproducible results
#> Method Empirical p.value
#> 1 Adjusted Rand Index 0.2724426 0.005