Exploratory Graph Analysis

Estimates the number of communities (dimensions) of a dataset or correlation matrix using a network estimation method (Golino & Epskamp, 2017; Golino et al., 2020). After, a community detection algorithm is applied (Christensen et al., 2023) for multidimensional data. A unidimensional check is also applied based on findings from Golino et al. (2020) and Christensen (2023)

Usage

EGA(
  data,
  n = NULL,
  corr = c("auto", "cor_auto", "cosine", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  model = c("BGGM", "glasso", "TMFG"),
  algorithm = c("leiden", "louvain", "walktrap"),
  uni.method = c("expand", "LE", "louvain"),
  plot.EGA = TRUE,
  verbose = FALSE,
  ...
)

Arguments

data

Matrix or data frame. Should consist only of variables to be used in the analysis. Can be raw data or a correlation matrix

n

Numeric (length = 1). Sample size if data provided is a correlation matrix

corr

Character (length = 1). Method to compute correlations. Defaults to "auto". Available options:

"auto" — Automatically computes appropriate correlations for the data using Pearson's for continuous, polychoric for ordinal, tetrachoric for binary, and polyserial/biserial for ordinal/binary with continuous. To change the number of categories that are considered ordinal, use ordinal.categories (see polychoric.matrix for more details)
"cor_auto" — Uses cor_auto to compute correlations. Arguments can be passed along to the function
"cosine" — Uses cosine to compute cosine similarity
"pearson" — Pearson's correlation is computed for all variables regardless of categories
"spearman" — Spearman's rank-order correlation is computed for all variables regardless of categories

For other similarity measures, compute them first and input them into data with the sample size (n)

na.data

Character (length = 1). How should missing data be handled? Defaults to "pairwise". Available options:

"pairwise" — Computes correlation for all available cases between two variables
"listwise" — Computes correlation for all complete cases in the dataset

model

Character (length = 1). Defaults to "glasso". Available options:

"BGGM" — Computes the Bayesian Gaussian Graphical Model. Set argument ordinal.categories to determine levels allowed for a variable to be considered ordinal. See ?BGGM::estimate for more details
"glasso" — Computes the GLASSO with EBIC model selection. See EBICglasso.qgraph for more details
"TMFG" — Computes the TMFG method. See TMFG for more details

algorithm

Character or igraph cluster_* function (length = 1). Defaults to "walktrap". Three options are listed below but all are available (see community.detection for other options):

"leiden" — See cluster_leiden for more details
"louvain" — By default, "louvain" will implement the Louvain algorithm using the consensus clustering method (see community.consensus for more information). This function will implement consensus.method = "most_common" and consensus.iter = 1000 unless specified otherwise
"walktrap" — See cluster_walktrap for more details

uni.method

Character (length = 1). What unidimensionality method should be used? Defaults to "louvain". Available options:

"expand" — Expands the correlation matrix with four variables correlated 0.50. If number of dimension returns 2 or less in check, then the data are unidimensional; otherwise, regular EGA with no matrix expansion is used. This method was used in the Golino et al.'s (2020) Psychological Methods simulation
"LE" — Applies the Leading Eigenvector algorithm (cluster_leading_eigen) on the empirical correlation matrix. If the number of dimensions is 1, then the Leading Eigenvector solution is used; otherwise, regular EGA is used. This method was used in the Christensen et al.'s (2023) Behavior Research Methods simulation
"louvain" — Applies the Louvain algorithm (cluster_louvain) on the empirical correlation matrix. If the number of dimensions is 1, then the Louvain solution is used; otherwise, regular EGA is used. This method was validated Christensen's (2022) PsyArXiv simulation. Consensus clustering can be used by specifying either "consensus.method" or "consensus.iter"

plot.EGA

Boolean (length = 1). Defaults to TRUE. Whether the plot should be returned with the results. Set to FALSE for no plot

verbose

Boolean (length = 1). Whether messages and (insignificant) warnings should be output. Defaults to FALSE (silent calls). Set to TRUE to see all messages and warnings for every function call

...

Additional arguments to be passed on to auto.correlate, network.estimation, community.detection, community.consensus, and community.unidimensional

Value

Returns a list containing:

network: A matrix containing a network estimated using link[EGAnet]{network.estimation}
wc: A vector representing the community (dimension) membership of each node in the network. NA values mean that the node was disconnected from the network
n.dim: A scalar of how many total dimensions were identified in the network
correlation: The zero-order correlation matrix
n: Number of cases in data
dim.variables: An ordered matrix of item allocation
TEFI: link[EGAnet]{tefi} for the estimated structure
plot.EGA: Plot output if plot.EGA = TRUE

References

Original simulation and implementation of EGA
Golino, H. F., & Epskamp, S. (2017). Exploratory graph analysis: A new approach for estimating the number of dimensions in psychological research. PLoS ONE, 12, e0174035.

Current implementation of EGA, introduced unidimensional checks, continuous and dichotomous data
Golino, H., Shi, D., Christensen, A. P., Garrido, L. E., Nieto, M. D., Sadana, R., & Thiyagarajan, J. A. (2020). Investigating the performance of Exploratory Graph Analysis and traditional techniques to identify the number of latent factors: A simulation and tutorial. Psychological Methods, 25, 292-320.

Compared all igraph community detection algorithms, introduced Louvain algorithm, simulation with continuous and polytomous data
Also implements the Leading Eigenvalue unidimensional method
Christensen, A. P., Garrido, L. E., Pena, K. G., & Golino, H. (2023). Comparing community detection algorithms in psychological data: A Monte Carlo simulation. Behavior Research Methods.

Comprehensive unidimensionality simulation
Christensen, A. P. (2023). Unidimensional community detection: A Monte Carlo simulation, grid search, and comparison. PsyArXiv.

Compared all igraph community detection algorithms, simulation with continuous and polytomous data
Christensen, A. P., Garrido, L. E., Guerra-Pena, K., & Golino, H. (2023). Comparing community detection algorithms in psychometric networks: A Monte Carlo simulation. Behavior Research Methods.

Author

Hudson Golino <hfg9s at virginia.edu>, Alexander P. Christensen <alexpaulchristensen at gmail.com>, Maria Dolores Nieto <acinodam at gmail.com> and Luis E. Garrido <garrido.luiseduardo at gmail.com>

Examples

# Obtain data
wmt <- wmt2[,7:24]

# Estimate EGA
ega.wmt <- EGA(
  data = wmt,
  plot.EGA = FALSE # No plot for CRAN checks
)

# Print results
print(ega.wmt)
#> Model: GLASSO (EBIC with gamma = 0.5)
#> Correlations: auto
#> Lambda: 0.0648639582532287 (n = 100, ratio = 0.1)
#> 
#> Number of nodes: 18
#> Number of edges: 96
#> Edge density: 0.627
#> 
#> Non-zero edge weights: 
#>      M    SD    Min   Max
#>  0.082 0.060 -0.013 0.363
#> 
#> ----
#> 
#> Algorithm:  Walktrap
#> 
#> Number of communities:  2
#> 
#>  wmt1  wmt2  wmt3  wmt4  wmt5  wmt6  wmt7  wmt8  wmt9 wmt10 wmt11 wmt12 wmt13 
#>     1     1     1     1     1     2     2     2     2     2     2     2     2 
#> wmt14 wmt15 wmt16 wmt17 wmt18 
#>     2     2     2     2     2 
#> 
#> ----
#> 
#> Unidimensional Method: Louvain
#> Unidimensional: No
#> 
#> ----
#> 
#> TEFI: -11.171

# Estimate EGAtmfg
ega.wmt.tmfg <- EGA(
  data = wmt, model = "TMFG",
  plot.EGA = FALSE # No plot for CRAN checks
)

# Estimate EGA with Louvain algorithm
ega.wmt.louvain <- EGA(
  data = wmt, algorithm = "louvain",
  plot.EGA = FALSE # No plot for CRAN checks
)

# Estimate EGA with an {igraph} function (Fast-greedy)
ega.wmt.greedy <- EGA(
  data = wmt,
  algorithm = igraph::cluster_fast_greedy,
  plot.EGA = FALSE # No plot for CRAN checks
)