Compares Dynamic Network Structures Using Permutation — dynamic.network.compare • EGAnet

A permutation implementation to determine statistical significance of whether the dynamic network structures are different from one another

Usage

dynamic.network.compare(
  data,
  paired = FALSE,
  corr = c("auto", "cor_auto", "pearson", "spearman"),
  na.data = c("pairwise", "listwise"),
  model = c("BGGM", "glasso", "TMFG"),
  id = NULL,
  group = NULL,
  n.embed = 5,
  n.embed.optimize = FALSE,
  tau = 1,
  delta = 1,
  use.derivatives = 1,
  na.derivative = c("none", "kalman", "rowwise", "skipover"),
  zero.jitter = 0.001,
  iter = 1000,
  ncores,
  seed = NULL,
  verbose = TRUE,
  ...
)

Arguments

data

Matrix or data frame. Should consist only of variables to be used in the analysis as well as an ID column

paired

Boolean (length = 1). Whether groups are repeated measures representing paired samples. Defaults to FALSE

corr

Character (length = 1). Method to compute correlations. Defaults to "auto". Available options:

"auto" — Automatically computes appropriate correlations for the data using Pearson's for continuous, polychoric for ordinal, tetrachoric for binary, and polyserial/biserial for ordinal/binary with continuous. To change the number of categories that are considered ordinal, use ordinal.categories (see polychoric.matrix for more details)
"cor_auto" — Uses cor_auto to compute correlations. Arguments can be passed along to the function
"cosine" — Uses cosine to compute cosine similarity
"pearson" — Pearson's correlation is computed for all variables regardless of categories
"spearman" — Spearman's rank-order correlation is computed for all variables regardless of categories

For other similarity measures, compute them first and input them into data with the sample size (n)

na.data

Character (length = 1). How should missing data be handled? Defaults to "pairwise". Available options:

"pairwise" — Computes correlation for all available cases between two variables
"listwise" — Computes correlation for all complete cases in the dataset

model

Character (length = 1). Defaults to "glasso". Available options:

"BGGM" — Computes the Bayesian Gaussian Graphical Model. Set argument ordinal.categories to determine levels allowed for a variable to be considered ordinal. See ?BGGM::estimate for more details
"glasso" — Computes the GLASSO with EBIC model selection. See EBICglasso.qgraph for more details
"TMFG" — Computes the TMFG method. See TMFG for more details

id

Numeric or character (length = 1). Number or name of the column identifying each individual. Defaults to NULL

group

Numeric or character (length = 1). Number of the column identifying group membership. Defaults to NULL

n.embed

Numeric (length = 1). Defaults to 5. Number of embedded dimensions (the number of observations to be used in the Embed function). For example, an "n.embed = 5" will use five consecutive observations to estimate a single derivative

n.embed.optimize

Boolean (length = 1). If TRUE, performs optimization of n.embed for each individual, then constructs the population based on optimized derivatives. When TRUE, individual networks are considered of interest and will always be output. Defaults to FALSE

tau

Numeric (length = 1). Defaults to 1. Number of observations to offset successive embeddings in the Embed function. Generally recommended to leave "as is"

delta

Numeric (length = 1). Defaults to 1. The time between successive observations in the time series (i.e, lag). Generally recommended to leave "as is"

use.derivatives

Numeric (length = 1). Defaults to 1. The order of the derivative to be used in the analysis. Available options:

0 — No derivatives; consistent with moving average
1 — First-order derivatives; interpreted as "velocity" or rate of change over time
2 — Second-order derivatives; interpreted as "acceleration" or rate of the rate of change over time

Generally recommended to leave "as is"

na.derivative

Character (length = 1). How should missing data in the embeddings be handled? Available options (see Boker et al. (2018) in glla references for more details):

"none" (default) — does nothing and leaves NAs in data
"kalman" — uses Kalman smoothing (KalmanSmooth) with structural time series models (StructTS) to impute missing values. This approach models the underlying temporal dependencies (trend, seasonality, autocorrelation) to generate estimates for missing observations while preserving the original time scale. More computationally intensive than the other methods but typically provides the most accurate imputation by respecting the stochastic properties of the time series
"rowwise" — adjusts time interval with respect to each embedding ensuring time intervals are adaptive to the missing data (tends to be more accurate than "none")
"skipover" — "skips over" missing data and treats the non-missing points as continuous points in time (note that the time scale shifts to the "per mean time interval," which is different and larger than the original scale)

zero.jitter

Numeric (length = 1). Small amount of Gaussian noise added to zero variance derivatives to prevent estimation failures. For more than one variable, noise is generated multivariate normal distribution to ensure orthogonal noise is added. The jitter preserves the overall structure but avoids singular covariance matrices during network estimation. Defaults to 0.001

iter

Numeric (length = 1). Number of permutations to perform. Defaults to 1000 (recommended)

ncores

Numeric (length = 1). Number of cores to use in computing results. Defaults to ceiling(parallel::detectCores() / 2) or half of your computer's processing power. Set to 1 to not use parallel computing

seed

Numeric (length = 1). Defaults to NULL or random results. Set for reproducible results. See Reproducibility and PRNG for more details on random number generation in EGAnet

verbose

Boolean (length = 1). Should progress be displayed? Defaults to TRUE. Set to FALSE to not display progress

...

Additional arguments that can be passed on to auto.correlate, network.estimation, EGA, and jsd

Value

Returns a list:

network: Data frame with row names of each measure, empirical value (statistic), and p-value based on the permutation test (p.value)
edges: List containing matrices of values for empirical values (statistic), p-values (p.value), and Benjamini-Hochberg corrected p-values (p.adjusted)

References

Frobenius Norm
Ulitzsch, E., Khanna, S., Rhemtulla, M., & Domingue, B. W. (2023). A graph theory based similarity metric enables comparison of subpopulation psychometric networks. Psychological Methods.

Jensen-Shannon Similarity (1 - Distance)
De Domenico, M., Nicosia, V., Arenas, A., & Latora, V. (2015). Structural reducibility of multilayer networks. Nature Communications, 6(1), 1–9.

Total Network Strength
van Borkulo, C. D., van Bork, R., Boschloo, L., Kossakowski, J. J., Tio, P., Schoevers, R. A., Borsboom, D., & Waldorp, L. J. (2023). Comparing network structures on three aspects: A permutation test. Psychological Methods, 28(6), 1273–1285.

Author

Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <alexpaulchristensen@gmail.com>

Examples

# Three similar groups

# Set seed
set.seed(42)

# Simulate dynamic data
participants <- lapply(
  seq_len(50), function(i){

    # Get output
    output <- simDFM(
      variab = 6, timep = 15,
      nfact = 2, error = 0.100,
      dfm = "DAFS", loadings = 0.60,
      autoreg = 0.80, crossreg = 0.10,
      var.shock = 0.36, cov.shock = 0.18,
      burnin = 2000
    )

    #  Add ID
    df <- data.frame(
      ID = i,
      Group = rep(1:3, each = 5),
      output$data
    )

    # Return data
    return(df)

  }
)

# Put participants into a data frame
df <- do.call(rbind.data.frame, participants)

if (FALSE) { # \dontrun{
# Perform comparison
dynamic.network.compare(
  data = df, paired = TRUE,
  # EGA arguments
  corr = "auto", na.data = "pairwise", model = "glasso",
  # dynEGA arguments
  id = "ID", group = "Group", n.embed = 3,
  tau = 1, delta = 1, use.derivatives = 1,
  # Permutation arguments
  iter = 1000, ncores = 2, verbose = TRUE, seed = 42
)} # }

# Two similar groups and one different

# Simulate dynamic data
participants <- lapply(
  seq_len(50), function(i){

    # Get output
    output <- simDFM(
      variab = 4, timep = 5,
      nfact = 3, error = 0.100,
      dfm = "DAFS", loadings = 0.60,
      autoreg = 0.80, crossreg = 0.10,
      var.shock = 0.36, cov.shock = 0.18,
      burnin = 2000
    )

    #  Add ID
    df <- data.frame(
      ID = i,
      Group = rep(3, each = 5),
      output$data
    )

    # Return data
    return(df)

  }
)

# Replace group 3
new_group <- do.call(rbind.data.frame, participants)
df[df$Group == 3,] <- new_group

if (FALSE) { # \dontrun{
# Perform comparison
dynamic.network.compare(
  data = df, paired = TRUE,
  # EGA arguments
  corr = "auto", na.data = "pairwise", model = "glasso",
  # dynEGA arguments
  id = "ID", group = "Group", n.embed = 3,
  tau = 1, delta = 1, use.derivatives = 1,
  # Permutation arguments
  iter = 1000, ncores = 2, verbose = TRUE, seed = 42
)} # }