Estimates dynamic communities in multivariate time series (e.g., panel data, longitudinal data, intensive longitudinal data) at multiple time scales and at different levels of analysis: individuals (intraindividual structure), groups, and population (interindividual structure)
Usage
dynEGA(
data,
id = NULL,
group = NULL,
n.embed = 5,
tau = 1,
delta = 1,
use.derivatives = 1,
level = c("individual", "group", "population"),
corr = c("auto", "cor_auto", "pearson", "spearman"),
na.data = c("pairwise", "listwise"),
model = c("BGGM", "glasso", "TMFG"),
algorithm = c("leiden", "louvain", "walktrap"),
uni.method = c("expand", "LE", "louvain"),
ncores,
verbose = TRUE,
...
)
Arguments
- data
Matrix or data frame. Participants and variable should be in long format such that row t represents observations for all variables at time point t for a participant. The next row, t + 1, represents the next measurement occasion for that same participant. The next participant's data should immediately follow, in the same pattern, after the previous participant
data
should have an ID variable labeled"ID"
; otherwise, it is assumed that the data represent the populationFor groups,
data
should have a Group variable labeled"Group"
; otherwise, it is assumed that there are no groups indata
Arguments
id
andgroup
can be specified to tell the function which column indata
it should use as the ID and Group variable, respectivelyA measurement occasion variable is not necessary and should be removed from the data before proceeding with the analysis
- id
Numeric or character (length = 1). Number or name of the column identifying each individual. Defaults to
NULL
- group
Numeric or character (length = 1). Number of the column identifying group membership. Defaults to
NULL
- n.embed
Numeric (length = 1). Defaults to
5
. Number of embedded dimensions (the number of observations to be used in theEmbed
function). For example, an"n.embed = 5"
will use five consecutive observations to estimate a single derivative- tau
Numeric (length = 1). Defaults to
1
. Number of observations to offset successive embeddings in theEmbed
function. Generally recommended to leave "as is"- delta
Numeric (length = 1). Defaults to
1
. The time between successive observations in the time series (i.e, lag). Generally recommended to leave "as is"- use.derivatives
Numeric (length = 1). Defaults to
1
. The order of the derivative to be used in the analysis. Available options:0
— No derivatives; consistent with moving average1
— First-order derivatives; interpreted as "velocity" or rate of change over time2
— Second-order derivatives; interpreted as "acceleration" or rate of the rate of change over time
Generally recommended to leave "as is"
- level
Character vector (up to length of 3). A character vector indicating which level(s) to estimate:
- corr
Character (length = 1). Method to compute correlations. Defaults to
"auto"
. Available options:"auto"
— Automatically computes appropriate correlations for the data using Pearson's for continuous, polychoric for ordinal, tetrachoric for binary, and polyserial/biserial for ordinal/binary with continuous. To change the number of categories that are considered ordinal, useordinal.categories
(seepolychoric.matrix
for more details)"cor_auto"
— Usescor_auto
to compute correlations. Arguments can be passed along to the function"pearson"
— Pearson's correlation is computed for all variables regardless of categories"spearman"
— Spearman's rank-order correlation is computed for all variables regardless of categories
For other similarity measures, compute them first and input them into
data
with the sample size (n
)- na.data
Character (length = 1). How should missing data be handled? Defaults to
"pairwise"
. Available options:"pairwise"
— Computes correlation for all available cases between two variables"listwise"
— Computes correlation for all complete cases in the dataset
- model
Character (length = 1). Defaults to
"glasso"
. Available options:"BGGM"
— Computes the Bayesian Gaussian Graphical Model. Set argumentordinal.categories
to determine levels allowed for a variable to be considered ordinal. See?BGGM::estimate
for more details"glasso"
— Computes the GLASSO with EBIC model selection. SeeEBICglasso.qgraph
for more details"TMFG"
— Computes the TMFG method. SeeTMFG
for more details
- algorithm
Character or
igraph
cluster_*
function (length = 1). Defaults to"walktrap"
. Three options are listed below but all are available (seecommunity.detection
for other options):"leiden"
— Seecluster_leiden
for more details"louvain"
— By default,"louvain"
will implement the Louvain algorithm using the consensus clustering method (seecommunity.consensus
for more information). This function will implementconsensus.method = "most_common"
andconsensus.iter = 1000
unless specified otherwise"walktrap"
— Seecluster_walktrap
for more details
- uni.method
Character (length = 1). What unidimensionality method should be used? Defaults to
"louvain"
. Available options:"expand"
— Expands the correlation matrix with four variables correlated 0.50. If number of dimension returns 2 or less in check, then the data are unidimensional; otherwise, regular EGA with no matrix expansion is used. This method was used in the Golino et al.'s (2020) Psychological Methods simulation"LE"
— Applies the Leading Eigenvector algorithm (cluster_leading_eigen
) on the empirical correlation matrix. If the number of dimensions is 1, then the Leading Eigenvector solution is used; otherwise, regular EGA is used. This method was used in the Christensen et al.'s (2023) Behavior Research Methods simulation"louvain"
— Applies the Louvain algorithm (cluster_louvain
) on the empirical correlation matrix. If the number of dimensions is 1, then the Louvain solution is used; otherwise, regular EGA is used. This method was validated Christensen's (2022) PsyArXiv simulation. Consensus clustering can be used by specifying either"consensus.method"
or"consensus.iter"
- ncores
Numeric (length = 1). Number of cores to use in computing results. Defaults to
ceiling(parallel::detectCores() / 2)
or half of your computer's processing power. Set to1
to not use parallel computingIf you're unsure how many cores your computer has, then type:
parallel::detectCores()
- verbose
Boolean (length = 1). Should progress be displayed? Defaults to
TRUE
. Set toFALSE
to not display progress- ...
Additional arguments to be passed on to
auto.correlate
,network.estimation
,community.detection
,community.consensus
, andEGA
Value
A list containing:
- Derivatives
A list containing:
Estimates
— A list the length of the unique IDs containing data frames of zero- to second-order derivatives for each ID indata
EstimatesDF
— A data frame of derivatives across all IDs containing columns of the zero- to second-order derivatives as well asid
andgroup
variables (group
is automatically set to1
for all if nogroup
is provided)
- dynEGA
A list containing:
Details
Derivatives for each variable's time series for each participant are
estimated using generalized local linear approximation (see glla
).
EGA
is then applied to these derivatives to model how variables
are changing together over time. Variables that change together over time are detected
as communities
References
Generalized local linear approximation
Boker, S. M., Deboeck, P. R., Edler, C., & Keel, P. K. (2010)
Generalized local linear approximation of derivatives from time series. In S.-M. Chow, E. Ferrer, & F. Hsieh (Eds.),
The Notre Dame series on quantitative methodology. Statistical methods for modeling human dynamics: An interdisciplinary dialogue,
(p. 161-178). Routledge/Taylor & Francis Group.
Deboeck, P. R., Montpetit, M. A., Bergeman, C. S., & Boker, S. M. (2009) Using derivative estimates to describe intraindividual variability at multiple time scales. Psychological Methods, 14(4), 367-386.
Original dynamic EGA implementation
Golino, H., Christensen, A. P., Moulder, R. G., Kim, S., & Boker, S. M. (2021).
Modeling latent topics in social media using Dynamic Exploratory Graph Analysis: The case of the right-wing and left-wing trolls in the 2016 US elections.
Psychometrika.
Time delay embedding procedure
Savitzky, A., & Golay, M. J. (1964).
Smoothing and differentiation of data by simplified least squares procedures.
Analytical Chemistry, 36(8), 1627-1639.
See also
plot.EGAnet
for plot usage in EGAnet
Author
Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <alexpaulchristensen@gmail.com>
Examples
# Population structure
simulated_population <- dynEGA(
data = sim.dynEGA, level = "population"
# uses simulated data in package
# useful to understand how data should be structured
)
# Group structure
simulated_group <- dynEGA(
data = sim.dynEGA, level = "group"
# uses simulated data in package
# useful to understand how data should be structured
)
if (FALSE) { # \dontrun{
# Individual structure
simulated_individual <- dynEGA(
data = sim.dynEGA, level = "individual",
ncores = 2, # use more for quicker results
verbose = TRUE # progress bar
)
# Population, group, and individual structure
simulated_all <- dynEGA(
data = sim.dynEGA,
level = c("individual", "group", "population"),
ncores = 2, # use more for quicker results
verbose = TRUE # progress bar
)
# Plot population
plot(simulated_all$dynEGA$population)
# Plot groups
plot(simulated_all$dynEGA$group)
# Plot individual
plot(simulated_all$dynEGA$individual, id = 1)
# Step through all plots
# Unless `id` is specified, 4 random IDs
# will be drawn from individuals
plot(simulated_all)} # }