Skip to contents

General function to compute a network's predictive power on new data, following Haslbeck and Waldorp (2018) and Williams and Rodriguez (2022)

This implementation is different from the predictability in the mgm package (Haslbeck), which is based on (regularized) regression. This implementation uses the network directly, converting the partial correlations into an implied precision (inverse covariance) matrix. See Details for more information

Usage

network.predictability(network, original.data, newdata, ordinal.categories = 7)

Arguments

network

Matrix or data frame. A partial correlation network

original.data

Matrix or data frame. Must consist only of variables to be used to estimate the network. See Examples

newdata

Matrix or data frame. Must consist of the same variables in the same order as original.data. See Examples

ordinal.categories

Numeric (length = 1). Up to the number of categories before a variable is considered continuous. Defaults to 7 categories before 8 is considered continuous

Value

Returns a list containing:

predictions

Predicted values of newdata based on the network

betas

Beta coefficients derived from the network

results

Performance metrics for each variable in newdata

Details

This implementation of network predictability proceeds in several steps with important assumptions:

1. Network was estimated using (partial) correlations (not regression like the mgm package!)

2. Original data that was used to estimate the network in 1. is necessary to apply the original scaling to the new data

3. (Linear) regression-like coefficients are obtained by reserve engineering the inverse covariance matrix using the network's partial correlations (i.e., by setting the diagonal of the network to -1 and computing the inverse of the opposite signed partial correlation matrix; see EGAnet:::pcor2inv)

4. Predicted values are obtained by matrix multiplying the new data with these coefficients

5. Dichotomous and polytomous data are given categorical values based on the original data's thresholds and these thresholds are used to convert the continuous predicted values into their corresponding categorical values

6. Evaluation metrics:

  • dichotomous --- Accuracy or the percent correctly predicted for the 0s and 1s

  • polytomous --- Accuracy based on the correctly predicting the ordinal category exactly (i.e., 1 = 1, 2, = 2, etc.) and a weighted accuracy such that absolute distance of the predicted value from the actual value (e.g., |prediction - actual| = 1) is used as the power of 0.5. This weighted approach provides an overall distance in terms of accuracy where each predicted value away from the actual value is given a harsher penalty (absolute difference = accuracy value): 0 = 1.000, 1 = 0.500, 2 = 0.2500, 3 = 0.1250, 4 = 0.0625, etc.

  • continuous --- R-sqaured and root mean square error

References

Original Implementation of Node Predictability
Haslbeck, J. M., & Waldorp, L. J. (2018). How well do network models predict observations? On the importance of predictability in network models. Behavior Research Methods, 50(2), 853–861.

Derivation of Regression Coefficients Used (Formula 3)
Williams, D. R., & Rodriguez, J. E. (2022). Why overfitting is not (usually) a problem in partial correlation networks. Psychological Methods, 27(5), 822–840.

Author

Hudson Golino <hfg9s at virginia.edu> and Alexander P. Christensen <alexpaulchristensen@gmail.com>

Examples

# Load data
wmt <- wmt2[,7:24]

# Set seed (to reproduce results)
set.seed(42)

# Split data
training <- sample(
  1:nrow(wmt), round(nrow(wmt) * 0.80) # 80/20 split
)

# Set splits
wmt_train <- wmt[training,]
wmt_test <- wmt[-training,]

# EBICglasso (default for EGA functions)
glasso_network <- network.estimation(
  data = wmt_train, model = "glasso"
)

# Check predictability
network.predictability(
  network = glasso_network, original.data = wmt_train,
  newdata = wmt_test
)
#> Dichotomous
#> 
#>           wmt1  wmt2  wmt3 wmt4  wmt5  wmt6  wmt7  wmt8  wmt9 wmt10 wmt11 wmt12
#> Accuracy 0.747 0.865 0.806  0.7 0.751 0.743 0.709 0.679 0.768 0.722 0.726 0.705
#>          wmt13 wmt14 wmt15 wmt16 wmt17 wmt18
#> Accuracy 0.734 0.675 0.709 0.717 0.797 0.764