Skip to contents

A fast implementation of polychoric correlations in C. Uses the Beasley-Springer-Moro algorithm (Boro & Springer, 1977; Moro, 1995) to estimate the inverse univariate normal CDF, the Drezner-Wesolosky approximation (Drezner & Wesolosky, 1990) to estimate the bivariate normal CDF, and Brent's method (Brent, 2013) for optimization of rho

Usage

polychoric.matrix(
  data,
  na.data = c("pairwise", "listwise"),
  empty.method = c("none", "zero", "all"),
  empty.value = c("none", "point_five", "one_over"),
  ...
)

Arguments

data

Matrix or data frame. A dataset with all ordinal values (rows = cases, columns = variables). Data are required to be between 0 and 11. Proper adjustments should be made prior to analysis (e.g., scales from -3 to 3 in increments of 1 should be shifted by added 4 to all values)

na.data

Character (length = 1). How should missing data be handled? Defaults to "pairwise". Available options:

  • "pairwise" — Computes correlation for all available cases between two variables

  • "listwise" — Computes correlation for all complete cases in the dataset

empty.method

Character (length = 1). Method for empty cell correction. Available options:

  • "none" — Adds no value (empty.value = "none") to the empirical joint frequency table between two variables

  • "zero" — Adds empty.value to the cells with zero in the joint frequency table between two variables

  • "all" — Adds empty.value to all in the joint frequency table between two variables

empty.value

Character (length = 1). Value to add to the joint frequency table cells. Accepts numeric values between 0 and 1 or specific methods:

  • "none" — Adds no value (0) to the empirical joint frequency table between two variables

  • "point_five" — Adds 0.5 to the cells defined by empty.method

  • "one_over" — Adds 1 / n where n equals the number of cells based on empty.method. For empty.method = "zero", n equals the number of zero cells

...

Not used but made available for easier argument passing

Value

Returns a polychoric correlation matrix

References

Beasley-Moro-Springer algorithm
Beasley, J. D., & Springer, S. G. (1977). Algorithm AS 111: The percentage points of the normal distribution. Journal of the Royal Statistical Society. Series C (Applied Statistics), 26(1), 118-121.

Moro, B. (1995). The full monte. Risk 8 (February), 57-58.

Brent optimization
Brent, R. P. (2013). Algorithms for minimization without derivatives. Mineola, NY: Dover Publications, Inc.

Drezner-Wesolowsky bivariate normal approximation
Drezner, Z., & Wesolowsky, G. O. (1990). On the computation of the bivariate normal integral. Journal of Statistical Computation and Simulation, 35(1-2), 101-107.

Author

Alexander P. Christensen <alexpaulchristensen@gmail.com> with assistance from GPT-4

Examples

# Load data (ensure matrix for missing data example)
wmt <- as.matrix(wmt2[,7:24])

# Compute polychoric correlation matrix
correlations <- polychoric.matrix(wmt)

# Randomly assign missing data
wmt[sample(1:length(wmt), 1000)] <- NA

# Compute polychoric correlation matrix
# with pairwise missing
pairwise_correlations <- polychoric.matrix(
  wmt, na.data = "pairwise"
)

# Compute polychoric correlation matrix
# with listwise missing
pairwise_correlations <- polychoric.matrix(
  wmt, na.data = "listwise"
)