Fitting

Fits topic models using varimax-rotated principal component analysis (PCA), following the 'vintage factor analysis' approach of Rohe & Zheng arXiv:2004.05387. Leverages truncated PCA via 'irlba' for sparse matrices, enabling fast model fitting on large corpora. Includes an information-theoretic approach to vocabulary selection, 'broom'-compatible tidiers for extracting word-topic and topic-document matrices into a tidy data workflow, and samplers for constructing simulated corpora for benchmarking and method evaluation.

Author

Maintainer: D. Hicks hicks.daniel.j@gmail.com (ORCID)

Fitting "topic models" with PCA+varimax

See also

Author