Fits topic models using varimax-rotated principal component analysis (PCA), following the 'vintage factor analysis' approach of Rohe & Zheng arXiv:2004.05387. Leverages truncated PCA via 'irlba' for sparse matrices, enabling fast model fitting on large corpora. Includes an information-theoretic approach to vocabulary selection, 'broom'-compatible tidiers for extracting word-topic and topic-document matrices into a tidy data workflow, and samplers for constructing simulated corpora for benchmarking and method evaluation.
Author
Maintainer: D. Hicks hicks.daniel.j@gmail.com (ORCID)