Skip to contents

Generators

Generate a simulated text corpus

peak_alpha()
Alpha parameter with a single peak
expected_entropy()
Expected entropy for samples from a Dirichlet distribution
rdirichlet()
Sample from the Dirichlet distribution
draw_a_word()
Draw a single word given topic and word distributions
draw_words()
Draw words for one document
draw_corpus()
Draw a collection of documents
journal_specific()
"Journal-specific" simulation scenario

Information gain

Tools for information-theoretic vocabulary selection

ndH()
Information gain (uniform distribution)
ndR()
Information gain (length-proportional distribution)

tmfast

Fitting “topic models” with PCA+varimax

tmfast()
Fit a topic model using PCA+varimax
insert_topics()
Insert a topic model into a fitted tmfast
varimax_irlba()
Fit a varimax-rotated PCA using irlba
fit_varimax()
Given a (rank n) PCA fit, return a rank k < n varimax fit

Tidiers

Extract beta and gamma matrices from tmfast objects

tidy(<tmfast>)
Extract beta and gamma matrices from tmfast objects
tidy_all()
Extract gamma or beta matrices for all topics

Renormalization

Renormalize a distribution to match a desired expected entropy

solve_power()
Solve the equation to find the desired exponent
target_power()
Find target power for renormalization
renorm()
Renormalize tidied distributions

Hellinger distances

Calculate Hellinger distances between distributions

hellinger(<Matrix>)
Hellinger distance for matrices
hellinger()
Hellinger distances
hellinger(<data.frame>)
Hellinger distance for dataframes

Discursive space visualizations

tsne()
Discursive space using t-SNE
tsne(<data.frame>)
Discursive space using t-SNE
umap()
Discursive space using UMAP
umap(<STM>)
Discursive space with UMAP for structural topic models
umap(<matrix>)
Discursive space with UMAP given a distance matrix
umap(<tmfast>)
Discursive space with UMAP for tmfast topic models

Utilities

Utility functions

build_matrix()
Convert a long dataframe to a wide (sparse) matrix
entropy()
Entropy of a distribution
loadings()
Extract a PCA/varimax loadings matrix
scores()
Extract item scores from a fitted PCA/varimax model
rotation()
Extract varimax rotation
make_colnames()
Make colnames