Skip to contents

2-dimensional "discursive space" representation of relationships between documents using Hellinger distances and t-SNE

Usage

# S3 method for data.frame
tsne(gamma_df, k, doc_ids, perplexity = NULL, df = TRUE)

Arguments

k

Number of topics (required for tmfast objects)

doc_ids

Vector of document IDs (required for STM objects)

perplexity

Perplexity parameter for t-SNE. By default, minimum of 30 and floor((ndocs - 1)/3) - 1.

df

Return a dataframe with columns document, x, and y (default) or the output of Rtsne.

tm

A fitted topic model

Value

See df

Details

Algorithm checks distances to 3*perplexity nearest neighbors. Rtsne loses rownames (document IDs); these are either extract from the tmfast object or passed separately for a STMobject. The default method (not exported) takes a tidied gamma (document-topic-gamma) matrix. Use set.seed() before calling this function for reproducibility.

Examples

## From the real books vignette
set.seed(42)
tsne(fitted_tmf, k = 4, df = TRUE) |>
    left_join(meta, by = c('document' = 'book')) |>
    ggplot(aes(x, y, color = author)) +
    geom_point()
#> Error in eval(expr, envir, enclos): object 'fitted_tmf' not found