Skip to contents

2-dimensional "discursive space" representation of relationships between documents using Hellinger distances and t-SNE.

Usage

tsne(x, ...)

# S3 method for class 'data.frame'
tsne(x, doc_ids, perplexity = NULL, df = TRUE, ...)

# S3 method for class 'tmfast'
tsne(x, k, perplexity = NULL, df = TRUE, ...)

# S3 method for class 'STM'
tsne(x, doc_ids, perplexity = NULL, df = TRUE, ...)

Arguments

x

Fitted topic model (tmfast or STM)

...

Passed to methods

doc_ids

Vector of document IDs, in the same order as rows in x

perplexity

Perplexity parameter for t-SNE. By default, minimum of 30 and floor((length(doc_ids) - 1)/3) - 1.

df

Return a dataframe with columns document, x, and y (default) or the raw output of Rtsne.

k

Number of topics

Value

See df

Details

Algorithm checks distances to 3*perplexity nearest neighbors. Rtsne loses rownames (document IDs); these are either extracted from the tmfast object or passed separately for an STM object. Use set.seed() before calling for reproducibility.

Methods (by class)

  • tsne(data.frame): Method for tidied gamma dataframes

  • tsne(tmfast): Method for fitted tmfast objects

  • tsne(STM): Method for fitted STM objects

Examples

# \donttest{
set.seed(42)
theta = rdirichlet(50, 1, k = 3)
phi   = rdirichlet(3, 0.1, k = 30)
corpus = draw_corpus(rep(50L, 50), theta, phi)
fitted = tmfast(corpus, n = 3)
tsne(fitted, k = 3, df = TRUE)
#> # A tibble: 50 × 3
#>    document      x      y
#>    <chr>     <dbl>  <dbl>
#>  1 1         5.10  -1.45 
#>  2 2        -2.05   2.32 
#>  3 3         4.51   0.428
#>  4 4        -0.116  2.89 
#>  5 5        -3.22  -0.631
#>  6 6        -3.58   2.98 
#>  7 7         1.55  -3.49 
#>  8 8        -1.77   3.30 
#>  9 9        -3.11   0.349
#> 10 10       -3.47  -2.83 
#> # ℹ 40 more rows
# }