2-dimensional "discursive space" representation of relationships between documents using Hellinger distances and t-SNE.
Usage
tsne(x, ...)
# S3 method for class 'data.frame'
tsne(x, doc_ids, perplexity = NULL, df = TRUE, ...)
# S3 method for class 'tmfast'
tsne(x, k, perplexity = NULL, df = TRUE, ...)
# S3 method for class 'STM'
tsne(x, doc_ids, perplexity = NULL, df = TRUE, ...)Arguments
- x
Fitted topic model (
tmfastorSTM)- ...
Passed to methods
- doc_ids
Vector of document IDs, in the same order as rows in
x- perplexity
Perplexity parameter for t-SNE. By default, minimum of 30 and
floor((length(doc_ids) - 1)/3) - 1.- df
Return a dataframe with columns
document,x, andy(default) or the raw output ofRtsne.- k
Number of topics
Details
Algorithm checks distances to 3*perplexity nearest neighbors. Rtsne
loses rownames (document IDs); these are either extracted from the tmfast
object or passed separately for an STM object. Use set.seed() before
calling for reproducibility.
Methods (by class)
tsne(data.frame): Method for tidied gamma dataframestsne(tmfast): Method for fittedtmfastobjectstsne(STM): Method for fittedSTMobjects
Examples
# \donttest{
set.seed(42)
theta = rdirichlet(50, 1, k = 3)
phi = rdirichlet(3, 0.1, k = 30)
corpus = draw_corpus(rep(50L, 50), theta, phi)
fitted = tmfast(corpus, n = 3)
tsne(fitted, k = 3, df = TRUE)
#> # A tibble: 50 × 3
#> document x y
#> <chr> <dbl> <dbl>
#> 1 1 5.10 -1.45
#> 2 2 -2.05 2.32
#> 3 3 4.51 0.428
#> 4 4 -0.116 2.89
#> 5 5 -3.22 -0.631
#> 6 6 -3.58 2.98
#> 7 7 1.55 -3.49
#> 8 8 -1.77 3.30
#> 9 9 -3.11 0.349
#> 10 10 -3.47 -2.83
#> # ℹ 40 more rows
# }