Skip to contents

Extract beta and gamma matrices from tmfast objects

Usage

# S3 method for class 'tmfast'
tidy(
  x,
  k,
  matrix = "beta",
  df = TRUE,
  exponent = NULL,
  keep_original = FALSE,
  rotation = NULL,
  ...
)

Arguments

x

tmfast object

k

Index (number of topics/factors)

matrix

Desired matrix, either word-topic (beta) or topic-doc distributions (gamma)

df

Return a long dataframe (default) or wide matrix?

exponent

Renormalize the probabilities using a given exponent Applies only for df == TRUE

keep_original

If renormalizing, return original (pre-renormalized) probabilities?

rotation

Optional rotation matrix; see details

...

Not used; required for S3 method compatibility

Value

A long dataframe, with one row per word-topic or topic-doc combination. Column names depend on the value of matrix.

Details

If rotation is not NULL, loadings/scores will be rotated. This might be used to align the fitted topics with known true topics, as in the journal_specific simulation. Loadings are left-multiplied by the given rotation, while scores are right-multiplied by the transpose of the given rotation.

Examples

# \donttest{
set.seed(42)
theta  = rdirichlet(50, 1, k = 3)
phi    = rdirichlet(3, 0.1, k = 20)
corpus = draw_corpus(rep(50L, 50), theta, phi)
model  = tmfast(corpus, n = 3)
tidy(model, k = 3, matrix = 'beta')
#> # A tibble: 45 × 3
#>    token topic      beta
#>    <chr> <chr>     <dbl>
#>  1 2     V1    0.000275 
#>  2 2     V2    0.000280 
#>  3 2     V3    0.000760 
#>  4 5     V1    0.00137  
#>  5 5     V2    0.00117  
#>  6 5     V3    0.00422  
#>  7 8     V1    0.985    
#>  8 8     V2    0.0000554
#>  9 8     V3    0.938    
#> 10 11    V1    0.000689 
#> # ℹ 35 more rows
tidy(model, k = 3, matrix = 'gamma')
#> # A tibble: 150 × 3
#>    document topic  gamma
#>    <chr>    <chr>  <dbl>
#>  1 1        V1    0.145 
#>  2 1        V2    0.692 
#>  3 1        V3    0.162 
#>  4 2        V1    0.201 
#>  5 2        V2    0.256 
#>  6 2        V3    0.543 
#>  7 3        V1    0.176 
#>  8 3        V2    0.481 
#>  9 3        V3    0.342 
#> 10 4        V1    0.0707
#> # ℹ 140 more rows
# }