Skip to contents

Given a tidied dataframe of topic-doc or word-topic distributions and a exponent, renormalizes the distributions.

Usage

renorm(tidy_df, group_col, p_col, exponent, keep_original = FALSE)

Arguments

tidy_df

The tidied distribution dataframe

group_col

Grouping column, RHS of the conditional probability distribution, eg, topics for word-topic distributions

p_col

Column containing the probability for each category (eg, word) conditional on the group (eg, topic)

exponent

Exponent to use in renormalization

keep_original

Keep original probabilities?

Value

A dataframe with (if keep_original is TRUE) an added column of the form p_col_rn containing the renormalized probabilities or (if keep_original is FALSE) renormalized values in p_col.

Examples

# \donttest{
set.seed(42)
theta  = rdirichlet(50, 1, k = 3)
phi    = rdirichlet(3, 0.1, k = 20)
corpus = draw_corpus(rep(50L, 50), theta, phi)
model  = tmfast(corpus, n = 3)
beta   = tidy(model, matrix = 'beta', k = 3)
pwr    = target_power(beta, topic, beta, target_entropy = 2)
renorm(beta, topic, beta, exponent = pwr)
#> # A tibble: 45 × 3
#>    token topic    beta
#>    <chr> <chr>   <dbl>
#>  1 2     V1    0.0113 
#>  2 2     V2    0.0113 
#>  3 2     V3    0.0148 
#>  4 5     V1    0.0255 
#>  5 5     V2    0.0233 
#>  6 5     V3    0.0352 
#>  7 8     V1    0.702  
#>  8 8     V2    0.00499
#>  9 8     V3    0.537  
#> 10 11    V1    0.0180 
#> # ℹ 35 more rows
# }