Given a tidied dataframe of topic-doc or word-topic distributions and a exponent, renormalizes the distributions.
Arguments
- tidy_df
The tidied distribution dataframe
- group_col
Grouping column, RHS of the conditional probability distribution, eg, topics for word-topic distributions
- p_col
Column containing the probability for each category (eg, word) conditional on the group (eg, topic)
- exponent
Exponent to use in renormalization
- keep_original
Keep original probabilities?
Value
A dataframe with (if keep_original is TRUE) an added column of the form p_col_rn containing the renormalized probabilities or (if keep_original is FALSE) renormalized values in p_col.
Examples
# \donttest{
set.seed(42)
theta = rdirichlet(50, 1, k = 3)
phi = rdirichlet(3, 0.1, k = 20)
corpus = draw_corpus(rep(50L, 50), theta, phi)
model = tmfast(corpus, n = 3)
beta = tidy(model, matrix = 'beta', k = 3)
pwr = target_power(beta, topic, beta, target_entropy = 2)
renorm(beta, topic, beta, exponent = pwr)
#> # A tibble: 45 × 3
#> token topic beta
#> <chr> <chr> <dbl>
#> 1 2 V1 0.0113
#> 2 2 V2 0.0113
#> 3 2 V3 0.0148
#> 4 5 V1 0.0255
#> 5 5 V2 0.0233
#> 6 5 V3 0.0352
#> 7 8 V1 0.702
#> 8 8 V2 0.00499
#> 9 8 V3 0.537
#> 10 11 V1 0.0180
#> # ℹ 35 more rows
# }