Monday, 15 February 2010

R convert fractions to integer percentages adding up to 100 -



R convert fractions to integer percentages adding up to 100 -

i have computed vector of frequency of different events, represented fractions , sorted in descending order. need interface tool requires positive integer percentages must sum 100. generate percentages in fashion best represents input distribution. is, relationship (ratios) among percentages best match 1 in input fractions, despite non-linearities resulting in cutting long tail.

i have function generates these percentages, don't think optimal or elegant. in particular, more of work in numeric space before resorting "stupid integer tricks".

here illustration frequency vector:

fractionals <- 1 / (2 ^ c(2, 5:6, 8, rep(9,358)))

and here function:

# convert vector of fractions integer percents summing 100 percentize <- function(fractionals) { # fractionals sorted descending , adds 1 # drop elements wouldn't round 1% vs. running total pctofcum <- fractionals / cumsum(fractionals) fractionals <- fractionals[pctofcum > 0.005] # calculate initial percentages percentages <- round((fractionals / sum(fractionals)) * 100) # if sum of percentages exceeds 100, remove proportionally <- 1 while (sum(percentages) > 100) { excess <- sum(percentages) - 100 if (i > length(percentages)) { <- 1 } partialexcess <- max(1, round((excess * percentages[i]) / 100)) percentages[i] <- percentages[i] - min(partialexcess, percentages[i] - 1) <- + 1 } # if sum of percentages shorts 100, add together proportionally <- 1 while (sum(percentages) < 100) { shortage <- 100 - sum(percentages) if (i > length(percentages)) { <- 1 } partialshortage <- max(1, round((shortage * percentages[i]) / 100)) percentages[i] <- percentages[i] + partialshortage <- + 1 } return(percentages) }

any ideas?

how this? rescales variables should add together 100, if due rounding comes 99 adds 1 largest frequency.

fractionals <- 1 / (2 ^ c(2, 5:6, 8, rep(9,358))) pctofcum <- fractionals / cumsum(fractionals) fractionals <- fractionals[pctofcum > 0.005] bunnies <- as.integer(fractionals / sum(fractionals) * 100) + 1 bunnies[bunnies > 1] <- round(bunnies[bunnies > 1] * (100 - sum(bunnies[bunnies == 1])) / sum(bunnies[bunnies > 1])) if((sum(bunnies) < 100) == true) bunnies[1] <- bunnies[1] + 1 > bunnies [1] 45 6 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

r integer data-analysis frequency-distribution

No comments:

Post a Comment