Friday, 15 July 2011

Replace NA with mean of values in previous and following row in R -



Replace NA with mean of values in previous and following row in R -

i've got data.frame total of na's.

date <- c("1","2","3","4","5","6","7","1","2","3","4","5","6","7") comp <- c("a", "a", "a", "a", "a", "a", "a", "b", "b", "b", "b", "b", "b", "b") bm <- c(12,11,na,14,na,15,na, 5, 5, na, 6, na, 8, 9) df <- data.frame(date, comp, bm, stringsasfactors=f) df # date comp bm # 1 1 12 # 2 2 11 # 3 3 na # 4 4 14 # 5 5 na # 6 6 15 # 7 7 na # 8 1 b 5 # 9 2 b 5 # 10 3 b na # 11 4 b 6 # 12 5 b na # 13 6 b 8 # 14 7 b 9

i want replace na's mean of values in previous , next row (only if it's same company of course). if first row na, next row's value should taken, if lastly row na, sec lastly row's value should taken.

the output should this

# date comp bm # 1 1 12 # 2 2 11 # 3 3 12.5 # 4 4 14 # 5 5 14.5 # 6 6 15 # 7 7 15 # 8 1 b 5 # 9 2 b 5 # 10 3 b 5.5 # 11 4 b 6 # 12 5 b 7 # 13 6 b 8 # 14 7 b 9

thank you!

that's job zoo:::na.approx:

library(plyr) library(zoo) ddply(df, .(comp), transform, bm=na.approx(bm, rule=2)) # date comp bm # 1 1 12.0 # 2 2 11.0 # 3 3 12.5 # 4 4 14.0 # 5 5 14.5 # 6 6 15.0 # 7 7 15.0 # 8 1 b 5.0 # 9 2 b 5.0 # 10 3 b 5.5 # 11 4 b 6.0 # 12 5 b 7.0 # 13 6 b 8.0 # 14 7 b 9.0

edit:

in response comment: need handle cases 1 non-na value or na values.

my.na.approx <- function(x) { if (sum(is.finite(x)) == 0l) return(x) if (sum(is.finite(x)) == 1l) return(na.approx(x, rule=2, method="constant")) na.approx(x, rule=2) } my.na.approx(c(na, 1, na, na, 2, na)) #[1] 1.000000 1.000000 1.333333 1.666667 2.000000 2.000000 my.na.approx(c(na, na, na, na, 2, na)) #[1] 2 2 2 2 2 2 my.na.approx(c(na, na, na, na, na, na)) #[1] na na na na na na

r replace data.frame na

No comments:

Post a Comment