Create new column using row-wise operations in Data.Table R -
i have 2 data.tables in r follows:
dt_a columns sid, date, value1, value2, sid primary key , date secondary key.
dt_b has columns sid, date1, date2, date3, ...., date12. here sid primary key , hence each row corresponds unique sid , other column names correspond secondary keys in dt_a.
i want add together column dt_a ith row contains corresponding element table dt_b[sid_i, date_i]
below illustration code , desired output:
require(data.table) dt_a <- data.table(sid = c(1,2,3,4,5,1,3), date = c("jan 2012", "feb 2012", "april 2012", "may 2012", "dec 2012", "feb 2012", "oct 2012"), value1 = rep("1", 7), value2 = rep("1", 7)) dt_b <- data.table(sid = as.character(c(1,2,3,4,5)), "jan 2012" = rep("1", 5), "feb 2012" = rep("2", 5), "march 2012" = rep("3", 5), "april 2012" = rep("4", 5), "may 2012" = rep("5", 5), "june 2012" = rep("6", 5), "july 2012" = rep("7", 5), "aug 2012" = rep("8", 5), "sept 2012" = rep("9", 5), "oct 2012" = rep("10", 5), "nov 2012" = rep("11", 5), "dec 2012" = rep("12", 5)) #set keys setkey(dt_a, sid, date) setkey(dt_b, sid) #define function fun1 fun1 <- function(x){ trycatch(dt_b[x[1], x[2], with=false], error = function(e) null) } #desired output dt_a$newcol <- sapply(apply(dt_a, 1, fun1),"[[",1) dt_a although current method works on little example, actual dt_a has 20 1000000 rows. method hangs there. can operation in more efficient way using data.table or other method?
melt sec data.table:
library(reshape2) dt_b.melted = melt(dt_b, id.vars = 'sid', variable.name = 'date') once it's melted, can set both same key, , join/assign/whatnot:
setkey(dt_b.melted, sid, date) setkey(dt_a, sid, date) dt_a[dt_b.melted, newcol := value] r data.table
No comments:
Post a Comment