Monday, 15 September 2014

Merging multiple text files in R with a constraint -



Merging multiple text files in R with a constraint -

i have 10 text files containing 1000's of rows.

example: first file:

v1 v2 1 2 2 3 10 20 1 4 .....

second file:

v1 v2 1 2 8 10 .....

what want final file contain 12 columns. first 2 columns representing relationship , next 10 columns telling different files ( represented 1 if pair nowadays , 0 if not nowadays ) example:

final file:

v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 1 2 1 1 0 0 1 0 1 1 0 1 2 3 1 0 1 1 0 0 0 1 1 0

now, did sort every file pairs 1 first number appears on top followed other numbers, i.e.

for particular file, did that,

v1 v2 1 2 1 3 1 10 1 5 2 10 2 15 .......

then, tried using merge command, however, know it's not right one. not able think of other method it.

here's 1 way it. i'll assume you've read files list of data.frames, e.g. l <- lapply(filenames, read.table). i'll simulate l below.

# create dummy info l <- replicate(5, expand.grid(1:10, 1:10)[sample(100, 10), ], simplify=false) # add together column each data.frame in l. # indicate presence of pair when merge. l <- lapply(seq_along(l), function(i) { l[[i]][, paste0('df', i)] <- 1 l[[i]] }) # merge things # (hat-tip @charles - http://stackoverflow.com/a/8097519/489704) l.merged <- reduce(function(...) merge(..., all=t), l) head(l.merged) # var1 var2 df1 df2 df3 df4 df5 # 1 1 2 na na na 1 na # 2 1 5 1 na na na 1 # 3 1 9 na na na 1 na # 4 1 10 na na 1 1 na # 5 2 5 na na 1 na na # 6 2 6 na 1 na 1 na

easy plenty convert na 0 if want, e.g. with:

l.merged[is.na(l.merged)] <- 0

relevant post: merge multiple info frames in list simultaneously

r

No comments:

Post a Comment