R: combinatorial string replacement -
i on lookout gsub based function enable me combinatorial string replacement, if have arbitrary number of string replacement rules
replrules=list("<x>"=c(3,5),"<alk>"=c("hept","oct","non"),"<end>"=c("ane","ene"))
and target string
string="<x>-methyl<alk><end>"
it give me dataframe final string name , substitutions made in
name x alk end 3-methylheptane 3 hept ane 5-methylheptane 5 hept ane 3-methyloctane 3 oct ane 5-methyloctane 5 ... ... 3-methylnonane 3 5-methylnonane 5 3-methylheptene 3 5-methylheptene 5 3-methyloctene 3 5-methyloctene 5 3-methylnonene 3 5-methylnonene 5
the target string of arbitrary structure, e.g. string="1-<alk>anol"
or each pattern occur several times, in string="<alk>anedioic acid, di<alk>yl ester"
what elegant way kind of thing in r?
how
d <- do.call(expand.grid, replrules) d$name <- paste0(d$'<x>', "-", "methyl", d$'<alk>', d$'<end>')
edit
this seems work (substituting each of these strplit
)
string = "<x>-methyl<alk><end>" string2 = "<x>-ethyl<alk>acosane" string3 = "1-<alk>anol"
using richards regex
d <- do.call(expand.grid, list(replrules, stringsasfactors=false)) names(d) <- gsub("<|>","",names(d)) s <- strsplit(string3, "(<|>)", perl = true)[[1]] out <- list() for(i in s) { out[[i]] <- ifelse (i %in% names(d), d[i], i) } d$name <- do.call(paste0, unlist(out, recursive=f))
edit
this should work repeat items
d <- do.call(expand.grid, list(replrules, stringsasfactors=false)) names(d) <- gsub("<|>","",names(d)) string4 = "<x>-methyl<alk><end>oate<alk>" s <- strsplit(string4, "(<|>)", perl = true)[[1]] out <- list() for(i in seq_along(s)) { out[[i]] <- ifelse (s[i] %in% names(d), d[s[i]], s[i]) } d$name <- do.call(paste0, unlist(out, recursive=f))
string r gsub
No comments:
Post a Comment