r - Use strsplit starting at end of string -
i've been using code split names of individual samples, change part of sample name, , rebind strings together. code works when names same length (ie: names 8 characters long , splits after first 4 characters), when names different lengths, code no longer effective.
essentially, individual names 7 or 8 characters. last 4 characters what's important.
example 8 characters: samp003a
example 7 characters: sam003a
is there way continue using strsplit separate names, start end of string rather beginning, keep last 4 characters (003a
)?
current code:
> rowlist <- as.list(rownames(df1)) > rowlistres <- strsplit(as.character(rowlist), "(?<=.{4})", perl = true) > rowlistres.df <- do.call(rbind, rowlistres) > rowlistres.df[,1] <- "ly3d" > dfnames <- apply(rowlistres.df, 1, paste, collapse="") > rownames(df1) <- dfnames
it's line 2 i'm trying hard edit, can split according last 4 characters.
any appreciated!
it looks you're bit mixed how use look-around assertions. pattern you're using, "(?<=.{4})"
, look-behind assertion says "find me inter-character spaces preceded 4 characters of kind", not want.
the pattern want, "(?=.{4}$)"
, look-ahead assertion finds single inter-character space followed 4 characters of kind followed end of string.
there is, unfortunately, unpleasant twist. reasons discussed in answers this question, strsplit()
interacts oddly look-ahead assertions; result, pattern you'll need "(?<=.)(?=.{4}$)"
. here's looks in action:
x <- c("samp003a", "sam003a") strsplit(x, split="(?<=.)(?=.{4}$)", perl=t) # [[1]] # [1] "samp" "003a" # # [[2]] # [1] "sam" "003a"
if want final 4 characters of each entry, maybe use substr()
, this:
x <- c("samp003a", "sam003a") substr(x, start=nchar(x)-3, stop=nchar(x)) # [1] "003a" "003a"
Comments
Post a Comment