How to split a string conditionally in R? -

- July 15, 2010

i split string multiple columns based on number of conditions.

an example of data:

col1<- c("01/05/2004 02:59", "01/05/2004 05:04", "01/06/2004 07:19", "01/07/2004 02:55", "01/07/2004 04:32", "01/07/2004 04:38", "01/07/2004 17:13", "01/07/2004 18:40", "01/07/2004 20:58", "01/07/2004 23:39", "01/09/2004 13:28")  col2<- c("wabamun #4 off line.", "keephills #2 on line.", "wabamun #1 on line.", "north red deer t217s bus lock out.  under investigation.",  "t217s has blown cts on 778l", "t217s north red deer bus in service (778l out of service)", "keephills #2 off line.", "wabamun #4 on line.", "sundance #1 off line.", "keephills #2 on line", "homeland security event lowered yellow ( elevated)")  df<- data.frame(col1,col2)

i able split column w conditionally.

to this:

col3<- c("wabamun #4", "keephills #2", "wabamun #1", "general asset", "general asset", "general asset", "keephills #2", "wabamun #4", "sundance #1", "keephills #2", "general asset")   col4<- c("off line.", "on line.", "on line.", "north red deer t217s bus lock out.  under investigation.",  "t217s has blown cts on 778l", "t217s north red deer bus in service (778l out of service)", "off line.", "on line.", "off line.", "on line", "homeland security event lowered yellow ( elevated)")

after i'm planning find times between when asset goes down , comes online. these generator plants looking capacity of plant. example keephills #2 has capacity of 300mw.

thankfully, regular expressions here save day.

# line prevents character strings turning factors df<- data.frame(col1,col2, stringsasfactors=false)  # match works powerplant names  # they're 1 or more characters followed space, hash , single digit. pwrmatch <- regexpr("^[[:alpha:]]+ #[[:digit:]]", df$col2) df$col3 <- "general asset" df$col3[grepl("^[[:alpha:]]+ #[[:digit:]]", df$col2)] <- regmatches(df$col2, pwrmatch)

col3 looks like: c("wabamun #4", "keephills #2", "wabamun #1", "general asset", "general asset", "general asset", "keephills #2", "wabamun #4", "sundance #1", "keephills #2", "general asset")

the other line similar matter, matching cases of on/off line.

linematch <- regexpr("(on|off) line", df$col2) df$col4 <- df$col2 df$col4[grepl("(on|off) line", df$col2)] <- regmatches(df$col2, linematch)

col4 looks like: c("off line", "on line", "on line", "north red deer t217s bus lock out. under investigation.", "t217s has blown cts on 778l", "t217s north red deer bus in service (778l out of service)", "off line", "on line", "off line", "on line", "homeland security event lowered yellow ( elevated)" )

Search This Blog

Sp

How to split a string conditionally in R? -

Comments

Post a Comment

Popular posts from this blog

c++11 - Intel compiler and "cannot have an in-class initializer" when using constexpr -

java - WrongTypeOfReturnValue exception thrown when unit testing using mockito -

rest - Spring boot: Request method 'PUT' not supported -