r - dplyr summarise: Equivalent of ".drop=FALSE" to keep groups with zero length in output -
when using summarise
plyr
's ddply
function, empty categories dropped default. can change behavior adding .drop = false
. however, doesn't work when using summarise
dplyr
. there way keep empty categories in result?
here's example fake data.
library(dplyr) df = data.frame(a=rep(1:3,4), b=rep(1:2,6)) # add level df$b has no corresponding value in df$a df$b = factor(df$b, levels=1:3) # summarise plyr, keeping categories count of 0 plyr::ddply(df, "b", summarise, count_a=length(a), .drop=false) b count_a 1 1 6 2 2 6 3 3 0 # try dplyr df %.% group_by(b) %.% summarise(count_a=length(a), .drop=false) b count_a .drop 1 1 6 false 2 2 6 false
not hoping for. there dplyr
method achieving same result .drop=false
in plyr
?
the issue still open, in meantime, since data factored, can use complete
"tidyr" might looking for:
library(tidyr) df %>% group_by(b) %>% summarise(count_a=length(a)) %>% complete(b) # source: local data frame [3 x 2] # # b count_a # (fctr) (int) # 1 1 6 # 2 2 6 # 3 3 na
if wanted replacement value zero, need specify fill
:
df %>% group_by(b) %>% summarise(count_a=length(a)) %>% complete(b, fill = list(count_a = 0)) # source: local data frame [3 x 2] # # b count_a # (fctr) (dbl) # 1 1 6 # 2 2 6 # 3 3 0
Comments
Post a Comment