# generate counts table library(plyr) example <- data.frame(count(diamonds,c('color', 'cut'))) example[1:3,] # excerpt of table color cut freq 1 d fair 163 2 d 662 3 d 1513
you can filter table freq > 1000 with: example[example$freq > 1000,]
. generate table similar except values less value e.g. 1000 included in row (other)
similar happens when have many factors , call summary(example, maxsum=3)
.
color cut freq d : 5 fair : 7 min. : 119 e : 5 : 7 1st qu.: 592 (other):25 (other):21 median :1204 mean :1541 3rd qu.:2334 max. :4884
example ideal output:
ideally want convert example[example$color=='j',]
:
color cut freq j fair 119 j 307 j 678 j premium 808 j ideal 896
and produce this:
color cut freq j 678 j premium 808 j ideal 896 j (other) 426
bonus: if kind of filtering possible ggplot create plot below, filtering, great also.
ggplot(example, aes(x=color, y=freq)) + geom_bar(aes(fill=cut), stat = "identity")
here alternative using dplyr
pipe correct data directly ggplot
call.
library(dplyr) example %>% mutate(cut = ifelse(freq < 500, "other", levels(cut))) %>% group_by(color, cut) %>% summarise(freq = sum(freq)) %>% ggplot(aes(color, freq, fill = cut)) + geom_bar(stat = "identity")
be sure detach plyr
, otherwise output incorrect dplyr
call.
Comments
Post a Comment