# generate counts table library(plyr) example <- data.frame(count(diamonds,c('color', 'cut'))) example[1:3,] # excerpt of table color cut freq 1 d fair 163 2 d 662 3 d 1513 you can filter table freq > 1000 with: example[example$freq > 1000,]. generate table similar except values less value e.g. 1000 included in row (other) similar happens when have many factors , call summary(example, maxsum=3).
color cut freq d : 5 fair : 7 min. : 119 e : 5 : 7 1st qu.: 592 (other):25 (other):21 median :1204 mean :1541 3rd qu.:2334 max. :4884 example ideal output:
ideally want convert example[example$color=='j',]:
color cut freq j fair 119 j 307 j 678 j premium 808 j ideal 896 and produce this:
color cut freq j 678 j premium 808 j ideal 896 j (other) 426 bonus: if kind of filtering possible ggplot create plot below, filtering, great also.
ggplot(example, aes(x=color, y=freq)) + geom_bar(aes(fill=cut), stat = "identity")
here alternative using dplyr pipe correct data directly ggplot call.
library(dplyr) example %>% mutate(cut = ifelse(freq < 500, "other", levels(cut))) %>% group_by(color, cut) %>% summarise(freq = sum(freq)) %>% ggplot(aes(color, freq, fill = cut)) + geom_bar(stat = "identity") be sure detach plyr, otherwise output incorrect dplyr call.


Comments
Post a Comment