twitter - Multi-class classification in R -


i have tweets of particular account, want go through each tweet , categorize class labels business, music, sports etc.

my approach creating training data assign few keywords each class label, example

  1. keywords “business” - entrepreneur, job, gdp…
  2. keywords “music” - songs, genre, album…

.csv file training data has 2 columns 1. keywords 2. class

is right way go ?

thank in advance!

it seems trying similar dictionary method. it's pretty straightforward apply dictionary corpus of texts, given using tweets i'd recommend using kenneth bennoits excellent quanteda package.

more can create custom dictionary (an s3 class believe) list of terms.

https://cran.r-project.org/web/packages/quanteda/quanteda.pdf

and apply dictionary using applydictionary. you'll nice table text , dictionary keys, following:

docs    christmas opposition taxglob taxregex country   text1         1          1       1        0       0   text2         0          0       1        0       2 

Comments