i have tweets of particular account, want go through each tweet , categorize class labels business, music, sports etc.
my approach creating training data assign few keywords each class label, example
- keywords “business” - entrepreneur, job, gdp…
- keywords “music” - songs, genre, album…
.csv file training data has 2 columns 1. keywords 2. class
is right way go ?
thank in advance!
it seems trying similar dictionary method. it's pretty straightforward apply dictionary corpus of texts, given using tweets i'd recommend using kenneth bennoits excellent quanteda package.
more can create custom dictionary (an s3 class believe) list of terms.
https://cran.r-project.org/web/packages/quanteda/quanteda.pdf
and apply dictionary using applydictionary. you'll nice table text , dictionary keys, following:
docs christmas opposition taxglob taxregex country text1 1 1 1 0 0 text2 0 0 1 0 2
Comments
Post a Comment