i have 2 files, 1 has summary (contains many rows extracted csv file) , other has list of words (in row in csv file). read both files, , got array[string] each one:
val summary: array[string] = ... val wordlist: array[string] = ...
for each line in summary
, want extract list of words exist in wordlist
,
sample data in summary
:
hi how good.how you. have tea.
sample data in wordlist
:
good tea
expected result:
you good,you like,tea
as points out, don't need spark:
import scala.collection.mutable.arraybuffer val results = summary.map(l => { var result = arraybuffer[string](); wordlist.foreach(w => {if (l.contains(w)) result.append(w)}); result.toarray.mkstring(",") }).filter(l => l.length > 0)
Comments
Post a Comment