text mining - list of word frequencies using R

Question

Welcome To Ask or Share your Answers For Others

text mining - list of word frequencies using R

asked Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

text mining - list of word frequencies using R

I have been using the tm package to run some text analysis. My problem is with creating a list with words and their frequencies associated with the same

library(tm)
library(RWeka)

txt <- read.csv("HW.csv",header=T) 
df <- do.call("rbind", lapply(txt, as.data.frame))
names(df) <- "text"

myCorpus <- Corpus(VectorSource(df$text))
myStopwords <- c(stopwords('english'),"originally", "posted")
myCorpus <- tm_map(myCorpus, removeWords, myStopwords)

#building the TDM

btm <- function(x) NGramTokenizer(x, Weka_control(min = 3, max = 3))
myTdm <- TermDocumentMatrix(myCorpus, control = list(tokenize = btm))

I typically use the following code for generating list of words in a frequency range

frq1 <- findFreqTerms(myTdm, lowfreq=50)

Is there any way to automate this such that we get a dataframe with all words and their frequency?

The other problem that i face is with converting the term document matrix into a data frame. As i am working on large samples of data, I run into memory errors. Is there a simple solution for this?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-17T00:29:12+0000

Try this

data("crude")
myTdm <- as.matrix(TermDocumentMatrix(crude))
FreqMat <- data.frame(ST = rownames(myTdm), 
                      Freq = rowSums(myTdm), 
                      row.names = NULL)
head(FreqMat, 10)
#            ST Freq
# 1       "(it)    1
# 2     "demand    1
# 3  "expansion    1
# 4        "for    1
# 5     "growth    1
# 6         "if    1
# 7         "is    2
# 8        "may    1
# 9       "none    2
# 10      "opec    2

Categories

text mining - list of word frequencies using R

text mining - list of word frequencies using R

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags