数据挖掘 - 根据预定义的向量删除单词 - 吾爱随笔录

根据预定义的向量删除单词

数据挖掘 r 数据清理

2022-02-27 19:17:24

我有数据集 test_stopword，我想根据向量从数据集中删除一些单词。我怎么能在 R 中做到这一点？

texts <- c("This is the first document.", 
       "Is this a text?", 
       "This is the second file.", 
       "This is the third text.", 
       "File is not this.") 

test_stopword <- as.data.frame(texts)
ordinal_stopwords  <- c("first","primary","second","secondary","third")

1个回答

texts <- c("This is the first document.", 
       "Is this a text?", 
       "This is the second file.", 
       "This is the third text.", 
       "File is not this.") 

test_stopword <- as.data.frame(texts)
ordinal_stopwords  <- c("first","primary","second","secondary","third")

(newdata <- as.data.frame(sapply(texts, function(x)   gsub(paste(ordinal_stopwords, collapse = '|'), '', x))))

添加到代码块中时输出会出现偏差（可能是 SE 中的错误）。但是，您将获得所需的输出。

其它你可能感兴趣的问题

上一篇k-means 聚类可以将贝壳作为聚类吗？下一篇如何使用贝叶斯信念网络执行朴素贝叶斯分类？