根据预定义的向量删除单词

数据挖掘 r 数据清理
2022-02-27 19:17:24

我有数据集 test_stopword,我想根据向量从数据集中删除一些单词。我怎么能在 R 中做到这一点?

texts <- c("This is the first document.", 
       "Is this a text?", 
       "This is the second file.", 
       "This is the third text.", 
       "File is not this.") 

test_stopword <- as.data.frame(texts)
ordinal_stopwords  <- c("first","primary","second","secondary","third")
1个回答
texts <- c("This is the first document.", 
       "Is this a text?", 
       "This is the second file.", 
       "This is the third text.", 
       "File is not this.") 

test_stopword <- as.data.frame(texts)
ordinal_stopwords  <- c("first","primary","second","secondary","third")

(newdata <- as.data.frame(sapply(texts, function(x)   gsub(paste(ordinal_stopwords, collapse = '|'), '', x))))

添加到代码块中时输出会出现偏差(可能是 SE 中的错误)。但是,您将获得所需的输出。