我正在使用 Spacy 进行文本标记化并陷入困境:
import spacy
nlp = spacy.load("en_core_web_sm")
mytext = "This is some sentence that spacy will not appreciate"
doc = nlp(mytext)
for token in doc:
print(token.text, token.lemma_, token.pos_, token.tag_, token.dep_, token.shape_, token.is_alpha, token.is_stop)
返回一些在我看来是成功的标记化:
This this DET DT nsubj Xxxx True False
is be VERB VBZ ROOT xx True True
some some DET DT det xxxx True True
sentence sentence NOUN NN attr xxxx True False
that that ADP IN mark xxxx True True
spacy spacy NOUN NN nsubj xxxx True False
will will VERB MD aux xxxx True True
not not ADV RB neg xxx True True
appreciate appreciate VERB VB ccomp xxxx True False
但另一方面
[token.text for token in doc[2].lefts]
返回一个空列表。左/右有错误吗?
自然语言处理初学者,希望我没有落入概念陷阱。使用 Spacy v'2.0.4'。