通过 LDA 进行主题建模后,我得到以下数据集。
Document_No Dominant_Topic Topic_Perc_Contrib Keywords TextBookFindings Disease/Drugs
0 0 3.0 0.7625 hypotension, bradycardia, mydriasis, hypersali Hypotension,hypothermia,bradycardia NSAIDS Poisoning
1 1 5.0 0.6833 edema, cyanosis, cardiacarrest, lacrimation, Hyperventilation,respiratoryalkalosis,edema–br... NSAIDS Poisoning
2 2 0.0 0.8100 vomiting, nausea, diarrhea, abdominalpain, Nausea,vomiting,diarrhea,abdominalpain– NSAIDS Poisoning
3 3 0.0 0.2625 vomiting, nausea, diarrhea, abdominalpain, GIbleeding,pancreatitis,hepaticinjury NSAIDS Poisoning
4 4 1.0 0.4463 insomnia, drowsiness, irritability, neurotoxic Headache,dizziness,encephalopathy,irritability NSAIDS Poisoning
... ... ... ... ... ... ...
1446 1446 7.0 0.5250 weakness, muscle, coagulopathy, fasciculations... metabolicacidosis,Elevatedlactateconcentration... Neuroleptic malignant syndrome (NMS)
1447 1447 0.0 0.0500 vomiting, nausea, diarrhea, abdominalpain, pan... hematologictoxicity Neuroleptic malignant syndrome (NMS)
1448 1448 0.0 0.5250 vomiting, nausea, diarrhea, abdominalpain, pan... Pancreatitis NaN
1449 1449 0.0 0.0500 vomiting, nausea, diarrhea, abdominalpain, pan... Hypersensitivity Neuroleptic malignant syndrome (NMS)
1450 1450 0.0 0.0500 vomiting, nausea, diarrhea, abdominalpain, pan... sensoryperipheralneuropathy NaN
我想在输入症状时创建一个预测系统,它显示疾病/药物的百分比匹配。所以我想在列keywords和之间创建一个预测Disease/Drugs。
哪个将是最好的算法以及如何推进这一点的一些建议?