Quote
: |
张伦伦, 任高, 邹北骥, 刘青萍.基于术语词典的中医医案实体抽取研究[J].湖南中医药大学学报英文版,2024,44(6):1110-1116.[Click to copy
] |
|
|
|
This paper
:Browser 409times Download 305times |
基于术语词典的中医医案实体抽取研究 |
张伦伦,任高,邹北骥,刘青萍 |
(湖南中医药大学信息科学与工程学院, 湖南 长沙 410208;中南大学计算机学院, 湖南 长沙 410083) |
摘要: |
目的 针对中医医案开展症状、病因病机、治法、用药、处方、取穴6类实体的抽取研究,为中医医案知识图谱构建和中医智能辅助诊疗提供基础。方法 根据中医医案文本的特点,提出一个可以动态更新的术语词典方法用于分词,并在中医脑系疾病医案和ChineseBLUE/cEHRNER、ChineseBLUE/cMedQANER、CBLUE/CMeEE 3个公开数据集上验证该方法的有效性。结果 使用术语词典的模型在准确率、精确率、召回率和F1值上均高于未使用术语词典的模型,在测试集和验证集上,F1值分别为92.07%和93.04%。结论 融合动态更新的术语词典分词方法的模型,能够增强中医领域特定术语和新实体的识别能力,提高中医医案关键信息识别的准确率,推进中医药知识的传承与发展。 |
关键词: 中医医案 脑系疾病 术语词典 实体抽取 IDCNN-CRF模型 |
DOI:10.3969/j.issn.1674-070X.2024.06.026 |
Received:December 15, 2023 |
基金项目:湖南省教育厅科学研究优秀青年项目(22B0385);2022年度学科建设“揭榜挂帅”项目(22JBZ051);湖南省中医药管理局智慧中医工程技术重点研究室。 |
|
Research on entity extraction of TCM medical cases based on terminology dictionary |
ZHANG Lunlun, REN Gao, ZOU Beiji, LIU Qingping |
(School of Informatics, Hunan University of Chinese Medicine, Changsha, Hunan 410208, China;School of Computer Science And Engineering, Central South University, Changsha, Hunan 410083, China) |
Abstract: |
Objective To extract six categories of entities including symptoms, etiology and pathogenesis, treatment principles, medication, prescriptions, and acupoint selection from TCM medical cases, so as to lay the foundation for the construction of TCM medical case knowledge graphs and intelligent assistance in TCM diagnosis and treatment. Methods Based on the characteristics of TCM medical case texts, a dynamically updatable terminology dictionary method was proposed for word segmentation, and its effectiveness was validated on medical cases of TCM neurological disorders, as well as three publicly available datasets: ChineseBLUE/cEHRNER, ChineseBLUE/cMedQANER, and CBLUE/CMeEE. Results The model using the terminology dictionary achieved higher accuracy, precision, recall, and F1 values compared to the model without using the terminology dictionary. The F1 values on the test set and validation set were 92.07% and 93.04%, respectively. Conclusion The model integrating the dynamically updatable terminology dictionary segmentation method can enhance the recognition ability of specific terms and new entities in the TCM field, improve the accuracy of key information identification in TCM medical cases, and promote the inheritance and development of TCM knowledge. |
Key words: medical cases of Chinese medicine neurological disorders terminology dictionary entity extraction IDCNN-CRF model |
|
二维码(扫一下试试看!) |
|
|
|
|