|
|
High-Quality Answer Identification in Online Health Communities Based on a Time-Series Graph Convolutional Neural Network |
Sun Wenjing1, Ma Jie1,2, Hao Zhiyuan3 |
1.School of Business and Management, Jilin University, Changchun 130012 2.Information Resource Research Center, Jilin University, Changchun 130012 3.School of Public Administration, Nanjing University of Finance & Economics, Nanjing 210023 |
|
|
Abstract The consultation modules in online health platforms are important channels for users to seek health consultations and obtain health information under the “internet+medical treatment” framework. The accurate identification of high-quality answers is important for guiding users in making informed health decisions. This study takes consultation texts from the “Good Doctor Online” platform as the research object and proposes a high-quality answer identification model based on a time-series graph convolutional neural network. Based on dual-process and graph theories, this model designs a feature system to measure the quality of answers, thereby creating an experimental dataset. Moreover, this model uses the graph sample and aggregate (GraphSAGE) graph convolutional neural network (CNN) to integrate the refined indicators of a quality measurement feature system. This model also integrates the gated recurrent unit (GRU) in GraphSAGE and constructs a “doctor-question” network graph as the model input and ultimately forms the GraphSAGE-GRU model to predict the quality of answers and identify high-quality answers. In this study, support vector machine (SVM), decision tree (DT), k-nearest neighbor (kNN), CNN, GRU, and graph convolutional network (GCN) are selected as baseline methods for control experiments. The results show that the proposed model has a higher accuracy of 93.2% and exhibits the best performance in identifying high-quality answers. Furthermore, nearly 95% of high-quality answer samples can be identified from experimental samples.
|
Received: 01 September 2024
|
|
|
|
1 中国互联网络信息中心. 第53次《中国互联网络发展状况统计报告》[R/OL]. (2024-03-22) [2024-08-05]. https://www.cnnic.net.cn/n4/2024/0322/c88-10964.html. 2 周欢, 刘嘉, 张培颖, 等. 复杂网络视角下在线健康社区评论有用性研究[J]. 情报科学, 2022, 40(9): 88-97. 3 冯翠翠, 易明, 莫富传. 基于指数随机图模型的在线健康社区多元主体交互影响机制研究[J]. 图书情报工作, 2024, 68(7): 88-101. 4 朱洪涛, 柯青, 李欣颖. 在线健康社区用户健康信息态度的形成机理研究[J]. 现代情报, 2024, 44(3): 59-69. 5 盛姝, 黄奇, 郑姝雅, 等. 在线健康社区中用户画像及主题特征分布下信息需求研究——以医享网结直肠癌圈数据为例[J]. 情报学报, 2021, 40(3): 308-320. 6 王火秀. 基于集成学习的在线问答社区健康信息质量评价研究[D]. 福州: 福州大学, 2021. 7 杨瑞仙, 黄书瑞, 王元锋. 基于三阶段DEA模型的在线健康社区知识交流效率评价研究[J]. 情报理论与实践, 2020, 43(10): 122-129. 8 吕健超. 基于集成学习的在线社区健康信息采纳预测研究[D]. 南京: 南京邮电大学, 2022. 9 谌文佳, 杨琳, 李金林. 嵌入意图识别的医疗健康问答文本语义分类模型[J]. 数据分析与知识发现, 2025, 9(2): 26-38. 10 郑承宇, 王新, 王婷, 等. 基于ALBERT-TextCNN模型的多标签医疗文本分类方法[J]. 山东大学学报(理学版), 2022, 57(4): 21-29. 11 Wang S, Luo Y, Liu X M. Early identification of high attention content for online mental health community users based on?multi-level fusion model[J]. Data Technologies and Applications, 2024, 58(5): 838-857. 12 李贺, 刘嘉宇, 沈旺, 等. 基于模糊认知图的在线健康社区知识推荐研究[J]. 数据分析与知识发现, 2020, 4(12): 55-67. 13 韩普, 叶东宇. 基于语义增强的在线健康社区情感分析研究[J]. 科技情报研究, 2024, 6(2): 88-99. 14 Qiu Y, Ding S, Tian D, et al. Predicting the quality of answers with less bias in online health question answering communities[J]. Information Processing & Management, 2022, 59(6): 103112. 15 Hu Z, Zhang Z, Yang H Q, et al. A deep learning approach for predicting the quality of online health expert question-answering services[J]. Journal of Biomedical Informatics, 2017, 71: 241-253. 16 Lin C Y, Wu Y H, Chen A L P. Selecting the most helpful answers in online health question answering communities[J]. Journal of Intelligent Information Systems, 2021, 57(2): 271-293. 17 唐雪梅, 苏祺, 王军, 等. 基于图卷积神经网络的古汉语分词研究[J]. 情报学报, 2023, 42(6): 740-750. 18 连芷萱, 王芳, 康佳, 等. 基于图神经网络和粒子群算法的技术预测模型[J]. 情报学报, 2023, 42(4): 420-435. 19 Apicella A, Isgrò F, Pollastro A, et al. Adaptive filters in graph convolutional neural networks[J]. Pattern Recognition, 2023, 144: 109867. 20 Pasa L, Navarin N, Sperduti A. Polynomial-based graph convolutional neural networks for graph classification[J]. Machine Learning, 2022, 111(4): 1205-1237. 21 Bai L, Jiao Y H, Cui L X, et al. Learning graph convolutional networks based on quantum vertex information propagation[J]. IEEE Transactions on Knowledge and Data Engineering, 2023, 35(2): 1747-1760. 22 张岚泽, 赵晓亮, 刘津彤, 等. 基于空间-邻域自适应的图卷积神经网络信贷欺诈检测模型[J]. 数据分析与知识发现, 2024, 8(4): 137-151. 23 李雪梅, 蒋建洪. 基于改进图卷积神经网络的评论有用性识别[J]. 数据分析与知识发现, 2022, 6(11): 38-51. 24 余本功, 季晓晗. 基于ADGCN-MFM的多模态讽刺检测研究[J]. 数据分析与知识发现, 2023, 7(10): 85-94. 25 Zhang X K, Xu C, Tian X M, et al. Graph edge convolutional neural networks for skeleton-based action recognition[J]. IEEE Transactions on Neural Networks and Learning Systems, 2020, 31(8): 3047-3060. 26 Mai W M, Chen J X, Chen X. Time-evolving graph convolutional recurrent network for traffic prediction[J]. Applied Sciences, 2022, 12(6): 2842. 27 董庆兴, 周欣, 毛凤华, 等. 在线健康社区用户持续使用意愿研究——基于感知价值理论[J]. 现代情报, 2019, 39(3): 3-14, 156. 28 Cheng Q, Lin Y R. Multilevel classification of users' needs in Chinese online medical and health communities: model development and evaluation based on graph convolutional network[J]. JMIR Formative Research, 2023, 7: e42297. 29 陈卓, 朱淼, 杜军威, 等. 基于记忆的注意力图神经网络专家推荐方法[J]. 湖南大学学报(自然科学版), 2022, 49(6): 116-123. 30 史伟志. 基于图卷积神经网络的社区问答专家发现方法研究[D]. 南京: 南京邮电大学, 2022. 31 刘臣, 李秋, 郝宇辰. 基于图卷积神经网络的在线社区行为预测[J]. 计算机技术与发展, 2021, 31(4): 28-33, 124. 32 Li J N, Zhu Q S, Wu Q W, et al. SMOTE-NaN-DE: addressing the noisy and borderline examples problem in imbalanced classification by natural neighbors and differential evolution[J]. Knowledge-Based Systems, 2021, 223: 107056. 33 Dixit A, Mani A. Sampling technique for noisy and borderline examples problem in imbalanced classification[J]. Applied Soft Computing, 2023, 142: 110361. 34 James W. The principles of psychology[M]. Vol.1. New York: Dover Publications, 2004. 35 Evans J S , Stanovich K E. Dual-process theories of higher cognition: advancing the debate[J]. Perspectives on Psychological Science, 2013, 8(3): 223-241. 36 Osatuyi B, Turel O. Tug of war between social self-regulation and habit: explaining the experience of momentary social media addiction symptoms[J]. Computers in Human Behavior, 2018, 85: 95-105. 37 贾明霞, 赵宇翔, 朱庆华, 等. 双系统理论视角下用户数字囤积行为的形成机理与演化路径研究[J]. 情报学报, 2024, 43(3): 339-356. 38 姜雯, 许鑫, 武高峰. 附加情感特征的在线问答社区信息质量自动化评价[J]. 图书情报工作, 2015, 59(4): 100-105. 39 向菲, 杨炀. 在线健康社区患者择医行为影响因素研究——在线信任的中介效应[J]. 医学信息学杂志, 2023, 44(12): 1-7. 40 郭顺利, 张向先, 陶兴, 等. 社会化问答社区用户生成答案质量自动化评价研究——以“知乎”为例[J]. 图书情报工作, 2019, 63(11): 118-130. 41 严炜炜, 黄为, 温馨. 学术社交网络问答质量智能评价与服务优化研究[J]. 图书情报工作, 2021, 65(6): 129-137. 42 喻晓伟. 基于图神经网络的加密恶意流量检测研究[D]. 南京: 南京邮电大学, 2023. 43 吴冰, 彭彧. 在线健康社区中基于用户属性的时序交互模式研究[J]. 知识管理论坛, 2019, 4(3): 163-172. 44 殷豪, 李奕甸, 谢智锋, 等. 混合图神经网络和门控循环网络的短期光伏功率预测[J]. 太阳能学报, 2024, 45(3): 523-532. 45 Prabha M S, Sarojini B. Online healthcare information adoption assessment using text mining techniques[J]. Mobile Networks and Applications, 2019, 24(4): 1160-1165. 46 Cai Y, Yang Y Y, Shi W D. A predictive model of the knowledge-sharing intentions of social Q&A community members: a regression tree approach[J]. International Journal of Human-Computer Interaction, 2022, 38(4): 324-338. 47 Yan Y J, Yu G, Yan X B. Online doctor recommendation with convolutional neural network and sparse inputs[J]. Computational Intelligence and Neuroscience, 2020, 2020(1): 8826557. 48 Zhang Y L, Li X M, Zhang Z. Disease-pertinent knowledge extraction in online health communities using GRU based on a double attention mechanism[J]. IEEE Access, 2020, 8: 95947-95955. 49 孙振龙. 面向支持向量机分类的差分隐私保护方法研究[D]. 哈尔滨: 哈尔滨工程大学, 2022. 50 Deng Z Y, Zhu X S, Cheng D B, et al. Efficient kNN classification algorithm for big data[J]. Neurocomputing, 2016, 195: 143-148. 51 李爱华, 刘婉昕, 陈思帆, 等. 面向不平衡数据的SMOTE-BO-XGBoost集成信用评分模型研究[J/OL]. 中国管理科学, (2024-10-11) [2024-11-22]. https://doi.org/10.16381/j.cnki.issn1003-207x. 2023.0635. 52 张智驹. 基于密度峰值聚类的不平衡数据过抽样方法[J]. 统计与决策, 2024, 40(8): 11-16. 53 王乐, 韩萌, 李小娟, 等. 不平衡数据集分类方法综述[J]. 计算机工程与应用, 2021, 57(22): 42-52. 54 刘嘉宇, 李贺, 谷莹, 等. 不平衡数据集上在线评论有用性识别研究[J]. 情报理论与实践, 2023, 46(11): 119-125, 153. 55 李颖, 吴增源, 陈亮. 基于SMOTE-LOF-AdaBoost模型的核心专利识别研究[J]. 科技管理研究, 2023, 43(21): 171-177. |
|
|
|