|
|
|
|
|
|
| Intelligence Theories and Methods |
|
|
|
|
332 |
Attribution Analysis of Research Collaboration Frequency Based on XGBoost-SHAP Framework Hot! |
|
 |
Peng Zhaoqi, Shi Bin, Yang Alex Jie, Deng Sanhong |
|
|
DOI: 10.3772/j.issn.1000-0135.2026.03.002 |
|
|
The frequency of scientific collaboration may exhibit complex collaborative characteristics, which plays a key role in comprehensively understanding the intensity and patterns of researchers’ collaborative relationships. This study selected 13,220,951 authentic collaborative relationships from the PubMed knowledge graph 2.0 (PKG 2.0) biomedical dataset as research samples to reveal the key drivers and inherent patterns behind varying collaboration frequencies. First, the frequency of collaboration was treated as a proxy variable to measure the strength of the collaborative relationship, categorizing the frequency into low, medium, and high levels based on its distribution characteristics. Second, ten distinct variables across four dimensions were selected to form a feature system describing author collaboration intensity, which included research similarity, achievement production models, academic capital, and individual attributes. The eXtreme Gradient Boosting (XGBoost) algorithm was employed to capture the complex correlations among high-dimensional features. Finally, the SHapley Additive exPlanations (SHAP) framework was used to analyze the attribution of the prediction model. The influence degree and interaction mechanism of the features on different cooperation intensity were further evaluated. The study reveals that thematic similarity among authors plays a dominant role in the classification of collaboration frequency, followed by differences in the number of papers and citations (total and average). High thematic congruence serves as the core factor sustaining high frequency collaboration, whereas low frequency collaboration prioritizes cumulative output over individual paper impact. The effects of thematic similarity, paper count differences, and H-index variations on collaboration frequency followed distinct patterns: bimodal symmetry, threshold stability, and marginal diminishing effects, respectively. Furthermore, high-frequency collaborations demonstrated remarkable tolerance for heterogeneous factors such as knowledge structure or influence composition, often forming effective complementary division mechanisms through coordinated efforts. |
|
|
2026 Vol. 45 (3): 332-349
[Abstract]
(
3
)
HTML
(160 KB)
PDF
(5671 KB)
(
5
) |
|
|
|
350 |
Research Topic Recommendation for Scholars Based on Heterogeneous Graph Neural Networks Hot! |
|
 |
Huo Chaoguang, Dan Tingting, Pang Zengyao |
|
|
DOI: 10.3772/j.issn.1000-0135.2026.03.003 |
|
|
Research topic selection is critical for academic career development. However, identifying suitable candidates from a large number of options remains a significant challenge. To address this issue, this study developed a personalized topic recommendation method to assist scholars discover topics that are relevant to their research bases and interests. Given the limitations of existing recommendation methods in dealing with the heterogeneity of nodes and relationship types in scholarly networks, this study proposes a research topic recommendation model for scholars based on a heterogeneous graph neural network. Three feature aggregation modes for scholars and topics using heterogeneous graph attention network (HAN), heterogeneous graph transformer (HGT), and heterogeneous graph contrastive learning (HGCL) were constructed, with a focus on message passing and aggregation mechanisms to learn the complex interaction patterns between scholars and topics, as well as to capture the patterns of topic diffusion and scholar topic selection. Based on feature aggregation, classifiers such as logistic regression, random forest, and multilayer perceptron have been used for recommendation discrimination. Empirical tests were conducted using data from 6,521 scholars at Renmin University of China and over 30,000 research topics covered by them from the Scopus database. The results showed that the recommendation model based on HGCL integrated with a multilayer perceptron had the highest precision (88.25%), representing a 9.29 percentage points improvement over the baseline model SVD. Meanwhile, the model based on HAN integrated with a multilayer perceptron outperformed the baseline model in terms of recall and F1-score, achieving 96.41% and 91.92%, respectively, representing 17.21 and 12.56 percentage points improvements. This is the first study to construct a research topic recommendation model for scholars, focusing on the representation and learning of heterogeneity in scholar and topic nodes, as well as their relationships. This study provides a reference method for personalized topic selection by scholars and topic-level academic resource recommendations. |
|
|
2026 Vol. 45 (3): 350-362
[Abstract]
(
2
)
HTML
(174 KB)
PDF
(2667 KB)
(
3
) |
|
|
|
376 |
Identification of Technical R&D Cooperation Partners Based on Heterogeneous Graph Embedding and Link Prediction Hot! |
|
 |
Pan Hong, Wang Ming, Zhao Kai, Zhai Liang, Liang Guoqiang, Zhai Dongsheng |
|
|
DOI: 10.3772/j.issn.1000-0135.2026.03.005 |
|
|
Technology research and development (R&D) partner identification is key for increasing innovation performance. To address the issues of limited attribute portrayal of technology R&D subjects and limited recognition accuracy owing to the sparse cooperation matrix in existing methods, this study proposes a technology R&D partner identification model (technology development partner identification integrating heterogeneous graph embedding link prediction, HGE-LP-TDPI) that integrates heterogeneous graph embedding and link prediction. First, the ontology structure of the technology R&D partner identification domain was constructed, and semantic extraction of patent specifications was performed based on large language models (LLMs) to generate a heterogeneous graph containing multidimensional technology associations of technology R&D subjects. Second, meta-paths characterizing technology associations were designed, and semantic information was aggregated using a long short-term memory (LSTM) temporal encoder with a multilevel attention mechanism. Finally, we identified partners based on a link prediction algorithm. Empirical studies in the field of electrochemical energy storage reveal the following: first, the HGE-LP-TDPI model significantly outperformed the benchmark model in terms of key indicators such as the area under curve (AUC) value (it reaches 95.62%), confirming the effectiveness of multidimensional technology linkage fusion and heterogeneous graph embedding in solving the problem of data sparsity; second, meta-path weighting analysis reveals that the technical problem drive was the core factor of cooperation formation, and the application domain and technical efficacy had the second highest influence; third, attribute ablation experiments show that technical dimension and knowledge dimension attributes had the highest contribution to the model results. |
|
|
2026 Vol. 45 (3): 376-392
[Abstract]
(
1
)
HTML
(251 KB)
PDF
(3743 KB)
(
2
) |
|
|
|
393 |
Anchor-Line-Network Framework: Potential Release Path of Public-Data Elements Under Resource-Dependence Theory Hot! |
|
 |
Ma Haiqun, Zhang Bin |
|
|
DOI: 10.3772/j.issn.1000-0135.2026.03.006 |
|
|
To realize the potential release of public-data elements, this study focuses on structural barriers such as “data silos,” “lack of standards,” and “privacy and regulatory deficiencies” in the “sharing-open-authorized operation” stage. An “anchor-line-network” path framework under the resource-dependence theory is proposed to mitigate the effects of inter-departmental and inter-industry data flow chains and enhance the collaborative efficiency between institutional supply and market-driven allocation. The framework is constructed with “sharing,” “open,” and “authorized operation” as the “anchor,” “line,” and “network,” respectively. Following the logic of “dilemma deconstruction-path reconstruction,” this study identifies bottlenecks and suggests the following countermeasures: standardization and phantomization to link the starting point, tiered openness and privacy protection to extend the chain, and tiered authorization and transparent supervision to stabilize the network while improving operational performance through benefit distribution and process optimization. This study shows that the three levels exhibit hierarchical collaboration and increasingly interdependent relationships: The “anchor” breaks the silos via unified standards and phantomization, the “line” achieves highly trusted circulation through tiered openness and security mechanisms, and the “network” constructs a stable ecosystem through standardized authorization, enhanced supervision, and optimized allocation. Based on these findings, the authors suggest simultaneously advancing the unified standards and incentives, improving tiered openness and security, and strengthening the authorization operation and supervision system to systematically release the value of public-data elements. |
|
|
2026 Vol. 45 (3): 393-401
[Abstract]
(
1
)
HTML
(96 KB)
PDF
(1286 KB)
(
3
) |
|
|
|
402 |
Cultural Heritage Smart Data Generation and Its Scenario-based Service Patterns Empowered by Collective Intelligence Hot! |
|
 |
Yang Simin, Wang Hao |
|
|
DOI: 10.3772/j.issn.1000-0135.2026.03.007 |
|
|
Driven by the emergence of human-AI intelligence, collective intelligence provides a new human-machine collaboration paradigm for the digital and intelligent transformation of cultural heritage. This study aims to achieve dynamic coupling between the generation of cultural heritage smart data and scenario-based services using collective intelligence empowerment, thereby promoting the synergistic upgrading of data and services. First, through a literature analysis, this study synthesizes the use of swarm intelligence in smart data generation and scenario-based services in the cultural heritage field. Second, it investigates the “data-learning-intelligence” generation path of smart data under the empowerment of swarm intelligence and analyzes the constituent elements of such data. Third, using the dimensions of demand, service, and technology, it elaborates on the logic of the scenario-based service model for cultural heritage smart data. Finally, using the preventive protection scenarios of traditional crafts under intangible cultural heritage as a case study, this study demonstrates the effectiveness of swarm intelligence in early warning of craft endangerment, market quality risk assessment, and intelligent decision-making services. This study aims to demonstrate the efficacy of swarm intelligence in “externalizing tacit knowledge, breaking through AI black boxes, and enhancing cross-subject collaboration,” to provide theoretical references for the intelligent development of cultural heritage in the digital-intelligent age. |
|
|
2026 Vol. 45 (3): 402-416
[Abstract]
(
1
)
HTML
(135 KB)
PDF
(3988 KB)
(
2
) |
|
|
|
417 |
User Role Recognition of Social Media Public Opinion Based on Heterogeneous Hypernetwork Representation Learning from the Perspective of Symbolic Interaction Theory Hot! |
|
 |
Shen Wang, Sun Ke, Li He, Liu Jiayu |
|
|
DOI: 10.3772/j.issn.1000-0135.2026.03.008 |
|
|
The core challenge of social media public opinion analysis lies in its multi-dimensional and heterogeneous information interaction mechanism, making accurate identification of user roles the key to understanding the law of information diffusion and accurate public opinion guidance. To address the current research limitations of overemphasizing high-impact user identification and inadequately integrating semantic information with network topology, this study proposes a user role identification method that integrates symbolic interaction theory and heterogeneous hypernetwork representation learning. Initially, grounded in symbolic interaction theory, we constructed a three-layer heterogeneous hypernetwork model comprising social behavior, information dissemination, and topic semantic layers, to systematically characterize the multi-dimensional interaction patterns of users. Subsequently, heterogeneous hypernetwork representation learning that included a multi-layer graph attention network and a cross-network feature fusion strategy was designed based on user-information-topic meta-path to enhance node feature representation and generate user fusion feature vectors with semantic consistency. Finally, user roles were identified in an unsupervised manner using the k-means clustering algorithm, with t-distributed stochastic neighbor embedding (t-SNE) employed for visual validation. Experiments on real-world social media public opinion datasets demonstrated that the proposed model effectively integrates multi-source heterogeneous information, enhancing both the accuracy and interpretability of user role identification. Ablation studies further validated the effectiveness of each module. The proposed method not only offers a novel technical approach for public opinion analysis but also provides a scientific basis for implementing differentiated and precise governance strategies through its interpretable feature fusion mechanism. |
|
|
2026 Vol. 45 (3): 417-432
[Abstract]
(
1
)
HTML
(207 KB)
PDF
(6864 KB)
(
4
) |
|
|
| Intelligence Technology and Application |
|
|
|
|
447 |
Construction and Evaluation of Intelligent Medical Q&A System for Chronic Diseases: A Hybrid Modeling Approach Based on GraphRAG and MoE Hot! |
|
 |
Ma Xin, Wang Fang, Zhang Feng, Li Zhaochuan |
|
|
DOI: 10.3772/j.issn.1000-0135.2026.03.010 |
|
|
With chronic diseases becoming a core challenge in global public health, there is a growing demand for high-quality, personalized, and sustainable health information services for patients. Although large language models (LLMs) have demonstrated significant advantages in medical Q&A tasks recently, existing systems still face bottlenecks such as coarse knowledge scheduling granularity, weak semantic awareness, and poor model response consistency in chronic disease management, which is a long-term, contextualized, and semantically complex scenario. To this end, this study follows the three-stage process of “structured knowledge modeling-hybrid-driven retrieval-large model cooperative response” to propose a fusion approach of graph retrieval augmented generation (GraphRAG) and mixture of experts (MoE) for chronic disease Q&A system. First, a multi-source heterogeneous proprietary knowledge graph covering the elements of disease evolution, lifestyle intervention, and long-term management was constructed. Subsequently, keyword matching was combined with semantic vector recall, multi-model collaborative answering, and multi-subgraph cue fusion three-channel hybrid graph retrieval mechanism, to achieve dynamic knowledge scheduling for complex information requirements. Second, initial screening of user query legitimacy and normality was combined with the user’s implicit intent and fusion subgraphs, through the MoE dynamic gating network on nine open source big models for command-level synergy, to achieve the semantic depth and generation precision of the dual engine enhancement. Finally, common sense and content integrity checking were introduced, along with formatting rules to refine the preliminary generated text and ensure the accuracy, integrity, and linguistic readability of the output results. The CdMedQA test set built for the experiment, covered 118 chronic diseases and three types of management tasks. The performance of the system was validated using a combination of objective index evaluation and subjective satisfaction comparisons. The results show that the proposed system significantly outperformed multiple generic and healthcare vertical large model baselines in terms of accuracy, clarity, personalization, and contextual adaptability. This provides not only a new path for the intelligent management of chronic disease but also theoretical support and technical solutions for the optimization of human-computer interaction and enhanced credibility for the generated content, driven by multi-source knowledge. |
|
|
2026 Vol. 45 (3): 447-462
[Abstract]
(
0
)
HTML
(195 KB)
PDF
(4293 KB)
(
2
) |
|
|
|