Full Abstracts

2025 Vol. 44, No. 9
Published: 2025-09-24

Special Topics
Intelligence Theories and Methods
Intelligence Technology and Application
Intelligence Users and Behavior
Intelligence Reviews and Comments
Special Topics
1075 Innovation Intelligence: Historical Context, Conceptual Connotation, and Functional Role Hot!
Zeng Jianxun, Lu Chunjiang, Lin Xin, Yuan Wei
DOI: 10.3772/j.issn.1000-0135.2025.09.001
With China’s deepening construction as an innovative nation and the accelerated implementation of innovation-driven development strategies, innovation intelligence in scientific and technological information has attracted increasing attention. To clarify the fundamental theoretical issues in the research and practice of innovation intelligence, this study conducts a scholarly analysis of innovation intelligence from the perspectives of historical context, conceptual connotations, and functional roles. This study first examines the historical context of innovation intelligence development, including the transformation of research paradigms under the stage of scientific competition, breakthroughs in critical technologies amid major technological rivalries, and high-quality development of new productive forces. Based on this, the concept of innovation intelligence is defined by analyzing its conceptual scope and fundamental characteristics. Finally, the functional roles of innovation intelligence are summarized in four areas: computable information services aimed at innovation entities, decision-making information support for innovation governance, frontier innovation sensing and risk prevention for tracking and monitoring, and open information collaboration for innovation ecosystem development.
2025 Vol. 44 (9): 1075-1082 [Abstract] ( 32 ) HTML (1 KB)  PDF (685 KB)  ( 47 )
Intelligence Theories and Methods
1083 An Idiomatic Metaphorical Word-Formation Method Based on a Large Language Model and Its Application: Knowledge Reorganization, Backtracking, and Discovery Hot!
Zhang Wei, Wang Dongbo, Liu Liu
DOI: 10.3772/j.issn.1000-0135.2025.09.002
In the age of digital intelligence, generative artificial intelligence (GenAI) has provided new impetus to traditional humanistic knowledge organization, mining, and production. Using the artificial intelligence generated content (AIGC) paradigm to reshape the information behaviors of idiomatic excerpts, followings, and curing of ancient literature into an intelligent word-formation mode is of great significance for the structural reorganization, historical retrospection, and conceptual discovery of existing humanities knowledge systems. This study proposes a metaphorical word formation method for idioms based on a large language model (LLM) from the perspective of cultural genes and word formation. First, a metaphorical word-formation knowledge system is defined based on the origins of idioms, including phrases, objects (source domains), and emotions (target domains). A QA dataset is constructed using the “source text and word formation system” corpus. Subsequently, a generative LLM is introduced for keyphrase extraction and metaphor recognition in multi-task learning of word formation of idioms, with a focus on exploring the enhancement effect of instruction fine-tuning in the word-formation LLM under the injection of dependency syntactic knowledge. The trained LLM can effectively generate metaphorical word formation structures for idiom source texts. The Xunzi model outperforms general LLMs such as qwen7b, llama3_8b, and GPT-4o in various indicators across differenttasks. Dependency syntax knowledge can effectively stimulate the understanding ability of the LLM, increasing the accuracy of vocabulary structure, object label, and emotion label recognition to 86.11%, 87.82%, and 85.39%, respectively. Considering “Tang Dynasty Poetry” as an example, it can be observed that the recognition of idioms in poems can achieve a chain knowledge reorganization of idioms, poems, and poets. The time-series analysis of the results generated by the LLM allowsthe origin tracing of 130 idioms (up to more than 1000 years forward) and completes the knowledge discovery of large-scale new phrases under the inheritance of idiom metaphorical cultural genes, compiling a thematic vocabulary of imagery with practical value in the cultural industry.
2025 Vol. 44 (9): 1083-1098 [Abstract] ( 30 ) HTML (1 KB)  PDF (7297 KB)  ( 61 )
1099 An Exploration of the Novelty Measurement Task of Scientific Literature Driven by a Large Language Model Hot!
Zhang Lin, Li Sijia, Shi Shunshun, Gou Zhenyu, Huang Ying
DOI: 10.3772/j.issn.1000-0135.2025.09.003
To analyze the usability of large language models in the novelty measurement task of scientific literature, a large language model-driven novelty measurement method for scientific literature is proposed in this paper; it is based on research questions, methods, and conclusions, as well as other knowledge units of scientific literature. In this study, a prompt template is designed for the task of extracting knowledge units from scientific literature, and the Qwen2-72B-Instruct open-source large language model is modified with supervised fine-tuning (SFT) and direct preference optimization (DPO) techniques to extract knowledge units of questions, methods, and conclusions from the literature. The semantic embedding of knowledge units is realized, and the average aggregation idea is introduced to realize the semantic embedding of knowledge unit combinations. Further, the novelty of the “new” papers is measured by comparing the semantic embedding vectors between the new papers and old reference paper collections. The experimental results show that the fine-tuned model performs better than the benchmark model in extracting knowledge units from scientific literature. Compared with existing methods for calculating the novelty of papers, the scientific literature novelty measurement model based on knowledge units proposed in this paper can capture more refined novelty differences at the semantic level of the knowledge unit combination. Overall, the novelty measurement method for scientific literature driven by the large language model can better complete the novelty measurement task of scientific literature and enrich the novelty measurement method for scientific papers. In this study, experiments are only carried out on an abstract collection of Chinese papers in computer science and technology, and usability in other fields must be discussed further. Moreover, human assistance is still needed to improve the interpretability and reliability of results when using large language models.
2025 Vol. 44 (9): 1099-1113 [Abstract] ( 19 ) HTML (1 KB)  PDF (2554 KB)  ( 177 )
1114 Research on the Post-hoc Explanation Recommendation Model Based on Generative AI Hot!
Li Weiqing, Wang Weijun, Huang Yinghui, Huang Wei, Zhang Rui
DOI: 10.3772/j.issn.1000-0135.2025.09.004
We propose a post-hoc explanation recommendation model based on generative artificial intelligence (GenAI), integrating the theory of consumption values into the recommendation process. This approach enhances the recommendation effectiveness while generating personalized textual explanations for users. Using generative AI prompt engineering, we assess fine-grained consumption value tendencies—functional, symbolic, economic, and emotional—extracted from product reviews. Based on these assessments, a user-item (value) preference interaction matrix is constructed, enabling the implementation of a value-driven recommendation model. Finally, the recommendation results and corresponding consumption value tendencies are fed into a generative AI-based explanation bot to produce tailored and contextually relevant explanations. The results demonstrate that the generative AI-based scoring bot achieves high accuracy, consistency, and diversity in evaluating consumption value tendencies, providing critical support for both the recommendation model and explanation system. By incorporating consumption values, the proposed model significantly improves recommendation accuracy and diversity while excelling in cold start and data sparsity scenarios. This approach offers new insights into addressing issues such as the “filter bubble” and over-specialization in recommendation systems. Additionally, the explanations generated by the generative AI-based bot are fluent, logically coherent, and can effectively communicate the underlying recommendation mechanisms and value tendencies. Compared with traditional methods, this approach is more flexible, efficient, and personalized, offering a novel solution to enhance the recommendation system transparency and foster user trust.
2025 Vol. 44 (9): 1114-1127 [Abstract] ( 23 ) HTML (1 KB)  PDF (2322 KB)  ( 87 )
1128 Network Characterization of Domain Knowledge Clusters Based on Heterogeneous Information Network Hot!
Yang Xinyi, Yang Jianlin, Ye Wenhao
DOI: 10.3772/j.issn.1000-0135.2025.09.005
Domain knowledge clusters with multi-entity participation can reveal the structure of domain knowledge from macro and micro content and topology perspectives, which is critical for understanding the domain knowledge system as an whole. This study uses heterogeneous information networks (HIN) to construct the multiple knowledge entities such as authors, papers and journals, and their relationships, forming an HIN of domain knowledge. In network clustering, a graph neural network framework is introduced to integrate network structural features and textual content features to learn node vector representations, then the node representations are used to recalculate weights of the links. Moreover, along with the strategy of community splitting and merging, we use network community detection algorithms to identify domain knowledge clusters. Finally, we analyzed the domain knowledge clusters in terms of textual content and network characteristics to better understand their composition. Using a dataset in the fields of database, data mining, and content retrieval (DBDMIR) as empirical evidence, our framework improves clustering results by identifying domain knowledge clusters with clear semantics and a strong community structure. The textual features of domain knowledge clusters represent research topics, while their topological features reflect the formation mechanism and development situation. In particular, the star clusters that are formed by the relationships between papers published in journals reveal the research focus of key journals in the field. In contrast, networked clusters that are formed by dense citation relationships represent a relatively mature direction. In addition, networked clusters with sparse citation relationships and connectivity based on heterogeneous author-paper relationships represent emerging research directions. Inter-cluster connectivity analysis shows that preferred connections between domain knowledge clusters divide the domain into subdomains, while heterogeneous connection preferences demonstrate how knowledge is exchanged between knowledge clusters. An analysis that combines textual and topological features provides an overview of domain knowledge content and development, implying that domain knowledge clusters containing multiple entities have the potential to predict emerging topics.
2025 Vol. 44 (9): 1128-1143 [Abstract] ( 28 ) HTML (1 KB)  PDF (6640 KB)  ( 35 )
1144 Elucidation of Driving Force Mechanism of Medical Data Value Release Based on Data Value Chain Hot!
Mu Dongmei, Zhang Meng
DOI: 10.3772/j.issn.1000-0135.2025.09.006
The purpose of this study is to analyze the key factors and potential driving forces of medical data value release, and provide strategic guidance for the efficient transformation of medical data resources and intrinsic value release. Based on the data value chain theory, the logical framework of medical data value release is constructed, and an evolutionary game analysis of the core players in the stages of data capitalization, commercialization, and capitalization is carried out. Moreover, the dynamic trend of medical data value release under different strategies is simulated by a system dynamics model. The active participation of medical institutions alone is not enough to effectively promote the process of data capitalization, commercialization, and capitalization. However, with the active participation of medical institutions and governments, as well as effective regulation of the data market, the release of value of medical data will be significantly enhanced. The increase in the amount of appropriate data resources, control of government regulatory costs, and optimization of market regulation benefits help promote the commercialization and capitalization of medical data, and thereby promote the rapid accumulation and release of value. In addition, both the increase and decrease in trust crisis have a negative impact on the release of medical data value, while the reduction in data integration risk and reasonable regulation and control of regulatory transaction risk have a positive impact on the release of medical data value.
2025 Vol. 44 (9): 1144-1158 [Abstract] ( 34 ) HTML (1 KB)  PDF (3662 KB)  ( 33 )
Intelligence Technology and Application
1159 Large Language Model Driven Academic Text Mining: Parameter-Efficient Fine-Tuning Strategy from the Tuning End Hot!
Liu Yinpeng, Lu Wei, Shi Xiang, Liu Jiawei, Cheng Qikai, Huang Yong
DOI: 10.3772/j.issn.1000-0135.2025.09.007
The ability to deeply understand academic texts has become a crucial support in intelligence work, and large language models (LLMs) have shown great potential in this area. LLMs can enhance knowledge extraction and utilization capabilities from both the inference end and tuning end. Currently, in academic text mining, various instruction engineering techniques at the inference end struggle to fully leverage the deep semantic understanding capabilities of LLMs. Therefore, adapting model parameters for domain-specific tasks using techniques such as parameter-efficient fine-tuning (PEFT) at the tuning end has become the key for LLMs to empower academic text mining. The performance and efficiency of applying different PEFT methods to LLMs have not yet been systematically explored. This study constructs a PEFT framework and evaluation system for academic text mining. It evaluates the performance metrics and cost-efficiency of seven instruction-tuned LLMs after applying seven PEFT methods, exploring the capability boundaries of PEFT strategies and instruction-tuned LLMs in academic text mining. The experiments demonstrate that, among the various tuning methods, fine-tuning achieves the best performance. However, its advantage is not significantly pronounced. By contrast, quantized low-rank adaptation (QLoRA) incurs the lowest computational cost, making it the most efficient PEFT method in terms of overall benefits. The performance differences following tuning across LLMs of varying sizes and architectures are minimal. Mistral-7B-Instruct-v0.1, which is smaller in scale, can achieve performance metrics comparable to those of models with 70B parameters when tuned with QLoRA. The LLMs show substantial improvements in performance across tasks such as citation function identification, scientific entity extraction, and scientific text reasoning, surpassing their performance on the instruction end by a significant margin. Compared with traditional deep learning models, LLMs in the tuning end comprehensively outperform in academic text reasoning tasks and perform similarly to smaller models in scientific entity extraction and citation function identification tasks. Therefore, LLMs perform better in tasks with higher difficulty, whereas small models are more beneficial for simpler sequence labeling and classification tasks.
2025 Vol. 44 (9): 1159-1172 [Abstract] ( 21 ) HTML (1 KB)  PDF (2653 KB)  ( 191 )
1173 Research on Text Representation Based on Improved Word Mover's Embedding Hot!
Cen Yonghua, Li Wenjing, Liu Xianzu
DOI: 10.3772/j.issn.1000-0135.2025.09.008
High-quality text representation serves as the foundation and guarantee for downstream text-processing tasks such as sentiment analysis and text classification. In response to the insufficient semantic accuracy and limited context window of traditional models, recent models based on word mover's/rotator's distance (WMD/WRD) or word mover's embedding (WME) have drawn special attention. To further this endeavor, this study introduces an improved word-mover's embedding method based on latent Dirichlet allocation (LDA), namely LDA-WFR-WME. This approach overcomes the semantic bias arising from the uniform topic distribution assumption of general WME by initializing text-embedding dimensions through LDA modeling and rectifies the distance distortion caused by the excessive semantic difference between documents using WFR text distance. Experiments on multi-group short text sentiment analysis, long text classification, and text clustering demonstrated the superiority of the proposed method in text embedding over competitive models such as Doc2Vec, attention-bidirectional long short-term memory (BiLSTM), bidirectional encoder representations from transformers (BERT), attention-bidirectional gated recurrent unit - convolutional neural network (Attention-BiGRU-CNN) and bidirectional graph attention network (BiGAT). The supervised topic-proxy document generation combined with WFR document distance enhanced the semantic embedding of text.
2025 Vol. 44 (9): 1173-1191 [Abstract] ( 22 ) HTML (1 KB)  PDF (3028 KB)  ( 81 )
Intelligence Users and Behavior
1192 Construction of User-Trusted Explainable Models for Academic Information Recommendation Hot!
Chen Yunyi, Wu Dan, Xia Zishuo
DOI: 10.3772/j.issn.1000-0135.2025.09.009
With the advancement of artificial intelligence, its unexplainable “black box” is hindering people’s understanding and trust in the system, limiting the cooperative relationship between humans and artificial intelligence. We proposed and validated a phased, problem-oriented user-trusted explainable model for the application scenario of academic information recommendation with the aim of improving the interpretability of artificial intelligence. Starting from the process of human-computer interaction, the model divides the whole process of interaction into three phases: initial contact, starting interaction, and in-depth collaboration, and specifies which problems had to be explained at each phase, so as to achieve the effect of enhancing user trust and promoting human-computer interaction. Subsequently, the effectiveness and rationality of the model were verified through experiments. Based on grounded theory, the interview content was coded at three levels: open coding, axial coding, and selective coding. The coding results were used to interpret and improve the model and propose interpretable practical optimization strategies. The staged, problem-oriented interpretable model proposed in this study can enhance user trust at multiple levels, and a mapping relationship exists between its explanatory content, presentation form, language style, and system cognition, usage intention, and usage behavior of users. Based on this, we proposed specific guidance strategies for each interactive stage from the perspectives of explanatory content and presentation form, aiming to have a positive impact on the construction of interpretable artificial intelligence.
2025 Vol. 44 (9): 1192-1203 [Abstract] ( 19 ) HTML (1 KB)  PDF (1461 KB)  ( 64 )
Intelligence Reviews and Comments
1204 Review on Division of Labor in Scientometrics Hot!
Tian Xuecan, Chen Liyue, Ding Jielan
DOI: 10.3772/j.issn.1000-0135.2025.09.010
Scientific collaboration is an important mode for researchers to conduct research activities. As the foundation of scientific collaboration, the division of labor (DOL) is of significance for optimization of team organization, improvement in research efficiency, and perfection of the contribution evaluation mechanism. We present a systematic review of the research on the classification, identification, and application of the DOL in Scientometrics, providing a reference for subsequent studies. First, starting from the taxonomy of DOL, we systematically investigate and sort out the taxonomies proposed by academic organizations, academic journals, and scholars. Second, we sort out the classification methods of DOL, which are divided into two levels: manual classification methods based on small-scale data and automatic classification methods based on large-scale data. Third, we sort out the application research of DOL from three aspects: characteristic research, structural research, and utility research. Finally, we discuss the frontier trend of artificial intelligence as a new type of labor division subject. We show that the taxonomies of DOL are relatively scattered, mainly application-oriented, and lack systematic theoretical research support. The classification methods of DOL have gradually shifted from manual annotation to text mining and machine learning, but the use of large language models is relatively rare. Regarding the application research, most studies focused on feature revelation of DOL, while few were performed on the organizational model and effects of DOL. In addition, the artificial intelligence, as a new subject of DOL, has triggered a new thinking about scientific collaboration models.
2025 Vol. 44 (9): 1204-1216 [Abstract] ( 13 ) HTML (1 KB)  PDF (1611 KB)  ( 90 )