情报学报

关闭×

Full Abstracts

2025 Vol. 44, No. 7 Published: 2025-07-24

Special Topics Intelligence Theories and Methods Intelligence Technology and Application Intelligence Reviews and Comments

Special Topics

	797	Theoretical Interpretation and Implementation Path of Information Empowerment for National Science and Technology Security ^Hot!
		Li Gang, Sun Jie, Mao Jin
		DOI: 10.3772/j.issn.1000-0135.2025.07.001
		In the context of global technological competition, national science and technology security has become a strategic foundation for national sovereignty, security, and development interests. This study focuses on information empowerment for national science and technology security as its core proposition and systematically investigates the theoretical framework, methodological system, and practical pathways of information in the field of national science and technology security. First, based on the contemporary background and practical requirements, this study explains the connotation, characteristics, and risk challenges of national science and technology security, with emphasis on the theoretical implications of information empowerment in risk prevention, control, and strategic competition. Second, it constructs the logical framework for “Information Empowerment for National Science and Technology Security” (IE-NSTS). By analyzing this framework from the perspectives of its functional modules, enabling mechanisms, and system architecture, it proposes an intelligent information methodology. This methodology is driven by the fusion of data and knowledge, based on technologies such as a dynamic science and technology security knowledge graph and multimodal knowledge representation and reasoning. Finally, this study presents implementation pathways for information empowerment in national science and technology security through risk systems and early warning mechanisms, situation awareness and risk profiling, risk assessment and response strategies, and game simulation and competitive strategies. This activates and enables information to serve as “eyes and ears, pioneers, and advisors” in the field of national science and technology security, thereby empowering high-quality development and high-level security in China’s science and technology domain.
		2025 Vol. 44 (7): 797-807 [Abstract] ( 35 ) HTML (1 KB) PDF (1908 KB) ( 27 )

Intelligence Theories and Methods

	808	Preprint Platforms and New ScholarlyCommunication Systems ^Hot!
		Chu Jingli, Liu Jingyu, Liu Jingyi, Ren Jiaohan, Yang Heng
		DOI: 10.3772/j.issn.1000-0135.2025.07.002
		An accurate understanding of the relationship between preprint platforms and scholarly communication systems is important for promoting the development of new types of preprint scholarly communication. This article combines literature research and the construction practice of preprint platforms from the perspective of information science or big intelligence; systematically reviews the definition, emergence, and development of preprint platforms; and explains the relationship between preprint platforms and scholarly communication systems from two dimensions: “the impact and significance of preprint platforms on the reconstruction of scholarly communication systems” and “the value of preprint platforms for different roles in scholarly communication systems.” This article proposes that the essence of preprint platforms is a new type of scholarly communication model and proposes suggestions for promoting the new type of scholarly communication of preprints in China from four aspects.
		2025 Vol. 44 (7): 808-817 [Abstract] ( 11 ) HTML (1 KB) PDF (1045 KB) ( 15 )

	818	Cross-Language Patent Text Representation Optimization Based on Supervised Fine-Tuning SimCSE Approach ^Hot!
		Wang Lijun, Li Haotian, Gao Yingfan, Wang Shujun
		DOI: 10.3772/j.issn.1000-0135.2025.07.003
		This paper proposes a method for optimizing cross-language patent text representations to enhance the semantic representation of Chinese and English patent texts. This method integrates the SimCSE contrastive learning algorithm with a supervised fine-tuning strategy, effectively leveraging parallel corpora of Chinese and English patent texts to achieve effective cross-language text representation. First, based on unsupervised SimCSE fine-tuning, this paper introduces a supervised SimCSE fine-tuning algorithm to improve the performance of the model in cross-language semantic understanding. Specifically, we propose a positive and negative sample-mining strategy in which a high-quality positive sample set is constructed by analyzing the citation relationships between patent texts, thereby enabling the model to capture cross-linguistic semantic similarities more accurately. Simultaneously, we introduce the RetroMAE secondary pretraining model to optimize the mining of hard negative samples, further enhancing the discriminative ability and generalization performance of the model. Compared with traditional cross-language text representation methods, the method proposed in this paper demonstrates significant advantages in handling cross-language patent texts, overcoming the limitations of previous methods in semantic alignment and differentiation, thus providing a more precise and effective tool for cross-language patent analysis across multiple domains.
		2025 Vol. 44 (7): 818-829 [Abstract] ( 21 ) HTML (1 KB) PDF (1048 KB) ( 15 )

	830	Research on a Pseudo-Feedback Training Data Generation Method for Attributable Text Generation in Scientific Literature ^Hot!
		Ma Yongqiang, Liu Jiawei, Gao Yingfan
		DOI: 10.3772/j.issn.1000-0135.2025.07.004
		The insertion of appropriate citation identifiers into academic texts is a fundamental norm in academic writing that enables readers to verify and trace the origins of information. These identifiers serve as critical tools for attribution and significantly enhance content verifiability. In academic contexts, contemporary large language models (LLMs) generally lack built-in attribution mechanisms when generating scientific text, resulting in limited transparency and accountability of the produced content. Although optimizing these models using human-annotated datasets is a standard approach, this method faces significant challenges in addressing attribution capabilities. Training sets derived from human-authored academic texts have inherent limitations, including insufficient internal consistency and considerable variation in citation practices across different authors and disciplines. In addition, data synthesis methods that rely on LLMs encounter constraints in terms of data diversity. To address these issues, this paper introduces a citation identifier framework and evaluation method for highly attributable scientific texts, aimed at analyzing the attribution of LLM-generated scientific content. For training data construction, this paper proposes a two-stage pseudo-feedback training data synthesis approach that balances the characteristics of both LLM- and human-annotated texts, thereby generating high-quality and diverse training data for attributable scientific text generation. Experimental results demonstrate that small models trained on the synthesized data developed in this study significantly enhance the attribution metrics of LLM-generated scientific texts. Furthermore, it was found that optimizing the data distribution and task diversity through a second stage of pseudo-feedback contributed to improved model generalization.
		2025 Vol. 44 (7): 830-845 [Abstract] ( 16 ) HTML (1 KB) PDF (6052 KB) ( 49 )

	846	A Patent Text Similarity Calculation Method Based on Expert Feedback Fine-Tuning ^Hot!
		Wang Shujun, Gao Yingfan, Yao Changqing, Yuan Ming
		DOI: 10.3772/j.issn.1000-0135.2025.07.005
		As important carriers of innovative technology, patents highlight the significance of text similarity calculation in natural language processing, with wide applications. Patent text similarity calculation helps identify potentially valuable patents and supports patent retrieval. This study introduces a patent text similarity calculation method that leverages expert feedback for fine-tuning. Using a small expert evaluation dataset, the method employs a large model to regenerate abstract texts and achieve negative example text enhancement. The pretrained model was fine-tuned using the expert evaluation dataset, and the similarity of patents was recalculated using a large-scale dataset. This study continues to train the bidirectional and auto-regressive transformers (BART) and Beijing Academy of Artificial Intelligence general embedding (BGE) models in the emerging fields of new materials and electronic information, respectively, and fine-tunes the two models using the expert evaluation dataset. The experimental results show that the Spearman correlation coefficient of this method increases by 6.4% and 16.9% compared to the initial models. The empirical section selects enterprises in the electronic information field to identify technological competitors and verifies the advantages of the method.
		2025 Vol. 44 (7): 846-858 [Abstract] ( 24 ) HTML (1 KB) PDF (2798 KB) ( 22 )

	859	A Two-tower Model for Multistage Literature Recommendation Incorporating Comparative Learning ^Hot!
		Ye Guanghui, Tan Qitao, Wu Chuan, Song Xiaoying, Li Songye
		DOI: 10.3772/j.issn.1000-0135.2025.07.006
		Literature recommendation is an important research topic in literature information mining. Effectively addressing the long-tail problem in traditional literature recommendation methods is difficult, as is adequately capturing high-dimensional literature features, leading to poor recommendation results. Therefore, a high-quality literature recommendation model is proposed in this study. A multistage dual-tower model incorporating contrastive learning is introduced to learn user and literature features utilizing user towers and literature towers, respectively. To improve the effectiveness of high-dimensional representation learning, a recall layer and a refined ranking layer are designed for multistage recommendations, with the recall layer used for irrelevant literature filtering and the refined ranking layer used for user-personalized recommendations. Experimental results on the CiteULike-a dataset show that the proposed model helps alleviate the long-tail distribution problem in literature recommendation. The model also performs better on the literature recommendation task than the single-stage model.
		2025 Vol. 44 (7): 859-868 [Abstract] ( 18 ) HTML (1 KB) PDF (1631 KB) ( 35 )

	869	SciBERT-based Measurement of Knowledge Intersection in Scientific Collaboration and Its Causal Effects on Sustained Research Output ^Hot!
		Feng Xiaodong, Huang Yuhang
		DOI: 10.3772/j.issn.1000-0135.2025.07.007
		With the increasing complexity and integration of scientific research problems, scientific collaboration and interdisciplinary research have intensified. Knowledge intersection represents a process of interaction, integration, and innovation across different fields, which can lead to significant scientific achievements. Breaking through the perspective of disciplinary granularity and static presentation in existing interdisciplinary research, this study explores the measurement of fine-grained knowledge intersection between research subjects and collaborators with respect to the accumulation of dynamic knowledge, experience, and continuous scientific research output. We constructed a measurement model to quantify the intensity of knowledge intersection in research collaborations using text-mining approaches based on a bidirectional encoder representations from transformers (BERT) model trained on scientific text (SciBERT) to represent the knowledge concept or sentence in a manuscript. Using scientific publication data from the Information Systems discipline as a case study, we generated panel data and applied the Generalized Propensity Score Matching method to conduct a causal effect analysis. The differential impacts of knowledge intersection on research questions and methodology were further explored. These findings indicate that the overall knowledge intersection in research collaboration has an inverted U-shaped effect on subsequent research output. Notably, knowledge intersection in research methodologies exerts a more substantial influence on the continuous research outputs of subjects than research questions.
		2025 Vol. 44 (7): 869-891 [Abstract] ( 28 ) HTML (1 KB) PDF (2731 KB) ( 16 )

Intelligence Technology and Application

	892	Research on n-Ary Technology Opportunity Discovery Based on Hyperlink Prediction ^Hot!
		Chen Wenjie, Qu Jiansheng
		DOI: 10.3772/j.issn.1000-0135.2025.07.008
		Exploring and analyzing technological opportunities in specific fields can provide new references and suggestions for the original innovation of enterprises from 0 to 1. This study proposes a multitechnology opportunity discovery method based on hyperlink prediction. First, based on the multiple co-occurrence relationships between technologies, a technology relationship hypernetwork was constructed, and node feature vectors were generated using International Patent Classification (IPC) reference information and text information. Then, the hyperlink prediction model Hyper-SAGNN was extended to the technology relationship hypernetwork to predict the possibility of future technology opportunities formed by the fusion of multiple technologies. Finally, based on features such as novelty, centrality, and cross-domain relevance, measurement indicators were constructed to identify potential and valuable diverse technological opportunities. Taking the field of intelligent question answering technology as an example, the scientificity and effectiveness of the method proposed in this study were verified, effectively mining high-value ternary and quaternary technological opportunities and providing decision support for the technological strategic layout and innovation strategy of enterprises.
		2025 Vol. 44 (7): 892-902 [Abstract] ( 20 ) HTML (1 KB) PDF (2430 KB) ( 19 )

	903	Identification of Key Core Technologies Based on Matrix Profile and Louvain Community Discovery Algorithms ^Hot!
		Wan Xiaoji, Lai Jing, Mou Yingxi, Zhu Zhiguo, Zhang Liping
		DOI: 10.3772/j.issn.1000-0135.2025.07.009
		Aiming at the problems of existing key core technology identification methods, such as less consideration of the time factor and more difficulty in interpreting the identification results, this study proposes a key core technology identification method based on the matrix profile (MP) and Louvain community detection algorithm. This method is based on International Patent Classification (IPC) subclass weights and word frequency analysis to identify hot technology topics in the target domain. Combining the high-frequency IPC subclass time series and the MP algorithm to construct the technology association network, the Louvain algorithm and social network analysis method are used to identify the initial key core technology topics in the target domain. Based on the features, the key core technology topics are screened, and the key core technologies in the target field are identified through a deep interpretation of the technology association subnetwork, original patent data, relevant policy documents, and journal literature. Through data processing and mining of granted patents in the field of logistics from 2014 to 2023 in the incoPat patent database, this method is noted to effectively identify the key core technologies in the field of logistics that not only helps to promote technological breakthroughs and innovations in the industry but also enhances the country’s position in the global industrial chain and value chain.
		2025 Vol. 44 (7): 903-914 [Abstract] ( 20 ) HTML (1 KB) PDF (2263 KB) ( 9 )

Intelligence Reviews and Comments

	915	Chasing the Cause and Tracing the Effect: Application and Prospect of Causal Inference Methods in the Field of Information Resource Management ^Hot!
		Wu Jiang, Tao Chengxu, Ou Guiyan, Ding Yang
		DOI: 10.3772/j.issn.1000-0135.2025.07.010
		Inferring causality in data to reveal the essence of information resource management is a new-era proposition in the field of information resource management, and discussing the application and prospect of causality inference methods in the field of information resource management can provide a reference for related research. Compared to statistical inference, causal inference uses a special design and experiment to control confounding variables and selection bias, which can describe causality more accurately. In terms of methodology, the potential outcome model is widely used in the social sciences because it can flexibly respond to the different characteristics of the data in the study and reveal the causes of complex social problems. For this reason, this paper first expounds the origin of causal inference at the two levels of philosophy and science. Second, it defines the concept of causal inference, compares the difference between causal and statistical inference, and identifies the relevant terms of causal inference. Third, the basic process of applying the causal inference method in the field of information resource management under the potential outcome model is refined, and the application ideas of various methods in the field of information resource management are analyzed, focusing on the applicable conditions of the method. Finally, this paper discusses the current situation and shortcomings of the application of relevant research methods in specific fields. Based on this, it proposes the prospect of applying causal inference methods in the field of information resource management from three aspects: research methods, application processes, and application fields.
		2025 Vol. 44 (7): 915-932 [Abstract] ( 23 ) HTML (1 KB) PDF (2751 KB) ( 44 )

THE JOURNAL

AUTHOR SERVICES

READER SERVICES

ACKNOWLEDGEMENTS

Peer Reviewers

Scan with iPhone or iPad

Editorial Office: JCSSTI Editorial Office, No.15 fuxing road, haidian, Beijing 100038
Tel: +86(010)68598273; Fax: +86(010)68598285; E-mail: qbxb@istic.ac.cn
Copyright © 2015 by the Journal of The China Society for Scientific and Technical Information
ISSN: 1000-0135 CN: 11-2257 / G3