Full Abstracts

2021 Vol. 40, No. 8
Published: 2021-09-24

791 Social and Historical Reasons for the Return of Macroscopic Intelligence Idea Based on the Evolution of Research Methodology Hot!
Li Bowen, Zhang Chengzhi
DOI: 10.3772/j.issn.1000-0135.2021.08.001
This study aimed to identify the social and historical reasons for the return of “Macroscopic Intelligence Idea,” based on the research methods, research methodology evolution, and collective memory of practical experience of the Chinese Intelligence Studies community. First, two Macroscopic Intelligence Idea were presented and discussed separately prior to revealing the research questions. Second, a total of 4,612 journal articles published in Journal of Library Science in China (JLSC) and Journal of the China Society for Scientific and Technical Information (JCSSTI) were retrieved. Coding schemas and coding instructions were created for research methods. Full texts of each article were examined to identify the research methods applied in these studies, and these were subjected to content analysis their consistency was tested. Third, we introduced the research methodology evolution of the Intelligence Studies. Fourth, we discussed the capabilities and collective memory regarding the use research methods of the Chinese Intelligence Studies community. Finally, we illustrated the social and historical reasons, which indicate the reasons for the return of “Macroscopic Intelligence Idea” that occurred due to the awakening of the Chinese Intelligence Studies community during dramatic periods of change. A limitation is that this study did not account for the close relationship between intelligence studies, library and information science.
2021 Vol. 40 (8): 791-805 [Abstract] ( 193 ) HTML (1 KB)  PDF (4012 KB)  ( 501 )
806 Analysis on Technology Gaps between China and US Based on Text Mining—Taking Space Technology as an Example Hot!
Guo Shijie, Chen Fang, Han Tao, Wang Xuezhao, Wang Yanpeng, Lyu Lucheng, Dong Lu
DOI: 10.3772/j.issn.1000-0135.2021.08.002
The study aimed to identify the differences and gaps between Chinese and US technology and to monitor the risks in these key core technologies. An information extraction method and an information synchronization matching method focusing on the elements of given documents were constructed on the basis of analyzing the semantic characteristics of the documents. An empirical analysis was carried out using US documents in the field of space science and technology as an example. The results of analysis on the technical products included in the US documents may be divided into four categories: no gap; technical gap; technical layout difference; and technical level gap.
2021 Vol. 40 (8): 806-816 [Abstract] ( 271 ) HTML (1 KB)  [PDF (0 KB)  ( 513 )
817 Research on Automatic Identification and Classification of Actionable Information in Emergencies Hot!
Wu Xuehua, Mao Jin, Chen Sijing, Xie Hao, Li Gang
DOI: 10.3772/j.issn.1000-0135.2021.08.003
Actionable information enables a timely response and flexible mobilization during emergencies. It plays a critical role in supporting rescue, resource allocation, and other response efforts, as well as in minimizing casualties and damages. This study aims to explore how various actionable information can be obtained from social media. To this end, we clarify the concept, characteristics, and categories of actionable information and propose a two-stage framework to identify and classify actionable information relying on machine learning techniques. Word vector representations, linguistic features, surface-based features and user-based features are adopted in the proposed framework. Five machine learning algorithms, namely, Support Vector Machine, Logistic Regression, TextCNN, BERT, and a combination of BERT and TextCNN (BERT + TextCNN), were explored in our experiment based on a manually annotated dataset. The performances of different classification strategies, algorithms, and features were evaluated. The experiment results indicate that two-stage approach can provide actionable information with various granularities without sacrificing the performance. BERT and BERT + TextCNN outperform other models in both stages. A combination of linguistic, surface-based, and user-based features makes little contribution to the information identification task in the first stage, while it can significantly improve the performance of the information classification task in the second stage. Our research helps better incorporate social media streams into emergency workflows. This can, to some extent, mitigate information overload during emergency response and improve the response efficiency.
2021 Vol. 40 (8): 817-830 [Abstract] ( 217 ) HTML (1 KB)  PDF (2492 KB)  ( 582 )
831 Method of Discovering Interdisciplinary Knowledge of the National Natural Science Foundation of China Based on Word Embedding: A Case Study on Artificial Intelligence and Information Management Hot!
Wang Weijun, Yao Chang, Qiao Ziyue, Cui Wenjuan, Du Yi, Zhou Yuanchun
DOI: 10.3772/j.issn.1000-0135.2021.08.004
Interdisciplinary research is an important way to promote the resolution of various complex scientific problems. In this paper, the keyword co-occurrence relationship in the disciplines of artificial intelligence and information management in the projects funded by the National Natural Science Foundation of China is used to map the corresponding keywords to the low-dimensional vector space through word2vec correlation model. The keyword vector is used to calculate the relationship between keywords and obtain the quantized keyword co-occurrence relationship. PageRank algorithm is utilized to calculate the importance of keywords in the co-occurrence network. DBSCAN clustering algorithm is used to analyze keyword co-occurrence with an interdisciplinary nature that did not appear in the project, and textual information such as keyword importance is combined with visual information to analyze potential interdisciplinary knowledge. The experiment shows that the model proposed in this study can extract the potential interdisciplinary knowledge well and can filter and sort the interdisciplinary knowledge by using the interdisciplinary keyword co-occurrence relation. The results are interpretable and reasonable and provide new research ideas for exploring methods in discovering interdisciplinary knowledge and identifying its potential growth.
2021 Vol. 40 (8): 831-845 [Abstract] ( 191 ) HTML (1 KB)  PDF (2207 KB)  ( 1407 )
846 Calculation of Author Intimacy Based on Multi-Dimensional Fusion Hot!
Hou Xiang, Huang Jin, Sang Jun, Xia Xiaofeng
DOI: 10.3772/j.issn.1000-0135.2021.08.005
To facilitate the exchange and accurate promotion of academic achievements between different academic paper authors and research teams, a reasonable and effective academic social network needs to be established. As an author network based on literature citation has not been established in current Chinese academic database platforms for community interaction, this paper uses a CNKI data source, a professor of software engineering and information security in a university, as an example. Further, the literature data of 122 scholars who are the professor A's co-authors or with citation relationships from 2014 to 2020 was used as the research object and a citation network of academic authors constructed. Combined with the characteristics of academic social networking, the authors, papers, and citation data were mined, and four dimensions of the intimacy calculation method based on the co-author, subject topic, sensitivity citation of subject sensitivity, and social graph were used to weight the comprehensive intimacy value among the authors in the network. In the experiment, the author relationship map was established by synthesizing the intimacy value, and the author's intimacy level in the network was obtained. The author's network level (weight) was obtained by multiplying the intimacy value and number of published papers. The authors and research teams with the same research level in the mapped network were found, which laid a good foundation for the data promotion of the academic social network.
2021 Vol. 40 (8): 846-853 [Abstract] ( 202 ) HTML (1 KB)  PDF (3242 KB)  ( 435 )
854 Keyword Extraction from a Paper's Abstract Based on Semantic Text Graph Hot!
Wang Xiaoyu, Wang Fang
DOI: 10.3772/j.issn.1000-0135.2021.08.006
Considering the basic role of keywords in large-scale document retrieval and text content analysis, an unsupervised keyword extraction algorithm based on a semantic graph is proposed, which focuses on improving the method of graph construction and the index of word weighting. To ensure that the text graph retains more semantic and structural information, the algorithm generates a semantic text graph consisting of four features, according to the dependence of words in a sentence: conceptual connection, equivalent membership, functional attributes, and modification. This operation eliminates the sliding window parameter in the traditional method and improves the usability of the algorithm. On this basis, a word-weighting method combining word position information, concept hierarchy, concept connection preference, and connection strength is proposed and the importance of each word is ranked. Finally, high-score nodes are selected to form a keyword set of abstracts included in research papers. Experimental results based on four open corpora show that the efficiency of this method is better compared with that of the other three baseline algorithms, and the F1 value has increased to 0.570.
2021 Vol. 40 (8): 854-868 [Abstract] ( 139 ) HTML (1 KB)  PDF (3142 KB)  ( 726 )
869 Towards an Appropriate Scale of Datasets for Domain Bibliometrics: Empirical Study under Multiple Tasks Hot!
Chen Guo, Wang Panting, Wang Yuefen
DOI: 10.3772/j.issn.1000-0135.2021.08.007
It is impossible to construct a complete dataset for domain bibliometrics owing to the centralized and decentralized distribution of literature, which raises an essential question about the appropriate scale of datasets. This problem should be resolved under specific bibliometric analysis tasks. In this study, we comprehensively consider typical task scenarios including domain scale, elements for analysis (such as subject classification, country, institution, keyword, reference, author, and their co-occurrence relationship), top N values of elements, and whether to sort or not. Based on this, we designed a corresponding experimental scheme. We selected “artificial intelligence” as an example domain, constructed various subsets with different sampling scales, and obtained 4,800 indices showing the imitative effect of those subsets to the full dataset. The results show that, in bibliometrics, when analyzing subject classifications or countries, a small dataset is sufficient for a reliable result. Analysis tasks on authors should be conducted with as large a dataset as possible, because author analysis is quite sensitive to data scale. When analyzing institutions, keywords, or references, a certain scale that corresponds to the specific task scenarios can also achieve reliable results. Additionally, for the co-occurrence analysis, more top elements—or sorting elements—and a larger dataset are necessary.
2021 Vol. 40 (8): 869-878 [Abstract] ( 259 ) HTML (1 KB)  PDF (2404 KB)  ( 420 )
879 Research on Data Recommendation Based on Community Detection of Citation Network Hot!
Li Chengzan, Li Jianhui, Wang Xuezhi, Shen Zhihong, Du Yi
DOI: 10.3772/j.issn.1000-0135.2021.08.008
Scientific data is the input and output of scientific research activities and the core driving factor of scientific and technological innovation. Only through open sharing and wide distribution of scientific data can its value be brought into full play. However, the utilization rate and dissemination efficiency of current data publications are generally low. To accelerate the dissemination and reuse of scientific data and enhance the effectiveness of open sharing of scientific data, this paper proposes a data recommendation method based on community detection of a citation network. Considering the construction of the association network among data sets, papers, and authors, the Louvain algorithm is used for community detection from three association modes of co-authorship, co-citation, and coupling. The similarity between data sets and academic papers is, thus, calculated by combining the TF-IDF algorithm and cosine similarity, and then the connection between the data sets and communities in which the papers are located is constructed for data recommendation. The experimental results show that the data recommendation method can effectively find papers or authors of potential interest in data sets. In addition, it is found that in terms of contribution and stability of data recommendation, community detection based on coupling relationship performs the best, followed by co-authorship relationship, whereas citation relationship is greatly affected by publishing and citation times.
2021 Vol. 40 (8): 879-886 [Abstract] ( 245 ) HTML (1 KB)  PDF (1988 KB)  ( 622 )
887 An Empirical Study on Factors Influencing Chinese Researchers' Research Data Reuse Behavior: Using Bioscience as an Example Hot!
Zhang Xiaoyue, Song Xiufang, Liping Ku, Liu Jinya, Chen Xinlan
DOI: 10.3772/j.issn.1000-0135.2021.08.009
Understanding influential factors and the mechanisms behind researchers' research data reuse behavior can support the formation of data reuse ecosystem, where open data (or data sharing) and data reuse can promote one another. Through a literature review, this paper put forward the theoretical model of research data reuse behavior based on an ecosystem perspective. We chose researchers from the Chinese Academy of Sciences in the bioscience departments as a sample and conduct convergent mixed method research in the following manner. PLS-SEM approaches were used to analyze the quantitative data from the questionnaire, and we collected supplementary qualitative data from open-ended questions in the questionnaire and results from semi-structured interviews with 10 representative researchers. These two kinds of data were then combined. Results reveal that perceived community culture foundation and perceived reuse support usability positively influence researchers' research data reuse behavior. Our research suggests that stakeholders should pay attention to research data quality control and researchers' rights management, specifically including the following recommendations: 1) promoting the internationalization of large domestic scientific data centers (repositories); 2) setting up a collaborative network in order to leverage the power of important groups and individuals; and 3) the formation of multiple strategies for research data reuse services.
2021 Vol. 40 (8): 887-902 [Abstract] ( 204 ) HTML (1 KB)  PDF (1820 KB)  ( 453 )