Full Abstracts

2021 Vol. 40, No. 2
Published: 2021-02-24

115 Research on the Metadata of Scholar Identity Management Systems Hot!
Si Li, Chen Chen
DOI: 10.3772/j.issn.1000-0135.2021.02.001
Research on metadata in different identity management systems can help in the understanding of functional goals and differences of each system, build effective identity management and metadata enhancement strategies, and achieve the goal of data sharing and reuse in the field of identity management. This study uses 100 Chinese scholars as a survey sample to analyze its identification in the five identity management systems: Chinese Name Authority Joint Database Search System, Baidu Scholar, ORCID, ScopusID, and Publons. It also measures and analyzes differences in the metadata structure and elements value, combined with the methods of word frequency statistics, co-occurrence analysis, and principal components analysis. The study finds that scholars present different identification and co-occurrence characteristics in different types of identity management systems; the distribution of elements within the same system is not balanced, and the metadata structure between different systems also differs. The metadata element of each system does not always have strong correlation and presents two distinct groups of components. Moreover, the contribution rate of each system element to each group of information is not consistent.
2021 Vol. 40 (2): 115-124 [Abstract] ( 109 ) HTML (1 KB)  PDF (4165 KB)  ( 584 )
125 Impact Factor Based on Logarithmic Correction for the Papers' Citations and the Studies on Its Category Normalization Hot!
Liu Xueli, Guo Jia, Shen Lan, Wang Yan, Sheng Lina, Fang Hongling, Li Jianhua, Ding Jun
DOI: 10.3772/j.issn.1000-0135.2021.02.002
Taking SCI journals and papers from five disciplines as research objects, this study converts the citations of the papers into normal distribution using different base logarithms and calculates the logarithmic correction impact factor (IFlog) of each journal. Category normalization for IFlog (cnIFlog) is conducted by dividing the IFlog of each journal by the average value of IFlog of all journals in the field. The superiorities of cnIFlog in the evaluation of academic journals across different fields are also demonstrated through empirical research. The results showed that different base logarithms could convert the citations of the journal papers into an approximate normal distribution. The IFlog of the five field journals was normally distributed, and the IFlog1.5, IFln, IFlog5, and IFlog10 of the journals were 100% positively related (r = 1.000, P = 0.000) both in the same or different fields. Compared with the impact factor (IF2018), average impact factor percentile (aJIFP), journal index of eight percentile rank classes (JIPR8), IFlog and relative impact factor (rIF2018), and other indicators, cnIFlog1.5 exhibited the smallest variation among the five field journals. Moreover, compared with aJIFP and JIPR8, it had the highest correlation and exhibited ideal discrimination and stability. It is concluded that cnIFlog1.5 is an extremely ideal journal evaluation indicator for use in both the same or different fields.
2021 Vol. 40 (2): 125-134 [Abstract] ( 148 ) HTML (1 KB)  PDF (1409 KB)  ( 631 )
135 Link Prediction in Two-layer Knowledge Network Based on Network Representation Learning Hot!
Cao Zhipeng, Pan Ding, Pan Qiliang
DOI: 10.3772/j.issn.1000-0135.2021.02.003
In recent years, most link prediction algorithms have focused on the similarity of the knowledge network's topological characteristics, with less consideration of the author's research field, which lead to some problems, such as insufficient information utilization. This paper proposes hypernet2vec model, a link prediction model for a two-layer knowledge network. The two-layer knowledge network consists of the Co-author Network and Academic Field Relationship Network. The nodes in the two-layer network are mapped to a low-dimensional vector space by network representation learning, and then they are fed into a convolutional neural network, which is specially designed to calculate and predict future links. The empirical results of the evaluation on real-world networks demonstrate that the proposed algorithm achieves higher AUC (area under curve) value, with an average increase of 11.28%, and performs superior to other algorithms such as RA indicator, LP indicator, and LRW indicator. This paper also explores the underlying mechanism that affects the model's occurrence, from the level of intelligence generation and structure of complex systems.
2021 Vol. 40 (2): 135-144 [Abstract] ( 207 ) HTML (1 KB)  PDF (1854 KB)  ( 685 )
145 Study on Identification and Classification of Emergency News Based on the Combined Deep Learning Model Hot!
Song Yinghua, Lyu Long, Liu Dan
DOI: 10.3772/j.issn.1000-0135.2021.02.004
Considering the difference between the keywords in emergency news and general news text, and the existing single consideration of the relationship between words and words and categories based on deep learning news text, we propose a news text classification model based on double input combination deep learning. First, word vector represents the relationship between words and words, discrete vector represents the relationship of words and categories. Second, considering the advantages of the CNN (convolutional neural networks) model learning local spatial feature information, LSTM (long short-term memory) model learning time sequence feature information, and MLP (multilayer perceptron) model learning the relationship between words and categories, we constructed the DCLSTM-MLP deep learning combination model. Finally, this model takes 5,477 emergency news texts with the interrelationship between words and the interrelationship between words and categories, and 2,815 general news texts, and analyzes the performance of the combined model by experimental comparison. The results show that the accuracy, recall rate, and comprehensive value of the first-level emergency identification model all reached 99.55 percent. The accuracy of the second-level emergency classification combination model reached 94.82 percent, and the composite value of the accuracy and recall rate increased by 6.06 percent, 2.36 percent, 2.47 percent, 1.14 percent, and 1.79 percent, respectively, compared with the five models of MLP, Text-CNN, Text-LSTM, CNN-MLP, and CLSTM. Thus, the combined model can implement news text classification tasks more accurately.
2021 Vol. 40 (2): 145-151 [Abstract] ( 305 ) HTML (1 KB)  PDF (1654 KB)  ( 672 )
152 Recognition of Lexical Functions in Academic Texts: Application in Automatic Keyword Extraction Hot!
Jiang Yi, Huang Yong, Xia Yikun, Li Pengcheng, Lu Wei
DOI: 10.3772/j.issn.1000-0135.2021.02.005
Traditional automatic keyword extraction often uses non-semantic information such as the frequency and location of candidate keywords to construct features without considering the specific semantic role of keywords in the academic text, that is, lexical function. Our statistical analysis found that 67.99% of the keywords in our dataset represented research questions or methods. Therefore, we classified lexical functions into three categories: Research Questions, Research Methods, and Others. Then, based on the word frequency and position features, a method was proposed to implement lexical functions in computer science papers through a classification model and ranking model. The results showed that our method could outperform the baseline with base features. The Acc and F of the classification model were improved to 0.840 and 0.666, with relative improvements of 24.63% and 25.19%, respectively. The MAP, NDCG@5, and P@5 of the ranking model improved by 168.32%, 189.50%, and 148.30%, reaching 0.813, 0.828, and 0.447, respectively. All improvements showed that lexical functions play an important role in automatic keyword extraction.
2021 Vol. 40 (2): 152-162 [Abstract] ( 334 ) HTML (1 KB)  PDF (1401 KB)  ( 541 )
163 Automatic Summarization of Book Reviews Based on Fine-Grained Review Mining Hot!
Zhang Chengzhi, Tong Tiantian, Zhou Qingqing
DOI: 10.3772/j.issn.1000-0135.2021.02.006
Mining book reviews can help users understand the content of books and help publishers optimize their marketing strategies. Book review summarization can greatly improve the efficiency of users' access to information, allowing them to quickly understand the main content of reviews by briefly reading a summary. The practice can thus provide users with concise and accurate book review summaries. Existing research on review summaries has mostly adopted methods based on sentence extraction, which neglect to address the fine-grained sentiment information in reviews. In addition, there are obvious differences in the content of reviews among different book review platforms. It is difficult for users to fully understand books through review summaries based on a single platform. In this study, we propose a book review summary model including aspect and content information and design a review summary method based on fine-grained reviews mining. The empirical results show that the review summary generated using the proposed method can provide fine-grained and multi-dimensional book evaluation information.
2021 Vol. 40 (2): 163-172 [Abstract] ( 153 ) HTML (1 KB)  PDF (1285 KB)  ( 569 )
173 Quality Evaluation of Forestry Open Government Data Based on Metadata Hot!
Wang Bo, Wen Jiwen
DOI: 10.3772/j.issn.1000-0135.2021.02.007
A quality evaluation of forestry open government data helps data providers to manage data, providing a basis on which users can select datasets. Metadata, an important attribute to describe the origin and background of data resources, can be used as the basis to evaluate the quality of open government data. Based on the characteristics of forestry open government data resources, open government data life-cycle theory, and the principles of the data resources being scientific, comprehensive, pertinent, and easy to operate, this paper constructs a systematic and comprehensive forestry open government data quality evaluation model from the three aspects of data quality (form, content, and utility) and the three phases of the open government data life-cycle (generating, opening, and using), including evaluation frameworks and evaluation indicators, with quantitative methods. This paper also proposes future research directions that can provide references for general open government data quality evaluation, and it can help improve the quality and value of open government data.
2021 Vol. 40 (2): 173-183 [Abstract] ( 194 ) HTML (1 KB)  PDF (1368 KB)  ( 479 )
184 Potential Knowledge Flow Detection from an Integrated Perspective of Three-Dimensional Citations: A Case Study of Gene Editing Hot!
Wang Feifei, Wang Xiaohan, Xu Shuo, Lu Wanzhao, Song Yanhui
DOI: 10.3772/j.issn.1000-0135.2021.02.008
In the era of knowledge economy, the value of knowledge flow in stimulating knowledge innovation and promoting scientific and technological development has gradually become more prominent. Based on the fusion of direct-co-citation-coupling citation association, this paper mines the potential knowledge flow in the domain at the subject association level. Indicators of link prediction are used as the feature values to construct the classifier and regressor, respectively. The classifier is used to predict the knowledge flow that is not yet present but is likely to occur in the future. The regressor is mainly used to predict the current knowledge flow with low link weights, which has not attracted widespread attention but has high link weights in the future. The two-layer prediction method is comprehensive and complementary, which can more fully detect research frontiers and emerging trends in the field. Using this idea to explore the currently trending field of gene editing technology, we have obtained the potential knowledge flow and research hotspot in this field, which can serve as a reference for the future research direction for researchers.
2021 Vol. 40 (2): 184-193 [Abstract] ( 243 ) HTML (1 KB)  PDF (2529 KB)  ( 549 )
194 Study on Detection Model and Simulation of Internet Rumor Based on Blockchain Hot!
Wang Xiwei, Zhang Liu, Huang Bo, Wei Yanan
DOI: 10.3772/j.issn.1000-0135.2021.02.009
By constructing the blockchain of the internet rumor detection model, the self-purification and traceable mechanism of internet rumor can be formed, which plays a certain role in promoting the public opinion supervision department to use the blockchain technology to govern internet rumor and guide public opinion. Based on blockchain technology and Unified Modeling Language graph, the premise of this study is based on the analysis of blockchain properties and working method, from the outbreak period of public opinion and audit of blockchain, the fermentation period of public opinion and filter of blockchain, the proliferation of public opinion to build the detection model of internet rumor, and combined with the Internet rumors “plastic rice” simulation experiment, to verify the validity of the detection model of internet rumor for blockchain. The simulation results indicated that the blockchain of the detection model of internet rumor can ensure the security and traceability of public opinion information transmission, purify the internet rumor, and ensure the integrity of public opinion information. The similarity function of the discrimination model is a relatively approximate method, which does not consider the upper limit of information storage. Thus, due to the continuous accumulation of mining difficulty, the calculation process of the hash value can be time-consuming.
2021 Vol. 40 (2): 194-203 [Abstract] ( 293 ) HTML (1 KB)  PDF (1754 KB)  ( 952 )
204 A Meta-Analysis of Factors Influencing Online Knowledge and Payment Intention Hot!
Yan Weiwei, Chen Ruoyu, Zhang Min
DOI: 10.3772/j.issn.1000-0135.2021.02.010
With the rapid development of the online payment industry in recent years, ample related research has been conducted, but different studies’ yield different conclusions. This study takes users’ purchase intention of online knowledge as the dependent variable, and the influential factors related to purchase intention verified in existing studies as the independent variables. A quantitative meta-analysis on 29 domestic and foreign relevant empirical studies obtained after retrieval and filtering is carried out. According to the meta-analysis results, seven factors included in the analysis are significantly associated with users’ payment intention of online knowledge; perceived value has the highest correlation, and perceived risk is negatively correlated with payment intention, with the weakest correlation. Meanwhile, as a moderating variable, platform type affects the relationship between payment intention and five independent variables, including subjective norm, perceived value, perceived usefulness, trust, and perceived risk. Through systematic analysis and verification of the empirical research results on the influencing factors of online knowledge payment intention, this study provides references for the improvement of the online knowledge payment system and the follow-up research in related fields.
2021 Vol. 40 (2): 204-212 [Abstract] ( 225 ) HTML (1 KB)  PDF (1358 KB)  ( 1127 )
213 Detection of Hotspot in Scientific Fields Based on Emerging Pattern Analysis of Social Q&A Community Contents Hot!
Yu Jing
DOI: 10.3772/j.issn.1000-0135.2021.02.011
Hot spot identification of scientific fields is one of the key research issues in the fields of science and technology intelligence and bibliometrics. It can constitute a reference and basis for policy-making of science and technology or education departments or research decision-making for researchers. Methods of hot spot identification in existing studies are primarily based on bibliometrics methods, without using abundant Web data. This study proposes a hot spot identification framework based on Emerging Pattern Recognition, which uses the question and answer content of Social Q&A Community to identify research hotspots in a field. First, keywords in the question and answer contents are extracted and clustered based on their co-occurrence. Second, a set of candidate hotspot patterns is constructed based on the clustering results, and emerging pattern recognition method is used to identify hotspots and analyze their trends. Experiments based on the dataset from the “Machine Learning” topic of zhihu.com were analyzed using the chi-square test and compared with frontier research. The results indicated that this framework can effectively identify research hotspots. This framework has strong realizability in that it alleviates the high computational complexity problem of the Emerging Pattern Recognition method by using keywords clustering. Moreover, it has potential application value in online community hotspot identification and other related issues.
2021 Vol. 40 (2): 213-222 [Abstract] ( 186 ) HTML (1 KB)  PDF (2274 KB)  ( 660 )