情报学报

关闭×

Full Abstracts

2019 Vol. 38, No. 4 Published: 2019-04-28

	335	Research on Identification of Emerging Topics Based on Link Prediction with Weighted Networks ^Hot!
		Huang Lu, Zhu Yihe, Zhang Yi
		DOI: 10.3772/j.issn.1000-0135.2019.04.001
		The acceleration of the new generation of technological revolution and industrial transformation stimulates the recognition of emerging technologies, which has become a crucial issue related to the future developmental strategy of a country and region. We initially predict the dynamic change of a co-word network based on link prediction and neural network algorithms. Emerging topics are then detected by measuring their novelty and influence. An empirical study on applications of perovskite materials is conducted to demonstrate the reliability of the proposed method.
		2019 Vol. 38 (4): 335-341 [Abstract] ( 336 ) HTML (107 KB) PDF (980 KB) ( 1588 )

	342	A “Multiple Indicator” View of Preprint Impact Evaluation in Information and Library Science ^Hot!
		Chen Yue, Wang Zhiqi, Liu Zeyuan, Song Cha
		DOI: 10.3772/j.issn.1000-0135.2019.04.002
		This paper focuses on those preprints in arXiv that are also published in three major journals in Library and Information Science (LIS) and further explores the border impact of the preprints from several perspectives. In particular, the following four indicators are used to examine the 550 arXiv and 5782 non-arXiv papers: citations from the Web of Science Core Collection (CF-WoS), Scopus, and Google Scholar; usage counts in WoS (UC); Mendeley readers (MR); and Tweets (TM), which are considered proxies for social attention. The results show different citation trends for the two sets of papers, wherein preprints have an obvious citation advantage compared with the other documents. The development of altmetrics in research evaluation promotes the open access process of scientific resources. The impact advantage of arXiv papers can also be observed in MR but is hardly reflected by UC or TM. A linear regression analysis substantiates that MR and UC do strongly correlate with CF-WoS, which also holds for the relationship with Scopus and Google Scholar citations, but MR is more suitable to assist in the evaluation of the impact of preprints. The strong correlation between readers/usage and citations may be interpreted in the sense that arXiv papers gain broader attention than non-arXiv papers, not only from subscribers of the WoS. This study helps to reveal the role of preprints in LIS and provides inspiration to build a more complete evaluation index system currently suitable for different methods of scientific communication from the “multiple indicator” view.
		2019 Vol. 38 (4): 342-353 [Abstract] ( 241 ) HTML (173 KB) PDF (1799 KB) ( 1045 )

	354	Design and Implementation of a Humanities and Social Sciences Data Sharing Model: A Case Study of Consortium Blockchain ^Hot!
		Gu Jun, Xu Xin
		DOI: 10.3772/j.issn.1000-0135.2019.04.003
		In this age of big data, research in the Humanities and Social Sciences has gradually transformed into a new research model that is driven by data, which in turn fuels the demand for data sharing. However, the poor traceability of sources and their use remains a challenge in traditional humanities and social science data sharing procedures. To solve this problem, a data recording mechanism of a blockchain was adopted. This study specifically selected the Hyperledger Fabric, which is a blockchain framework, as the basis of the consortium blockchain and rewrote the data storage mode of the block. The model of the consortium blockchain of data sharing in the humanities and social sciences consists of CA authentication, presubmission, feedback verification, block package broadcasts, and ledger database updates. Furthermore, we customized Dataverse, which is a type of open-source data management software, and established a consortium blockchain platform for humanities and social science data sharing. Results showed that the data sharing model for humanities and social science based on blockchain can both solve existing problems at a technical level, and promote the development of data sharing. The results also showed that our consortium blockchain mechanism based on Hyperledger Fabric achieves the expected operational efficiency.
		2019 Vol. 38 (4): 354-367 [Abstract] ( 307 ) HTML (129 KB) PDF (3336 KB) ( 1258 )

	368	Empirical Investigation on the Literature Obsolescence Based on Ye Model ^Hot!
		Chen Jinglian, Ye Zipiao
		DOI: 10.3772/j.issn.1000-0135.2019.04.004
		Detailed information from the literature, subject citation frequency, and the quantity of articles can be used to evaluate the development of an academic field. In this paper, in order to investigate the increment rule of subject citation frequency and the corresponding quantity of articles, a study on the time-response of the subject citation frequency and quantity of articles was conducted using the Ye model. The number of articles and the citation frequency of Physical Review D (IF=4.56) in 1985-1990, 1991-1996, 2000-2005 and 2006-2011, as well as the frequency of the subject “graphene” in 2005, 2008 and 2010 were retrieved from the Web of Science database. These data were fitted by the Ye, negative exponential, and logistic models, respectively. The results showed that the Ye model could simulate an acceptable time-response of the literature citation frequency. Simultaneously, the literature citation peak and maximum citation years were very close to the observed value with extreme determination coefficients. These results also revealed that the maximum citation period is continuously decreasing with the article quantity and citation frequency of the journal. However, although the negative exponential model was not able to adequately fit the time-response curves of the literature citation frequency in 1985-1990 and 1991-1996, it could fit the time-response curves of the literature citation frequency in 2000-2005 and 2006-2011 well. In addition, by using the logistic model to simulate the uplift of the time-response curve in the four selected time periods, the peak value of the citation is obtained by the model, and it is lower than the observed values. Additionally, the Ye model could fit the time-response curves of the literature citation frequency on the subject “graphene” in 2005, 2008 and 2010 well. It also found that the maximum citation period continually decreased. However, there is a great difference between the time-response curves and the observed value as the response curves were fitted by the logistic model.
		2019 Vol. 38 (4): 368-376 [Abstract] ( 361 ) HTML (116 KB) PDF (1122 KB) ( 791 )

	377	Comparative Analysis of Altmetrics and Citation Measurement Based on the Scopus Database ^Hot!
		Qin Fen, Gao Jian
		DOI: 10.3772/j.issn.1000-0135.2019.04.005
		This paper uses SPSS software to acquire data from the Essential Science Indicators (ESI) of highly cited Library and Information Science (LIS) papers, in order to obtain the correlation and difference between the data collected from the indexes of Altmetric.com, citation from the ESI database, and the Altmetrics index from the Scopus database. The results of the analysis show that there is no significant correlation between Altmetrics and citation for which angles differ. Therefore, it is reasonable to combine Altmetrics and citation as an academic evaluation method.
		2019 Vol. 38 (4): 377-383 [Abstract] ( 192 ) HTML (99 KB) PDF (759 KB) ( 955 )

	384	Structural Recognition of PDF Academic Literature Based on Computer Vision ^Hot!
		Yu Fengchang, Lu Wei
		DOI: 10.3772/j.issn.1000-0135.2019.04.006
		Portable Document Format (PDF) documents play an important role in the publication of academic electronic literature. However, owing to the technical and structural complexities of PDF documents, they cannot be directly read by digital devices, which in turn can hinder research studies based on academic electronic literature. Hence, this paper proposes a method based on computer vision for the structural recognition of PDF documents. The proposed method, supplemented by a heuristic algorithm, maps graphic objects and text objects present in the PDF files of academic documents and thereby obtains geometric and text attributes of the file objects. The proposed algorithm can identify the category of a PDF object for determining the physical and logical structures of a PDF document. Conventional PDF analysis methods require a significant amount of artificial feature construction and large-scale lexical corpus training and cannot identify formulae and tables. The proposed method can overcome the aforementioned shortcomings and can successfully perform full-text extraction and structural recognition of ACL data collections.
		2019 Vol. 38 (4): 384-390 [Abstract] ( 341 ) HTML (77 KB) PDF (1505 KB) ( 723 )

	391	Exploring Potential Collaboration Partners of Middle and Small-sized Enterprises Based on Heterogeneous Information Networks of Patent: Graphene as Example ^Hot!
		Fu Junying, Peng Zhe, Zheng Jia, Yuan Fang, Li Nong
		DOI: 10.3772/j.issn.1000-0135.2019.04.007
		The technological innovation capability of small- and medium-sized enterprises (SMEs) has become an important component of a country’s national innovation system. In the US, there have been very comprehensive support measures related to SMEs that encourage them to cooperate with external organizations, such as non-profit research institutions and large enterprises; the support, in turn, can help them to develop rapidly. This study seeks to measure the technical similarity of the patentees by measuring the similarity between their patents and builds a heterogeneous network based on graphene-related patents in the database on US-authorized patents. Further, based on the PathSelClus algorithm, we choose 7 scientific research institutions and 5 large enterprises to act as user guidance. As a result, we get 2 types of clusters in terms of 2 semantics. The first result is a distribution of the degree of similarity between the SMEs and the 7 scientific research institutions, and the other is a distribution of the degree of similarity between SMEs and the 5 large enterprises. Finally, this study analyzes the validity of clustering results according to the research and development direction of the patentees.
		2019 Vol. 38 (4): 391-401 [Abstract] ( 326 ) HTML (133 KB) PDF (12216 KB) ( 330 )

	402	Patent Evaluation with a Machine Learning Approach ^Hot!
		Liu Xia, Huang Can, Yu Xiaofeng
		DOI: 10.3772/j.issn.1000-0135.2019.04.008
		As the number of patent applications in CNIPA (China National Intellectual Property Administration) increases, patent value is of great interest throughout industries, governments, and academies. However, the existing statistic and econometric models cannot take advantage of huge samples of patent data for value prediction. Based on more than 850,000 patent applications field in 2010 and 2011, this paper provides a machine learning approach to predict patent forward citations at an early stage by using multiple patent indicators that can be defined immediately after the relevant patents are public. The developed model could provide a prediction on whether the relevant patent would receive forward citations; however it was weak in differentiating between high and low citations. Moreover, based on the Gini impurity, features of backward citations provide more information for value prediction. In other words, the prior art search process during the patent examination should be focused on. Finally, the paper discusses the limitations of the adopted model, as well as improvement methods for further studies.
		2019 Vol. 38 (4): 402-410 [Abstract] ( 377 ) HTML (122 KB) PDF (882 KB) ( 1028 )

	411	Microblog Retrieval Model Combining User Interest and Mixed Estimation ^Hot!
		Wu Shufang, Zhang Xiongtao, Zhu Jie
		DOI: 10.3772/j.issn.1000-0135.2019.04.009
		With the further development of mobile internet technology, microblog retrieval has become an important part of microblog service. Considering the difference between microblog retrieval and traditional text retrieval, a new microblog retrieval model is put forward. The new model improves the prior probability and document language model estimation of the query likelihood model. To improve the document prior probability, the user’s interest blog library is obtained by quantifying the interest of users in blogs, and then the prior probability of microblog document is computed based on the proposed interest blog library. On the other hand, the information of blog contents and user interaction are mixed to obtain related blogs, which are used to smooth the original blog and achieve the mixed estimation on document language model, to effectively solve the problem of data sparseness in microblog short text. Experiments adopt the real data crawled from Sina to verify the effectiveness of our model, and experimental results demonstrate that our model outperforms some state-of-the-art models on P@15, P@30, and MRR.
		2019 Vol. 38 (4): 411-419 [Abstract] ( 250 ) HTML (186 KB) PDF (2589 KB) ( 773 )

	420	Emotional Recognition of Visual-perception Oriented Images and Its Application in the Recommendation System ^Hot!
		Chen Fen, He Yuan, Tang Liping
		DOI: 10.3772/j.issn.1000-0135.2019.04.010
		Visual information is one of the most important sources of external information. Images are a major form of visual information. In this paper, the authors first optimize the algorithm of color histogram based on image segmentation, introduce the Itti visual attention model, extract the image saliency map according to the principle that different image regions arouse different degrees of attention, and calculate the weighted histogram based on the saliency map. Secondly, different types of visual emotional features are extracted, combining low-level color, texture, and shape, as well as high-level facial emotion features to generate a composite image sentiment feature description vector. Finally, the emotion recognition model is used to make emotion-based film recommendations, combined with movie posters and synopsis texts, and based on the combination of graphic and textual emotion recognition, to meet users’ emotional needs. In conclusion, this paper proposes a framework of image emotion recognition that combines visual-perception oriented features and facial emotions. This framework is efficient and exhibits good performance. To a certain extent, the framework proposed in this paper narrows the “semantic gap” in this area of research.
		2019 Vol. 38 (4): 420-431 [Abstract] ( 293 ) HTML (161 KB) PDF (24394 KB) ( 420 )

	432	A Review of Query Suggestion ^Hot!
		Zhang Xiaojuan, Peng Lin, Li Qian
		DOI: 10.3772/j.issn.1000-0135.2019.04.011
		Query suggestion is an important technique for improving search efficiency, and its core task is to help users construct effective queries to accurately describe usersinformation requirements. As a core technology of search engines, query suggestion has attracted wide attention in both academia and industry and has long been considered to be an important research topic in information retrieval. This paper summarizes the recent research progress in query suggestion using papers published in China s and international conferences and journals. On this basis, the mainstream methods—simpleoccurrence information-based method, graph-based method, and integration of multiple information-based methods—are reviewed in detail in this paper. Then, the related evaluation methods and metrics are summarized and discussed. Finally, the possible future research directions are pointed out.
		2019 Vol. 38 (4): 432-446 [Abstract] ( 231 ) HTML (252 KB) PDF (1399 KB) ( 828 )

THE JOURNAL

AUTHOR SERVICES

READER SERVICES

ACKNOWLEDGEMENTS

Peer Reviewers

Scan with iPhone or iPad

Editorial Office: JCSSTI Editorial Office, No.15 fuxing road, haidian, Beijing 100038
Tel: +86(010)68598273; Fax: +86(010)68598285; E-mail: qbxb@istic.ac.cn
Copyright © 2015 by the Journal of The China Society for Scientific and Technical Information
ISSN: 1000-0135 CN: 11-2257 / G3