|
|
2021 Vol. 40, No. 8
Published: 2021-09-24 |
|
|
|
|
|
|
|
791 |
Social and Historical Reasons for the Return of “Macroscopic Intelligence Idea” Based on the Evolution of Research Methodology Hot! |
|
|
Li Bowen, Zhang Chengzhi |
|
|
DOI: 10.3772/j.issn.1000-0135.2021.08.001 |
|
|
This study aimed to identify the social and historical reasons for the return of “Macroscopic Intelligence Idea,” based on the research methods, research methodology evolution, and collective memory of practical experience of the Chinese Intelligence Studies community. First, two Macroscopic Intelligence Idea were presented and discussed separately prior to revealing the research questions. Second, a total of 4,612 journal articles published in Journal of Library Science in China (JLSC) and Journal of the China Society for Scientific and Technical Information (JCSSTI) were retrieved. Coding schemas and coding instructions were created for research methods. Full texts of each article were examined to identify the research methods applied in these studies, and these were subjected to content analysis their consistency was tested. Third, we introduced the research methodology evolution of the Intelligence Studies. Fourth, we discussed the capabilities and collective memory regarding the use research methods of the Chinese Intelligence Studies community. Finally, we illustrated the social and historical reasons, which indicate the reasons for the return of “Macroscopic Intelligence Idea” that occurred due to the awakening of the Chinese Intelligence Studies community during dramatic periods of change. A limitation is that this study did not account for the close relationship between intelligence studies, library and information science. |
|
|
2021 Vol. 40 (8): 791-805
[Abstract]
(
245
)
HTML
(1 KB)
PDF
(4012 KB)
(
518
) |
|
|
|
817 |
Research on Automatic Identification and Classification of Actionable Information in Emergencies Hot! |
|
|
Wu Xuehua, Mao Jin, Chen Sijing, Xie Hao, Li Gang |
|
|
DOI: 10.3772/j.issn.1000-0135.2021.08.003 |
|
|
Actionable information enables a timely response and flexible mobilization during emergencies. It plays a critical role in supporting rescue, resource allocation, and other response efforts, as well as in minimizing casualties and damages. This study aims to explore how various actionable information can be obtained from social media. To this end, we clarify the concept, characteristics, and categories of actionable information and propose a two-stage framework to identify and classify actionable information relying on machine learning techniques. Word vector representations, linguistic features, surface-based features and user-based features are adopted in the proposed framework. Five machine learning algorithms, namely, Support Vector Machine, Logistic Regression, TextCNN, BERT, and a combination of BERT and TextCNN (BERT + TextCNN), were explored in our experiment based on a manually annotated dataset. The performances of different classification strategies, algorithms, and features were evaluated. The experiment results indicate that two-stage approach can provide actionable information with various granularities without sacrificing the performance. BERT and BERT + TextCNN outperform other models in both stages. A combination of linguistic, surface-based, and user-based features makes little contribution to the information identification task in the first stage, while it can significantly improve the performance of the information classification task in the second stage. Our research helps better incorporate social media streams into emergency workflows. This can, to some extent, mitigate information overload during emergency response and improve the response efficiency. |
|
|
2021 Vol. 40 (8): 817-830
[Abstract]
(
297
)
HTML
(1 KB)
PDF
(2492 KB)
(
622
) |
|
|
|
846 |
Calculation of Author Intimacy Based on Multi-Dimensional Fusion Hot! |
|
|
Hou Xiang, Huang Jin, Sang Jun, Xia Xiaofeng |
|
|
DOI: 10.3772/j.issn.1000-0135.2021.08.005 |
|
|
To facilitate the exchange and accurate promotion of academic achievements between different academic paper authors and research teams, a reasonable and effective academic social network needs to be established. As an author network based on literature citation has not been established in current Chinese academic database platforms for community interaction, this paper uses a CNKI data source, a professor of software engineering and information security in a university, as an example. Further, the literature data of 122 scholars who are the professor A's co-authors or with citation relationships from 2014 to 2020 was used as the research object and a citation network of academic authors constructed. Combined with the characteristics of academic social networking, the authors, papers, and citation data were mined, and four dimensions of the intimacy calculation method based on the co-author, subject topic, sensitivity citation of subject sensitivity, and social graph were used to weight the comprehensive intimacy value among the authors in the network. In the experiment, the author relationship map was established by synthesizing the intimacy value, and the author's intimacy level in the network was obtained. The author's network level (weight) was obtained by multiplying the intimacy value and number of published papers. The authors and research teams with the same research level in the mapped network were found, which laid a good foundation for the data promotion of the academic social network. |
|
|
2021 Vol. 40 (8): 846-853
[Abstract]
(
240
)
HTML
(1 KB)
PDF
(3242 KB)
(
496
) |
|
|
|
854 |
Keyword Extraction from a Paper's Abstract Based on Semantic Text Graph Hot! |
|
|
Wang Xiaoyu, Wang Fang |
|
|
DOI: 10.3772/j.issn.1000-0135.2021.08.006 |
|
|
Considering the basic role of keywords in large-scale document retrieval and text content analysis, an unsupervised keyword extraction algorithm based on a semantic graph is proposed, which focuses on improving the method of graph construction and the index of word weighting. To ensure that the text graph retains more semantic and structural information, the algorithm generates a semantic text graph consisting of four features, according to the dependence of words in a sentence: conceptual connection, equivalent membership, functional attributes, and modification. This operation eliminates the sliding window parameter in the traditional method and improves the usability of the algorithm. On this basis, a word-weighting method combining word position information, concept hierarchy, concept connection preference, and connection strength is proposed and the importance of each word is ranked. Finally, high-score nodes are selected to form a keyword set of abstracts included in research papers. Experimental results based on four open corpora show that the efficiency of this method is better compared with that of the other three baseline algorithms, and the F1 value has increased to 0.570. |
|
|
2021 Vol. 40 (8): 854-868
[Abstract]
(
163
)
HTML
(1 KB)
PDF
(3142 KB)
(
791
) |
|
|
|
869 |
Towards an Appropriate Scale of Datasets for Domain Bibliometrics: Empirical Study under Multiple Tasks Hot! |
|
|
Chen Guo, Wang Panting, Wang Yuefen |
|
|
DOI: 10.3772/j.issn.1000-0135.2021.08.007 |
|
|
It is impossible to construct a complete dataset for domain bibliometrics owing to the centralized and decentralized distribution of literature, which raises an essential question about the appropriate scale of datasets. This problem should be resolved under specific bibliometric analysis tasks. In this study, we comprehensively consider typical task scenarios including domain scale, elements for analysis (such as subject classification, country, institution, keyword, reference, author, and their co-occurrence relationship), top N values of elements, and whether to sort or not. Based on this, we designed a corresponding experimental scheme. We selected “artificial intelligence” as an example domain, constructed various subsets with different sampling scales, and obtained 4,800 indices showing the imitative effect of those subsets to the full dataset. The results show that, in bibliometrics, when analyzing subject classifications or countries, a small dataset is sufficient for a reliable result. Analysis tasks on authors should be conducted with as large a dataset as possible, because author analysis is quite sensitive to data scale. When analyzing institutions, keywords, or references, a certain scale that corresponds to the specific task scenarios can also achieve reliable results. Additionally, for the co-occurrence analysis, more top elements—or sorting elements—and a larger dataset are necessary. |
|
|
2021 Vol. 40 (8): 869-878
[Abstract]
(
322
)
HTML
(1 KB)
PDF
(2404 KB)
(
439
) |
|
|
|
879 |
Research on Data Recommendation Based on Community Detection of Citation Network Hot! |
|
|
Li Chengzan, Li Jianhui, Wang Xuezhi, Shen Zhihong, Du Yi |
|
|
DOI: 10.3772/j.issn.1000-0135.2021.08.008 |
|
|
Scientific data is the input and output of scientific research activities and the core driving factor of scientific and technological innovation. Only through open sharing and wide distribution of scientific data can its value be brought into full play. However, the utilization rate and dissemination efficiency of current data publications are generally low. To accelerate the dissemination and reuse of scientific data and enhance the effectiveness of open sharing of scientific data, this paper proposes a data recommendation method based on community detection of a citation network. Considering the construction of the association network among data sets, papers, and authors, the Louvain algorithm is used for community detection from three association modes of co-authorship, co-citation, and coupling. The similarity between data sets and academic papers is, thus, calculated by combining the TF-IDF algorithm and cosine similarity, and then the connection between the data sets and communities in which the papers are located is constructed for data recommendation. The experimental results show that the data recommendation method can effectively find papers or authors of potential interest in data sets. In addition, it is found that in terms of contribution and stability of data recommendation, community detection based on coupling relationship performs the best, followed by co-authorship relationship, whereas citation relationship is greatly affected by publishing and citation times. |
|
|
2021 Vol. 40 (8): 879-886
[Abstract]
(
328
)
HTML
(1 KB)
PDF
(1988 KB)
(
729
) |
|
|
|