|
|
2022 Vol. 41, No. 10
Published: 2022-10-24 |
|
|
|
|
|
|
|
1024 |
Academic Output Distribution in Authors of Highly Cited Papers among Different City-University Clusters Hot! |
|
|
Zhang Guilan, Pan Yuntao, Zheng Chuhua, Wang Haiyan, Ma Zheng |
|
|
DOI: 10.3772/j.issn.1000-0135.2022.10.003 |
|
|
In the open scientific research ecosystem environment, researchers were self-selective and self-organized to a certain extent in growth and development. The integrated development of universities and cities constituted the external ecological environment for researchers, which further affected their growth and development. Based on the economic level of cities and academic level of universities, this study proposed city-university clusters at different levels. We studied the academic output distribution of researchers among different city-university clusters. In this study, the authors of highly cited papers in artificial intelligence were used as examples. The basic and work information, project data, paper output data and patent output data of the authors were obtained comprehensively through data mining. We used statistical analysis and PSM (propensity score matching) to explore the distribution. We also examined the combined influence of city-university on their academic output. We found that the authors of highly cited papers were mainly concentrated in top universities, and ranking of universities and number of authors met the power function distribution law that a was negative. Certain differences were identified in the academic output distribution in authors of highly cited papers among different city-university clusters. The academic output of the higher level of city-university clusters was significantly higher, and degree of dispersion larger. Furthermore, university and city had double the influence on authors’ academic output. The influence of university on academic output was higher than that of city. The high-quality university platform could make up for the influence of city economic level on researchers’ academic output. |
|
|
2022 Vol. 41 (10): 1024-1033
[Abstract]
(
211
)
HTML
(129 KB)
PDF
(2032 KB)
(
400
) |
|
|
|
1034 |
Multi Granularity Knowledge Organization Model for User Generated Content Hot! |
|
|
Wang Zhongyi, Zheng Xin, Wang Keying |
|
|
DOI: 10.3772/j.issn.1000-0135.2022.10.004 |
|
|
As an important text resource of network information resources in the era of big data, user generated content (UGC) has received increasing attention from scholars in various fields. Compared with traditional texts, it is more difficult to organize massive and fragmented UGC texts. Aiming at the fragmentation characteristics of the UGC text, this study proposes a multi granularity knowledge organization model based on the knowledge element. By extracting fragmented UGC knowledge element and establishing multi granularity association and multi granularity index, the fragmented UGC is organized from point to surface and from part to whole. On the one hand, in the empirical research part, pieces of fragmented UGC text are related to “retrieval” to complete multi-granularity knowledge organization, and the user interface is provided to complete the knowledge retrieval service; on the other hand, the effectiveness and scientificity of the multi granularity knowledge organization model proposed in this paper are proved by comparative experiments. |
|
|
2022 Vol. 41 (10): 1034-1043
[Abstract]
(
354
)
HTML
(86 KB)
PDF
(2024 KB)
(
153
) |
|
|
|
1044 |
The Challenges of Webometrics and Altmetrics and the Evaluation of Robustmetric and Non-robustmetric in Societal Impact Hot! |
|
|
Liu Tingyuan, Liu Shuman |
|
|
DOI: 10.3772/j.issn.1000-0135.2022.10.005 |
|
|
In the wake of increasingly extensive evidence of the impact of scientific achievements, the challenges of webometrics and altmetrics and their societal impact evaluation are increasing. Due to the widespread prevalence of high zero values (left), multiple outliers (right), and extremely right-skewed distribution as web-altmetric data with societal impacts, the authenticity and rationality of the data set and the resistance of bias-error, reliability, and stability of their informetric methods and results are facing many unique challenges. In this study, in the face of high zero value, the quartile zero value scaling-down method is used for verification, and the proposed accurate calculation formula has good consistency and resistance of bias-error, which is an important basis for the reasonable correction of outliers and their robustmetric. The quartile zero-value rate is defined and derived based on the inter-quartile range,and the actual risk rate of its maximum scaling-down is low, which belongs to the ideal position parameter estimation point. For multiple outliers, the robustify winsorizing method is used for modification, and compared with the non-robust method, the corrected data set has more tolerance and reliability. For extremely right-skewed distribution, the linear proportional method based on the tailed mean is adopted for dimensionless, so that the results of mapping and transformation are more stable and consistent compared with the linear proportional method based on the mean. The solution of weight coefficient is based on the organic integration of subjective weight into objective weight method, and the weight set of G1 method (subjective), objective G1 method, and semi-objective G1 method is regarded as a triangular fuzzy number for defuzzification so that the weight values have a subjective-objective dual realization mechanism, thereby improving the reliability and stability of the comprehensive evaluation results. Compared with methods of non-robustmetric evaluation, the stability, reliability, and resistance of bias-error of robustmetric evaluation is greatly improved, which is conducive to promoting the development of informetrics and evaluation science towards complexity precision science. |
|
|
2022 Vol. 41 (10): 1044-1058
[Abstract]
(
279
)
HTML
(219 KB)
PDF
(952 KB)
(
511
) |
|
|
|
1059 |
Imbalanced Classification of Emerging Technologies Identification: Based on Cost-sensitive Random Forest Hot! |
|
|
Lu Xiaobin, Zhang Yangyi, Yang Guancan, Xing Jiaxin |
|
|
DOI: 10.3772/j.issn.1000-0135.2022.10.006 |
|
|
Automated forward-looking forecasting based on large patent data and patent characteristics has gradually become the research focus of emerging technologies identification. In addition, the introduction of machine learning technology has attracted the attention of the small probability of discovering emerging technologies from massive technological inventions represented by patents, which comprises a typical imbalanced classification problem. This study aims to improve the identification performance of the classification bias to the majority caused by imbalanced datasets in emerging technologies identification and to propose a comprehensive imbalanced classification optimization framework that integrates three levels of data, algorithm, and evaluation verified by the binary classification of whether the patents in cancer drugs field can be authorized by the Food and Drug Administration to become new drugs as emerging technologies as an example. The specific improvements are as follows: progressive resampling is verified at the data level, cost-sensitive learning is introduced with three cost matrix setting methods under the background of a lack of expert experience are studied at the evaluation level, and the cost-sensitive random forest is constructed at the algorithm level. The results show that cost-sensitive random forest based on 1∶2 undersampling and ROC (receiver operating characteristic) -Youden index threshold cost matrix can predict 82.8% of the emerging technologies and 81.6% of the common technologies, which is significantly better than the control group and the existing related results. It has a certain reference value for further mining the essence of the imbalanced classification in emerging technologies identification in the future, and has certain reference value for the future exploration of the nature of the imbalanced classification problems in emerging technologies identification. |
|
|
2022 Vol. 41 (10): 1059-1070
[Abstract]
(
294
)
HTML
(177 KB)
PDF
(1648 KB)
(
247
) |
|
|
|
1071 |
Framework of the Government-data Collaborative Governance Platform Based on the Alliance Blockchain: Considering the National Carbon Emissions Trading Market as an Example Hot! |
|
|
Zheng Rong, Gao Zhihao, Wei Mingzhu, Sun Yanfei |
|
|
DOI: 10.3772/j.issn.1000-0135.2022.10.007 |
|
|
Collaborative governance of government data under national security concept has become an important part of social and national governance. Based on the literature analysis and current research results, guided by the synergy theory, this study uses the alliance blockchain technology to build a government-data collaborative governance platform to achieve maximum synergy of the government data governance under multi-agent and multi-source data, and ensure symbiosis, sharing, co-governance, security, and stability of government data governance elements. This article focuses on the dilemma of the collaborative governance of government data, adopts the research paradigm of the “technical framework construction-platform model construction-analysis of operating mechanism,” and uses the national carbon emissions trading market as the actual application scenario to discuss the value of the platform in the collaborative governance of government data. It is proved the platform can realize collaboration between government data governance entities and government data, break data barriers, improve data security, credibility, and traceability, and clarify data standards and ownership rights in the data collaborative governance. Data security and value-added government data provide platform support and technical guarantees. |
|
|
2022 Vol. 41 (10): 1071-1084
[Abstract]
(
331
)
HTML
(106 KB)
PDF
(5745 KB)
(
365
) |
|
|
|
1100 |
Perspective of the Development of Intelligence Studies and Intelligence Service in the Era of Data Intelligence Hot! |
|
|
Xu Xin, Ye Dingling |
|
|
DOI: 10.3772/j.issn.1000-0135.2022.10.009 |
|
|
As the era of data intelligence transforms the core of intelligence studies and intelligence services, it has become necessary to systematically examine the changes within such studies and services to provide reference and developmental guidance for realizing the common advantages of intelligence studies and services. Starting from intelligence process, a systematic analysis of the impact of big data, cloud computing, artificial intelligence, blockchain, 5G technologies and the integration of such technologies on intelligence studies and services during demand and planning, retrieval and collection, integration and organization, analysis and condensation, presentation and transmission, is presented in this study. Further, certain viewpoints on the relationship between data intelligence technology and intelligence studies, the development of data intelligence technology and digital intelligence technology as well as a few intelligence studies theories are put forward. |
|
|
2022 Vol. 41 (10): 1100-1110
[Abstract]
(
313
)
HTML
(94 KB)
PDF
(2157 KB)
(
647
) |
|
|
|