Named Entity Recognition of Local Chronicles Literature in Traditional Chinese Opera Based on Multi-dimensional Feature Analysis |
Zhai Shanshan1,2, Yu Huajuan1, Chen Jianyao3, Xia Lixin1 |
1.School of Information Management, Central China Normal University, Wuhan 430079 2.Intelligent Computing Laboratory for Cultural Heritage, Wuhan University, Wuhan 430072 3.University of Wisconsin-Milwaukee, Milwaukee 53202 |
Abstract Local chronicles are a unique and highly valuable form of regional documentation in China. Digitizing and implementing knowledge mining for these records is crucial for the inheritance and dissemination of traditional Chinese culture, as well as for the construction of a culturally strong nation. Named entity recognition (NER) plays a crucial role as a fundamental technology in organizing and discovering knowledge within local chronicles. Although there has been some progress in NER for local chronicles, a systematic technical solution that adapts to the specific features of these texts and the characteristics of domain resources is still lacking. Therefore, this study proposes a novel approach for named entity recognition in traditional Chinese opera local chronicles by integrating multi-dimensional features with Bi-LSTM-CRF. First, by combining syntactic features with textual features such as symbols, suffixes, word structure, context, and negative examples, the distinctive traits of opera entities within local chronicles are analyzed. Thereafter, the Bi-LSTM-CRF model, which performs well in long text structures, is utilized to improve the efficiency of entity recognition with the help of parsed features of opera-like entities. Finally, empirical research is conducted using the specific case of the “Chu Opera Chronicles.” The results demonstrate that the proposed model outperforms the baseline model in terms of named entity recognition, achieving an F1 score of 0.869.
Received: 18 October 2023
