|
|
Automatic Recommendation of Judgment Documents Based on Structural Content Features |
Liang Zhu1, Shen Si2, Ye Wenhao3, Wang Dongbo1 |
1.College of Information Management, Nanjing Agricultural University, Nanjing 210095 2.School of Economics & Management, Nanjing University of Science and Technology, Nanjing 210094 3.School of Information Management, Nanjing University, Nanjing 210023 |
|
|
Abstract In the existing judgment document retrieval system, non-legal professionals encounter difficulties while trying to retrieve documents. At present, intelligent searches in the legal field can rely on the recommendations and classifications of legal provisions based on judgment documents without researching the automatic recommendations of judgment documents. Therefore, this study proposes a method to intelligently recommend judgment documents using news texts based on similar news methods. This article combines current research work to summarize the structure-specific and content characteristics of judgment documents, use the text of news articles, simulate the search query of non-legal professional users, and construct a corpus of judgment documents with structural content features to recommend automatically related judgment documents. The results show that after extracting the corresponding feature words using the content structure of the court opinion of judgment documents, the retrieval model performs well when the LambdaMART model conducts text matching, which is superior to the traditional full-text retrieval technology.
|
Received: 23 May 2020
|
|
|
|
1 李振宇. 法律文献的特征、类型及考证与检索[J]. 人文杂志, 2002(2): 158-160. 2 黄俏娟, 罗旭东. 人工智能与法律结合的现状及发展趋势[J]. 计算机科学, 2018, 45(12): 1-11. 3 Chalkidis I, Kampas D. Deep learning in law: early adaptation and legal word embeddings trained on large corpora[J]. Artificial Intelligence and Law, 2019, 27(2): 171-198. 4 Giri R, Porwal Y, Shukla V, et al. Approaches for information retrieval in legal documents[C]// Proceedings of the 2017 Tenth International Conference on Contemporary Computing. IEEE, 2017: 1-6. 5 张琳, 秦策, 叶文豪. 基于条件随机场的法言法语实体自动识别模型研究[J]. 数据分析与知识发现, 2017, 1(11): 46-52. 6 黄菡, 王宏宇, 王晓光. 结合主动学习的条件随机场模型用于法律术语的自动识别[J]. 数据分析与知识发现, 2019, 3(6): 66-74. 7 高丹, 彭敦陆, 刘丛. 海量法律文书中基于CNN的实体关系抽取技术[J]. 小型微型计算机系统, 2018, 39(5): 1021-1026. 8 Li Z H. A classification retrieval approach for English legal texts[C]// Proceedings of the 2019 International Conference on Intelligent Transportation, Big Data & Smart City. IEEE, 2019: 220-223. 9 陆伟, 黄永, 程齐凯. 学术文本的结构功能识别——功能框架及基于章节标题的识别[J]. 情报学报, 2014, 33(9): 979-985. 10 黄永, 陆伟, 程齐凯. 学术文本的结构功能识别——基于章节内容的识别[J]. 情报学报, 2016, 35(3): 293-300. 11 黄永, 陆伟, 程齐凯, 等. 学术文本的结构功能识别——基于段落的识别[J]. 情报学报, 2016, 35(5): 530-538. 12 黄永, 陆伟, 程齐凯, 等. 学术文本的结构功能识别——在学术搜索中的应用[J]. 情报学报, 2016, 35(4): 425-431. 13 Zhuang C H, Zhou Y M, Ge J D, et al. Information extraction from Chinese judgment documents[C]// Proceedings of the 2017 14th Web Information Systems and Applications Conference. IEEE, 2017: 240-244. 14 赵彦. 司法裁判文书的网络检索路径[J]. 学术交流, 2015(5): 126-131. 15 黄都培. 基于本体的法律信息语义检索[J]. 计算机工程与应用, 2008, 44(28): 196-199. 16 黄都培. 法律信息语义检索方法研究[J]. 法律文献信息与研究, 2009(4): 1-10. 17 邢启迪, 耿骞, 赵盼云, 等. 法律文献资源关联模型设计与应用研究[J]. 图书情报工作, 2017, 61(10): 131-140. 18 Wagh R S, Anand D. Legal document similarity: a multi-criteria decision-making perspective[J]. PeerJ Computer Science, 2020, 6: e262. 19 Padayachy T, Scholtz B, Wesson J. An information extraction model using a graph database to recommend the most applied case[C]// Proceedings of the 2018 International Conference on Computing, Electronics & Communications Engineering. IEEE, 2018: 89-94. 20 Kanapala A, Jannu S, Pamula R. Passage-based text summarization for legal information retrieval[J]. Arabian Journal for Science and Engineering, 2019, 44(11): 9159-9169. 21 Marques M R S, Bianco T, Roodnejad M, et al. Machine learning for explaining and ranking the most influential matters of law[C]// Proceedings of the Seventeenth International Conference on Artificial Intelligence and Law. New York: ACM Press, 2019: 239-243. 22 陈文哲, 秦永彬, 黄瑞章, 等. 基于犯罪行为序列的法律条文预测方法[J]. 计算机工程与应用, 2019, 55(22): 245-249, 264. 23 涂海, 彭敦陆, 陈章, 等. S2SA-BiLSTM: 面向法律纠纷智能问答系统的深度学习模型[J]. 小型微型计算机系统, 2019, 40(5): 1034-1039. 24 McElvain G, Sanchez G, Matthews S, et al. WestSearch Plus: a non-factoid question-answering system for the legal domain[C]// Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM Press, 2019: 1361-1364. 25 Zhou X, Zhang Y, Liu X, et al. Legal intelligence for e-commerce: multi-task learning by leveraging multiview dispute representation[C]// Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM Press, 2019: 315-324. 26 Joachims T. Text categorization with Support Vector Machines: learning with many relevant features[C]// Proceedings of the European Conference on Machine Learning. Heidelberg: Springer, 1998: 137-142. 27 Joachims T. Training linear SVMs in linear time[C]// Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2006: 217-226. 28 Burges C, Shaked T, Renshaw E, et al. Learning to rank using gradient descent[C]// Proceedings of the 22nd International Conference on Machine Learning. New York: ACM Press, 2005: 89-96. 29 Burges C J C, Svore K M, Bennett P N, et al. Learning to rank using an ensemble of lambda-gradient models[C]// Proceedings of the 2010 International Conference on Yahoo! Learning to Rank Challenge. JMLR.org, 2010, 14: 25-35. 30 Burges C J C. From RankNet to LambdaRank to LambdaMART: an overview[R/OL]. Microsoft Research Technical Report MSR-TR-2010-82, 2010, https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/MSR-TR-2010-82.pdf. |
|
|
|