融合语义特征和分布特征的跨媒体关联分析方法研究

doi:10.3772/j.issn.1000-0135.2021.05.004

情报学报

2021, Vol. 40

Issue (5): 471-478 DOI: 10.3772/j.issn.1000-0135.2021.05.004

情报分析方法与技术

本期目录 | 过刊浏览 | 高级检索

融合语义特征和分布特征的跨媒体关联分析方法研究

刘忠宝^1,2, 赵文娟^1,2

1.北京语言大学语言智能研究院，北京 100083
2.泉州信息工程学院云计算与物联网技术福建省高等学校重点实验室，泉州 362000

Research on Cross-media Correlation Analysis by Fusing Semantic Features and Distribution Features

Liu Zhongbao^1,2, Zhao Wenjuan^1,2

1.Institute of Language Intelligence, Beijing Language and Culture University, Beijing 100083
2.Key Laboratory of Cloud Computing and Internet-of-Things Technology, Quanzhou University of Information Engineering, Quanzhou 362000

摘要
图/表
参考文献
相关文章 (2)

全文: PDF (1387 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要文本、图像、视频、音频等多种媒体数据具有多源异构的特性，这导致“语义鸿沟”问题的出现。现有文献采用的方法中大多数是针对文本和图像两种媒体数据展开研究，难以实现更多类型媒体数据的关联分析。因此，本文融入多种媒体数据的语义特征和分布特征，来对跨媒体关联分析方法进行深入研究，以实现文本、图像、视频、音频等多种媒体数据的一致性表示。首先，对多种媒体数据进行向量化表示，并输入模型；其次，利用双向长短期记忆网络（bidirectional long short-term memory，BiLSTM）挖掘输入数据的上下文信息，得到各种媒体数据的特征向量；最后，融合特征向量的语义特征和分布特征进行跨媒体关联分析，进而得到跨媒体的一致性表示。自建数据集上的比较实验结果表明，本文的研究方法较之CCA（canonical correlation analysis）、KCCA（kernel canonical correlation analysis）、Deep-SM（deep semantic match）等已有方法具有更高的关联分析准确率，这表明本文的研究方法能够较为准确地发现各种媒体数据之间的语义关联关系。希望本文的研究对跨媒体关联分析研究具有一定的指导和借鉴作用。

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	刘忠宝
	赵文娟

关键词 ：跨媒体数据, 关联分析, 双向长短期记忆网络, 语义特征, 分布特征

收稿日期: 2020-04-07

基金资助:国家社会科学基金一般项目“大数据环境下面向图书馆资源的跨媒体知识服务研究”（19BTQ012）。

作者简介: 刘忠宝，男，1981年生，博士，教授，博士生导师，主要研究方向为知识组织、文本挖掘、自然语言处理，E-mail：liuzb@nuc.edu.cn；赵文娟，女，1983年生，硕士，副教授，主要研究方向为信息资源管；

引用本文:

刘忠宝, 赵文娟. 融合语义特征和分布特征的跨媒体关联分析方法研究[J]. 情报学报, 2021, 40(5): 471-478.
Liu Zhongbao, Zhao Wenjuan. Research on Cross-media Correlation Analysis by Fusing Semantic Features and Distribution Features. 情报学报, 2021, 40(5): 471-478.

链接本文:

http://qbxb.istic.ac.cn/CN/10.3772/j.issn.1000-0135.2021.05.004 或 http://qbxb.istic.ac.cn/CN/Y2021/V40/I5/471

1 Li D G, Dimitrova N, Li M K, et al. Multimedia content processing through cross-modal association[C]// Proceedings of the 11th ACM International Conference on Multimedia. New York: ACM Press, 2003: 604-611.
2 Peng Y X, Qi J W. Reinforced cross-media correlation learning by context-aware bidirectional translation[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30(6): 1718-1731.
3 Rasiwasia N, Costa Pereira J, Coviello E, et al. A new approach to cross-modal multimedia retrieval[C]// Proceedings of the 18th International Conference on Multimedia. New York: ACM Press, 2010: 251-260.
4 Ballan L, Uricchio T, Seidenari L, et al. A cross-media model for automatic image annotation[C]// Proceedings of the 4th ACM International Conference on Multimedia Retrieval. New York: ACM Press, 2014: 73-80.
5 Gong Y C, Ke Q F, Isard M, et al. A multi-view embedding space for modeling Internet images, tags, and their semantics[J]. International Journal of Computer Vision, 2014, 106(2): 210-233.
6 Chen Y M, Wang L, Wang W, et al. Continuum regression for cross-modal multimedia retrieval[C]// Proceedings of the 19th IEEE International Conference on Image Processing. IEEE, 2012: 1949-1952.
7 Andrew G, Arora R, Bilmes J, et al. Deep canonical correlation analysis[J]. Proceedings of Machine Learning, 2013, 28(3): 1247-1255.
8 Feng F X, Wang X J, Li R F. Cross-modal retrieval with correspondence autoencoder[C]// Proceedings of the 22nd ACM International Conference on Multimedia. New York: ACM Press, 2014: 7-16.
9 Wei Y C, Zhao Y, Lu C Y, et al. Cross-modal retrieval with CNN visual features: a new baseline[J]. IEEE Transactions on Cybernetics, 2017, 47(2): 449-460.
10 Peng Y X, Huang X, Qi J W. Cross-media shared representation by hierarchical learning with multiple deep networks[C]// Proceedings of the 25th International Joint Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2016: 3846-3853.
11 Peng Y X, Qi J W, Huang X, et al. CCL: cross-modal correlation learning with multigrained fusion by hierarchical network[J]. IEEE Transactions on Multimedia, 2018, 20(2): 405-420.
12 Huang X, Peng Y X, Yuan M K. Cross-modal common representation learning by hybrid transfer network[C]// Proceedings of the 26th International Joint Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2017: 1893-1900.
13 李广丽, 张红斌, 移梦阳. 数字图书馆中跨媒体检索模型的设计及优化探索[J]. 情报理论与实践, 2013, 36(2): 104-108.
14 明均仁, 何超. 基于语义关联挖掘的数字图书馆跨媒体检索方法研究[J]. 图书情报工作, 2013, 57(7): 101-105.
15 张兴旺, 黄晓斌. 数字图书馆跨媒体检索研究综述[J]. 情报资料工作, 2014(3): 37-42.
16 刘忠宝, 贾君枝, 赵文娟. 数字图书馆跨媒体检索技术研究[J]. 图书馆论坛, 2014, 34(12): 94-97, 137.
17 李爱明. 数字图书馆中基于语义关联挖掘的跨媒体检索研究: 模型设计与实验分析[J]. 情报科学, 2014, 32(1): 85-88.
18 彭欣. 基于深度学习的数字图书馆跨媒体语义检索方法研究[J]. 情报探索, 2018(2): 16-19.
19 徐彤阳, 邓颖慧. 微信中基于语义关联的跨媒体检索研究[J]. 情报科学, 2018, 36(7): 158-162.
20 黄微, 刘熠, 孙悦. 多媒体网络舆情语义识别的关键技术分析[J]. 情报理论与实践, 2019, 42(1): 134-140.
21 熊回香, 杨滋荣, 蒋武轩. 跨媒体知识图谱构建中多模态数据语义相关性研究[J]. 情报理论与实践, 2019, 42(2): 13-18, 24.
22 李广丽, 朱涛, 刘斌, 等. 面向大数据的数字图书馆多媒体信息检索系统优化研究[J]. 情报科学, 2019, 37(2): 115-119.
23 Graves A, Schmidhuber J. Framewise phoneme classification with bidirectional LSTM and other neural network architectures[J]. Neural Networks, 2005, 18(5/6): 602-610.
24 He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition[C]// Proceedings of the 29th IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2016: 770-778.
25 Kalchbrenner N, Grefenstette E, Blunsom P. A convolutional neural network for modelling sentences[C]// Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2014: 655-665.
26 Palaz D, Collobert R, Doss M M. Estimating phoneme class conditional probabilities from raw speech signal using convolutional neural networks[C]// Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013: 1766-1770.