|
|
Study of Short Video Popularity Prediction Based on Network Representation Learning |
Zhu Hengmin1,2, Xu Ning1, Wei Jing1, Shen Chao1 |
1.School of Management, Nanjing University of Posts and Telecommunications, Nanjing 210003 2.Jiangsu University Philosophy and Social Science Key Research Base—Information Industry Integration Innovation and Emergency Management Research Center, Nanjing 210003 |
|
|
Abstract Predicting the popularity of short videos not only helps short-video platforms with efficient information management but also plays an important role in monitoring public opinion. Unlike existing studies that focus only on multimodal content features of short videos, to construct a popularity prediction model, we propose a popularity prediction model based on network representation learning, fusing content and network structural features. First, based on the dataset crawled in Douyin, a heterogeneous information network consisting of nodes was constructed, including short videos, publishers, commenters, and edges. After mapping into two different homogeneous networks, namely, short-video and publisher networks, node2vec was selected to represent the network structure in the embedding space as a network modality. Second, the multimodal content features of short videos were extracted and fused using low-rank multiview embedding learning. Finally, a multilayer perceptron machine regression model was proposed for short-video popularity prediction. Comparisons and ablation experiments were further conducted. The results show that fusing network structure features can reduce the error of short-video popularity prediction. The degree of influence of the various modalities on short-video popularity prediction consisted of the textual, network, social, acoustic, and visual modalities, in decreasing order. Our method, which combines short-video content and network structure features, provides new ideas for short-video popularity prediction based on feature engineering.
|
Received: 30 January 2024
|
|
|
|
1 Trzciński T, Andruszkiewicz P, Bocheński T, et al. Recurrent neural networks for online video popularity prediction[C]// Proceedings of the 23rd International Symposium on Methodologies for Intelligent Systems. Cham: Springer, 2017: 146-153. 2 武维, 李泽平, 杨华蔚, 等. 融合内容特征和时序信息的深度注意力视频流行度预测模型[J]. 计算机应用, 2021, 41(7): 1878-1884. 3 Zohourian A, Sajedi H, Yavary A. Popularity prediction of images and videos on Instagram[C]// Proceedings of the 4th International Conference on Web Research. Piscataway: IEEE, 2018: 111-117. 4 Xie J Y, Zhu Y C, Zhang Z B, et al. A multimodal variational encoder-decoder framework for micro-video popularity prediction[C]// Proceedings of the Web Conference 2020. New York: ACM Press, 2020: 2542-2548. 5 井佩光, 叶徐清, 刘昱, 等. 基于双向深度编码网络的短视频流行度预测[J]. 激光与光电子学进展, 2022, 59(8): 300-308. 6 Hong L J, Dan O, Davison B D. Predicting popular messages in Twitter[C]// Proceedings of the 20th International Conference Companion on World Wide Web. New York: ACM Press, 2011: 57-58. 7 Ma Z Y, Sun A X, Cong G. On predicting the popularity of newly emerging hashtags in Twitter[J]. Journal of the American Society for Information Science and Technology, 2013, 64(7): 1399-1410. 8 Meghawat M, Yadav S, Mahata D, et al. A multimodal approach to predict social media popularity[C]// Proceedings of the 2018 IEEE Conference on Multimedia Information Processing and Retrieval. Piscataway: IEEE, 2018: 190-195. 9 Shulman B, Sharma A, Cosley D. Predictability of popularity: gaps between prediction and understanding[J]. Proceedings of the Tenth International AAAI Conference on Web and Social Media, 2016, 10(1): 348-357. 10 Li H T, Ma X Q, Wang F, et al. On popularity prediction of videos shared in online social networks[C]// Proceedings of the 22nd ACM International Conference on Information & Knowledge Management. New York: ACM Press, 2013: 169-178. 11 Zhang Y C, Li P M, Zhang Z L, et al. GraphInf: a GCN-based popularity prediction system for short video networks[C]// Proceedings of the 27th International Conference on Web Services. Cham: Springer, 2020: 61-76. 12 齐金山, 梁循, 李志宇, 等. 大规模复杂信息网络表示学习: 概念、方法与挑战[J]. 计算机学报, 2018, 41(10): 2394-2420. 13 Wei Y W, Wang X, Nie L Q, et al. MMGCN: multi-modal graph convolution network for personalized recommendation of micro-video[C]// Proceedings of the 27th ACM International Conference on Multimedia. New York: ACM Press, 2019: 1437-1445. 14 Guo D Y, Hong J S, Luo B L, et al. Multi-modal representation learning for short video understanding and recommendation[C]// Proceedings of the 2019 IEEE International Conference on Multimedia & Expo Workshops. Piscataway: IEEE, 2019: 687-690. 15 Sang L, Xu M, Qian S S, et al. Context-dependent propagating-based video recommendation in multimodal heterogeneous information networks[J]. IEEE Transactions on Multimedia, 2021, 23: 2019-2032. 16 Grover A, Leskovec J. node2vec: scalable feature learning for networks[C]// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2016: 855-864. 17 Devlin J, Chang M W, Lee K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: Association for Computational Linguistics, 2019: 4171-4186. 18 Cui Y M, Che W X, Liu T, et al. Pre-training with whole word masking for Chinese BERT[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2021, 29: 3504-3514. 19 Eyben F, W?llmer M, Schuller B. Opensmile: the Munich versatile and fast open-source audio feature extractor[C]// Proceedings of the 18th ACM International Conference on Multimedia. New York: ACM Press, 2010: 1459-1462. 20 Eyben F, Scherer K R, Schuller B W, et al. The Geneva minimalistic acoustic parameter set (GeMAPS) for voice research and affective computing[J]. IEEE Transactions on Affective Computing, 2016, 7(2): 190-202. 21 Huang G, Liu Z, van der Maaten L, et al. Densely connected convolutional networks[C]// Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 2261-2269. 22 Chen T, Borth D, Darrell T, et al. DeepSentiBank: visual sentiment concept classification with deep convolutional neural networks[OL]. (2014-10-30) [2023-10-04]. https://arxiv.org/pdf/1410. 8586. 23 杨瀚森, 樊养余, 吕国云, 等. 基于语义概念的图像情感分析[J]. 西北工业大学学报, 2023, 41(4): 784-793. 24 Jing P G, Su Y T, Nie L Q, et al. Low-rank multi-view embedding learning for micro-video popularity prediction[J]. IEEE Transactions on Knowledge and Data Engineering, 2018, 30(8): 1519-1532. 25 Rupnik J, Shawe-Taylor J. Multi-view canonical correlation analysis[C]// Proceedings of the Conference on Data Mining and Data Warehouses. Piscataway: IEEE, 2010: 1-4. 26 Liu G C, Lin Z C, Yu Y. Robust subspace segmentation by low-rank representation[C]// Proceedings of the 27th International Conference on Machine Learning. Madison: Omnipress, 2010: 663-670. 27 Chen J Y, Song X M, Nie L Q, et al. Micro tells macro: predicting the popularity of micro-videos via a transductive model[C]// Proceedings of the 24th ACM International Conference on Multimedia. New York: ACM Press, 2016: 898-907. 28 Smola A J, Sch?lkopf B. A tutorial on support vector regression[J]. Statistics and Computing, 2004, 14(3): 199-222. 29 Hoerl A E, Kennard R W. Ridge regression: biased estimation for nonorthogonal problems[J]. Technometrics, 2000, 42(1): 80-86. 30 Kukreja S L, L?fberg J, Brenner M J. A least absolute shrinkage and selection operator (LASSO) for nonlinear system identification[J]. IFAC Proceedings Volumes, 2006, 39(1): 814-819. 31 Huang G B, Zhou H M, Ding X J, et al. Extreme learning machine for regression and multiclass classification[J]. IEEE Transactions on Systems, Man, and Cybernetics Part B, Cybernetics. Piscataway: IEEE, 2012, 42(2): 513-529. 32 Zhang Z R, Xu S B, Guo L, et al. Multi-modal variational auto-Encoder model for micro-video popularity prediction[C]// Proceedings of the 8th International Conference on Communication and Information Processing. New York: ACM Press, 2022: 9-16. |
|
|
|