Attention-Based Multi-Layered Encoder-Decoder Model for Summarizing Non-Interactive User-Based Videos

Main Article Content

Vasudha Tiwari, Charul Bhatnagar

Abstract

Video summarization extracts the relevant contents from a video and presents the entire content of the video in a compact and summarized form.  User based video summarization, can summarize a video as per the requirement of the user. In this work, a non interactive and a perception-based video summarization technique is proposed that makes use of attention mechanism to capture user’s interest and extract relevant keyshots in temporal sequence from the video content. Here, video summarization has been articulated as a sequence-to-sequence learning problem and a supervised method has been proposed for summarization of the video. Adding layers to the existing network makes it deeper, enables higher level of abstraction and facilitates better feature extraction. Therefore, the proposed model uses a multi-layered, deep summarization encoder-decoder network (MLAVS), with attention mechanism to select final keyshots from the video. The contextual information of the video frames is encoded using a multi-layered Bidirectional Long Short-Term Memory network (BiLSTM) as the encoder. To decode, a multi-layered attention-based Long Short-Term memory (LSTM) using a multiplicative score function is employed. The experiments are performed on the benchmark TVSum dataset and the results obtained are compared with recent works. The results show considerable improvement and clearly demonstrate the efficacy of this methodology against most of the other available state-of-art methods.

Article Details

Section
Articles
Author Biography

Vasudha Tiwari, Charul Bhatnagar

[1]Vasudha Tiwari *

2Charul Bhatnagar

 

[1] Department of CEA, GLA University, Mathura, India

vasudhatiwari1608@gmail.com

2 Department of CEA, GLA University, Mathura, India

charul@gla.ac.in

Copyright © JES 2024 on-line : journal.esrgroups.org

References

Tiwari V, Bhatnagar C (2021) A survey of recent work on video summarization: approaches and techniques. Multimedia Tools and Applications 80, no. 18: 27187-27221.

Basavarajaiah M, Sharma P (2019) Survey of compressed domain video summarization techniques. ACM Computing Surveys (CSUR), 52(6), 1-29.

Sreeja M U, Kovoor B C (2019) Towards genre-specific frameworks for video summarisation: A survey. Journal of Visual Communication and Image Representation, 62, 340-358.

Del Molino A G, Tan C, Lim J H, Tan A H (2016) Summarization of egocentric videos: A comprehensive survey. IEEE Transactions on Human-Machine Systems, 47(1), 65-76.

Ji Z, Xiong K, PangY, & Li X (2019) Video summarization with attention-based encoder–decoder networks. IEEE Transactions on Circuits and Systems for Video Technology, 30(6), 1709-1717.

Sharghi A, Gong B, & Shah M (2016) Query-focused extractive video summarization. In European conference on computer vision (pp. 3-19). Springer, Cham.

Sharghi A, Laurel J S, Gong B (2017) Query-focused video summarization: Dataset, evaluation, and a memory network based approach. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4788-4797).

Vasudevan A B, Gygli M, Volokitin A, Van Gool L (2017) Query-adaptive video summarization via quality-aware relevance estimation. In Proceedings of the 25th ACM international conference on Multimedia (pp. 582-590).

Zhang Y, Kampffmeyer M, Zhao X, Tan M (2019) Deep reinforcement learning for query-conditioned video summarization. Applied Sciences, 9(4), 750.

Lin J, Zhong S H, Fares A (2022) Deep hierarchical LSTM networks with attention for video summarization. Computers & Electrical Engineering, 97, 107618.

Kanehira A, Van Gool L, Ushiku Y, Harada T (2018) Viewpoint-Aware video summarization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 7435-7444).

Joho H, Jose J M, Valenti R, Sebe N (2009) Exploiting facial expressions for affective video summarisation In Proceedings of the ACM international conference on image and video retrieval (pp. 1-8).

Peng W T, Chu W T, Chang C H, Chou, C N, Huang W J, Chang W Y, Hung, Y P (2011) Editing by viewing: automatic home video summarization by viewing behavior analysis. IEEE Transactions on Multimedia, 13(3), 539-550.

Zhong S H, Lin J, Lu J, Fares A, Ren T (2022) Deep semantic and attentive network for unsupervised video summarization. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 18(2), 1-21.

Ji Z, Zhao Y, Pang Y, Li X, Han J (2020). Deep attentive video summarization with distribution consistency learning. IEEE transactions on neural networks and learning systems, 32(4), 1765-1775.

De Avila S E F, Lopes A P B, da Luz Jr A, de Albuquerque Araújo A (2011) VSUMM: A mechanism designed to produce static video summaries and a novel evaluation method. Pattern recognition letters, 32(1), 56-68.

Chu W S, Song Y, Jaimes A (2015) Video co-summarization: Video summarization by visual co-occurrence. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3584-3592).

Cong Y, Yuan J, Luo J (2011) Towards scalable summarization of consumer videos via sparse dictionary selection. IEEE Transactions on Multimedia, 14(1), 66-75.

Zhou K, Qiao Y, Xiang T (2018) Deep reinforcement learning for unsupervised video summarization with diversity-representativeness reward. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 32, No. 1)

Mahasseni B, Lam M, Todorovic S (2017) Unsupervised video summarization with adversarial lstm networks. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (pp. 202-211).

Gong B, Chao W L, Grauman K, Sha F (2014) Diverse sequential subset selection for supervised video summarization. Advances in neural information processing systems, 27.

Zhao B, Li X, Lu X (2017) Hierarchical recurrent neural network for video summarization. In Proceedings of the 25th ACM international conference on Multimedia (pp. 863-871).

Rochan M, Ye L, Wang Y (2018) Video summarization using fully convolutional sequence networks. In Proceedings of the European conference on computer vision (ECCV) (pp. 347-363).

Lee H, Liu M, Riaz H, Rajasekaren N, Scriney M, Smeaton A F (2021) Attention based video summaries of live online zoom classes. arXiv preprint arXiv:2101.06328.

Sanabria M, Precioso F, Menguy T (2021) Hierarchical multimodal attention for deep video summarization. In 2020 25th International Conference on Pattern Recognition (ICPR) (pp. 7977-7984). IEEE.

Song Y, Vallmitjana J, Stent A, Jaimes A (2015) Tvsum: Summarizing web videos using titles. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5179-5187).

Zhao B, Li X, Lu X (2018) Hsa-rnn: Hierarchical structure-adaptive rnn for video summarization. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7405-7414).

Jung Y, Cho D, Kim D, Woo S, Kweon I S (2019) Discriminative feature learning for unsupervised video summarization. In Proceedings of the AAAI Conference on artificial intelligence (Vol. 33, No. 01, pp. 8537-8544).

Li Y, Wang L, Yang T, Gong B (2018) How local is the local diversity? reinforcing sequential determinantal point processes with dynamic ground sets for supervised video summarization. In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 151-167).

Zhang K, Chao W L, Sha F, Grauman K (2016) Video summarization with long short-term memory. In European conference on computer vision (pp. 766-782). Springer, Cham.

Li X, Zhao B, Lu X (2017) A general framework for edited video and raw video summarization. IEEE Transactions on Image Processing, 26(8), 3652-3664.

Apostolidis E, Adamantidou E, Metsai A I, Mezaris V, Patras I (2020) AC-SUM-GAN: Connecting actor-critic and generative adversarial networks for unsupervised video summarization. IEEE Transactions on Circuits and Systems for Video Technology, 31(8), 3278-3292.

Agarwal, Ambuj Kumar, Rupesh Kumar Jindal, Deepak Chaudhary, Raj Gaurang Tiwari, and Megha Sharma. "Security and Privacy Concerns in the Internet of Things: A Comprehensive Review." In 2022 11th International Conference on System Modeling & Advancement in Research Trends (SMART), pp. 254-259. IEEE, 2022.

Tiwari, Raj Gaurang, Ambuj Kumar Agarwal, Rajesh Kumar Kaushal, and Naveen Kumar. "Prophetic analysis of bitcoin price using machine learning approaches." In 2021 6th International Conference on Signal Processing, Computing and Control (ISPCC), pp. 428-432. IEEE, 2021.

Tiwari, Raj Gaurang, Sandeep Kumar, Gaurav Vishnu Londhe, Ambuj Kumar Agarwal, and Rajat Bhardwaj. "Accurate and Automated Deep Learning Solution for Skin Cancer Detection." International Journal of Intelligent Systems and Applications in Engineering 11, no. 5s (2023): 490-500.

Tiwari, Raj Gaurang, Ambuj Kumar Agarwal, Nishant Gupta, Aman Anand, and Nikita Verma. "Conceptualization of Effective Algorithm for Minimizing Power Consumption in Cloud Servers." In 2022 11th International Conference on System Modeling & Advancement in Research Trends (SMART), pp. 445-449. IEEE, 2022.

Agarwal, Ambuj Kumar, Lekha Rani, Raj Gaurang Tiwari, Tarun Sharma, and Pradeepta Kumar Sarangi. "Honey encryption: fortification beyond the brute-force impediment." In Advances in Mechanical Engineering: Select Proceedings of CAMSE 2020, pp. 673-681. Springer Singapore, 2021.

Agarwal, Ambuj Kumar, Raj Gaurang Tiwari, Rajesh Kumar Kaushal, and Naveen Kumar. "A systematic analysis of applications of blockchain in healthcare." In 2021 6th International Conference on Signal Processing, Computing and Control (ISPCC), pp. 413-417. IEEE, 2021.

Agarwal, Ambuj Kumar, Vidhu Kiran, Rupesh Kumar Jindal, Deepak Chaudhary, and Raj Gaurang Tiwari. "Optimized Transfer Learning for Dog Breed Classification." International Journal of Intelligent Systems and Applications in Engineering 10, no. 1s (2022): 18-22.

Tiwari, Raj Gaurang, Ambuj Kumar Agarwal, Rupesh Kumar Jindal, and Anshbir Singh. "Experimental Evaluation of Boosting Algorithms for Fuel Flame Extinguishment with Acoustic Wave." In 2022 International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies (3ICT), pp. 413-418. IEEE, 2022.

Tiwari, Raj Gaurang, Pratibha, Sandeep Dubey, and Ambuj Kumar Agarwal. "Impact of IDMA Scheme on Power Line Communication." In Recent Trends in Product Design and Intelligent Manufacturing Systems: Select Proceedings of IPDIMS 2021, pp. 985-996. Singapore: Springer Nature Singapore, 2022.

De, Indrajit, Lekha Rani, Rajat Bhardwaj, Ambuj Kumar Agarwal, and Raj Gaurang Tiwari. "Human Posture Recognition by Distribution-Aware Coordinate Representation and Machine Learning." International Journal of Intelligent Systems and Applications in Engineering 11, no. 5s (2023): 477-489.

Trivedi, Naresh Kumar, Raj Gaurang Tiwari, Ambuj Kumar Agarwal, and Vinay Gautam. "A Detailed Investigation and Analysis of Using Machine Learning Techniques for Thyroid Diagnosis." In 2023 International Conference on Emerging Smart Computing and Informatics (ESCI), pp. 1-5. IEEE, 2023.

Kumar, Ajay, Raj Gaurang Tiwari, Naresh Kumar Trivedi, Abhineet Anand, Ambuj Kumar Agarwal, and Devendra Prasad. "Extended Network Lifespan with Fault-Tolerant Information Transmission." In 2021 10th International Conference on System Modeling & Advancement in Research Trends (SMART), pp. 218-222. IEEE, 2021.

Kumar, Ajay, Raj Gaurang Tiwari, Abhineet Anand, Naresh Kumar Trivedi, and Ambuj Kumar Agarwal. "New Business Paradigm using Sentiment Analysis Algorithm." In 2021 10th International Conference on System Modeling & Advancement in Research Trends (SMART), pp. 419-423. IEEE, 2021.

Tiwari, Raj Gaurang, Abeer A. Aljohani, Rajat Bhardwaj, and Ambuj Kumar Agarwal. "Virtual reality in tourism: assessing the authenticity, advantages, and disadvantages of VR tourism." Augmented and Virtual Reality in Social Learning: Technological Impacts and Challenges 3 (2023): 215.

Tiwari, Raj Gaurang, Sandip Vijay, Sandeep Dubey, Ambuj Kumar Agarwal, and Megha Sharma. "Relevance and Predictability in Wireless Multimedia Sensor Network in Smart Cities." In Convergence of IoT, Blockchain, and Computational Intelligence in Smart Cities, pp. 251-262. CRC Press, 2023.

Tiwari, Raj Gaurang, Ambuj Kumar Agarwal, and Mohammad Husain. "Integration of virtual reality in the e-learning environment." Augmented and Virtual Reality in Industry 5.0 2 (2023): 253.