Escaping local minima in deep reinforcement learning for video summarization
Authors/Creators
- 1. Aristotle University of Thessaloniki
Description
State-of-the-art deep neural unsupervised video summarization methods mostly fall under the adversarial reconstruction framework. This employs a Generative Adversarial Network (GAN) structure and Long Short-Term Memory (LSTM) autoencoders during its training stage. The typical result is a selector LSTM that sequentially receives video frame representations and outputs corresponding scalar importance factors, which are then used to select key-frames. This basic approach has been augmented with an additional Deep Reinforcement Learning (DRL) agent, trained using the Discriminator’s output as a reward, which learns to optimize the selector’s outputs. However, local minima are a well-known problem in DRL. Thus, this paper presents a novel regularizer for escaping local loss
minima, in order to improve unsupervised key-frame extraction. It is an additive loss term employed during a second training phase, that rewards the difference of the neural agent’s parameters from those of a previously found good solution. Thus, it encourages the training process to explore more aggressively the parameter space in order to discover a better local loss minimum. Evaluation performed on two public datasets shows considerable increases over
the baseline and against the state-of-the-art.
Files
EscapingLocalMinimaInDeepReinforcementLearningForVideoSummarization_ICMR23.pdf
Files
(547.4 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:132d3d48db21302d414afca839aff7cf
|
547.4 kB | Preview Download |