TY - GEN
T1 - Online reinforcement learning for multimedia buffer control
AU - Mastronarde, Nicholas
AU - Van Der Schaar, Mihaela
PY - 2010
Y1 - 2010
N2 - We formulate the multimedia buffer control problem as a Markov decision process. Because the application's rate-distortion-complexity behavior is unknown a priori, the optimal buffer control policy must be learned online. To this end, we adopt a low complexity reinforcement learning algorithm called Q-learning to learn the optimal control policy at run-time. We propose an accelerated Q-learning algorithm that exploits partial knowledge about the system's dynamics in order to dramatically improve the performance. In our experiments, we show that the proposed application-aware reinforcement learning algorithm performs significantly better than existing application-independent reinforcement learning algorithms.
AB - We formulate the multimedia buffer control problem as a Markov decision process. Because the application's rate-distortion-complexity behavior is unknown a priori, the optimal buffer control policy must be learned online. To this end, we adopt a low complexity reinforcement learning algorithm called Q-learning to learn the optimal control policy at run-time. We propose an accelerated Q-learning algorithm that exploits partial knowledge about the system's dynamics in order to dramatically improve the performance. In our experiments, we show that the proposed application-aware reinforcement learning algorithm performs significantly better than existing application-independent reinforcement learning algorithms.
KW - Dynamic voltage scaling
KW - Encoder complexity control
KW - Markov decision processes
KW - Multimedia buffer control
KW - Reinforcement learning
UR - https://www.scopus.com/pages/publications/78049412014
U2 - 10.1109/ICASSP.2010.5495293
DO - 10.1109/ICASSP.2010.5495293
M3 - Conference contribution
AN - SCOPUS:78049412014
SN - 9781424442966
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 1958
EP - 1961
BT - 2010 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2010 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2010 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2010
Y2 - 14 March 2010 through 19 March 2010
ER -