TY - GEN
T1 - Reinforcement learning for power management in wireless multimedia communications
AU - Mastronarde, Nicholas
AU - Van Der Schaar, Mihaela
PY - 2011
Y1 - 2011
N2 - We consider the problem of energy-efficient point-to-point transmission of delay-sensitive data (e.g. multimedia data) over a fading channel. We propose a rigorous and unified framework for simultaneously utilizing both physical-layer and system-level techniques to minimize energy consumption, under delay constraints, in the presence of stochastic and unknown traffic and channel conditions. We formulate the problem as a Markov decision process and solve it online using reinforcement learning. The advantages of the proposed online method are that (i) it does not require a priori knowledge of the traffic arrival and channel statistics to determine the jointly optimal physical-layer and system-level power management strategies; (ii) it exploits partial information about the system so that less information needs to be learned than when using conventional reinforcement learning algorithms; and (iii) it obviates the need for action exploration, which severely limits the adaptation speed and run-time performance of conventional reinforcement learning algorithms.
AB - We consider the problem of energy-efficient point-to-point transmission of delay-sensitive data (e.g. multimedia data) over a fading channel. We propose a rigorous and unified framework for simultaneously utilizing both physical-layer and system-level techniques to minimize energy consumption, under delay constraints, in the presence of stochastic and unknown traffic and channel conditions. We formulate the problem as a Markov decision process and solve it online using reinforcement learning. The advantages of the proposed online method are that (i) it does not require a priori knowledge of the traffic arrival and channel statistics to determine the jointly optimal physical-layer and system-level power management strategies; (ii) it exploits partial information about the system so that less information needs to be learned than when using conventional reinforcement learning algorithms; and (iii) it obviates the need for action exploration, which severely limits the adaptation speed and run-time performance of conventional reinforcement learning algorithms.
KW - adaptive modulation and coding
KW - dynamic power management
KW - Energy-efficient wireless multimedia communication
KW - Markov decision process
KW - power-control
KW - reinforcement learning
UR - https://www.scopus.com/pages/publications/80155202760
U2 - 10.1109/ICME.2011.6012018
DO - 10.1109/ICME.2011.6012018
M3 - Conference contribution
AN - SCOPUS:80155202760
SN - 9781612843490
T3 - Proceedings - IEEE International Conference on Multimedia and Expo
BT - Electronic Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, ICME 2011
T2 - 2011 12th IEEE International Conference on Multimedia and Expo, ICME 2011
Y2 - 11 July 2011 through 15 July 2011
ER -