No-Pain No-Gain: DRL Assisted Optimization in Energy-Constrained CR-NOMA Networks

Zhiguo Ding, Robert Schober, H. Vincent Poor

Research output: Contribution to journalArticlepeer-review


This paper applies machine learning to optimize the transmission policy of cognitive radio inspired non-orthogonal multiple access (CR-NOMA) networks, where time-division multiple access (TDMA) is used to serve multiple primary users and an energy-constrained secondary user is admitted to the primary users’ time slots via NOMA. During each time slot, the secondary user performs the two tasks: data transmission and energy harvesting based on the signals received from the primary users. The goal of the paper is to maximize the secondary user’s long-term throughput, by optimizing its transmit power and the time-sharing coefficient for its two tasks. The longterm throughput maximization problem is challenging due to the need for making decisions that yield long-term gains but might result in short-term losses. For example, when in a giventime slot, a primary user with large channel gains transmits, intuition suggests that the secondary user should not carry out data transmission due to the strong interference from the primary user but perform energy harvesting only, which results in zero data rate for this time slot but yields potential long-term benefits. In this paper, a deep reinforcement learning (DRL) approach is applied to emulate this intuition, where the deep deterministic policy gradient (DDPG) algorithm is employed together with convex optimization. Our simulation results demonstrate that the proposed DRL assisted NOMA transmission scheme can yield significant performance gains over two benchmark schemes.
Original languageEnglish
JournalIEEE Transactions on Communications
Publication statusAccepted/In press - 3 Jun 2021


Dive into the research topics of 'No-Pain No-Gain: DRL Assisted Optimization in Energy-Constrained CR-NOMA Networks'. Together they form a unique fingerprint.

Cite this