Dynamic Task Allocation and Service Migration in Edge-Cloud IoT System based on Deep Reinforcement Learning

Edge computing extends the ability of cloud computing to the network edge to support diverse resource-sensitive and performance-sensitive IoT applications. However due to the limited capacity of edge servers (ESs) and the dynamic computing requirements the system needs to dynamically update the task allocation policy according to real-time system states. Service migration is essential to ensure service continuity when implementing dynamic task allocation. Therefore this paper investigates the long-term dynamic task allocation and service migration (DTASM) problem in edge-cloud IoT systems where users’ computing requirements and mobility change over time. The DTASM problem is formulated to achieve the long-term performance of minimizing the load forwarded to the cloud while fulfilling the seamless migration constraint and the latency constraint at each time of implementing the DTASM decision. First the DTASM problem is divided into two sub-problems: the user selection problem on each ES and the system task allocation problem. Then the DTASM problem is formulated as a Markov Decision Process (MDP) and an approach based on deep reinforcement learning (DRL) is proposed. To tackle the challenge of vast discrete action spaces for DTASM task allocation in the system with a mass of IoT users a training architecture based on the twin-delayed deep deterministic policy gradient (DDPG) is employed. Meanwhile each action is divided into a differentiable action for policy training and one mapped action for implementation in the IoT system. Simulation results demonstrate that the proposed DRL-based approach obtains the long-term optimal system performance compared to other benchmarks while satisfying seamless service migration.