- Browse by Subject
Browsing by Subject "reinforcement learning"
Now showing 1 - 5 of 5
Results Per Page
Sort Options
Item Blockchain-based Edge Resource Sharing for Metaverse(IEEE, 2022-10) Wang, Zhilin; Hut, Qin; Xu, Minghui; Jiang, Honglu; Computer and Information Science, School of ScienceAlthough Metaverse has recently been widely studied, its practical application still faces many challenges. One of the severe challenges is the lack of sufficient resources for computing and communication on local devices, resulting in the inability to access the Metaverse services. To address this issue, this paper proposes a practical blockchain-based mobile edge computing (MEC) platform for resource sharing and optimal utilization to complete the requested offloading tasks, given the heterogeneity of servers' available resources and that of users' task requests. To be specific, we first elaborate the design of our proposed system and then dive into the task allocation mechanism to assign offloading tasks to proper servers. To solve the multiple task allocation (MTA) problem in polynomial time, we devise a learning-based algorithm. Since the objective function and constraints of MTA are significantly affected by the servers uploading the tasks, we reformulate it as a reinforcement learning problem and calculate the rewards for each state and action considering the influences of servers. Finally, numerous experiments are conducted to demonstrate the effectiveness and efficiency of our proposed system and algorithms.Item Intelligent Device Selection in Federated Edge Learning with Energy Efficiency(2021-12) Peng, Cheng; Hu, Qin; Kang, Kyubyung; Zou, XukaiDue to the increasing demand from mobile devices for the real-time response of cloud computing services, federated edge learning (FEL) emerges as a new computing paradigm, which utilizes edge devices to achieve efficient machine learning while protecting their data privacy. Implementing efficient FEL suffers from the challenges of devices' limited computing and communication resources, as well as unevenly distributed datasets, which inspires several existing research focusing on device selection to optimize time consumption and data diversity. However, these studies fail to consider the energy consumption of edge devices given their limited power supply, which can seriously affect the cost-efficiency of FEL with unexpected device dropouts. To fill this gap, we propose a device selection model capturing both energy consumption and data diversity optimization, under the constraints of time consumption and training data amount. Then we solve the optimization problem by reformulating the original model and designing a novel algorithm, named E2DS, to reduce the time complexity greatly. By comparing with two classical FEL schemes, we validate the superiority of our proposed device selection mechanism for FEL with extensive experimental results. Furthermore, for each device in a real FEL environment, it is the fact that multiple tasks will occupy the CPU at the same time, so the frequency of the CPU used for training fluctuates all the time, which may lead to large errors in computing energy consumption. To solve this problem, we deploy reinforcement learning to learn the frequency so as to approach real value. And compared to increasing data diversity, we consider a more direct way to improve the convergence speed using loss values. Then we formulate the optimization problem that minimizes the energy consumption and maximizes the loss values to select the appropriate set of devices. After reformulating the problem, we design a new algorithm FCE2DS as the solution to have better performance on convergence speed and accuracy. Finally, we compare the performance of this proposed scheme with the previous scheme and the traditional scheme to verify the improvement of the proposed scheme in multiple aspects.Item Multidisciplinary Optimization in Decentralized Reinforcement Learning(IEEE, 2017-12) Nguyen, Thanh; Mukhopadhyay, Snehasis; Computer and Information Science, School of ScienceMultidisciplinary Optimization (MDO) is one of the most popular techniques in aerospace engineering, where the system is complex and includes the knowledge from multiple fields. However, according to the best of our knowledge, MDO has not been widely applied in decentralized reinforcement learning (RL) due to the `unknown' nature of the RL problems. In this work, we apply the MDO in decentralized RL. In our MDO design, each learning agent uses system identification to closely approximate the environment and tackle the `unknown' nature of the RL. Then, the agents apply the MDO principles to compute the control solution using Monte Carlo and Markov Decision Process techniques. We examined two options of MDO designs: the multidisciplinary feasible and the individual discipline feasible options, which are suitable for multi-agent learning. Our results show that the MDO individual discipline feasible option could successfully learn how to control the system. The MDO approach shows better performance than the completely decentralization and centralization approaches.Item Mutual Reinforcement Learning(2021-05) Reid, Cameron; Mukhopadhyay, Snehasis; Mohler, George; Tuceryan, MihranMutual learning is an emerging field in intelligent systems which takes inspiration from naturally intelligent agents and attempts to explore how agents can communicate and coop- erate to share information and learn more quickly. While agents in many biological systems have little trouble learning from one another, it is not immediately obvious how artificial agents would achieve similar learning. In this thesis, I explore how agents learn to interact with complex systems. I further explore how these complex learning agents may be able to transfer knowledge to one another to improve their learning performance when they are learning together and have the power of communication. While significant research has been done to explore the problem of knowledge transfer, the existing literature is concerned ei- ther with supervised learning tasks or relatively simple discrete reinforcement learning. The work presented here is, to my knowledge, the first which admits continuous state spaces and deep reinforcement learning techniques. The first contribution of this thesis, presented in Chapter 2, is a modified version of deep Q-learning which demonstrates improved learning performance due to the addition of a mutual learning term which penalizes disagreement between mutually learning agents. The second contribution, in Chapter 3, is a presentation work which describes effective communication of agents which use fundamentally different knowledge representations and systems of learning (model-free deep Q learning and model- based adaptive dynamic programming), and I discuss how the agents can mathematically negotiate their trust in one another to achieve superior learning performance. I conclude with a discussion of the promise shown by this area of research and a discussion of problems which I believe are exciting directions for future research.Item Selective decentralization to improve reinforcement learning in unknown linear noisy systems(IEEE, 2017-11) Nguyen, Thanh; Mukhopadhyay, Snehasis; Computer and Information Science, School of ScienceIn this paper, we answer the question of to what extend selective decentralization could enhance the learning and control performance when the system is noisy and unknown. Compared to the previous works in selective decentralization, in this paper, we add the system noise as another complexity in the learning and control problem. Thus, we only perform analysis for some simple toy examples of noisy linear system. In linear system, the Halminton-Jaccobi-Bellman (HJB) equation becomes Riccati equation with closed-form solution. Our previous framework in learning and control unknown system is based on the following principle: approximating the system using identification in order to apply model-based solution. Therefore, this paper would explore the learning and control performance on two aspects: system identification error and system stabilization. Our results show that selective decentralization show better learning performance than the centralization when the noise level is low.