- Browse by Subject
Browsing by Subject "Reinforcement Learning"
Now showing 1 - 4 of 4
Results Per Page
Sort Options
Item Deep Reinforcement Learning of IoT System Dynamics for Optimal Orchestration and Boosted Efficiency(2023-08) Shi, Haowei; Zhang, Qingxue; King, Brian; Fang, ShiaofenThis thesis targets the orchestration challenge of the Wearable Internet of Things (IoT) systems, for optimal configurations of the system in terms of energy efficiency, computing, and data transmission activities. We have firstly investigated the reinforcement learning on the simulated IoT environments to demonstrate its effectiveness, and afterwards studied the algorithm on the real-world wearable motion data to show the practical promise. More specifically, firstly, challenge arises in the complex massive-device orchestration, meaning that it is essential to configure and manage the massive devices and the gateway/server. The complexity on the massive wearable IoT devices, lies in the diverse energy budget, computing efficiency, etc. On the phone or server side, it lies in how global diversity can be analyzed and how the system configuration can be optimized. We therefore propose a new reinforcement learning architecture, called boosted deep deterministic policy gradient, with enhanced actor-critic co-learning and multi-view state transformation. The proposed actor-critic co-learning allows for enhanced dynamics abstraction through the shared neural network component. Evaluated on a simulated massive-device task, the proposed deep reinforcement learning framework has achieved much more efficient system configurations with enhanced computing capabilities and improved energy efficiency. Secondly, we have leveraged the real-world motion data to demonstrate the potential of leveraging reinforcement learning to optimally configure the motion sensors. We used paradigms in sequential data estimation to obtain estimated data for some sensors, allowing energy savings since these sensors no longer need to be activated to collect data for estimation intervals. We then introduced the Deep Deterministic Policy Gradient algorithm to learn to control the estimation timing. This study will provide a real-world demonstration of maximizing energy efficiency wearable IoT applications while maintaining data accuracy. Overall, this thesis will greatly advance the wearable IoT system orchestration for optimal system configurations.Item Integrating Data-driven Control Methods with Motion Planning: A Deep Reinforcement Learning-based Approach(2023-12) Prabu, Avinash; Li, Lingxi; Chen, Yaobin; King, Brian; Tian, RenranPath-tracking control is an integral part of motion planning in autonomous vehicles, in which the vehicle's lateral and longitudinal positions are controlled by a control system that will provide acceleration and steering angle commands to ensure accurate tracking of longitudinal and lateral movements in reference to a pre-defined trajectory. Extensive research has been conducted to address the growing need for efficient algorithms in this area. In this dissertation, a scenario and machine learning-based data-driven control approach is proposed for a path-tracking controller. Firstly, a Deep Reinforcement Learning model is developed to facilitate the control of longitudinal speed. A Deep Deterministic Policy Gradient algorithm is employed as the primary algorithm in training the reinforcement learning model. The main objective of this model is to maintain a safe distance from a lead vehicle (if present) or track a velocity set by the driver. Secondly, a lateral steering controller is developed using Neural Networks to control the steering angle of the vehicle with the main goal of following a reference trajectory. Then, a path-planning algorithm is developed using a hybrid A* planner. Finally, the longitudinal and lateral control models are coupled together to obtain a complete path-tracking controller that follows a path generated by the hybrid A* algorithm at a wide range of vehicle speeds. The state-of-the-art path-tracking controller is also built using Model Predictive Control and Stanley control to evaluate the performance of the proposed model. The results showed the effectiveness of both proposed models in the same scenario, in terms of velocity error, lateral yaw angle error, and lateral distance error. The results from the simulation show that the developed hybrid A* algorithm has good performance in comparison to the state-of-the-art path planning algorithms.Item Learning-based Attack and Defense on Recommender Systems(2021-08) Palanisamy Sundar, Agnideven; Zou, Xukai; Li, Feng; Hu, QinThe internet is the home for massive volumes of valuable data constantly being created, making it difficult for users to find information relevant to them. In recent times, online users have been relying on the recommendations made by websites to narrow down the options. Online reviews have also become an increasingly important factor in the final choice of a customer. Unfortunately, attackers have found ways to manipulate both reviews and recommendations to mislead users. A Recommendation System is a special type of information filtering system adapted by online vendors to provide suggestions to their customers based on their requirements. Collaborative filtering is one of the most widely used recommendation systems; unfortunately, it is prone to shilling/profile injection attacks. Such attacks alter the recommendation process to promote or demote a particular product. On the other hand, many spammers write deceptive reviews to change the credibility of a product/service. This work aims to address these issues by treating the review manipulation and shilling attack scenarios independently. For the shilling attacks, we build an efficient Reinforcement Learning-based shilling attack method. This method reduces the uncertainty associated with the item selection process and finds the most optimal items to enhance attack reach while treating the recommender system as a black box. Such practical online attacks open new avenues for research in building more robust recommender systems. When it comes to review manipulations, we introduce a method to use a deep structure embedding approach that preserves highly nonlinear structural information and the dynamic aspects of user reviews to identify and cluster the spam users. It is worth mentioning that, in the experiment with real datasets, our method captures about 92\% of all spam reviewers using an unsupervised learning approach.Item Selectively decentralized reinforcement learning(2018-05) Nguyen, Thanh Minh; Mukhopadhyay, SnehasisThe main contributions in this thesis include the selectively decentralized method in solving multi-agent reinforcement learning problems and the discretized Markov-decision-process (MDP) algorithm to compute the sub-optimal learning policy in completely unknown learning and control problems. These contributions tackle several challenges in multi-agent reinforcement learning: the unknown and dynamic nature of the learning environment, the difficulty in computing the closed-form solution of the learning problem, the slow learning performance in large-scale systems, and the questions of how/when/to whom the learning agents should communicate among themselves. Through this thesis, the selectively decentralized method, which evaluates all of the possible communicative strategies, not only increases the learning speed, achieves better learning goals but also could learn the communicative policy for each learning agent. Compared to the other state-of-the-art approaches, this thesis’s contributions offer two advantages. First, the selectively decentralized method could incorporate a wide range of well-known algorithms, including the discretized MDP, in single-agent reinforcement learning; meanwhile, the state-of-the-art approaches usually could be applied for one class of algorithms. Second, the discretized MDP algorithm could compute the sub-optimal learning policy when the environment is described in general nonlinear format; meanwhile, the other state-of-the-art approaches often assume that the environment is in limited format, particularly in feedback-linearization form. This thesis also discusses several alternative approaches for multi-agent learning, including Multidisciplinary Optimization. In addition, this thesis shows how the selectively decentralized method could successfully solve several real-worlds problems, particularly in mechanical and biological systems.