Currently, IoTs have taken the centre stage in the technology world by being one of the fastest growing markets, as it has been predicted that there will be more than 30 billion connected devices by the end of 2020. Futhermore, the amount of data which they are producing is estimated to be 100s of trillion gigabytes per year. In the near future, almost every device will be connected to the internet, ranging from sensors, vehicles, wearable electronics to other embedded systems like refrigerators. This tremendous reliance on IoT devices, generates a situation where we have to ﬁnd efﬁcient ways to communicate with them as well as charge them, speciﬁcally in the case of tiny IoT devices like an RFID or Bluetooth. One one hand, using a traditional method like battery is not a viable option for miniscule sized IoT devices. On the other hand, charging cables are not suitable, as it is not only expensive to purchase them in abundance considering each device, but also not practical for inaccessible areas. Henceforth, this project proposes the deployment of an unmanned ground vehicle in designated areas to wirelessly charge and collect data from clusters of tiny IoT devices in an operation area as shown below.
The objective is to explore different methods like Multi Integer Non-Linear Programming (as lower bound), Q-Learning and Deep Reinforcement Learning (deep Q-Learning) in order to plan the path of an unmanned ground vehicle so that it can charge the devices, meanwhile optimising both the energy consumed by it and the total path taken. Results from the above methods have been included and compared. All of the above methods are compared extensively on the basis of their efﬁciency and speed, and ultimately the one which gives the best result in a real world environment is chosen.
Working into MATLAB, Python and TensorFlow
Reading up on MINLP and Reinforcement Learning methods like Q-learning, dyna-Q and deep Q Learning
November - December
Development of demo application
Mid-January - February
Applying Deep Reinforcement Learning to large state space (40x40 grid size)
March - April
Comparing the results obtained from the different approaches
UGV learning path in a simulation
Comparison of methods as outlined in the final report, showed that Q-learning should be used for robot path planning in case the operation area is small and when there is not enough time for training the UGV. Moreover, dyna-Q should be incorporated with Q-learning in cases where no information about the model is present as dyna-Q uses model learning before training. In all the other cases, deep Q-learning should be used because it supports a large grid size and is more efﬁcient in achieving an optimum path with low energy consumption. Finally, the results from deep Q-learning are also optimum for real world application as they are quite close to the lower bound provided by MINLP. Therefore, the deployment of such a UGV for wirelessly charging (and communication with) IoT devices is feasible. The algorithm presented in this Project can also be used to create a simulated environment so that the UGV can train for large episodes without additional costs incurred by physical interaction with the environment. After sufﬁcient training, the UGV will have required knowledge before deployment. This will not only make cables obsolete but also will play a big role in data collection and charging in various sectors ranging from manufacturing to retail.