IU Indianapolis ScholarWorks :: Browsing by Subject "CUDA"

Browsing by Subject "CUDA"

Now showing 1 - 3 of 3

Deep Learning with Go
(2020-05) Stinson, Derek L.; Ben Miled, Zina; King, Brian; Rizkalla, Maher
Current research in deep learning is primarily focused on using Python as a support language. Go, an emerging language, that has many benefits including native support for concurrency has seen a rise in adoption over the past few years. However, this language is not widely used to develop learning models due to the lack of supporting libraries and frameworks for model development. In this thesis, the use of Go for the development of neural network models in general and convolution neural networks is explored. The proposed study is based on a Go-CUDA implementation of neural network models called GoCuNets. This implementation is then compared to a Go-CPU deep learning implementation that takes advantage of Go's built in concurrency called ConvNetGo. A comparison of these two implementations shows a significant performance gain when using GoCuNets compared to ConvNetGo.
GPU-OSDDA: A Bit-Vector GPU-based Deadlock Detection Algorithm for Single-Unit Resource Systems
(Taylor & Francis, 2015-09) Abell, Stephen; Nhan, Do; Lee, John J.; Department of Electrical and Computer Engineering
This article presents a GPU-based single-unit deadlock detection methodology and its algorithm, GPU-OSDDA. Our GPU-based design utilizes parallel hardware of GPU to perform computations and thus is able to overcome the major limitation of prior hardware-based approaches by having the capability of handling thousands of processes and resources, whilst achieving real-world run-times. By utilizing a bit-vector technique for storing algorithm ma- trices and designing novel, efficient algorithmic methods, we not only reduce memory usage dramatically but also achieve two orders of magnitude speedup over CPU equivalents. Additionally, GPU-OSDDA acts as an interactive service to the CPU, because all of the aforementioned computations and matrix management techniques take place on the GPU, requiring minimal interaction with the CPU. GPU-OSDDA is implemented on three GPU cards: Tesla C2050, Tesla K20c, and Titan X. Our design shows overall speedups of 6-595X over CPU equivalents.
Parallel acceleration of deadlock detection and avoidance algorithms on GPUs
(2013-08) Abell, Stephen W.; Lee, Jaehwan John; King, Brian; Chien, Stanley
Current mainstream computing systems have become increasingly complex. Most of which have Central Processing Units (CPUs) that invoke multiple threads for their computing tasks. The growing issue with these systems is resource contention and with resource contention comes the risk of encountering a deadlock status in the system. Various software and hardware approaches exist that implement deadlock detection/avoidance techniques; however, they lack either the speed or problem size capability needed for real-time systems. The research conducted for this thesis aims to resolve issues present in past approaches by converging the two platforms (software and hardware) by means of the Graphics Processing Unit (GPU). Presented in this thesis are two GPU-based deadlock detection algorithms and one GPU-based deadlock avoidance algorithm. These GPU-based algorithms are: (i) GPU-OSDDA: A GPU-based Single Unit Resource Deadlock Detection Algorithm, (ii) GPU-LMDDA: A GPU-based Multi-Unit Resource Deadlock Detection Algorithm, and (iii) GPU-PBA: A GPU-based Deadlock Avoidance Algorithm. Both GPU-OSDDA and GPU-LMDDA utilize the Resource Allocation Graph (RAG) to represent resource allocation status in the system. However, the RAG is represented using integer-length bit-vectors. The advantages brought forth by this approach are plenty: (i) less memory required for algorithm matrices, (ii) 32 computations performed per instruction (in most cases), and (iii) allows our algorithms to handle large numbers of processes and resources. The deadlock detection algorithms also require minimal interaction with the CPU by implementing matrix storage and algorithm computations on the GPU, thus providing an interactive service type of behavior. As a result of this approach, both algorithms were able to achieve speedups over two orders of magnitude higher than their serial CPU implementations (3.17-317.42x for GPU-OSDDA and 37.17-812.50x for GPU-LMDDA). Lastly, GPU-PBA is the first parallel deadlock avoidance algorithm implemented on the GPU. While it does not achieve two orders of magnitude speedup over its CPU implementation, it does provide a platform for future deadlock avoidance research for the GPU.

Browsing by Subject "CUDA"

Results Per Page

Sort Options