- Browse by Subject
Browsing by Subject "FPGA"
Now showing 1 - 5 of 5
Results Per Page
Sort Options
Item 3D Image Segmentation Implementation on FPGA Using EM/MPM Algorithm(2010-12) Sun, Yan; Christopher, Lauren; Rizkalla, Maher E.; Salama, PaulIn this thesis, 3D image segmentation is targeted to a Xilinx Field Programmable Gate Array (FPGA), and verified with extensive simulation. Segmentation is performed using the Expectation-Maximization with Maximization of the Posterior Marginals (EM/MPM) Bayesian algorithm. This algorithm segments the 3D image using neighboring pixels based on a Markov Random Field (MRF) model. This iterative algorithm is designed, synthesized and simulated for the Xilinx FPGA, and greater than 100 times speed improvement over standard desktop computer hardware is achieved. Three new techniques were the key to achieving this speed: Pipelined computational cores, sixteen parallel data paths and a novel memory interface for maximizing the external memory bandwidth. Seven MPM segmentation iterations are matched to the external memory bandwidth required of a single source file read, and a single segmented file write, plus a small amount of latency.Item ASIC implemented MicroBlaze-based Coprocessor for Data Stream Management Systems(2020-05) Balasubramanian, Linknath Surya; Lee, John J.; Christopher, Lauren A; Rizkalla, Maher E.The drastic increase in Internet usage demands the need for processing data in real time with higher efficiency than ever before. Symbiote Coprocessor Unit (SCU), developed by Dr. Pranav Vaidya, is a hardware accelerator which has potential of providing data processing speedup of up to 150x compared with traditional data stream processors. However, SCU implementation is very complex, fixed, and uses an outdated host interface, which limits future improvement. Mr. Tareq S. Alqaisi, an MSECE graduate from IUPUI worked on curbing these limitations. In his architecture, he used a Xilinx MicroBlaze microcontroller to reduce the complexity of SCU along with few other modifications. The objective of this study is to make SCU suitable for mass production while reducing its power consumption and delay. To accomplish this, the execution unit of SCU has been implemented in application specific integrated circuit and modules such as ACG/OCG, sequential comparator, and D-word multiplier/divider are integrated into the design. Furthermore, techniques such as operand isolation, buffer insertion, cell swapping, and cell resizing are also integrated into the system. As a result, the new design attains 67.9435 µW of dynamic power as compared to 74.0012 µW before power optimization along with a small increase in static power, 39.47 ns of clock period as opposed to 52.26 ns before time optimization.Item An Energy Efficient Register File Architecture for VLIW Streaming Processors on FPGAs(2019-12) Vaidya, Pranav S.; Yadav, Avinash; Surya, Linknath; Lee, John J.; Electrical and Computer Engineering, School of Engineering and TechnologyThe design of a register file with large scalability, high bandwidth, and energy efficiency is the major issue in the execution of streaming Very Long Instruction Word (VLIW) processors on Field Programmable Gate Arrays (FPGA's). This problem arises due to the fact that accessing multi-ported register files that can use optimized on-chip memory resources as well as enabling the maximum sharing of register operands are difficult provided that FPGA's on-chip memory resources only support up to two ports. To handle this issue, an Inverted Distributed Register File (IDRF) architecture is proposed in this article. This new IDRF is compared with the existing Central Register File (CRF) and the Distributed Register File (DRF) architectures on parameters such as kernel performance, circuit area, access delay, dynamic power, and energy. Experimental results show that IDRF matches the kernel performance with the CRF architecture but 10.4% improvement in kernel performance as compared to DRF architecture. Similar experimental results related to the circuit area, dynamic power, and energy are discussed in this article.Item Injector Waveform Monitoring of a Diesel Engine in Real-Time on a Hardware in the Loop Bench(2011-12) Farooqi, Quazi Mohammed Rushaed; Anwar, Sohel; Wasfy, Tamer; Lee, Jaehwan (John)This thesis presents the development, experimentation and validation of a reliable and robust system to monitor the injector pulse generated by an Engine Control Module (ECM) and send the corresponding fueling quantity to the real-time computer in a closed loop Hardware In the Loop (HIL) bench. The system can be easily calibrated for different engine platforms as well. The fueling quantity that is being injected by the injectors is a crucial variable to run closed loop HIL simulation to carry out the performance testing of engine, aftertreatment and other components of the vehicle. This research utilized Field Programmable Gate Arrays (FPGA) and Direct Memory Access (DMA) transfer capability offered by National Instruments (NI) Compact Reconfigurable Input-Output (cRIO) to achieve high speed data acquisition and delivery. The research was conducted in three stages. The first stage was to develop the HIL bench for the research. The second stage was to determine the performance of the system with different threshold methods and different sampling speeds necessary to satisfy the required accuracy of the fueling quantity being monitored. The third stage was to study the error and its variability involved in the injected fueling quantity from pulse to pulse, from injector to injector, between real injector stators and cheaper inductor load cells emulating the injectors, over different operating conditions with full factorial design of experimentation and mixed model Analysis Of Variance (ANOVA). Different thresholds were experimented to find out the best thresholds, the Start of Injection (SOI) threshold and the End of Injection (EOI) threshold that captured the injector “ontime” with best reliability and accuracy. Experimentation has been carried out at various data acquisition rates to find out the optimum speed of data sampling rate, trading off the accuracy of fueling quantity. The experimentation found out the expected error with a system with cheaper solution as well, so that, if a test application is not sensitive to error in fueling quantity, a cheaper solution with lower sampling rate and inductors as load cells can be used. The statistical analysis was carried out at highest available sampling rate on both injectors and inductors with the best threshold method found in previous studies. The result clearly shows the factors that affect the error and the variability in the standard deviations in error; it also shows the relation with the fixed and random factors. The real-time application developed for the HIL bench is capable of monitoring the injector waveform, using any fueling ontime table corresponding to the platform being tested, and delivering the fueling quantity in real-time. The test bench made for this research is also capable of studying injectors of different types with the automated test sequence, without occupying the resource of fully capable closed loop test benches for testing the ECM unctionality.Item Region-based Convolutional Neural Network and Implementation of the Network Through Zedboard Zynq(2019-05) Islam, Md Mahmudul; Christopher, Lauren; Salama, Paul; Rizkalla, MaherIn autonomous driving, medical diagnosis, unmanned vehicles and many other new technologies, the neural network and computer vision has become extremely popular and influential. In particular, for classifying objects, convolutional neural networks (CNN) is very efficient and accurate. One version is the Region-based CNN (RCNN). This is our selected network design for a new implementation in an FPGA. This network identifies stop signs in an image. We successfully designed and trained an RCNN network in MATLAB and implemented it in the hardware to use in an embedded real-world application. The hardware implementation has been achieved with maximum FPGA utilization of 220 18k BRAMS, 92 DSP48Es, 8156 FFS, 11010 LUTs with an on-chip power consumption of 2.235 Watts. The execution speed in FPGA is 0.31 ms vs. the MATLAB execution of 153 ms (on the computer) and 46 ms (on GPU).