- Browse by Subject
Browsing by Subject "Performance analysis"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
Item Deep Transferable Intelligence for Wearable Big Data Pattern Detection(2021-08) Gangadharan, Kiirthanaa; Zhang, Qingxue; King, Brian S.; Chien, Yung-Ping S.Biomechanical Big Data is of great significance to precision health applications, among which we take special interest in Physical Activity Detection (PAD). In this study, we have performed extensive research on deep learning-based PAD from biomechanical big data, focusing on the challenges raised by the need for real-time edge inference. First, considering there are many places we can place the motion sensors, we have thoroughly compared and analyzed the location difference in terms of deep learning-based PAD performance. We have further compared the difference among six sensor channels (3-axis accelerometer and 3-axis gyroscope). Second, we have selected the optimal sensor and the optimal sensor channel, which can not only provide sensor usage suggestions but also enable ultra-lowpower application on the edge. Third, we have investigated innovative methods to minimize the training effort of the deep learning model, leveraging the transfer learning strategy. More specifically, we propose to pre-train a transferable deep learning model using the data from other subjects and then fine-tune the model using limited data from the target-user. In such a way, we have found that, for single-channel case, the transfer learning can effectively increase the deep model performance even when the fine-tuning effort is very small. This research, demonstrated by comprehensive experimental evaluation, has shown the potential of ultra-low-power PAD with minimized sensor stream, and minimized training effort.Item A visual Analytics System for Optimizing Communications in Massively Parallel Applications(IEEE, 2017) Fujiwara, Takanori; Malakar, Preeti; Reda, Khairi; Vishwanath, Venkatram; Papka, Michael E.; Ma, Kwan-LiuCurrent and future supercomputers have tens of thousands of compute nodes interconnected with high-dimensional networks and complex network topologies for improved performance. Application developers are required to write scalable parallel programs in order to achieve high throughput on these machines. Application performance is largely determined by efficient inter-process communication. A common way to analyze and optimize performance is through profiling parallel codes to identify communication bottlenecks. However, understanding gigabytes of profile data is not a trivial task. In this paper, we present a visual analytics system for identifying the scalability bottlenecks and improving the communication efficiency of massively parallel applications. Visualization methods used in this system are designed to comprehend large-scale and varied communication patterns on thousands of nodes in complex networks such as the 5D torus and the dragonfly. We also present efficient rerouting and remapping algorithms that can be coupled with our interactive visual analytics design for performance optimization. We demonstrate the utility of our system with several case studies using three benchmark applications on two leading supercomputers. The mapping suggestion from our system led to 38% improvement in hop-bytes for MiniAMR application on 4,096 MPI processes.