- Browse by Subject
Browsing by Subject "Neural network"
Now showing 1 - 7 of 7
Results Per Page
Sort Options
Item Comparison of Supervised Machine Learning and Probabilistic Approaches for Record Linkage(AMIA Informatics summit 2019 Conference Proceedings., 2020-03-25) McNutt, Andrew T.; Grannis, Shaun J.; Bo, Na; Xu, Huiping; Kasthurirathne, Suranga N.Record linkage is vital to prevent fragmentation of patient data. Machine learning approaches present considerable potential for record linkage. We compared the performance of three machine learning algorithms to an established probabilistic record linkage technique. Machine learning approaches exhibited results that were comparable, or statistically superior to the established probabilistic approach. It is unclear if the cost of manually reviewing datasets for supervised learning is justified by the performance improvements they yield.Item Direct prediction of profiles of sequences compatible to a protein structure by neural networks with fragment-based local and energy-based nonlocal profiles(Wiley Online Library, 2014-10) Li, Zhixiu; Yang, Yuedong; Faraggi, Eshel; Zhou, Jian; Zhou, Yaoqi; Department of BioHealth Informatics, IU School of Informatics and ComputingLocating sequences compatible with a protein structural fold is the well-known inverse protein-folding problem. While significant progress has been made, the success rate of protein design remains low. As a result, a library of designed sequences or profile of sequences is currently employed for guiding experimental screening or directed evolution. Sequence profiles can be computationally predicted by iterative mutations of a random sequence to produce energy-optimized sequences, or by combining sequences of structurally similar fragments in a template library. The latter approach is computationally more efficient but yields less accurate profiles than the former because of lacking tertiary structural information. Here we present a method called SPIN that predicts Sequence Profiles by Integrated Neural network based on fragment-derived sequence profiles and structure-derived energy profiles. SPIN improves over the fragment-derived profile by 6.7% (from 23.6 to 30.3%) in sequence identity between predicted and wild-type sequences. The method also reduces the number of residues in low complex regions by 15.7% and has a significantly better balance of hydrophilic and hydrophobic residues at protein surface. The accuracy of sequence profiles obtained is comparable to those generated from the protein design program RosettaDesign 3.5. This highly efficient method for predicting sequence profiles from structures will be useful as a single-body scoring term for improving scoring functions used in protein design and fold recognition. It also complements protein design programs in guiding experimental design of the sequence library for screening and directed evolution of designed sequences. The SPIN server is available at http://sparks-lab.org.Item Enhancing Precision of Object Detectors: Bridging Classification and Localization Gaps for 2D and 3D Models(2024-05) Ravi, Niranjan; El-Sharkawy, Mohamed; Rizkalla, Maher E.; Li, Lingxi; King, Brian S.Artificial Intelligence (AI) has revolutionized and accelerated significant advancements in various fields such as healthcare, finance, education, agriculture and the development of autonomous vehicles. We are rapidly approaching Level 5 Autonomy due to recent developments in autonomous technology, including self-driving cars, robot navigation, smart traffic monitoring systems, and dynamic routing. This success has been made possible due to Deep Learning technologies and advanced Computer Vision (CV) algorithms. With the help of perception sensors such as Camera, LiDAR and RADAR, CV algorithms enable a self-driving vehicle to interact with the environment and make intelligent decisions. Object detection lays the foundations for various applications, such as collision and obstacle avoidance, lane detection, pedestrian and vehicular safety, and object tracking. Object detection has two significant components: image classification and object localization. In recent years, enhancing the performance of 2D and 3D object detectors has spiked interest in the research community. This research aims to resolve the drawbacks associated with localization loss estimation of 2D and 3D object detectors by addressing the bounding box regression problem, addressing the class imbalance issue affecting the confidence loss estimation, and finally proposing a dynamic cross-model 3D hybrid object detector with enhanced localization and confidence loss estimation. This research aims to address challenges in object detectors through four key contributions. In the first part, we aim to address the problems associated with the image classification component of 2D object detectors. Class imbalance is a common problem associated with supervised training. Common causes are noisy data, a scene with a tiny object surrounded by background pixels, or a dense scene with too many objects. These scenarios can produce many negative samples compared to positive ones, affecting the network learning and reducing the overall performance. We examined these drawbacks and proposed an Enhanced Hard Negative Mining (EHNM) approach, which utilizes anchor boxes with 20% to 50% overlap and positive and negative samples to boost performance. The efficiency of the proposed EHNM was evaluated using Single Shot Multibox Detector (SSD) architecture on the PASCAL VOC dataset, indicating that the detection accuracy of tiny objects increased by 3.9% and 4% and the overall accuracy improved by 0.9%. To address localization loss, our second approach investigates drawbacks associated with existing bounding box regression problems, such as poor convergence and incorrect regression. We analyzed various cases, such as when objects are inclusive of one another, two objects with the same centres, two objects with the same centres and similar aspect ratios. During our analysis, we observed existing intersections over Union (IoU) loss and its variant’s failure to address them. We proposed two new loss functions, Improved Intersection Over Union (IIoU) and Balanced Intersection Over Union (BIoU), to enhance performance and minimize computational efforts. Two variants of the YOLOv5 model, YOLOv5n6 and YOLOv5s, were utilized to demonstrate the superior performance of IIoU on PASCAL VOC and CGMU datasets. With help of ROS and NVIDIA’s devices, inference speed was observed in real-time. Extensive experiments were performed to evaluate the performance of BIoU on object detectors. The evaluation results indicated MASK_RCNN network trained on the COCO dataset, YOLOv5n6 network trained on SKU-110K and YOLOv5x trained on the custom e-scooter dataset demonstrated 3.70% increase on small objects, 6.20% on 55% overlap and 9.03% on 80% overlap. In the earlier parts, we primarily focused on 2D object detectors. Owing to its success, we extended the scope of our research to 3D object detectors in the later parts. The third portion of our research aims to solve bounding box problems associated with 3D rotated objects. Existing axis-aligned loss functions suffer a performance gap if the objects are rotated. We enhanced the earlier proposed IIoU loss by considering two additional parameters: the objects’ Z-axis and rotation angle. These two parameters aid in localizing the object in 3D space. Evaluation was performed on LiDAR and Fusion methods on 3D KITTI and nuScenes datasets. Once we addressed the drawbacks associated with confidence and localization loss, we further explored ways to increase the performance of cross-model 3D object detectors. We discovered from previous studies that perception sensors are volatile to harsh environmental conditions, sunlight, and blurry motion. In the final portion of our research, we propose a hybrid 3D cross-model detection network (MAEGNN) equipped with MaskedAuto Encoders (MAE) and Graph Neural Networks (GNN) along with earlier proposed IIoU and ENHM. The performance evaluation on MAEGNN on the KITTI validation dataset and KITTI test set yielded a detection accuracy of 69.15%, 63.99%, 58.46% and 40.85%, 37.37% on 3D pedestrians with overlap of 50%. This developed hybrid detector overcomes the challenges of localization error and confidence estimation and outperforms many state-of-art 3D object detectors for autonomous platforms.Item GENN: A GEneral Neural Network for Learning Tabulated Data with Examples from Protein Structure Prediction(Springer, 2015) Faraggi, Eshel; Kloczkowski, Andrzej; Biochemistry and Molecular Biology, School of MedicineWe present a GEneral Neural Network (GENN) for learning trends from existing data and making predictions of unknown information. The main novelty of GENN is in its generality, simplicity of use, and its specific handling of windowed input/output. Its main strength is its efficient handling of the input data, enabling learning from large datasets. GENN is built on a two-layered neural network and has the option to use separate inputs–output pairs or window-based data using data structures to efficiently represent input–output pairs. The program was tested on predicting the accessible surface area of globular proteins, scoring proteins according to similarity to native, predicting protein disorder, and has performed remarkably well. In this paper we describe the program and its use. Specifically, we give as an example the construction of a similarity to native protein scoring function that was constructed using GENN. The source code and Linux executables for GENN are available from Research and Information Systems at http://mamiris.com and from the Battelle Center for Mathematical Medicine at http://mathmed.org. Bugs and problems with the GENN program should be reported to EF.Item Short- and Long-Term Prediction of the Post-Pubertal Mandibular Length and Y-Axis in Females Utilizing Machine Learning(MDPI, 2023-08-22) Parrish, Matthew; O’Connell, Ella; Eckert, George; Hughes, Jay; Badirli, Sarkhan; Turkkahraman, Hakan; Orthodontics and Oral Facial Genetics, School of DentistryThe aim of this study was to create a novel machine learning (ML) algorithm for predicting the post-pubertal mandibular length and Y-axis in females. Cephalometric data from 176 females with Angle Class I occlusion were used to train and test seven ML algorithms. For all ML methods tested, the mean absolute errors (MAEs) for the 2-year prediction ranged from 2.78 to 5.40 mm and 0.88 to 1.48 degrees, respectively. For the 4-year prediction, MAEs of mandibular length and Y-axis ranged from 3.21 to 4.00 mm and 1.19 to 5.12 degrees, respectively. The most predictive factors for post-pubertal mandibular length were mandibular length at previous timepoints, age, sagittal positions of the maxillary and mandibular skeletal bases, mandibular plane angle, and anterior and posterior face heights. The most predictive factors for post-pubertal Y-axis were Y-axis at previous timepoints, mandibular plane angle, and sagittal positions of the maxillary and mandibular skeletal bases. ML methods were identified as capable of predicting mandibular length within 3 mm and Y-axis within 1 degree. Compared to each other, all of the ML algorithms were similarly accurate, with the exception of multilayer perceptron regressor.Item Training Machine Learning Potentials for Reactive Systems: A Colab Tutorial on Basic Models(Wiley, 2024) Pan, Xiaoliang; Snyder, Ryan; Wang, Jia-Ning; Lander, Chance; Wickizer, Carly; Van, Richard; Chesney, Andrew; Xue, Yuanfei; Mao, Yuezhi; Mei, Ye; Pu, Jingzhi; Shao, Yihan; Chemistry and Chemical Biology, School of ScienceIn the last several years, there has been a surge in the development of machine learning potential (MLP) models for describing molecular systems. We are interested in a particular area of this field - the training of system-specific MLPs for reactive systems - with the goal of using these MLPs to accelerate free energy simulations of chemical and enzyme reactions. To help new members in our labs become familiar with the basic techniques, we have put together a self-guided Colab tutorial (https://cc-ats.github.io/mlp_tutorial/), which we expect to be also useful to other young researchers in the community. Our tutorial begins with the introduction of simple feedforward neural network (FNN) and kernel-based (using Gaussian process regression, GPR) models by fitting the two-dimensional Müller-Brown potential. Subsequently, two simple descriptors are presented for extracting features of molecular systems: symmetry functions (including the ANI variant) and embedding neural networks (such as DeepPot-SE). Lastly, these features will be fed into FNN and GPR models to reproduce the energies and forces for the molecular configurations in a Claisen rearrangement reaction.Item Visual Analytics and Interactive Machine Learning for Human Brain Data(2019-08) Li, Huang; Fang, Shiaofen; Shen, Li; Mukhopadhyay, SnehasisThis study mainly focuses on applying visualization techniques on human brain data for data exploration, quality control, and hypothesis discovery. It mainly consists of two parts: multi-modal data visualization and interactive machine learning. For multi-modal data visualization, a major challenge is how to integrate structural, functional and connectivity data to form a comprehensive visual context. We develop a new integrated visualization solution for brain imaging data by combining scientific and information visualization techniques within the context of the same anatomic structure. For interactive machine learning, we propose a new visual analytics approach to interactive machine learning. In this approach, multi-dimensional data visualization techniques are employed to facilitate user interactions with the machine learning process. This allows dynamic user feedback in different forms, such as data selection, data labeling, and data correction, to enhance the efficiency of model building.