Classifying the unknown: Insect identification with deep hierarchical Bayesian learning

Date
2023
Language
American English
Embargo Lift Date
Department
Committee Members
Degree
Degree Year
Department
Grantor
Journal Title
Journal ISSN
Volume Title
Found At
Wiley
Abstract
  1. Classifying insect species involves a tedious process of identifying distinctive morphological insect characters by taxonomic experts. Machine learning can harness the power of computers to potentially create an accurate and efficient method for performing this task at scale, given that its analytical processing can be more sensitive to subtle physical differences in insects, which experts may not perceive. However, existing machine learning methods are designed to only classify insect samples into described species, thus failing to identify samples from undescribed species.

  2. We propose a novel deep hierarchical Bayesian model for insect classification, given the taxonomic hierarchy inherent in insects. This model can classify samples of both described and undescribed species; described samples are assigned a species while undescribed samples are assigned a genus, which is a pivotal advancement over just identifying them as outliers. We demonstrated this proof of concept on a new database containing paired insect image and DNA barcode data from four insect orders, including 1040 species, which far exceeds the number of species used in existing work. A quarter of the species were excluded from the training set to simulate undescribed species.

  3. With the proposed classification framework using combined image and DNA data in the model, species classification accuracy for described species was 96.66% and genus classification accuracy for undescribed species was 81.39%. Including both data sources in the model resulted in significant improvement over including image data only (39.11% accuracy for described species and 35.88% genus accuracy for undescribed species), and modest improvement over including DNA data only (73.39% genus accuracy for undescribed species).

  4. Unlike current machine learning methods, the proposed deep hierarchical Bayesian learning approach can simultaneously classify samples of both described and undescribed species, a functionality that could become instrumental in biodiversity monitoring across the globe. This framework can be customized for any taxonomic classification problem for which image and DNA data can be obtained, thus making it relevant for use across all biological kingdoms.

Description
item.page.description.tableofcontents
item.page.relation.haspart
Cite As
Badirli, S., Picard, C. J., Mohler, G., Richert, F., Akata, Z., & Dundar, M. (2023). Classifying the unknown: Insect identification with deep hierarchical Bayesian learning. Methods in Ecology and Evolution, 14(6), 1515–1530. https://doi.org/10.1111/2041-210X.14104
ISSN
Publisher
Series/Report
Sponsorship
Major
Extent
Identifier
Relation
Journal
Source
Alternative Title
Type
Article
Number
Volume
Conference Dates
Conference Host
Conference Location
Conference Name
Conference Panel
Conference Secretariat Location
Version
Full Text Available at
This item is under embargo {{howLong}}
Collections