Flexible and Scalable Annotation Tool to Develop Scene Understanding Datasets
dc.contributor.author | Elahi, Md Fazle | |
dc.contributor.author | Tian, Renran | |
dc.contributor.author | Luo, Xiao | |
dc.contributor.department | Electrical and Computer Engineering, Purdue School of Engineering and Technology | |
dc.date.accessioned | 2024-04-29T11:42:26Z | |
dc.date.available | 2024-04-29T11:42:26Z | |
dc.date.issued | 2022 | |
dc.description.abstract | Recent progress in data-driven vision and language-based tasks demands developing training datasets enriched with multiple modalities representing human intelligence. The link between text and image data is one of the crucial modalities for developing AI models. The development process of such datasets in the video domain requires much effort from researchers and annotators (experts and non-experts). Researchers re-design annotation tools to extract knowledge from annotators to answer new research questions. The whole process repeats for each new question which is time consuming. However, since the last decade, there has been little change in how the researchers and annotators interact with the annotation process. We revisit the annotation workflow and propose a concept of an adaptable and scalable annotation tool. The concept emphasizes its users’ interactivity to make annotation process design seamless and efficient. Researchers can conveniently add newer modalities to or augment the extant datasets using the tool. The annotators can efficiently link free-form text to image objects. For conducting human-subject experiments on any scale, the tool supports the data collection for attaining group ground truth. We have conducted a case study using a prototype tool between two groups with the participation of 74 non-expert people. We find that the interactive linking of free-form text to image objects feels intuitive and evokes a thought process resulting in a high-quality annotation. The new design shows ≈ 35% improvement in the data annotation quality. On UX evaluation, we receive above-average positive feedback from 25 people regarding convenience, UI assistance, usability, and satisfaction. | |
dc.eprint.version | Final published version | |
dc.identifier.citation | Md Fazle Elahi, Renran Tian, and Xiao Luo. 2022. Flexible and Scalable Annotation Tool to Develop Scene Understanding Datasets. In Workshop on Human-In-the-Loop Data Analytics (HILDA ’22 ), June 12, 2022, Philadelphia, PA, USA. ACM, New York, NY, USA, 7 pages. https://doi.org/10.1145/3546930.3547499 | |
dc.identifier.uri | https://hdl.handle.net/1805/40310 | |
dc.language.iso | en_US | |
dc.publisher | National Science Foundation | |
dc.relation.isversionof | 10.1145/3546930.3547499 | |
dc.relation.journal | HILDA '22: Proceedings of the Workshop on Human-In-the-Loop Data Analytics | |
dc.rights | Publisher Policy | |
dc.source | Author | |
dc.subject | Vision and Language | |
dc.subject | Scene Understanding | |
dc.subject | Data Annotation | |
dc.title | Flexible and Scalable Annotation Tool to Develop Scene Understanding Datasets | |
dc.type | Article |