Lessons Learned for Identifying and Annotating Permissions in Clinical Consent Forms
Date
Language
Embargo Lift Date
Committee Members
Degree
Degree Year
Department
Grantor
Journal Title
Journal ISSN
Volume Title
Found At
Abstract
Background: The lack of machine-interpretable representations of consent permissions precludes development of tools that act upon permissions across information ecosystems, at scale.
Objectives: To report the process, results, and lessons learned while annotating permissions in clinical consent forms.
Methods: We conducted a retrospective analysis of clinical consent forms. We developed an annotation scheme following the MAMA (Model-Annotate-Model-Annotate) cycle and evaluated interannotator agreement (IAA) using observed agreement (A o), weighted kappa (κw ), and Krippendorff's α.
Results: The final dataset included 6,399 sentences from 134 clinical consent forms. Complete agreement was achieved for 5,871 sentences, including 211 positively identified and 5,660 negatively identified as permission-sentences across all three annotators (A o = 0.944, Krippendorff's α = 0.599). These values reflect moderate to substantial IAA. Although permission-sentences contain a set of common words and structure, disagreements between annotators are largely explained by lexical variability and ambiguity in sentence meaning.
Conclusion: Our findings point to the complexity of identifying permission-sentences within the clinical consent forms. We present our results in light of lessons learned, which may serve as a launching point for developing tools for automated permission extraction.