Functional filter for whole-genome sequencing data identifies HHT and stress-associated non-coding SMAD4 polyadenylation site variants >5 kb from coding DNA

dc.contributor.authorXiao, Sihao
dc.contributor.authorKai, Zhentian
dc.contributor.authorMurphy, Daniel
dc.contributor.authorLi, Dongyang
dc.contributor.authorPatel, Dilip
dc.contributor.authorBielowka, Adrianna M.
dc.contributor.authorBernabeu-Herrero, Maria E.
dc.contributor.authorAbdulmogith, Awatif
dc.contributor.authorMumford, Andrew D.
dc.contributor.authorWestbury, Sarah K.
dc.contributor.authorAldred, Micheala A.
dc.contributor.authorVargesson, Neil
dc.contributor.authorCaulfield, Mark J.
dc.contributor.authorGenomics England Research Consortium
dc.contributor.authorShovlin, Claire L.
dc.contributor.departmentMedicine, School of Medicine
dc.date.accessioned2024-04-10T12:22:57Z
dc.date.available2024-04-10T12:22:57Z
dc.date.issued2023
dc.description.abstractDespite whole-genome sequencing (WGS), many cases of single-gene disorders remain unsolved, impeding diagnosis and preventative care for people whose disease-causing variants escape detection. Since early WGS data analytic steps prioritize protein-coding sequences, to simultaneously prioritize variants in non-coding regions rich in transcribed and critical regulatory sequences, we developed GROFFFY, an analytic tool that integrates coordinates for regions with experimental evidence of functionality. Applied to WGS data from solved and unsolved hereditary hemorrhagic telangiectasia (HHT) recruits to the 100,000 Genomes Project, GROFFFY-based filtration reduced the mean number of variants/DNA from 4,867,167 to 21,486, without deleting disease-causal variants. In three unsolved cases (two related), GROFFFY identified ultra-rare deletions within the 3' untranslated region (UTR) of the tumor suppressor SMAD4, where germline loss-of-function alleles cause combined HHT and colonic polyposis (MIM: 175050). Sited >5.4 kb distal to coding DNA, the deletions did not modify or generate microRNA binding sites, but instead disrupted the sequence context of the final cleavage and polyadenylation site necessary for protein production: By iFoldRNA, an AAUAAA-adjacent 16-nucleotide deletion brought the cleavage site into inaccessible neighboring secondary structures, while a 4-nucleotide deletion unfolded the downstream RNA polymerase II roadblock. SMAD4 RNA expression differed to control-derived RNA from resting and cycloheximide-stressed peripheral blood mononuclear cells. Patterns predicted the mutational site for an unrelated HHT/polyposis-affected individual, where a complex insertion was subsequently identified. In conclusion, we describe a functional rare variant type that impacts regulatory systems based on RNA polyadenylation. Extension of coding sequence-focused gene panels is required to capture these variants.
dc.eprint.versionFinal published version
dc.identifier.citationXiao S, Kai Z, Murphy D, et al. Functional filter for whole-genome sequencing data identifies HHT and stress-associated non-coding SMAD4 polyadenylation site variants >5 kb from coding DNA. Am J Hum Genet. 2023;110(11):1903-1918. doi:10.1016/j.ajhg.2023.09.005
dc.identifier.urihttps://hdl.handle.net/1805/39862
dc.language.isoen_US
dc.publisherElsevier
dc.relation.isversionof10.1016/j.ajhg.2023.09.005
dc.relation.journalAmerican Journal of Human Genetics
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internationalen
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/
dc.sourcePMC
dc.subject3′ untranslated region
dc.subjectAlternate exon use
dc.subjectCADD score
dc.subjectCombined annotation-dependant depletion score
dc.subjectCPA site
dc.subjectCleavage and polyadenylation site
dc.subjectCycloheximide
dc.subjectHereditary hemorrhagic telangiectasia
dc.subjectPeripheral blood mononuclear cells
dc.subjectPBMCs
dc.subjectRare variant
dc.titleFunctional filter for whole-genome sequencing data identifies HHT and stress-associated non-coding SMAD4 polyadenylation site variants >5 kb from coding DNA
dc.typeArticle
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
main.pdf
Size:
4.61 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.99 KB
Format:
Item-specific license agreed upon to submission
Description: