PseudoFuN: Deriving functional potentials of pseudogenes from integrative relationships with genes and microRNAs across 32 cancers

dc.contributor.authorJohnson, Travis S.
dc.contributor.authorLi, Sihong
dc.contributor.authorFranz, Eric
dc.contributor.authorHuang, Zhi
dc.contributor.authorDan Li, Shuyu
dc.contributor.authorCampbell, Moray J.
dc.contributor.authorHuang, Kun
dc.contributor.authorZhang, Yan
dc.contributor.departmentMedicine, School of Medicineen_US
dc.date.accessioned2019-08-21T15:43:15Z
dc.date.available2019-08-21T15:43:15Z
dc.date.issued2019-05
dc.description.abstractBACKGROUND: Long thought "relics" of evolution, not until recently have pseudogenes been of medical interest regarding regulation in cancer. Often, these regulatory roles are a direct by-product of their close sequence homology to protein-coding genes. Novel pseudogene-gene (PGG) functional associations can be identified through the integration of biomedical data, such as sequence homology, functional pathways, gene expression, pseudogene expression, and microRNA expression. However, not all of the information has been integrated, and almost all previous pseudogene studies relied on 1:1 pseudogene-parent gene relationships without leveraging other homologous genes/pseudogenes. RESULTS: We produce PGG families that expand beyond the current 1:1 paradigm. First, we construct expansive PGG databases by (i) CUDAlign graphics processing unit (GPU) accelerated local alignment of all pseudogenes to gene families (totaling 1.6 billion individual local alignments and >40,000 GPU hours) and (ii) BLAST-based assignment of pseudogenes to gene families. Second, we create an open-source web application (PseudoFuN [Pseudogene Functional Networks]) to search for integrative functional relationships of sequence homology, microRNA expression, gene expression, pseudogene expression, and gene ontology. We produce four "flavors" of CUDAlign-based databases (>462,000,000 PGG pairwise alignments and 133,770 PGG families) that can be queried and downloaded using PseudoFuN. These databases are consistent with previous 1:1 PGG annotation and also are much more powerful including millions of de novo PGG associations. For example, we find multiple known (e.g., miR-20a-PTEN-PTENP1) and novel (e.g., miR-375-SOX15-PPP4R1L) microRNA-gene-pseudogene associations in prostate cancer. PseudoFuN provides a "one stop shop" for identifying and visualizing thousands of potential regulatory relationships related to pseudogenes in The Cancer Genome Atlas cancers. CONCLUSIONS: Thousands of new PGG associations can be explored in the context of microRNA-gene-pseudogene co-expression and differential expression with a simple-to-use online tool by bioinformaticians and oncologists alike.en_US
dc.identifier.citationJohnson, T. S., Li, S., Franz, E., Huang, Z., Dan Li, S., Campbell, M. J., … Zhang, Y. (2019). PseudoFuN: Deriving functional potentials of pseudogenes from integrative relationships with genes and microRNAs across 32 cancers. GigaScience, 8(5), giz046. doi:10.1093/gigascience/giz046en_US
dc.identifier.urihttps://hdl.handle.net/1805/20459
dc.language.isoen_USen_US
dc.publisherOxford University Pressen_US
dc.relation.isversionof10.1093/gigascience/giz046en_US
dc.relation.journalGigaScienceen_US
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 United States*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/us/*
dc.sourcePMCen_US
dc.subjectCompeting endogenous RNAen_US
dc.subjectDatabaseen_US
dc.subjectFunctional predictionen_US
dc.subjectGene regulationen_US
dc.subjectGraphics processing uniten_US
dc.subjectHigh-performance computingen_US
dc.subjectNetwork analysisen_US
dc.subjectPseudogenesen_US
dc.titlePseudoFuN: Deriving functional potentials of pseudogenes from integrative relationships with genes and microRNAs across 32 cancersen_US
dc.typeArticleen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
giz046.pdf
Size:
2.5 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.99 KB
Format:
Item-specific license agreed upon to submission
Description: