Session Introduction

Imaging genomics is an emerging research ﬁeld, where integrative analysis of imaging and omics data is performed to provide new insights into the phenotypic characteristics and genetic mechanisms of normal and/or disordered biological structures and functions, and to impact the development of new diagnostic, therapeutic and preventive approaches. The Imaging Genomics Session at PSB 2017 aims to encourage discussion on fundamental concepts, new methods and innovative applications in this young and rapidly evolving ﬁeld. a novel sparse canonical correlation model analyzed an imaging proteomic data set downloaded from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database. Participants include 42 67 with mild cognitive (MCI), and 67 patients with Alzheimer’s (AD). The data includes clinical information, mag-netic resonance imaging (MRI) scans, and expression data of 229 proteomic analytes (83 from cerebrospinal ﬂuid and 146 from plasma). The authors developed a novel machine learning model, called discriminative sparse canonical correlation analysis (DSCCA), and applied it to the joint analysis of imaging, proteomic and diagnostic data. This analysis yielded a strong imaging proteomic association so that the identiﬁed imaging and proteomic components had also high power. Such an outcome-relevant imaging proteomic pattern has the potential to improve mechanistic understanding of the disease.


Introduction
Imaging genomics [1][2][3][4][5][6][7][8][9] is an emerging research field that arises with the recent advances in acquiring high throughput omics data and multimodal imaging data. Its major task is to perform integrative analysis of genomics data and structural, functional and molecular imaging data. Bridging imaging and genomic factors and exploring their connections have the potential to provide important new insights into the phenotypic characteristics and genetic mechanisms of normal and/or disordered biological structures and functions, which in turn will impact the development of new diagnostic, therapeutic and preventive approaches.
Binformatics strategies for imaging genomics, which is a relatively young field, [1][2][3][4] have been rapidly evolving. Early studies started with the simplest strategy to examine pairwise univariate associations 10,11 between genetic markers and imaging phenotypes. To identify more flexible associations involving multiple genetic markers and multiple imaging phenotypes, recent studies employed multiple regression and multivariate models, 12 sometimes coupled with powerful machine learning approaches 13 and valuable prior knowledge 14 to discover relevant imaging and genomic features. To increase statistical power and reduce false positives, metaanalysis studies 15,16 were performed to quantitatively synthesize imaging genomic findings from multiple independent analyses. To hunt for "missing heritability", epistatic studies 17 were performed to examine genetic interaction effects on imaging phenotypes. To identify biologically meaningful findings with increased statistical power, imaging genetic enrichment analysis 18 was proposed to mine set level associations in both imaging and genomic domains.
The topic of imaging genomics has recently been addressed in several medical imaging and bioinformatics conferences. The most focused one is the International Imaging Genetics Conference (IIGC, http://www.imaginggenetics.uci.edu/), which is an annual meeting organized at the UC Irvine since 2005. The MICCAI Workshop on Imaging Genetics (MICGen, http://micgen.csail.mit.edu/) has been held twice in conjunction with the major medical image computing conference MICCAI in 2014 and 2015. An educational course on "Introduction to Imaging Genetics" has been offered at the annual meeting of the Organization for Human Brain Mapping (OHBM) since 2009. The topic of imaging genomics has also been covered in the following two events in the bioinformatics field: (1) ACM BCB 2015 Workshop on The Computational Pathology: Linking Tissue Phenotypes with Genomics and Clinical Outcomes, and (2) ICIBM 2015 Tutorial in Bioimage Informatics and Integrative Genomics.
As the field of imaging genomics contains a significant genomics (or omics in general) component in addition to biomedical imaging, we feel that it is timely for a major bioinformatics conference such as PSB to address this important, relevant and emerging topic. We believe that PSB offers an ideal and timely opportunity to bring together people with different expertise and shared interests in this rapidly evolving field. Specifically, the computational biology and bioinformatics expertise of the PSB and ISCB communities can provide important new perspective, complementary to the expertise of the IIGC, MICCAI, OHBM, ACM BCB and ICIBM communities, and thus can help contribute new concepts, methods, and applications to the analysis of emerging imaging and genomic data.
The scale and complexity of multidimensional imaging and omics data provide us unprecedented opportunities in enhancing mechanistic understanding of complex disorders such as neurological diseases [19][20][21] and cancers, 22,23 which can benefit public health outcomes by facilitating diagnostic and therapeutic progress. However, due to the extremely high dimensionality and complex structure of these data sets, this field is facing major computational and bioinformatics challenges. The technological advance in this field is urgently needed and has the potential to significantly contribute to multiple national health priority areas including the Precision Medicine Initiative, 24 the Brain Initiative, 25 and the Big Data to Knowledge Initiative. 26 The objective of this Imaging Genomics Session at PSB 2017 is to encourage discussion on fundamental concepts, novel methods and innovative applications. We hope that this session will become a forum for researchers to exchange ideas, data, and software, in order to speed up the development of innovative technologies for hypothesis testing and data-driven discovery in Imaging Genomics.

Session Summary
This session includes an invited lecture and five accepted presentations with peer-reviewed papers. Three presentations will be delivered as platform talks and the other two as posters.

Invited Talk
Our invited lecture will be given by Dr. Paul Thompson, a world renowned pioneer in imaging genomics. Dr. Thompson is from the University of Southern California (USC). At USC, he is a Professor of Neurology, Psychiatry, Radiology, Pediatrics, Engineering, and Ophthalmology, the director of the USC Imaging Genetics Center, and the director of the ENIGMA Center for Worldwide Medicine, Imaging & Genomics -an $11M NIH Center of Excellence in Big Data Computing. Dr. Thompson's major contributions to the field of imaging genomics and to the science in general can be summarized by the following text quoted from http://keck.usc.edu/faculty/paul-m-thompson/: Paul Thompson directs the ENIGMA Consortium, a global alliance of 307 scientists in 33 countries who conduct the largest studies of 10 major brain diseasesranging from schizophrenia, depression, ADHD, bipolar illness and OCD, to HIV and addictions on the brain. ENIGMA's genomic screens of over 31,000 people's brain scans and genome-wide data (published in Nature Genetics, 2012; Nature, 2015) have brought together experts from 185 institutions to unearth genetic variants that affect brain structure, disease risk, and brain connectivity. Collaborating with imaging labs around the world, Dr. Thompson and his students have published over 1,300 publications (h-index: 116) describing novel mathematical and computational strategies for analyzing brain image databases, for detecting pathology in individual patients and groups, and for creating disease-specific atlases of the human brain.

Papers
In Integrative analysis for lung adenocarcinoma predicts morphological features associated with genetic variations, Wang et al. analyzed an imaging genomic data set downloaded from the TCGA portal, containing 201 patients with lung adenocarcinoma (LUAD). The data includes clinical information, mRNA expression profiles, and histopathologic whole slide images of the patients. On the imaging end, the authors calculated 283 morphological features from histopathologic images, and identified features strongly correlated with patient survival outcome. On the genomic end, the authors constructed the gene co-expression network and extracted gene co-expression clusters. To relate imaging with genomics, the authors regressed the outcome-relevant morphological feature on multiple co-expressed gene clusters using Lasso. The study identified gene clusters highly associated with DNA copy number variations. These observations may lead to new insight on lung cancer development, suggesting biological pathways from genetic variations, gene transcription, cancer morphology to survival outcome.
In Identification of discriminative imaging proteomics associations in Alzheimer's disease via a novel sparse canonical correlation model, Yan et al. analyzed an imaging proteomic data set downloaded from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database. Participants include 42 healthy controls, 67 patients with mild cognitive impairment (MCI), and 67 patients with Alzheimer's disease (AD). The data includes clinical information, magnetic resonance imaging (MRI) scans, and expression data of 229 proteomic analytes (83 from cerebrospinal fluid and 146 from plasma). The authors developed a novel machine learning model, called discriminative sparse canonical correlation analysis (DSCCA), and applied it to the joint analysis of imaging, proteomic and diagnostic data. This analysis yielded a strong imaging proteomic association so that the identified imaging and proteomic components had also high discriminative power. Such an outcome-relevant imaging proteomic pattern has the potential to improve mechanistic understanding of the disease.
In Enforcing co-expression in multimodal regression framework, Zille et al. analyzed an imaging genomic data set collected by Mind Clinical Imaging Consortium (MCIC). Participants include 116 controls and 92 schizophrenia patients. The data includes clinical information, functional MRI (fMRI) scans, and genotyping data. The authors developed a new machine learning model, called MT-CoReg, by combining sparse regression with canonical correlation analysis; and applied it to the analysis of the MCIC data. The analysis identified imaging and genomic markers that not only induce a strong imaging genomic association but also can jointly predict the outcome.
In Adaptive testing of SNP-brain functional connectivity association via a modular network analysis, Gao et al. analyzed an imaging genomic data set downloaded from the ADNI database. Participants include 162 ADNI subjects: 73 with no APOE E4 allele, 67 with one copy of the APOE E4 allele, and 22 with two copies of the APOE E4 allele. The authors analyzed the resting-state fMRI data to identify modular structures in brain functional networks, using a weighted gene co-expression network analysis (WGCNA) framework, coupled with topological overlap matrix (TOM) elements in hierarchical clustering. After that, they employed an adaptive association test based on the proportional odds model to identify distinct modular structures in brain functional networks in relation to different APOE E4 groups.
In Exploring brain transcriptomic patterns: a topological analysis using spatial expression networks, Kuncheva et al. analyzed whole genome whole brain gene expression data downloaded from the Allen Human Brain Atlas (AHBA). Participants include six AHBA donors. The authors focused on 16,906 genes selected based on a previous study, and 105 brain regions where at least one measurement in all 6 brains were available. A Spatial Expression Network (SEN) was extracted for each gene to quantify co-expression patterns amongst several anatomical locations. After that, network similarity measures were computed and used to quantify the topological resemblance between pairs of SENs and identify naturally occurring clusters. The analysis identified three stable clusters, including one with genes specifically involved in the nervous system, and the other two representing immunity, transcription and translation. shared a common theme to examine the relationship among three levels (i.e., omics features, imaging phenotypes, and clinical outcomes). This suggests a promising future direction to integrate imaging genomics with systems biology, which attempts to model complex and interactive multilevel biological systems using multimodal imaging and multidimensional omics data sets.