Background:
Tremendous efforts have been made to elucidate the molecular basis of the initiation and progression of ovarian cancer.However, most existing studies have been focused on individual genes or a single type of data, which may lack the power to detect the complex mechanism of cancer of formation by overlooking the interactions of different genetic and epigenetic factors.
These research propose an integrative framework to identify genetic and epigenetic features related to ovarian cancer and to quantify the casual among these features using probabilistic graphical model based on the Cacner Genome Atlas (TCGA) data. Identifies possible important genetic and epigenetic features that are related to complex cancer disease.This research constructed Bayesian Network that has identified some new genetic/epigenetic pathways, which may shed new light into the molecular mechanism of ovarian cancer.
- First defined a set of seed genes by including 48 candidate tumor suppressor or oncogenes and an additional 20 ovarian cancer related genes reported .
- Then the seed genes were then fed into stepwise correlation-based selector to identify 271 additional features including 177 genes, 82 copy number variation sites, 11 methylation sites and 1 somatinc mutation.
- They built a Bayesian network model with a logit link function to quantify the casual relationships among these features and discovered a set of 13 hyb genes including ARID1A, C19orf53 and COLA52.
- The directed graph revealed many potential genetic pathways, some of which confirmed the existing result .
- Clustering analysis further suggested four gene cluster, three of which correspond to well-defines cellular process including cell division, tumor invasion and mitochondrial system.
- In addition, two genes related to glycoprotein synthesis, PSG11 and GALNT10 were found highly predictive for the overall survival time of ovarian cancer patients.
Ovarian cancer, is one of the most malignant gynecologic cancers, is the fifth leading cause of cancer-related deaths among women in the United States. Studies have suggested that there are well-known oncogenes and tumor suppressors including TP53,PIK3C, BRCA1 and BRCA2
Systems Biology approach combines multiple genetic and epigenetic profeines for an integrative analysis provides a new direction to study the regulatory network associated with ovarian cancer.
Data for this research was taken from TCGA project.
The BN approach allows rigorous statistical inference of causality between genetic and epigenetic features. How to combine different types of complex data for casual inference in BN poses a big challange to identify a subset of the most relevant features and to remove irrelevant or redundant features.
Result
They consider four types of molecular data including gene expression, DNA copy number variation, promoter methylation and somatic mutation.
In this paper they assume that cancer phenotype is directly associated with gene expression, which can be potentially driven by genetic and epigenetic changes. They first identify a set of tumor suppressors and oncogenes by differential expression analysis between the cancer and control groups. This set of genes form the set of seed gens. Then these seeds genes are then fed into their proposed stepwise correlation based selector (SCBS) to select other features.
Figure 2 illustrates the workflow of the proposed framework. They first identify a set of tumor suppressors and oncogenes by differential expression analysis between the cancer and control groups. The first step was to define a set of seed genes out of 12000 genes that have record of expression level and at least one of the three epigenetic factor. This procedure resulted in 48 potential tumor suppressor or oncogenes.
Figure 3 shows the predicted network that contains 698 edges where the direction indicated the downstream feature is regulated by the upstream one. They found that copy number variation are the major factor that accounts for differential gene expression and this suggest that many amplified genes may act as cancer drivers, confirming finding from a breast cancer study. This network confirmed many previously reported gene-gene interactions.
These results suggested that the proposed pipeline is capable of revealing important genetic pathways that underline the complex cancer phenotype.
They identified 13 nodes with significantly larger outdegrees. These hub genes all have known function and have casual effect on at least seven neighboring genes suggesting that they play important roles in driving corresponding local sub-networks
Figure 5.
13 hub genes that clearly distinguish the cancer samples from the normal samples. This is a multi-dimensional scaling plot based on the correlation dissimilarity. This suggest that 13 hub genes may present the major difference between cancer and normal samples.
For gene cluster, 245 genes were identified to fall into four major clusters corresponding to distinct function by k-means clustering methods.
- Cluster 1 contains 18 genes, mainly related to cell division, mitosis, spindle formation etc.
- Cluster 2 contains 23 genes, most of which are functionally related to growth factor, cell shape, cell motility, tumor invasion.
- Cluster 3 contains 20 genes, mostly related to mitochondrial system, membrane process.
- Cluster 4 is the largest and most complicated cluster harboring 184 genes. This large cluster communicates between the other three clusters with are independent from each other
These findings could be implicative of some important molecular pathways, which may or may not have been identified, that drive the development of ovarian cancer. Figure 7
The inferred Bayesian network identified two genes, PSG11 and GALNT10, that may be directly associated with the overall survival time of ovarian cancer patients. Both genes are functionally related to glycoprotein synthesis. This indicates the biological pathway related to glycoprotein synthesis may be implicative of death risk of ovarian cancer patients. Several tumor-associated glycoproteins were found on the surface of many cancer cells including ovary, breast, colon and pancreatinc cells and they may play protein roles in early detection of cancers. A well known protein is CA-125 which is the primary protein used to measure serous cancer tumor load. Some certain glycoproteins are closely associated with woman cancers such as ovarian cancer and breast cancer affecting the death risk, chemotherapy resistance and prognosis of ovarian cancer patients.
Discussion
This research proposed an integrative approach int the Bayeasian network for casual inference between genetic and epigenetic features in complex cancer data.
First showed stepwise correlation-based selection approach is more effective than simple single-round selection method in identifying important features in the genetic/epigenetic pathways. The method they proposed relies on the correlation strength among connected nodes and may fail when the connections are weak.
Second a model was purposed for casual relationship between features of different types in a Bayesian network through a logit link function.
They also found that pathways related to glycoprotein synthesis, hematopoietic and immune systems correlate with the survival rate of ovarian cancer patients. In particular, that two genes related to glycoprotein synthesis, PSG11 and GALNT10 can significantly affect the overall survival time of ovarian cancer patients.
Conclusion
Understanding the biological mechanism of ovarian cancer has significant practical importance for clinical diagnosed and treatment. In this research they propose a new integrative approach which present two innovations: a stepwise feature selection procedure and a Bayeasian network model that incorporates both continuous and discrete features for casual inference. Clustering analysis suggested four gene clusters corresponding to distinct biological process including cell division, tumor invasion and mitochondrial system. In addition, they found that genes related to glycoprotein synthesis, hematopoietic, immune system could be highly predictive of overall survival time of ovarian cancer patients.
The electronic version of this article is the complete one and can be found online at:http://www.biomedcentral.com/1752-0509/8/1338
© 2014 Zhang et al.; licensee BioMed Central.



oss-talk mechanism has previously been identified between EGFR and the IGF1R signaling pathway.
