Sign in Register Submit Manuscript

Hapres Home

Location:Home >> Detail

Med One. 2017; 1: e170003.


Identification and Validation Novel Risk Genes for Type 1 Diabetes – A Meta-Analysis

Puqiang Yang 1 Ashley Chorath 2, Wenjuan Jiang 3 *

1 General Internal Medicine, Tongling Hospital of Traditional Chinese Medicine, Tongling, Anhui 244000, P.R. China

2 Radiology and Imaging Sciences, National Institutes of Health Clinical Center, Bethesda, MD 20892, USA

3 Department of Pharmaceutic Science, School of Pharmacy, University of Maryland, Baltimore, MD 21201, USA

*Corresponding to: Dr. Wenjuan Jiang, Department of Pharmaceutic Science, School of Pharmacy, University of Maryland, 20 Penn Street, MD 21201, Baltimore, USA. Tel: 3157085712.

Published: 25 February 2017


Background: Type 1 diabetes (T1D) cannot be explained by environmental factors alone and gene mutations in T1D should be also taken seriously. The aim of this study was to investigate the novel genetic risk of T1D by systematically reviewing the published literature and performing a meta-analysis.

Methods: A comprehensive search of electronic databases was completed using Illumina BioEngine. Seven T1D case/control bio-sets from four different studies were selected, including 286 T1D cases and 162 controls. The selected top T1D risk genes were further analyzed integrating an online open source T1D genetic database. Pathway enrichment analysis (PEA) and network connectivity analysis (NCA) were conducted to identify potential functional association between target genes and T1D.

Results: Six genes were identified through the meta-analysis as top target genes for T1D: HLA-DQB1, OAS1, CMKLR1, NFE2, MNDA and GPR56. These genes play roles within multiple T1D genetic pathways and demonstrate solid connections with known T1D target genes. NCA results also revealed strong functional association between these genes and T1D.

Conclusion: This study identified known as well as novel T1D target genes and their functional pathways that influence the T1D pathogenesis. Our results may provide new insights for the understanding of the genetic mechanisms of T1D.


Type 1 diabetes (T1D) is perceived as a progressive immune-mediated disease that accounts for approximately 5 % of all existing cases of diabetes in the USA [1]. There has been a considerable worldwide increase in incidence over the past few decades [2]. Environmental factors have been implicated in the pathogenesis of T1D as both triggers and potentiators of beta-cell destruction [3, 4]. However, gene mutations should also be considered in response to the heterogeneity of T1D, which cannot be explained purely by environmental factors.

Numerous genetic studies of T1D have been conducted to explore candidate genes for the disease with both case-control studies and family-based studies [5, 6]. In the 1970s, association and affected-sib pair linkage studies established the role of HLA genes in T1D predisposition [7]. Recently, multiple modality genetic data from peripheral blood monocyte gene were employed for the continued efforts to identify T1D genetic determinants [8-10]. These previously studies built a solid background for T1D genetic research, which could be leveraged for the discovery and evaluation of novel risk genes.

However, the risk estimates from individual studies often lack statistical power due to limited sample sizes and sample specificities in terms of phenotype characteristics. It is also difficult to come to a consistent conclusion as results are spread over a large number of independent studies. Therefore, a meta-analysis of multiple studies could provide a higher power assessment of the genetic risk factors of T1D.

In this study, a meta-analysis was performed based on four recent studies (2009-2014). The top genes from the study were further analyzed integrating a curated T1D genetic database (T1D_GD). The T1D_GD database was constructed using a large scale literature knowledge database, Pathway Studio (PS) ResNet database. In recent years, PS ResNet database has been widely used to study modeled relationships between proteins, genes, complexes, cells, tissues and diseases ( Our study identified novel T1D genes and evaluated the effectiveness of integrating meta-analysis and PS ResNet database to identify and evaluate novel T1D risk genes.


2.1 Genetic data selection

A systematic search of electronic databases was conducted using Illumine BaseSpace Correlation Engine ( Fig. 1 presents the diagram for the data selection. The use of the keyword ‘Type 1 Diabetes’ searched and identified 421 T1D studies. Further filter criteria included: (1) the data type is RNA expression, (2) the data organism is Homo sapiens (3) The study is recent (within the last 10 years) (4) The study is T1D case vs. healthy control study (or include case/control bio-sets). In total, seven bio-sets (T1D case/control comparisons) from four studies satisfied the study selection criteria and were included in this systematic review and meta-analysis.

Fig. 1 Workflow diagram for meta-analysis data selection
2.2 Genetic database T1D_GD

The T1D_GD is T1D targeted knowledge database online available at ‘Bioinformatics Database’ ( The database is updated monthly or upon request. The current version of T1D_GD is composed of 1165 T1D target genes (T1D_GD Related Genes), 646 small molecular/drugs (T1D_GD Related Drugs), 120 pathways (T1D_GD Related Pathways), and 96 related diseases (T1D_GD Related Diseases). The database also provides supporting references for each T1D-Gene and T1D-Drug relation, including the titles and the sentences where the relation has been identified (see T1D_GD Ref for Related Genes and T1D_GD Ref for Related Drugs, respectively). This information could be used to locate a detailed description of how a candidate gene/drug is related to T1D.

Using T1D_GD, further analyses of the T1D target genes from the meta-analysis were conducted, including identifying their related T1D pathways (T1D_GD Related Pathways), genes (T1D_GD Related Genes), and drugs (T1D_GD Related Drugs). Here we defined two genes as functionally related if they play roles within same genetic pathway. Pathway enrichment analysis (PEA) was conducted using Pathway Studio to identify genetic pathways potentially linked to T1D [11]. The gene-drug relationships were identified using the network building module of Pathway Studio.


3.1 Selected Datasets

After screening against the selection criteria, seven T1D case/control comparison bio-sets from four independent studies were retrieved and assessed for eligibility. One of the identified datasets contained four separate case/control studies [10], including: (1) children genetically predisposed to T1D vs. normal adults, (2) long standing T1D patients vs. normal adults (3) newly diagnosed T1D patients vs. normal adults (4) T1D vs. normal adults. The data is available at Each of other three datasets contained one case/control study [8, 9, 12]. These three datasets were available at NCBI GEO (ID: GSE10586, GSE33440 and GSE55098, respectively). Statistics of the included bio-sets are presented in Table 1.

Table 1. Characteristics of the selected studies ordered by publication date
3.2 Meta-analysis results

The Meta-analysis results were deposited into the ‘Bioinformatics Database’ ( ), named as T1D_Meta. The top 6 genes (Score ≥ 85) from the meta-analysis appear in Table 2, with more detailed statistics presented in T1D_Meta → Top 6 Genes. The full Meta-analysis results are presented in T1D_Meta → Full Gene List. A gene's score is defined by the meta-analysis Illumina BaseSpace Correlation Engine (, which is based on the statistical significance and consistency of the gene across the queried bio-sets. The higher the score, the greater the importance of the gene for the case/control comparison.

Table 2. The top 6 genes of the included studies excluded from T1D-gene relation data

Score: A gene's score is based on the statistical significance and consistency of the gene across the queried bio-sets. Specificity: A gene's specificity is the number of bio-sets in which the direction of a gene's regulation matches the selected filter. Associated Pathway: The known T1D related Pathways (T1D_GD → Related Pathways) that contain the gene. GO ID is provided if any. Gene Connectivity: The number of known T1D related genes (T1D_GD → Related Genes) that connect with the target gene.

Out of the six genes listed in Table 2, two were included in T1D_GD (HLA-DQB1 and OAS1), while other four were not (CMKLR1, NFE2, GPR56 and MNDA). Further study using the T1D_GD showed that, these 6 genes were enriched within multiple T1D target pathways and were connected to many other genes that were liked to T1D (see Table 2). Fig. 2 (a) presents the functionally connections among these six genes. Fig. 2 (b) shows that 18 T1D pathways including these six genes. To note, five of these 18 pathways were among the top 10 T1D pathways (T1D_GD Related Pathways), including external side of plasma membrane, cell surface, immune response, innate immune response and blood coagulation, as shown in Fig. 2.

Fig. 2 Six T1D genes and the 18 T1D pathways where the six genes get enriched.

Fig. 2 Six T1D genes and the 18 T1D pathways where the six genes get enriched. (a) The top six TID target genes from meta-analysis. The weight for a two-node edge is the number of shared pathways by the two genes. The weight for a single-node edge represents the number of pathways including the gene. (b) The 18 T1D pathways including these six genes. The weight for a two-node edge is the number of shared genes by the two Pathways. The weight for a single-node edge represents the number of genes included in the pathway.

3.3 Network analysis

Additional functional network connectivity analysis (NCA) using PS showed that, the four novel genes from this meta-analysis (CMKLR1, NFE2, GPR56 and MNDA) present strong functional association with T1D. These genes influence the pathogenic development of T1D through multiple pathways, as shown in Fig. 3 Under each relation (arrow) in Fig. 3, there are support from one or more references that provide detailed description of each relation (see T1D_Meta CMKLR1, T1D_Meta NFE2, T1D_Meta GPR56 and T1D_Meta MNDA).

Fig. 3 Network connectivity analysis between four target Genes and T1D.

Fig. 3 Network connectivity analysis between four target Genes and T1D.
(a)CMKLR1 T1D. (b)NFE2 T1D. (c)GPR56 T1D. (d)MNDA T1D. The networks were generated using ‘network building’ module of Pathway Studio. For the definition of the entity types and relation types in the figure please refer to


Meta-analysis is a statistical approach that combines the results of multiple scientific studies. Although many genetic studies have been conducted to discovery genetic risk factors for T1D, combining the results from these separated studies by using meta-analysis could lead to a higher statistical power and more robust point estimate. In this study, meta-analysis was performed on seven T1D case/control bio-sets extracted from four recent studies. T1D target genes from meta-analysis were sorted by gene score, which is based on the statistical significance and consistency of the gene across the queried bio-sets. Meta-analysis results suggested six top risk genes for T1D (Score ≥ 85), and four of them are novel according to a recently updated database T1D_GD. Further analyses were conducted to study the possible correlation between T1D and these six genes, with more efforts focused on the four novel ones.

Analysis using T1D_GD showed that the two known T1D target genes, HLA-DQB1 and OAS1, are among the top T1D_GD genes with supports from multiple independent studies (see T1D_GD Related Genes). PEA results showed that, both these two genes and other four genes (CMKLR1, NFE2, GPR56 and MNDA) are enriched within multiple T1D pathways (T1D_GD Related Pathways) and linked to dozens or even hundreds of other T1D genes (see Table 2 and Fig. 2). These results support the possible linkage between these genes and T1D.

Additional network connectivity analysis (NCA) revealed multiple possible functional associations between T1D and the four novel genes (see Fig. 3). It has been shown that activation of CMKLR1 could depress expression of INSR [13], while INSR a target gene for the treatment of T1D [14]. This suggests that CMKLR1may play a role in the development of T1D through a CMKLR1 INSR T1D pathway. Previous studies have also shown that MNDA is an inhibitor of PARP [15]. PARP has been suggested to play a role in the development of TID [16]. Therefore, MNDA may serve as a protector for T1D patients. NFE2 regulates the expression of HMOX1 [17], while overexpression of HMOX1 has been shown to delay T1D [18, 19]. Thus, NFE2 may influence the T1D pathogenesis by regulating the expression of HMOX1. Furthermore, several studies show that GPR56 can induce the secretion of IL6 [20], which has been shown to play a significant role in T1D etiopathogenesis [21]. These observations suggest a possible GPR56 IL6 T1D regulatory pathway. More potential regulation pathways could be identified from the T1D_Meta database (see T1D_Meta CMKLR1, T1D_Meta NFE2, T1D_Meta GPR56 and T1D_Meta MNDA), which has been deposited into the open source ‘Bioinformatics Database’ (

In summary, this meta-analysis supported the correlation between two genes (HLA-DQB1 and OAS1) and T1D, and revealed four novel potential risk genes for the disease (CMKLR1, NFE2, GPR56 and MNDA). Further network analysis supported the meta-analysis results and identified possible functional pathways and mechanisms, thorough which these genes exert influence on T1D. Findings in this study may provide new insights into the current field of T1D genetic study.


Authors claim no conflict of interests.























How to Cite This Article

Yang P, Chorath A, Jiang W. Identification and Validation Novel Risk Genes for Type 1 Diabetes – A Meta-Analysis. Med One. 2017 Feb 25; 1: e170003.

Copyright © 2020 Hapres Co., Ltd. Privacy Policy | Terms and Conditions