Identification of five genetic variants as novel determinants of type 2 diabetes mellitus in Japanese by exome-wide association studies

We performed exome-wide association studies to identify single nucleotide polymorphisms that either influence fasting plasma glucose level or blood hemoglobin A1c content or confer susceptibility to type 2 diabetes mellitus in Japanese. Exome-wide association studies were performed with the use of Illumina Human Exome-12 DNA Analysis or Infinium Exome-24 BeadChip arrays and with 11,729 or 8635 subjects for fasting plasma glucose level or blood hemoglobin A1c content, respectively, or with 14,023 subjects for type 2 diabetes mellitus (3573 cases, 10,450 controls). The relation of genotypes of 41,265 polymorphisms to fasting plasma glucose level or blood hemoglobin A1c content was examined by linear regression analysis. After Bonferroni's correction, 41 and 17 polymorphisms were significantly (P < 1.21 × 10−6) associated with fasting plasma glucose level or blood hemoglobin A1c content, respectively, with two polymorphisms (rs139421991, rs189305583) being associated with both. Examination of the relation of allele frequencies to type 2 diabetes mellitus with Fisher's exact test revealed that 87 polymorphisms were significantly (P < 1.21 × 10−6) associated with type 2 diabetes mellitus. Subsequent multivariable logistic regression analysis with adjustment for age and sex showed that four polymorphisms (rs138313632, rs76974938, rs139012426, rs147317864) were significantly (P < 1.44 × 10−4) associated with type 2 diabetes mellitus, with rs138313632 and rs139012426 also being associated with fasting plasma glucose and rs76974938 with blood hemoglobin A1c. Five polymorphisms—rs139421991 of CAT, rs189305583 of PDCL2, rs138313632 of RUFY1, rs139012426 of LOC100505549, and rs76974938 of C21orf59—may be novel determinants of type 2 diabetes mellitus.


INTRODUCTION
Type 2 diabetes mellitus (DM) is a major cause of nephropathy, retinopathy, and neuropathy as well as cardiovascular disease and stroke [1,2]. The heritability of type 2 DM has been estimated to be 50% to 60% [3]. Genome-wide association studies (GWASs) and metaanalyses thereof have identified >80 susceptibility loci for type 2 DM in individuals of European [4][5][6][7][8][9] or African [10] ancestry, in East Asians [11], or in multiple ethnic groups [12]. Genetic variants identified in these previous studies typically have a minor allele frequency (MAF) of ≥5% and a small individual effect size. Given that these common variants explain only a fraction of the heritability of type 2 DM, low-frequency (0.5% ≤ MAF < 5%) or rare (MAF < 0.5%) variants with larger effect sizes are also thought to contribute to the genetic architecture of this condition [13]. Among Japanese, GWASs have identified KCNQ1 [14,15], UBE2E2, C2CD4A-B [16], ANK1 [17], MIR129-LEP, GPSM1, and SLC16A13 [18] as susceptibility genes for type 2 DM, and a recent metaanalysis identified an additional seven susceptibility loci [19]. Genetic variants, including low-frequency and rare variants, that influence fasting plasma glucose (FPG) levels and blood glycosylated hemoglobin (HbA 1c ) content or which contribute to predisposition to type 2 DM in Japanese remain to be identified definitively, however.
We have now performed exome-wide association studies (EWASs) with the use of exome array-based genotyping methods to identify single nucleotide polymorphisms (SNPs)-especially low-frequency or rare coding variants with moderate to large effect sizes-that influence FPG levels and blood HbA 1c content or which confer susceptibility to type 2 DM in Japanese. We used Illumina arrays that provide coverage of functional SNPs in entire exons including low-frequency and rare variants.

Characteristics of subjects
The characteristics of subjects are shown in Table 1. Age, the frequency of men, body mass index, and the prevalence of hypertension, dyslipidemia, chronic kidney disease, and hyperuricemia as well as systolic and diastolic blood pressure, serum concentrations of triglycerides, creatinine, and uric acid, FPG level, and blood HbA 1c content were greater, whereas estimated glomerular filtration rate and the serum concentration of high density lipoprotein (HDL)-cholesterol were lower, in subjects with type 2 DM than in controls.

EWAS for FPG concentration
We examined the relation of genotypes of 41,265 SNPs that passed quality control to FPG levels in 11,729 subjects by linear regression analysis. A Manhattan plot for the EWAS is shown in Supplementary Figure 1A. After Bonferroni's correction, 41 SNPs were significantly [P < 1.21 × 10 −6 (0.05/41,265)] associated with FPG concentration (Table 2).

EWAS for blood HbA 1c content
We examined the relation of genotypes of 41,265 SNPs to blood HbA 1c content in 8635 subjects by linear regression analysis. A Manhattan plot for the EWAS is shown in Supplementary Figure 1B. After Bonferroni's correction, 17 SNPs were significantly (P < 1.21 × 10 -6 ) associated with blood HbA 1c content (Table 3). SNPs rs139421991 [G/A (R320Q)] of CAT and rs189305583 [C/T (V69I)] of PDCL2 were significantly associated with both FPG level and blood HbA 1c content.

EWAS for type 2 DM
The EWAS for type 2 DM was performed with 14,023 subjects (3573 individuals with type 2 DM, 10,450 controls). We examined the relation of allele frequencies of 41,265 SNPs to type 2 DM with Fisher's exact test. A Manhattan plot for the EWAS is shown in Supplementary Figure 1C. After Bonferroni's correction, 87 SNPs were significantly (P < 1.21 × 10 −6 ) associated with type 2 DM (Supplementary Table 1). The genotype distributions of these SNPs were in Hardy-Weinberg equilibrium (P > 0.001) among controls (Supplementary Table 2). www.impactjournals.com/oncotarget The relation of these 87 SNPs to type 2 DM was examined further by multivariable logistic regression analysis with adjustment for age and sex (Supplementary Table 3). Four SNPs-rs138313632 [T/G (S705A)] of RUFY1, rs76974938 [C/T (D67N)] of C21orf59, rs139012426 [G/C (S1242T)] of LOC100505549, and rs147317864 [C/T (A262T)] of TRABD2B-were significantly [P < 1.44 × 10 −4 (0.05/348)] associated with type 2 DM (Table 4). The minor G, T, and C alleles of rs138313632, rs76974938, and rs139012426, respectively, were protective against type 2 DM, whereas the minor T allele of rs147317864 was a risk factor for this condition. SNPs rs138313632 of RUFY1 and rs139012426 of LOC100505549 were significantly associated with both FPG level and type 2 DM, whereas rs76974938 of C21orf59 was significantly associated with both blood HbA 1c content and type 2 DM.

Relation of SNPs to FPG level or blood HbA 1c content
We examined the relation of genotypes of identified SNPs to the FPG level or blood HbA 1c content by one-way analysis of variance (ANOVA). The 41 SNPs identified in the EWAS for FPG level, including the two SNPs (rs138313632, rs139012426) also found to be associated with type 2 DM, were all significantly [P < 0.0012 (0.05/43)] associated with FPG level. The remaining two SNPs associated with type 2 DM (rs76974938, rs147317864) were not significantly related to the FPG level (Table 5).  rs137981794 The relation of genotypes of SNPs to FPG level was examined by linear regression analysis. a Major allele/minor allele. The relation of genotypes of SNPs to blood HbA 1c content was examined by linear regression analysis. a Major allele/minor allele.  The 17 SNPs identified in the EWAS for blood HbA 1c content, including the one SNP (rs76974938) also found to be associated with type 2 DM, were all significantly [P < 0.0025 (0.05/20)] associated with blood HbA 1c content by one-way ANOVA. The remaining three SNPs associated with type 2 DM (rs138313632, rs139012426, rs147317864) were not significantly related to blood HbA 1c content ( Table 6).

Relation of identified SNPs to phenotypes examined in previous GWASs
We examined the genes, chromosomal loci, and SNPs identified in the present study to DM-related phenotypes examined in previous GWASs deposited in a public database [GWAS Catalog (http://www.ebi.ac.uk/ gwas)].
LGR5 has been previously shown to be related to type 2 DM, whereas TNXB has been previously associated with type 1 DM and PTCHD3 with fasting insulin-related traits (Supplementary Table 4). The remaining 54 SNPs identified in the present study have not been previously found to be related to DM-related phenotypes.

Network analysis of the genes identified in the present study
We performed network analysis of the top ten (high scores) genes that have been shown to be associated with type 2 DM selected from DisGeNET database (http:// www.disgenet.org/web/DisGeNET) and four genes (CAT, PDCL2, RUFY1, and C21orf59) identified in the present study by the use of Cytoscape version 3.4.0 software (http://www.cytoscape.org/). Given that LOC100505549 protein has not been characterized, it could not be examined. The network analysis showed that CAT, PDCL2, RUFY1, and C21orf59 have potential indirect interactions with several genes previously shown to be associated with type 2 DM (Supplementary Figure 2).

We have now shown that two SNPs-rs139421991 [G/A (R320Q)] of CAT and rs189305583 [C/T (V69I)]
of PDCL2-were significantly associated with both FPG levels and blood HbA 1c content; two SNPs-rs138313632 [T/G (S705A)] of RUFY1 and rs139012426 [G/C (S1242T)] of LOC100505549-were significantly associated with both FPG levels and type 2 DM; and one SNP, rs76974938 [C/T (D67N)] of C21orf59, was significantly associated with both blood HbA 1c content and type 2 DM. These five SNPs may thus be novel determinants of type 2 DM. Given that FPG levels are affected by meals of the day before examination, we selected the SNPs and genes that were associated with both FPG levels and blood HbA 1c or type 2 DM.
The catalase gene (CAT) is located at chromosomal region 11p13 (NCBI Gene, https://www.ncbi.nlm.nih. gov/gene) and is expressed in various tissues and organs  including the pancreas (The Human Protein Atlas, http:// www.proteinatlas.org). Catalase catalyzes the breakdown of hydrogen peroxide into oxygen and water. Inherited catalase deficiency has been associated with an increased prevalence of type 2 DM in Hungarians [20][21][22]. The frequency of various CAT mutations has been found to be increased in individuals with DM, especially in females with type 2 DM, and such inherited catalase deficiency is associated with an early onset of type 2 DM [20]. We have now shown that rs139421991 [G/A (R320Q)] of CAT was significantly associated with both FPG levels and blood HbA 1c content, with the minor A allele being related to an increase in these parameters. This association of CAT with type 2 DM may be attributable to the role of the encoded protein in the metabolism of hydrogen peroxide and oxidative stress, although the underlying molecular mechanism remains to be determined. The phosducin like 2 gene (PDCL2) is located at chromosome 4q12 (NCBI Gene) and is expressed at a high level in testis (The Human Protein Atlas). The PDCL2 protein belongs to the phosducin family [23]. PDCL1 has been shown to be essential for G protein signaling as a result of its role in folding and assembly of the Gβγ dimer. PDCL2 and PDCL3 likely assist in the folding of actin, tubulin, and proteins that activate cell cycle progression [24]. We have now shown that rs189305583 [C/T (V69I)] of PDCL2 was significantly associated with both the FPG concentration and blood HbA 1c content, with the minor T allele being related to increases in these parameters. Given that G protein-coupled receptor signaling promotes insulin secretion and the proliferation of pancreatic β cells [25], the association PDCL2 with FPG levels and blood HbA 1c content may be attributable to an effect of PDCL2 on such signaling, although the molecular mechanism remains unclear.
The RUN and FYVE domain containing 1 gene (RUFY1) is located at chromosomal region 5q35.3 (NCBI Gene) and is expressed in various tissues and organs including the pancreas (The Human Protein Atlas). The RUFY1 protein binds to phosphatidylinositol 3-phosphate and promotes early endosomal trafficking including the tethering and fusion of vesicles through interactions with small GTPases such as Rab4, Rab5, and Rab14 [26].
We have now shown that rs138313632 [T/G (S705A)] of RUFY1 was significantly associated with both FPG concentration and type 2 DM, with the minor G allele being related to a decreased FPG level and a reduced risk for type 2 DM. Given that small GTPases enhances insulin granule exocytosis [27], the association of RUFY1 with both FPG levels and type 2 DM may be attributable to an effect of the encoded protein on insulin secretion.
The uncharacterized LOC100505549 gene is located at chromosome 18q21.31 (NCBI Gene). The function of LOC100505549 remains unknown. We have now shown that rs139012426 [G/C (S1242T)] of LOC100505549 was significantly associated with both FPG levels and type 2 DM, with the minor C allele being related to a decreased FPG concentration and a reduced risk of type 2 DM, although the functional relevance of this association remains unknown.
The chromosome 21 open reading frame 59 gene (C21orf59) is located at chromosome 21q22.11 (NCBI Gene) and is expressed in various tissues and organs including the pancreas (The Human Protein Atlas). The C21orf59 protein promotes dynein arm assembly in motile cilia, and mutations in C21orf59 cause ciliary dyskinesia [28]. Ciliopathies are associated with pancreatic defects that manifest mostly as cysts originating from ductal cells. Ciliary proteins have been suggested to influence insulin secretion and energy regulation [29,30]. We have now shown that rs76974938 [C/T (D67N)] of C21orf59 was significantly associated with both blood HbA 1c content and type 2 DM, with the minor T allele being related to a decreased blood HbA 1c content and a reduced risk for type 2 DM. Given that C21orf59 may activate ciliary function and that cilia influence insulin secretion [29,30], the association of C21orf59 with blood HbA 1c content and type 2 DM might reflect an effect of this gene on insulin secretion, although the underlying molecular mechanism remains unclear.
In previous GWASs of type 2 DM in the Japanese population [14][15][16][17][18][19], the MAF of identified SNPs ranged from 2% to 48% and the odds ratio (OR) from 0.38 to 1.70. In a meta-analysis of GWASs for type 2 DM in East Asian populations [11] and in a trans-ancestry metaanalysis of GWASs for type 2 DM [12], the OR ranged from 1.06 to 1.10 or from 1.08 to 1.13, respectively. In our study, we identified four SNPs associated with type 2 DM, with the MAF and OR in a dominant model of logistic regression analysis for rs138313632, rs76974938, rs139012426, and rs147317864 being 0.5% and 0.20, 2.4% and 0.33, 0.4% and 0.22, and 0.2% and 1.21 × 10 8 , respectively. Both rs138313632 and rs76974938 were thus low-frequency variants with a moderate effect size, whereas rs139012426 and rs147317864 were rare variants with a moderate to large effect size.
Two SNPs (MAF and differences in FPG level or blood HbA 1c content between genotypes, respectively, shown in parentheses) associated with both FPG concentration and blood HbA 1c content-rs139421991 of CAT (0.3%, 17.0%, 11.3%) and rs189305583 of PDCL2 (0.1%, 24.8%, 15.7%)-were rare variants with a large effect size; two SNPs associated with both FPG levels and type 2 DM-rs138313632 of RUFY1 (0.5%, 21.3%, 22.4%) and rs139012426 of LOC100505549 (0.4%, 20.5%, 28.0%)-were rare or low-frequency variants with a large effect size; and one SNP associated with both blood HbA 1c content and type 2 DM, rs76974938 of C21orf59 (2.4%, 4.3%, 9.4%), was a low-frequency variant with a moderate effect size. Among the remaining 37 SNPs associated with FPG levels, 25 SNPs were rare variants with a large effect size and 12 SNPs were low-frequency variants with a moderate to large effect size. Among the remaining 14 SNPs associated with blood HbA 1c content, two SNPs were rare variants with a large effect size, three SNPs were low-frequency variants with a moderate to large effect size, and nine SNPs were common variants with a small to moderate effect size (Supplementary Table 5).
There are several limitations to the present study. (i) Given that our results were not replicated, they will require validation in other subject panels or in other ethnic groups. (ii) There is a possibility that some control individuals are prediabetic. (iii) It is possible that SNPs identified in the present study are in linkage disequilibrium with other polymorphisms in other nearby genes that are actually determinants of FPG levels, blood HbA 1c content, or the development of type 2 DM. (iv) One SNP associated with type 2 DM was not significantly related to FPG level or blood HbA 1c content, a discrepancy that may be attributable to the effects of medical treatment. (v) The biological or functional evidence of the association of the identified SNPs with FPG level, blood HbA 1c content, or type 2 DM remains to be determined. Because of lack of experiments for functional analyses, the association of the SNPs identified in the present study with type 2 DM, FPG levels, or blood HbA 1c content should be interpreted carefully.
In conclusion, we have identified five SNPs-rs139421991 [G/A (R320Q)] of CAT, rs189305583 [C/T (V69I)] of PDCL2, rs138313632 [T/G (S705A)] of RUFY1, rs139012426 [G/C (S1242T)] of LOC100505549, and rs76974938 [C/T (D67N)] of C21orf59-as novel determinants of type 2 DM. We also identified 37, 14, or one SNPs as candidate determinants of FPG levels, blood HbA 1c content, and type 2 DM, respectively. Determination of genotypes for these SNPs may prove informative for assessment of the genetic risk for type 2 DM in Japanese.

Study subjects
A total of 14,023 individuals was examined. The subjects were recruited as described previously [31].
Type 2 DM was defined according to the criteria of the World Health Organization as described previously [32][33][34]. Subjects with type 2 DM had an FPG level of ≥6.93 mmol/L (126 mg/dL) or a blood HbA 1c content of ≥6.5% or were taking antidiabetes medication. We thus examined 3573 subjects with type 2 DM and 10,450 controls. Individuals with type 1 DM, maturity-onset diabetes of the young, DM associated with mitochondrial diseases or single-gene disorders, pancreatic diseases, or other metabolic or endocrinologic diseases were excluded from the study. Those taking medications that may cause secondary DM were also excluded. The control subjects had an FPG level of <6.05 mmol/L (110 mg/dL), a blood HbA 1c content of <6.2%, and no history of DM or of having taken antidiabetes medication. Autopsy cases were excluded from controls.
The study protocol complied with the Declaration of Helsinki and was approved by the Committees on the Ethics of Human Research of Mie University Graduate School of Medicine, Hirosaki University Graduate School of Medicine, Tokyo Metropolitan Institute of Gerontology, and participating hospitals. Written informed consent was obtained from all subjects or families of the deceased subjects.

EWASs
Methods for sample collection and extraction of genomic DNA have been described previously [31]. EWASs for FPG concentration and blood HbA 1c content included 11,729 and 8635 subjects, respectively, whereas that for type 2 DM included 14,023 individuals (3573 subjects with type 2 DM, 10,450 controls). Data for FPG levels were obtained from subjects who had fasted overnight. Data for blood HbA 1c content were obtained from subjects with type 2 DM or impaired glucose tolerance or from those who had annual health checkup. The EWASs were performed with the use of a HumanExome-12 v1.1 or v1.2 DNA Analysis BeadChip or Infinium Exome-24 v1.0 BeadChip (Illumina, San Diego, CA, USA). Detailed information of the exome arrays and methods of quality control have been described previously [31]. Genotype data were examined for population stratification by principal components analysis [35] (Supplementary Figure 3). A total of 41,265 SNPs passed quality control and was subjected to analysis.

Statistical analysis
The relation of genotypes of SNPs to FPG level or blood HbA 1c content in the EWASs was examined by linear regression analysis. For analysis of characteristics of the study subjects, quantitative and categorical data were compared between individuals with type 2 DM and controls with the unpaired Student's t test or Fisher's exact test, respectively. Allele frequencies were estimated by the gene counting method, and Fisher's exact test was used to identify departure from Hardy-Weinberg equilibrium. The relation of allele frequencies of SNPs to type 2 DM in the EWAS was examined with Fisher's exact test. To compensate for multiple comparisons of genotypes with FPG level or blood HbA 1c content or of allele frequencies with type 2 DM, we applied Bonferroni's correction for statistical significance of association. Given that 41,265 SNPs were analyzed, the significance level was set at P < 1.21 × 10 −6 (0.05/41,265) for the EWASs. Quantilequantile plots for P values of genotypes in the EWASs for FPG level or blood HbA 1c content or for those of allele frequencies in the EWAS for type 2 DM are shown in Supplementary Figure 4. The inflation factor (λ) was 1.02 for FPG level, 1.03 for blood HbA 1c content, and 1.26 for type 2 DM. Multivariable logistic regression analysis was performed with type 2 DM as a dependent variable and independent variables including age, sex (0, woman; 1, man), and genotype of each SNP. A detailed method of analysis has been described previously [31]. The relation of genotypes of identified SNPs to FPG level or blood HbA 1c content was examined by one-way ANOVA. Bonferroni's correction was also applied to other statistical analysis as indicated to compensate for multiple comparisons. Statistical tests were performed with JMP Genomics version 6.0 software (SAS Institute, Cary, NC, USA).

Author contributions
Y. Yamada contributed to conception and design of the study; to acquisition, analysis, and interpretation of the data; and to drafting of the manuscript. J. Sakuma, I. Takeuchi, and Y. Yasukochi contributed to analysis and interpretation of the data as well as to revision of the manuscript. K. Kato, M. Oguri, T. Fujimaki, H. Horibe, M. Muramatsu, M. Sawabe, Y. Fujiwara, Y. Taniguchi, S. Obuchi, H. Kawai, S. Shinkai, S. Mori, and T. Arai contributed to acquisition of the data and to revision of the manuscript. M. Tanaka contributed to acquisition, analysis, and interpretation of the data as well as to revision of the manuscript. All authors approved submission of the final version of the article for publication.