Prevalence and spectrum of BRCA germline variants in mainland Chinese familial breast and ovarian cancer patients

Germline mutations in BRCA1 and BRCA2 are the most penetrating genetic predispositions for breast and ovarian cancer, and their presence is largely ethnic-specific. Comprehensive information about the prevalence and spectrum of BRCA mutations has been collected in European and North American populations. However, similar information is lacking in other populations, including the mainland Chinese population despite its large size of 1.4 billion accounting for one fifth of the world's population. Herein, we performed an extensive literature analysis to collect BRCA variants identified from mainland Chinese familial breast and ovarian cancer patients. We observed 137 distinct BRCA1 variants in 409 of 3,844 and 80 distinct BRCA2 variants in 157 of 3,024 mainland Chinese patients, with an estimated prevalence of 10.6% for BRCA1 and 5.2% for BRCA2. Of these variants, only 40.3% in BRCA1 and 42.5% in BRCA2 are listed in current Breast Cancer Information Core database. We observed higher frequent variation in BRCA1 exons 11A, 11C, 11D, and 24 and BRCA2 exon 10 in Chinese patients than in the patients of other populations. The most common pathogenic variant in BRCA1 wasc.981_982delAT in exon 11A, and in BRCA2 c.3195_3198delTAAT in exon 11B and c.5576_5579delTTAA in exon 11E; the most common novel variant in BRCA1 was c.919A>G in exon 10A, and in BRCA2 c.7142delC in exon 14. None of the variants overlap with the founder mutations in other populations. Our analysis indicates that the prevalence of BRCA variation in mainland Chinese familial breast and ovarian cancer patients is at a level similar to but the spectrum is substantially different from the ones of other populations.


INTRODUCTION
BRCA1 and BRCA2 (BRCA) are rapidly evolving genes with high levels of variation across primate species [1][2][3].Germline mutations in BRCA predispose individuals for breast and ovarian cancer [4][5].Extensive efforts have been made to determine the prevalence and spectrum of germline mutations in both genes to aid clinical diagnosis of and prevent the disease [6][7][8].
BRCA mutations have been extensively analyzed in European and North American populations, but much less are known about them in Asian, African, and Latin American populations, although these contribute most of the total human population.Using the data from Western populations to interpret BRCA mutations in non-Western patients can be inaccurate and lead to misdiagnoses.Therefore, knowledge of ethnic-specific BRCA mutations is urgently demanding and will be highly beneficial for the patients.
Mainland China has a population size of nearly 1.4 billion, accounting for one fifth of the human population worldwide.However, limited information about BRCA mutations in this large population is available in current BRCA variation databases.For example, only 13 of the 1,791 BRCA1 variants and three of the 2,000 BRCA2 variants in the Breast Cancer Information Core (BIC) database were derived exclusively from mainland Chinese patients [15].We hypothesized that 1) BRCA variation may be common in this population, and 2) many variants representing potential mutations may have already been identified but this information is unknown outside the Chinese scientific community, because many Chinese scientists publish in Chinese rather than in English and most Chinese medical and health science journals are not included in international journal databases [16].To test our hypothesis, we performed an extensive survey of Chinese and English scientific literature to collect BRCA variant data derived solely from mainland Chinese familial breast and ovarian cancer patients (Figure 1).

Identification of publications
We identified 32 Chinese publications, including 24 peer-reviewed papers and eight graduate theses (Supplementary Table 1), and 11 peer-reviewed English papers.This totaled 43 publications covering between 2003 and 2015 reported BRCA variants from mainland Chinese familial breast and ovarian cancer patients .
From these publications, we identified familial breast cancer cases using the inclusion criteria described in each publication: at least one first-degree relative with breast cancer irrespective of age; breast cancer diagnosed before the age of 35 years with a family history of breast and/or ovarian cancer; at least one or two first-or seconddegree relatives diagnosed with breast cancer at any age; at least three relatives affected by breast cancer or breast and ovarian cancer; triple-negative breast cancer patients diagnosed before the age of 45 years; bilateral breast cancer diagnosed before the age of 50 years; one or more primary breast/ovarian cancers in first-or seconddegree relatives; and at least one relative with cancer other than breast and ovarian cancer that is known to be BRCA1-related.From these publications, we also collected pedigree and genotype information available from family members although most publications only analyzed the proband without such information (Supplementary Table 2).
We identified a total of 3,844 familial breast and ovarian cancer cases from the original studies.All of these were analyzed for BRCA1 (3,129 covered all exons), and 3,024 were analyzed for BRCA2 (2,854 covered all exons); 92% of the 3,844 cases were Han Chinese and the rest were from other ethnic groups (Hui, Mongol, Uyghur, Kazakh, and Russian) (Table 1).These studies were performed in 15 provinces or cities in mainland China, mostly in the densely populated, economically advanced eastern coast area, with the exception of Xinjiang and Ningxia regions (Figure 2).The information highlights the need to analyze the population in so far uncovered regions to fully determine the prevalence and spectrum of BRCA mutations in the entire mainland Chinese population.
Multiple assays including hetero-duplex formation, single-strand conformation polymorphism (SSCP), denaturing high-performance liquid chromatography (DHPLC), and Sanger sequencing were used in the original studies.All BRCA1 and BRCA2 variants collected in our current study were identified by either direct Sanger sequencing or by Sanger sequencing validation for the results from other assays (Table 1).

BRCA variants identified from publications
By mining the variant data from the 3,844 cases, we identified a total of 137 distinct BRCA1 variants in 409 cases, and 80 distinct BRCA2 variants in 157 cases (Table 2, Supplementary Table 3; Table 3, Supplementary Table 4).Of the 137 BRCA1 variants, 33 (24.6%) were detected

Prevalence assessment
The prevalence of the variant carriers was 10.6% (409/3,844) for BRCA1 and 5.2% (157/3,024) for BRCA2 (Of the 3,844 cases, all were used for BRCA1, but 3,024 were used for BRCA2).The total number of cases used for all exon analysis was 3,129 in BRCA1 and 2,854 in BRCA2.Thus, the total number of cases in the BRCA2 group accounted for 91.2% of the BRCA1 (2,854/3,129).Therefore, the different prevalence of BRCA1 and BRCA2 variations is unlikely caused by the analysis of different cases in each group but instead reflects the fact that BRCA1 has a higher prevalence than BRCA2 in Chinese population.This pattern differs from that in the neighboring Korean population, which has a much higher prevalence of BRCA2 variation than BRCA1 variation [14].The variation types included frameshift, nonsense, missense, and splicing changes.Majority of the variants except a few do not have frequency information in genome databases, indicating that the variants are mostly rare in human population (Supplementary Table 3, Supplementary Table 4).

Exon distribution of BRCA variants between Chinese and other patient populations
We compared exon distribution frequencies of BRCA variations between mainland Chinese patients and other patient populations represented in the BIC database.We compared the ratios calculated as: number of variation cases in each exon / total number of variation cases in each data set.The total number of variation cases (entries) in the BIC dataset was 15,311 for BRCA1 [61] and 14,914 for BRCA2 [62]; the total number of variation cases in this study was 409 for BRCA1 and 157 for BRCA2.The results showed that the distribution frequencies in 13 out of 24 BRCA1 exons were significantly different between between mainland Chinese and BIC populations (Figure 3A).Variants in mainland Chinese were particularly lower in exons 2 and 20 but higher in exons 11A, 11C, and 11D (exon 11    11D by the BIC database because of its large size) and exon 24 than in other populations.The variants in BRCA1 exons 11A, 11C, 11D and exon 24 occurred in 299 of the 409 (73.1%)Chinese BRCA1-variation cases.In BRCA2, the differences were smaller with only 6 out of 27 exons showed significant difference between mainland Chinese and BIC populations.Exon 10 was the highest in mainland Chinese with 44 of the 157 (28%) Chinese BRCA2variation cases (Figure 3B).Therefore, BRCA1 exon 11A, 11C, 11D, exon 24, and BRCA2 exon 10 are the variation hot spots in mainland Chinese patients.

BIC-matched variants
Fifty-six (40.3%)BRCA1 and 34 (42.5%)BRCA2 variants exist in the BIC database (Figure 4).Of these, 27 BRCA1 and 23 BRCA2 variants are classified by BIC as Class 5 (Pathogenic), 27 BRCA1 and 9 BRCA2 variants as Pending [most were variants of unknown significance (VUS)], and two BRCA1 and two BRCA2 variants as Class 1 (Benign).The most common pathogenic BRCA1 variant was c.981_982delAT (p.Cys328*) in exon 11A (n = 18), confirming the previous observation in a smaller group of patients [30].The frequency of this variant was substantially higher in mainland Chinese than in non-   3A).The most common Pathogenic BRCA2 variant was c.3195_3198delTAAT (p.Asn1066Leufs*10) in exon 11B (n = 5) and c.5576_5579delTTAA (p.Ile1859Lysfs*3) in exon 11E (n = 5), and the most common Pending variant was c.865A>C (p.Asn289His) in exon 10 (n = 13; frequency in 1000 Genomes: 0.0737).Except for the BRCA1 c.981_982delAT variant, other known pathogenic and Pending variants in either BRCA1 or BRCA2 are unlikely to be founder mutation candidates among mainland Chinese patients due to their lower prevalence or higher frequency in normal population.
In conclusion, our study indicates that BRCA variations are common in mainland Chinese familial breast and ovarian cancer patients.The absence of such information in current international BRCA databases appears to largely reflect the poor communication between Western and Chinese scientific communities.Our study also indicates while the prevalence of BRCA variation is similar to that of other populations, the spectrum of BRCA variation in Chinese patients differs substantially with the hot spots of BRCA1 exons 11A, 11C, 11D, 24 and BRCA2 exon 10.Except the c.981_982delAT in BRCA1 exon 11A, there is no strong evidence showing the presence of common founder BRCA mutations in mainland Chinese patients, although such a possibility may exist in certain subpopulations of specific geographic regions or ethnic groups in mainland China.

Information sources
We searched two major Chinese scientific databases, China National Knowledge Infrastructure (CNKI) [63] and WanFang [16], which comprehensively collect information from Chinese academic journals, dissertations, conference proceedings, and patents, by using the key words "breast cancer", "BRCA1 mutation", and "BRCA2 mutation" in Chinese characters.From the identified publications, we excluded those of sporadic breast cancer, animals, and those about patients marked with "early diagnosis", "triple-negative", and "bilateral" but without age indication, "male", and from non-mainland Chinese.Using similar approaches but in English, we also searched the PubMed database to identify non-Chinese publications reporting BRCA mutations from mainland Chinese patients (Figure 1).
We applied multiple steps to ensure the reliability of the identified variants, including: 1) only including variants detected or validated by Sanger sequencing; 2) re-annotating all variants following HGVS nomenclature using the reference sequences U14680 for BRCA1 and U43746 for BRCA2, regardless of original annotation; 3) using the BIC database (13-Mar-2015 version) as a reference to classify variants as known variants with BIC designation or novel variants without BIC designation; 4) excluding synonymous variants and un-interpretable variants from analysis; and 5) annotating novel variants by referring to their effects on coding changes in BRCA1 and BRCA2.We used U14680 and U43746 as the reference sequences for BRCA1 and BRCA2 annotation, as they were used as the standard references by most of the cited publications and BIC database.However, different BRCA databases may use different BRCA reference sequences, which can generate differences for certain variants.For example, Clinvar database uses NM_007294 and NM_000059 as the references for BRCA1 and BRCA2 (64).To facilitate data comparison with BRCA variants annotated by Clinvar database, we also included the variants annotated by using these two references (Supplementary Tables 3, 4).All variants were annotated following HGVS nomenclature.

Figure 1 :
Figure 1: Outline of the study.It shows the steps taken to extract information about BRCA variation in mainland Chinese familial breast and ovarian cancer patients.

Figure 2 :
Figure 2: Geographic locations of the original studies.The original studies were performed in 15 provinces and cities in mainland China.Of these, 13 were in east coast area of Han Chinese and two were in Xinjiang and Ningxia of other ethnic groups.

Figure 3 :
Figure 3: Comparison of exon distribution frequencies of BRCA variation between mainland Chinese and BIC populations.Relative ratios between these two datasets were used for the comparison (see text for the details).Chi square (χ 2 ) and Fisher exact test were used for statistics analysis."*" refers to p < 0.05 (actual P values listed in Supplementary Table5). A. Variant distribution in BRCA1.B. Variant distribution in BRCA2.

Figure 4 :
Figure 4: Matching BRCA variants to the BIC database.The 137 BRCA1 and 80 BRCA2 distinct variants from mainland Chinese patients were compared with the 1,781 BRCA1 and 2,000 BRCA2 distinct variants in the BIC database.Of the Chinese variants, 56 BRCA1 and 34 BRCA2 variants were matched, whereas 82 BRCA1 and 46 BRCA2 variants were not.

Table 2 : Examples of BRCA1 variants identified in mainland Chinese familial breast and ovarian cancer patients* Class (BIC) Exon HGVS annotation Variation type Total case Carrier
* The table lists the variants detected in at least two cases in each class