Sequence conservation of mitochondrial (mt)DNA during expansion of clonal mammary epithelial populations suggests a common mtDNA template in CzechII mice

One major foundation of cancer etiology is the process of clonal expansion. The mechanisms underlying the complex process of a single cell leading to a clonal dominant tumor, are poorly understood. Our study aims to analyze mitochondrial DNA (mtDNA) for somatic single nucleotide polymorphisms (SNPs) variants, to determine if they are conserved throughout clonal expansion in mammary tissues and tumors. To test this hypothesis, we took advantage of a mouse mammary tumor virus (MMTV)-infected mouse model (CzechII). CzechII mouse mtDNA was extracted, from snap-frozen normal, hyperplastic, and tumor mammary epithelial outgrowth fragments. Next generation deep sequencing was used to determine if mtDNA “de novo” SNP variants are conserved during serial transplantation of both normal and neoplastic mammary clones. Our results support the conclusion that mtDNA “de novo” SNP variants are selected for and maintained during serial passaging of clonal phenotypically heterogeneous normal cellular populations; neoplastic cellular populations; metastatic clonal cellular populations and in individual tumor transplants, grown from the original metastatic tumor. In one case, a mammary tumor arising from a single cell, within a clonal hyperplastic outgrowth, contained only mtDNA copies, harboring a deleterious “de novo” SNP variant, suggesting that only one mtDNA template may act as a template for all mtDNA copies regardless of cell phenotype. This process has been attributed to “heteroplasmic-shifting”. A process that is thought to result from selective pressure and may be responsible for pathogenic mutated mtDNA copies becoming homogeneous in clonal dominant oncogenic tissues.


INTRODUCTION
Since the advent of rapid sequencing technology, it has become more advantageous for researchers to investigate evolutionary history of individual tumors. Assessing tumor clonality in heterogenous populations requires stable reliable biomarkers that remains conserved throughout clonal expansion. Early methods for assessing tumor clonality used X chromosome-linked studies such as: glucose 6 phosphate dehydrogenase (G6PD) [1]; phosphoglycerate kinase (PGK) [2]; and a human androgen receptor (HUMARA) [3]. Interestingly, relying on the random inactivation of the X chromosome, process known as lyonization, only makes this method most accurate and useful for assessment in females. Another method analyzed and compared loss of heterozygosity (LOH) in microsatellite regions of chromosomes [4]. Currently, analysis of mitochondrial DNA, as another molecular genomic marker, has piqued interest. Several earlier studies focused on the D Loop (displacement Research Paper loop) in the control region of the mtDNA. Specifically identifying a mononucleotide Cytosine repeat, in the hypervariable 2 (HVR2) region, known as D310. Utilizing the D310 repeat has shown promise in assessing clonality in a myriad of cancers: lung cancers [5], head and neck cancers [5], and several other solid tumors. However, due to the D310 mutation localized in the hypervariable region, where increased variation persist, effective use of the variation as a stable biomarker remains inconclusive.
Our study analyzes mtDNA sequence conservation or alteration in the clonal progeny of a single cell, which was identified by retroviral genomic insertions and followed through various stages of carcinogenic progression. The CzechII mouse represents a unique model system for determining the role of the MMTV in promoting mammary tumorigenesis [6]. This is manifest in the absence of MMTV sequences in the CzechII mouse genomic DNA. The CzechII colony is the only mouse line inbred or otherwise that is negative for MMTV DNA sequences in its germline DNA. As a result, all MMTV-DNA insertion events can be mapped within the mammary somatic genome whether oncogenic transformation results or not. Thus, it was possible to include non-tumorigenic "normal" MMTV-infected mammary clones, as well as premalignant and malignant mammary clones in our study.

Isolation of mitochondrial DNA from snap frozen tissue, utilizing Qiagen Qproteome and DNeasy isolation kits
A cartoon schematic shows our methods of utilizing two Qiagen isolation kits, Qproteome and DNeasy, to isolate intact mitochondria and mitochondrial DNA, using snap frozen starting material (Supplementary Figure 1A, 1B). After isolation of mitochondrial DNA, we validated the presence of mitochondrial DNA by PCR. Primer sets were designed to target 4 arbitrary regions in the mouse mitochondrial genome: Trn-Pro, Co3, Rnr1 and D-Loop ( Figure 1A). On a 1% agarose gel, the DNA fragment patterns are at the desired base pair length: Trn-Pro (950 bp), Co3 (805 bp), Rnr1 (619 bp) and D-Loop (513 bp) ( Figure 1B). To ensure mitochondrial DNA is not fragmented, we performed genomic TAPE assay to measure mitochondrial integrity. Results show mitochondrial DNA appears around 16,000 bp in length with 7.3 DIN ( Figure 1C).
Mitochondrial DNA bioinformatic analysis and phylogenetic mapping reveals CzechII mouse relationship distance from the mtDNA of 39 common mouse strains Bioinformatic analysis, of CzechII mtDNA, enabled phylogenetic mapping. Comparison and genotyping of CzechII mouse genome and 39 common mice strains (i. e. C57 black and Balb/C), was performed utilizing the Sanger Mouse Genome Project. Phylogenetic network construction revealed, via mtDNA analysis and genotyping, glaring separation between CzechII mtDNA and that of common mouse strains (Figure 2A, 2B).
Next generation deep sequencing identifies 2 "de-novo" SNP variants conserved in R12 MT1 tumor transplants and lung metastasis Next Generation sequencing, with a mitochondrial genome coverage of ≥50× at ≈98% (Supplementary Table 1), was performed on normal mammary generational outgrowths of R12 and L12. SNP variant calling analysis identified CzechII mt-SNPs which were filtered to reveal minor somatic SNP variants, throughout CzechII normal mammary outgrowths. R12 normal outgrowths revealed no significant change in the mitochondrial genome compared to the controls, mtDNA isolated from snap frozen liver and lactating mammary gland. The same conservation pattern, i. e. wild type, was observed in the mitochondria extracted and sequenced from multiple (n = 5) L12 second transplant generation lactating outgrowths. In agreement with the results obtained with mtDNA isolated from lactating R12 serial transplants (Data not shown).
A 3D line graph ( Figure 3) shows the appearance and conservation of two "de-novo" SNP variants, mt-ND1 3695 AC>A and mt-ND5 12871 G>A, which were not present in normal R12 or controls but later appeared in subsequent R12 tumor clone transplants. Next Generation sequencing data reveals a mt-ND1 SNP variant, 3695 AC>A and an mt-ND5 SNP variant, 12871 G>A, in a primary R12 tumor in the near equivalent of 17%. In R12 tumor transplants the mt-ND1 SNP variant, 3695 AC>A, increased in frequency to approximately 45% to 55% and remained conserved throughout samples. Alternatively, the mt ND5 SNP variant, 12871 G>A, remained at a frequency of 17%, in all the tumor transplants. Mt-ND1 SNP variants, 3695 AC>A and mt-ND5 SNP variant, 12871 G>A, were not found in mt-DNA isolated from an unrelated CzechII mammary tumor arising in the same mouse. The R12 lung metastatic tumor transplants, show a similar pattern of increased frequency for the mt-ND1 SNP variant, 3695 AC>A. R12 lung metastasis, compared to control, the mt-ND1 SNP increases to approximately 45% to 75% and was conserved throughout samples. Similar to the preceding tumor transplants, the mt ND5 SNP variant, 12871 G>A remained conserved at the same frequency (17%) (Figure 4). www.oncotarget.com  In silico analysis of, mt ND1 (3695 AC>A) and mt ND5 (12871 G>A) in CzechII R12 tumor transplants, investigated SNP variant consequence; SNP variant impact; protein position; amino acid changes; and SIFT prediction of the SNP variant impact on amino acid changes which may affect protein function. The conserved mt ND1 SNP variant, 3695 AC>A, was shown to be a deleterious, frameshift mutation and had high impact on DNA sequence that leads to a truncated or non-functional ND1 protein. Mt ND5 SNP variant, 12871 G>A, was shown to be a tolerated missense mutation and was predicted to have a moderate impact on DNA sequence that may impact ND5 protein function (Table 1).

Figure 4: CzechII mammary R12 metastatic tumor mtDNA SNP variant calling via Next Generation sequencing.
Next generation sequencing was performed on R12 mammary tumor and 5 CzechII mammary R12 serially transplanted metastatic tumor fragments from R12 Tumor, SNP variant calling was performed to analyze common somatic SNPs that were conserved across R12 metastatic tumor fragments in comparison to CzechII Lactating mammary gland, negative control and CzechII Primary R12, positive control. sequencing was performed on R12 mammary tumor and the tumors from 7 serially transplanted CzechII mammary R12 tumor fragments, SNP variant calling was performed to analyze common somatic SNPs that were conserved across R12 tumor fragments in comparison to CzechII Lactating mammary gland, negative control and CzechII Primary R12, positive control. Samples ran in duplicates. www.oncotarget.com

Next generation deep sequencing identifies 2 "denovo" SNPs conserved in CZN5 hyperplasia to tumorigenesis
MtDNA was isolated from CZN5, a premalignant CzechII clonally derived outgrowth line. Next generation deep sequencing, with a mitochondrial genome coverage of ≥50× at ≈98% (Supplementary Table 1), was performed to analyze somatic SNP variants that were conserved in the CZN5 hyperplasia and the mammary tumors that stochastically developed in this population during its serial passage. A 3D line graph ( Figure 5) illustrates two "denovo" conserved SNP variants, mt-ND1 (3274 T>TA) and mt-CO1 (1017 G>T), develop in CZN5 mammary tumors, which did not appear in the lactating mammary gland control. Mt-ND1 SNP variant (3274 T>TA), appeared at 0% frequency in the antecedent CZN5 hyperplasia but became fixed at 100%, in the succeeding CZN5 tumor 2. The second SNP variant, mt-Cox1 (1017 G>T), appeared to be at 0% frequency, which showed conservation of frequency throughout CZN5 hyperplasia and CZN5 tumor 1.
The two SNP variants were also identified in CZN5 mammary tumors. However, the mt-ND1 SNP variant, (3274 T>TA), displayed 0% frequency, in the antecedent hyperplasia and remained consistent throughout tumor development. Interestingly, mt-CO1 SNP variant (1017 G>T)

Figure 5: CzechII mammary CZN5 tumor 1 mtDNA SNP variant calling via Next Generation sequencing. Next
Generation sequencing was performed on CZN5 tumor 1 that arose from a CzechII CZN5 hyperplasia. SNP variant calling was performed to analyze common somatic SNPs that were conserved across CZN5 hyperplasia and tumor outgrowth, in comparison to CzechII lactating mammary gland control. appeared at 0.8% frequency, in CZN5 hyperplasia and later increased to 14% frequency, in CZN5 tumor 2 ( Figure 6).
In silico analysis of mt-ND1 (3274 T>TA) and mt-Co1 (1017 G>T) in the CzechII CZN5 tumor 1 samples, identified SNP variant consequence; SNP variant impact; protein position; amino acid changes; and SIFT prediction of SNP variant impact on amino acid changes which may affect protein function. The conserved mt ND1 (3274 T>TA) was shown to be a deleterious, frameshift mutation and had a high impact on DNA sequence that may lead to engendering a non-functional ND1 protein.
Mt-Co1 (1017 G>T) was shown to be a modifier due to its position outside the coding region, no functional data was predicted for this SNP variant (Table 1).
R12 and CZN5 tumors, harboring "de-novo" SNP variant (3695 AC>A) and (3274 T>TA) respectively, reveals significant decrease in mt-ND1 gene expression Next generation sequencing identified two different deleterious "de novo" mt ND1 SNP variants, in R12 and CZN5 tumors. We investigated the deleterious effects of the SNPs, in ND1, due to possibly elucidating the potential need, of the mutation, in mammary tumorigenesis. In silico analysis revealed these SNP variants to be deleterious frameshift mutations, that highly impact mt ND1 sequence, leading to decrease in mt ND1 gene expression. To validate the predicted impact of SNP variants on, mt ND1 sequence, ddPCR analysis was performed on the R12 and CZN5 tumors harboring the two "de novo" SNP variants. Statistical analysis of ddPCR mt ND1 gene expression results revealed a 2.5-fold decrease in CZN5 tumor 1 and a 4-fold decrease in CzechII R12, compared to CzechII Liver (Control), Bonferroni Corrected p < .05 ( Figure 7A). R12 and CZN5 tumors, harboring "de-novo" SNP variant (3695 AC>A) and (3274 T>TA) respectively, reveal a significant decrease in mt-ND1 protein expression Next generation sequencing identified two different "de novo" mt ND1 SNP mutations in R12 and CZN5 tumors. In silico analysis revealed these SNP mutations to be deleterious frameshift mutations that highly impacted the mt ND1 sequence. To validate the impact of the mutations on ND1 protein function, a western blot was performed on R12 and CZN5 tumors harboring the two "de novo" SNP mutations. β-actin was used as an endogenous control. Lane 1 shows control, CzechII liver, having normal expression of ND1. However, compared to R12 tumor, absence of fluorescent band indicates complete ablation of ND1 protein expression. In addition, attenuation of band fluorescence in CZN5 indicates normal ND1 protein function being significantly reduced ( Figure 7B).

DISCUSSION
The potential of mtDNA being used as a molecular biomarker for breast cancer, like other malignant Generation sequencing was performed on CZN5 tumor 2 that arose from a CzechII CZN5 hyperplasia. SNP variant calling was performed to analyze common somatic SNPs that were conserved across CZN5 hyperplasia and tumor outgrowth, in comparison to CzechII lactating mammary gland control. tumors, to determine cancer clonality, has been widely investigated [9][10][11][12][13][14]. The attractiveness of mtDNA, as a clonal molecular biomarker, may derive from its forensic applications [15], uses in phylogenetic and evolution mapping [16]. The reason for its diverse use of application is due, in part, to mitochondrial evolution, which produced sequences that remain tightly conserved among species (e. g. 12S mt-rDNA) and between species (e. g. 16S mt-rDNA and mt-Cyt b) [17]. In addition, mtDNA possess highly varied sequences in the control region known as hypervariable region 1 and 2, which varies greatly between species. Several early studies investigated the control and hypervariable regions probably because of the higher frequency of non-functional mutations that did not impact the coding region, resulting in no selective pressure to degrade the mutated mtDNA [18]. However, in doing so, the mechanism that may underlie "selective pressure "and mitochondrial homoplasmy, to engender sub clonal dominant clones, would be ignored.
To test whether mtDNA harbored somatic "de novo" SNP variants, that could be detected in clonal populations during clonal expansion, we isolated and extracted mtDNA, to perform next generation deep sequencing and bioinformatic variant calling.
An important aspect of our current analyses of mtDNA is that all tissues were clonal. In other words, R12 and L12, both apparently normal mammary outgrowths, were shown to be derived from a single cell by Southern blot analyses for MMTV DNA insertions, which are known to be random [7,8]. All the MMTV-induced mammary premalignant outgrowths (HOG) lines were also shown to be derived from a single antecedent by virtue of Southern blots showing identical MMTV DNA insertions at each passage [6]. The mammary tumors arising within the premalignant outgrowths arise from single cells and share all the MMTV DNA insertions found in each individual premalignant outgrowth line [19]. Therefore, all of the mammary mtDNAs, except the non-clonal lactating mammary gland and liver, are derived from the progeny of a single cellular antecedent. This is important because, in effect, mtDNA in the progeny from a single cell is analyzed from normality through progression to full malignancy.
In CZN5 hyperplasia (HOG), SNP variant mt-ND1 (3274 T>TA) was undetected. However, in the tumor antecedent, CZN5 tumor 1, the mt-ND1 (3274 T>TA) mutation frequency rose to 100% demonstrating that all mtDNA copies tested by NGS, from this tumor, possessed the mtND1(3274 T>TA) SNP variant in the same position in the mitochondrial coding region. Coller and colleagues [20] showed, through computer simulation, that mtDNA mutations in a tumor progenitor cell could lead to a homoplasmic state in human tumors by clonal cell proliferation.
Investigating the impact of mtND1(3274 T>TA) on ND1 function, further elucidates the potential need of the mutation, in CZN5 hyperplasia, leading to tumorigenesis. Our bioinformatic result suggests that ND1 (3274 T>TA) is a frameshift mutation that generates a premature stop site in the mtDNA coding sequence for ND1. Digital Droplet PCR (ddPCR) confirmed a 2.5 decrease in ND1 mRNA expression compared to its control, CzechII liver mitochondria. This was verified by western blot showing attenuation of the band intensity of ND1 protein expression. ND1 dysfunction, in complex 1 of the electron polymerase chain reaction (ddPCR) on CzechII R12 Tumor and CZN5 tumor 1. Mt-ATP5f1 was used as endogenous control. CzechII R12 tumor 1 and CZN5 Tumor 1 Nd1 relative gene expression was compared to CzechII liver control. Statistical analysis was performed using ANOVA and Post-Hoc student t-test, Bonferroni correction P < .05. Samples were run in triplicates. Protein expression of mt-Nd1 in Czech tumor samples. Mt-ND1 protein expression was measured by performing a western blot on CzechII R12 Tumor and CZN5 tumor 1 (A). β-actin was used as endogenous control. CzechII R12 tumor 1 and CZN5 tumor 1 ND1 protein expression was compared to CzechII liver (B). www.oncotarget.com transport chain (ETC) of the mitochondria, has been documented to be associated with tumor growth [21][22][23]. It's important to recognize that this tumor population developed from a single "tumor progenitor" cell in the CZN5 hyperplastic population. Therefore, although ND1 (3274 T>TA) may be important in CZN5 tumor 1 through positive selection, it may have arisen randomly during clonal expansion due to "heteroplasmic shifting".
The CZN5 hyperplasia tested, harbored a SNP mutation in its mtDNA, Cox1 (1017 G>T), which was also detected at 0.7%. However, in CZN5 tumor 2, that SNP frequency increased to 14.0%. This is indicative of "heteroplasmic shifting". Conversely, since the tumor is comprised largely of epithelium whereas the HOG is a mixture of stroma and epithelium, it may simply reflect the increased mtDNA contribution of the epithelium. Additional CZN5-derived mammary tumor mtDNAs, that were analyzed, were devoid of any SNP found in mtDNA from the control CzechII tissues (data not shown). This supports randomness in clonal expansion, evident in each "tumor progenitor" cell within the premalignant population.
The normal R12 outgrowth gave rise to a single tumor (R12 MT1). The mtDNA from this tumor possessed two "de novo" SNP variants, ND1 (3695 AC>A) and ND5 (12781 G>A). Each was present in approximately 17% of the mtDNA sequences analyzed in the primary tumor. However, in the R12 tumor transplants and lung metastases, the ND1 SNP variant increased dramatically compared to the ND5 SNP variant, which remains approximately at the same frequency as in MT1. Therefore, these SNPs are in separate mtDNA genomes. Our bioinformatic analysis results indicate that ND1 (3695 AC>A) is a frameshift mutation that introduces a premature stop site in the mtDNA coding sequence. The SNP variant in ND5 was predicted to be a tolerable missense mutation, which may not impact protein function but may be involved in tumor progression [24]. A clear explanation for two separate mtDNA mutations in a clonally derived tumor is challenging unless one considers that one SNP preceded the second during the clonal expansion of the tumor population. This could result in two independent tumor sub clones each characterized by different mtDNA SNPs, i. e., ND1 (3695 AC>A) and ND5 (12781 G>A). In the lung metastases and tumor transplants, the sub clone bearing the ND1 SNP became the larger contributor of mtDNA copies. Whether this is due to a selective advantage for growth in distal sites is unclear. However, ND1 (3695 AC>A) is a SNP of interest due to its classification as a frameshift mutation and deleterious impact on ND1 protein function. Interestingly, which may suggest a selective advantage for cells harboring mutations later proliferating in sub clonal dominant tumor.
Digital Droplet PCR (ddPCR) showed a 4.2-fold decrease in ND1 mRNA compared to its control, CzechII liver mtND1 mRNA. The deleterious impact of ND1 (3695 AC>A) on the mitochondrial sequence, leading to dysfunction in the electron transport chain (ETC), may be important in this sub clone for its proliferation in distal sites.
There are currently three leading theories of the mtDNA inheritance mechanisms: (1) variation in heteroplasmy is due to an unequal segregation of mtDNA during cell division, (2) variation in heteroplasmy is due to an unequal segregation of mtDNA nucleoids during cell division and (3) variation in heteroplasmy is due to the selective replication of a specific sub-population of mtDNA [25]. A clear and cogent argument from our results states that mitochondrial DNA synthesis must be very precise because one seldom detects a specific SNP in their DNA sequence. Only the tumors arising within clonal outgrowths, which are clones themselves, possess "de Novo" SNPs. These are detected at identical positions in their mtDNA sequence and the nucleotide changed is always the same. This argues that for these mtDNAs there is extreme fidelity in their replication. There are two distinct SNPs in MT1 of the R12 outgrowth mtDNAs, one is found in the coding region of the ND1 gene and another in the coding region for subunit ND5. This suggests the presence of two distinct cellular sub clones within the MTI tumor containing mutated mtDNA. One sub clone with only the ND1-SNP in its mitochondrial DNA and one sub clone with the ND5-SNP variation. The increase in the frequency of ND1-SNP in the mtDNAs isolated from the lung metastases and from the tumor transplants of MT1 can be explained by the selective increase in the 1st sub clone. The best example of the fidelity of mtDNA replication was discovered in CZN5 MT1, where all the analyzed mtDNA sequences contained a SNP at the identical position in the ND1 coding gene. These data suggest that a single mutated mtDNA genome acts as the template for all others in any given clone during its expansion. We realize that this hypothesis for mtDNA replication and inheritance is contrary to the generally accepted view. However, our data is best explained by this concept and is supported by the observation that "normal", premalignant and malignant cell populations may develop from a single retrovirally-marked antecedent.

Mice
CzechII mice infected with mouse mammary tumor virus (MMTV) [26] were used as donors and hosts of the mammary epithelial transplants. The mice were held in a closed colony, maintained on a 12-h light/ dark cycle under controlled temperature and humidity, and were given laboratory chow supplemented with birdseed and water ad libitum. Bedding was hardwood chip and was autoclaved prior to use. All animal care and treatment were conducted strictly according to the rules and procedures defined by the USPHS and the NIH and were approved by the NCI Animal Care and Use Committee.

Mammary tissue transplantation
The surgical procedures for clearing the mammary epithelium from the inguinal fat pads of 3-week-old female mice and the method of implanting either tissue fragments or cell suspensions have been described in detail in earlier publications [27][28][29][30][31]. The surgical procedures required to remove the host epithelium from the fat pads were performed immediately prior to insertion of the transplant or inoculation of cultured cells. The development and characterization of the clonally derived outgrowths and their serial transplantation into epitheliumcleared mammary fat pads has been described in detail [7,8]. The implanted glands as well as intact host glands were taken 1 day postpartum.

Intact mitochondrial isolation
All tissue samples (N = 34) were snap frozen in liquid N2 and stored at -80C. A portion of the frozen tissue was thawed on ice and washed using 1 mL 0.9% (w/v) sodium chloride solution. If necessary, the tissue was cut into ~2 mm 3 pieces and placed into a 2 mL reaction tubes. Lysis Buffer (500 µL, supplemented with Protease Inhibitor Solution) was added to each reaction tube. Dissertator rotor-stator homogenizer set at the lowest speed for 10 s, was used to homogenize tissue sample. After disruption the solution was incubated on an endover-end shaker for 10 min at 4°C. Homogenate was centrifuged at 1000 × g for 10 min at 4°C. Supernatant was carefully removed. The pellet was resuspended in 1.5 mL ice-cold Disruption Buffer. Lysate was drawn into a 1.0 cc syringe equipped with a 25-gauge needle and ejected with one stroke, 10 times. Lysate was then centrifuged at 1000 × g for 10 min at 4°C. The supernatant was carefully transferred to a clean 1.5 mL tube. Supernatants from each extraction were combined. Supernatant (s) were centrifuged at 6000 × g for 10 min at 4°C. Mitochondrial pellet was washed with 1 mL Mitochondria Storage Buffer. This solution was centrifuged at 6000 × g for 20 min at 4°C (Qproteome Mitochondria Isolation Kit, Qiagen).

Mitochondrial DNA preparation
Qiagen DNeasy kit was used to extract mtDNA. This was done according to the manufacturer's protocol. Proteinase K (20 µL) and PBS (200 µL) were added to the mtDNA pellet for resuspension.

Genomic DNA screentape assay
Quantification, Sizing and Integrity Analysis of CzechII mtDNA using the Agilent 4200 TapeStation system (G2991AA), Genomic DNA ScreenTape (5067-5365) and Genomic DNA Reagents (5067-5366) were obtained from Agilent Technologies. The extracted mtDNA was analyzed using the Genomic DNA ScreenTape assay. The samples were prepared by mixing 1 µL of gDNA sample with 10 µL of Genomic DNA Sample buffer. A 3 µL amount of Genomic DNA Ladder was placed in the first tube of an 8-way strip followed by the samples. The prepared strip was vortexed on high speed for 5 seconds, centrifuged and placed in the 2200 TapeStation instrument. The samples were analyzed as triplicates for each individual extracted sample.
Next generation sequencing of CzechII mtDNA from serially transplanted normal, tumor, and metastatic CzechII mammary tissue A DNA library was prepared using the Nextera XT library preparation kit (Illumina). The concentrations of the indexed libraries were analyzed on the Agilent 4200 TapeStation (Agilent Technologies) using the D1000 Kit (Agilent Technologies). For CzechII L12 serially transplanted normal lactating mammary tissue mtDNA, equimolar amounts of the 5 indexed libraries were pooled to obtain a 4 nM library mixture. CzechII R12 serially transplanted normal mammary tissue mtDNA, equimolar amounts of the 4 indexed libraries were pooled to obtain a 4 nM library mixture. CzechII R12 serially transplanted tumor mammary tissue mtDNA, equimolar amounts of the 15 indexed libraries were pooled to obtain a 4 nM library mixture. CzechII R12 serially transplanted metastatic mammary tissue mtDNA, equimolar amounts of the 5 indexed libraries were pooled to obtain a 4 nM library mixture After denaturing, and further diluting, the final 1.3 pM library was loaded into an Illumina cartridge. Sequencing was performed using the Illumina NextSeq 500/550 High Output Kit v2 (300 Cycles) on the Illumina NextSeq 500 instrument following the manufacturer's instructions (Illumina). For CzechII CZN5 serially transplanted hyperplastic and tumor mammary tissue mtDNA, equimolar amounts of the 5 indexed libraries were pooled to obtain a 4 nM library mixture. After denaturing, and further diluting, the final 1.3 pM library was loaded into an Illumina cartridge. Sequencing was performed using the Illumina NextSeq 500/550 High Output Kit v2 (300 Cycles) on the Illumina NextSeq 500 instrument following the manufacturer's instructions (Illumina).

Bioinformatic phylogenetic network of CzechII mouse strain with other known mouse strains via mtDNA analysis
To examine the relationship of our CzechII mice with other known strains, a germline phylogenetic network was generated using all published mouse strain mtDNA genomes available through the Sanger Mouse Genome Project (39 total strains; https://www. sanger.ac.uk/science/data/mouse-genomes-project). To homogenize the CzechII samples with the Sanger mouse strain genomes, BAM files for each mouse strain were downloaded and reads were extracted for each genome using the bamtofastq tool in bedtools v2.27.2 (https:// bedtools.readthedocs.io/en/latest/) and remapped with the above pipeline. All samples were then joint genotyped using the HaplotypeCaller from the Genome Analysis Toolkit v.3.5 (GATK, Broad Institute, Cambridge, MA, USA), and following the GATK Best Practices [33,34]. Uncorrected genetic distances were generated using plink v1.07 [36] and a phylogenetic network was generated using phylip v3.697 (http://evolution.genetics. washington.edu/phylip.html).

Statistical analysis
Data was presented as mean ± standard deviation (SD). We performed ANOVA to compare the difference in mt-ND1 gene copy number between Czech Liver, R12 Tumor and CZN5 Tumor 1. Post-hoc Bonferroni adjustment was applied for ANOVA analysis. Paired Student's t-test was used to determine the difference in mt-ND1 gene copy number between ddPCR in Czech Liver, R12 Tumor and CZN5 Tumor 1, respectively. A p-value of <0.05, after Bonferroni correction, was considered statistically significant. Statistical analyses were conducted using Microsoft Excel version 16.25.

Summary statement
MtDNA sequence is conserved in heterogenous populations derived from single cell during clonal expansion. This suggests that a single mtDNA copy may act as a template for others within a clonal cell population and "heteroplasmic shifting" may act as a selective pressure for certain mtDNA mutations that produce a metabolic advantage for individual clonal dominant mammary tumors.

CONCLUSIONS
We conclude that mtDNA sequences are conserved during clonal expansion and may be selected via "heteroplasmic shifting" to form clonal dominant tumors. We propose that the conservation mechanism, by which mtDNA sequences are maintained, appears to be achieved through mtDNA replication, which is remarkably faithful. This conclusion is based upon the observation that mtDNA sequence variation or lack thereof are present in phenotypically heterogenous cellular populations comprised of the progeny of a single cellular antecedent. Further studies such as direct replacement of mitochondria carrying a marked genome in a clonogenic cell and examination of the mtDNA from a sub-clonal population developed from such a cell is required for final proof.