Development of a sequential workflow based on LC-PRM for the verification of endometrial cancer protein biomarkers in uterine aspirate samples

About 30% of endometrial cancer (EC) patients are diagnosed at an advanced stage of the disease, which is associated with a drastic decrease in the 5-year survival rate. The identification of biomarkers in uterine aspirate samples, which are collected by a minimally invasive procedure, would improve early diagnosis of EC. We present a sequential workflow to select from a list of potential EC biomarkers, those which are the most promising to enter a validation study. After the elimination of confounding contributions by residual blood proteins, 52 potential biomarkers were analyzed in uterine aspirates from 20 EC patients and 18 non-EC controls by a high-resolution accurate mass spectrometer operated in parallel reaction monitoring mode. The differential abundance of 26 biomarkers was observed, and among them ten proteins showed a high sensitivity and specificity (AUC > 0.9). The study demonstrates that uterine aspirates are valuable samples for EC protein biomarkers screening. It also illustrates the importance of a biomarker verification phase to fill the gap between discovery and validation studies and highlights the benefits of high resolution mass spectrometry for this purpose. The proteins verified in this study have an increased likelihood to become a clinical assay after a subsequent validation phase.


INTRODUCTION
Endometrial cancer (EC) is the most frequently observed invasive tumor of the female genital tract and the fourth most common cancer in women in developed countries, accounting for 60,050 diagnosed cases and 10,470 estimated deaths in 2016 in the United States [1]. Postmenopausal women represent 86% of diagnosed EC cases, with a median age of 63 years [2]. Nowadays, 70% of the EC cases are diagnosed at early stages of the disease where the tumor is still localized within the endometrium and is associated with an overall 5-year survival rate of 96%. However, 30% of EC patients are diagnosed at an advanced stage of the disease associated with a drastic decrease in the 5-year survival rate, which is reduced to 68% when myometrial invasion and/or lymph node affectation is already present, and to 17% in cases of distant metastasis [1]. Improving early diagnosis is hence a major issue to appropriately manage EC and decrease mortality associated with the disease.
Early detection of EC patients is favored by the presence of symptoms like abnormal vaginal bleeding, present in 93% of women diagnosed with EC. However, many other benign disorders generate similar symptoms [3]. Discrimination of patients with benign endometrial pathologies and with EC is only achieved after a tedious diagnostic process consisting of a pelvic examination and a transvaginal ultrasonography followed by a confirmatory histopathological examination of an endometrial biopsy. The preferable biopsy used in this procedure is named uterine aspirate or pipelle biopsy and is obtained by a minimally invasive aspiration of endometrial fluid from inside the uterine cavity using a Cornier pipelle. Using this sampling, this process has unfortunately a diagnostic failure and an inadequate sampling rate of 8% and 15%, respectively; which is increased in postmenopausal women up to 12% and 22% [4]. In those undiagnosed cases, a biopsy guided by hysteroscopy needs to be performed, but this invasive technique presents more complications, including uterine perforation, hemorrhage and possible damage to other organs [5]. Implementation of biomarkers in early stages of the diagnostic process would improve detection of EC.
From a biological point of view, proteins are key players of many cellular processes and variations of their abundance can be associated with pathologies such as cancer. Proteins are detectable in biofluids and thus are valuable disease indicators for the development of noninvasive diagnostic tests. In this regard, uterine aspirates are specially promising as a source of EC biomarkers thanks to the direct contact of this fluid with the endometrium. Nevertheless, the search of protein EC biomarkers has been mostly based on tissue analyses [6][7][8], hampering the translation into clinical practice. Only few proteomic studies have been performed on uterine aspirates without a focus on EC biomarkers [9][10][11].
The ideal diagnostic biomarker pipeline consists of sequential phases of discovery, verification and validation. The discovery produces large lists of differentially abundant proteins (i.e. 100 s-1000 s) between simplified biological conditions using a limited number of samples, mostly tissue specimens. In contrast, the validation phase requires a precise and accurate quantification of the most promising biomarker candidates (typically a dozen), in a large set of samples, which is normally a preferred biofluid [12]. Up to date in EC, the vast majority of biomarker studies cover either discovery phases that generates large lists of biomarker candidates [13,14], most of which have never been further validated; or validation studies focus on a specific protein [15][16][17][18], with an increased risk of not generating concrete application and hampering the search of biomarker panels that improve the diagnostic performance of individual proteins. The intermediate verification phase is crucial for the prioritization of biomarker candidates to enter a validation phase in order to increase the likelihood of identifying clinically relevant biomarkers [19]. The lack of methods to guide the prioritization of candidates in the verification phase has been identified as one of the factors of poor translation of biomarker discovery into clinics [20,21]. The liquid chromatography mass spectrometry (LC-MS) platform, operated in targeted acquisition mode, is ideal to achieve this task as proteins can be reliably quantified in a highly multiplexed fashion and at a fast throughput.
In this study we aimed to i) demonstrate the efficiency of a stepwise verification workflow that prioritizes, from a list of potential biomarkers derived from published discovery studies, the most promising to enter into a further validation phase; ii) evaluate the performance of the parallel reaction monitoring (PRM), a targeted acquisition method employed on a high resolution accurate mass spectrometer, in clinical samples of uterine aspirates; and iii) assess the potential of the soluble fraction of uterine aspirates as a source of protein EC biomarkers.

LC-PRM method development: Selection of the biomarker candidates
A targeted MS-based approach was selected for the verification of potential EC biomarkers in uterine aspirates as it enables the quantification of multiple peptides within a single analysis. The LC-PRM is a hypothesis driven methodology that differs from the unsupervised MS-based approaches (e.g. data dependent and data independent acquisition) as the proteins must be selected prior the actual MS acquisition. With LC-PRM, the number of targets is limited to approximately 100-150 peptides per analysis, but in return, the mass accuracy and the high resolving power of the orbitrap analyzer, in conjunction with the use of isotope labeled peptides as internal standard, allows for systematic quantitative measurements in all samples achieved with a high degree of selectivity and precision. Therefore, a preselection of the protein candidates to be measured is required. Starting from 506 protein candidates found in the literature from previous studies mainly using EC tissue samples as biological material, we proposed a workflow to reduce step by step this number down to 52 biomarker candidates, leading to 98 pairs of light/heavy peptides that can be measured by a single LC-PRM method. We verified those candidates in uterine aspirates from a cohort of 20 EC patients and 18 controls by LC-PRM ( Figure 1).
The first step of this study was to select proteins indicated as potential biomarkers for EC from an extensive literature review performed in the PUBMED bibliographic database. The search included articles published from 1990 to 2014, combined with the text words"endometrial cancer" and "biomarker". We obtained a first list of 506 proteins associated with EC, which were mostly derived from studies performed in endometrial tissue samples. The second step of selection consisted in the assessment of the LC-MS detectability of those 506 potential biomarkers in four samples of uterine aspirates by repeated DDA analysis. The main goal of this step was to reduce the list of protein candidates initially described in tissue samples to those that can be effectively detected in uterine aspirate samples. From a total of 1,086 proteins identified in the four uterine aspirates, 158 proteins out of the initial 506 potential biomarkers list were detected (Supplementary Table 1). This first screening indicated that one third of the potential biomarkers could be easily detected by LC-MS techniques in uterine aspirates samples, thus confirming the potential of uterine aspirates as a source of protein EC biomarkers.
Blood contamination of biological samples is a recurrent problem in bioanalyses, particularly in the field of biomarker research in some biofluids [22,23]. Understanding that uterine aspirates display a variable amount of blood between samples, we introduced a third step of selection to evaluate the interference of blood components during LC-MS detection of the potential biomarkers in uterine aspirates. To do that, the uterine aspirates of two patients (one control and one EC patient) were split into four equal-volume aliquots and spiked with increasing volumes of full blood: 0, 10, 20, 40% (v/v). All samples were digested and analyzed by LC-MS/MS in duplicate. We excluded those proteins whose peptides displayed an increasing profile with an increasing concentration of spiked-in blood and maintained those proteins showing no effect or diminished levels ( Figure 2). This criterion was used in order to discriminate protein biomarkers belonging to the endometrial tissue contained in uterine aspirates rather than proteins contained in the blood proteome. Moreover, by excluding abundant proteins of blood, we reduced analytical problems related to variable blood contamination among the samples. As a result of this analysis, 32 proteins were likely to be derived from the blood contamination of the uterine aspirates and were excluded from further analysis (Supplementary Figure 1).
The remaining 97 uterine aspirate specific candidates, and seven additional proteins detected in uterine aspirate samples but not in full blood (data not shown), were scaled down to 52 proteins based on their consistency in literature. The 52 candidates had undergone at least one level of additional validation using a different technology or biospecimen type, or an independent cohort of cases and controls whether in the context of the same publication or in an independent report. A total of two peptides per each of these 52 proteins (104 peptides) were selected according to their uniqueness, detection and chromatographic behavior.

Quality control of the LC-PRM data
The 52 proteins of interest were verified in the soluble fraction of uterine aspirates by targeted MS. Uterine aspirates from 20 EC patients and 18 non-EC controls were digested in duplicate and analyzed by a quadrupoleorbitrap MS operated in PRM mode using a mix of the stable isotopes labeled (SIL) peptides of the 104 peptides (i.e., heavy peptides) as internal standards. Four of these SIL peptides could not be synthesized, leading to a final list of 100 monitored peptides in the method (Supplementary Table 2). The signals of the five most intense product ions for each precursor were extracted from the MS2 spectra to generate elution profiles (i.e. Extracted Ion Chromatograms (XICs) of selected product ions) (Supplementary Table 3). The identity of the peptides, as well as the potential interferences on the PRM traces, were evaluated by a similarity score based on the cosine of the spectral contrast angle (cos θ) calculated with the top five fragment ions of each precursor. This score was calculated against a reference LC-PRM analysis of the isotopically labeled peptides without biological matrix ( Figure 3A). The signal of a peptide was accepted if the cos θ was higher than 0.98 for both the endogenous and the stable isotope labeled standard  [24]. Values lower than 0.98 due to an interfered PRM XIC were replaced by the next most intense available XIC of product ion. Six peptides were monitored by XICs of four product ions due to the absence of a clean fifth product ion (Supplementary Table 2). Following this, a positive spectral matching was achieved for 95.1% of a total of 7,350 pairs (ratio light/heavy). The unmatched 4.9% pairs were due to two conditions: i) measurements below the limits of detection (4.7%), which were replaced with an estimation of the background value. In this account, peptides below the limit of detection in more than 50% of the samples, only two peptides -VHITSLLPTPEDNLEIVLHR and VTILELFR-fulfilled this condition, were removed from the study; and ii) measurements for which less than four clean XICs of product ions were detected (0.2%). These 0.2% were due to data very close to, but below, cos θ = 0.98 and only one replicate was affected in all cases; thus the value of the accepted replicates was kept. These results illustrate the efficiency of the PRM acquisition in complex clinical samples. The use of internal standards and the availability of all XICs of product ions guarantee the correct identification of each peptide, reduce interferences and fasten the detection and exclusion of potential interferences in large datasets. The mean between duplicates in the cleansed dataset was calculated (Supplementary Table 4), as well as the correspondent coefficient of variation (CV%). The CV% of the duplicated sample preparation for each uterine aspirate sample was below 15% for 99% of the detected peptides, with an averaged CV of 3.6%. This confirmed the high reproducibility level of the full process ( Figure 3B). Finally, the correlation between the peptides derived from the same protein was evaluated by a Pearson correlation coefficient ( Figure 3C) and 39 out of the 46 (85%) proteins monitored with two peptides showed a very high correlation, with a R coefficient over 0.95. Only 3 proteins -ROA2, OSTP, and KPYM-presented an R coefficient below 0.9, which were due to the specificity of the monitored peptides to different isoforms of the same protein.

Differentially abundant proteins between endometrial cancer and control uterine aspirates
In order to assess the potential of the 52 selected proteins to detect EC, we compared the abundance of each biomarker candidate between 20 EC patients and 18 non-EC controls. Importantly, both patients and controls were postmenopausal women suffering from an abnormal vaginal bleeding, as these clinical features are present in 93% of patients suffering from EC. However, only 15% of those will be finally diagnosed with EC [25].
Based on the Bradford assays, 250 ng of the total protein concentration after albumin and IgGs depletion was injected for each sample. The constant amount of injected protein among samples was further confirmed by the integration of the total ion chromatogram of the MS1 scans, as shown in Supplementary Figure 2. After MS data curation, the relative levels (light/heavy ratios) of the final 98 monitored peptides in MS2 were subjected to Mann Whitney test for their comparison between tumor and control samples. Forty eight peptides corresponding to 26 proteins showed significant differences between the two groups with adjusted p-value < 0.05 (Benjamini corrected) and fold change greater than 3: PERM, CADH1, SPIT1, ENOA, MMP9, LDHA, CASP3, KPYM, PRDX1, OSTP, PDIA1, NAMPT, MIF, CTNB1, K2C8, ANXA2, CAPG, FABP5, MUC1, CAYP1, XPO2, NGAL, SG2A1, ANXA1, HSPB1, PIGR. All these proteins showed higher levels in tumor samples as compared to control samples (Table 1; Supplementary Figure 3).
To further evaluate their performance as biomarkers for EC diagnosis, we performed a ROC analysis to determine the sensitivity and specificity of each biomarker. Interestingly, these differentially abundant proteins showed Area Under the ROC Curve (AUC) values for discriminating between EC and controls patients ranging from 0.75 to 0.97. The 10 best-performing individual proteins were PERM, CADH1, SPIT1, ENOA, MMP9, LDHA, CASP3, KPYM isoform M1-M2, PRDX1 and OSTP isoform A, all of them with AUC values higher than 0.9 ( Figure 4). Among those proteins, PERM, CADH1, SPIT1 and OSTP isoform A were of special interest as each of them presented sensitivities higher than 80% when specificity was fixed to 95% (Table 1). This is particularly important in EC diagnosis, as biomarkers with high specificity could complement the output of low invasive techniques such as the transvaginal ultrasonography, which currently presents very high sensitivity but lack of specificity [26].
Furthermore, we conducted a bioinformatics analysis using Ingenuity Pathway Analysis (IPA) to better understand the association of these proteins with cancer and their origin regarding the subcellular location. As expected, integration of the data resulted in the identification of cancer, inflammatory disease, organismal injury and abnormalities, and reproductive system disease as the top diseases associated to these biomarkers. The top five molecular and cellular functions involved with these proteins included cellular movement, cellular death and survival, cellular development, cellular growth and proliferation, and celltocell signaling and interaction, all of them important processes altered in cancer. These proteins are mainly found in the cytoplasm, plasma membrane and extracellular space ( Table 1), indicating that they are coming either from secretion of the epithelial and inflammatory cells of the endometrium or by necrosis of cells in the proximal tissue. This is in concordance with the observation that all biomarkers in this study were found more abundant in EC patients as compared to controls, as both processes are related to the higher proliferation rate of epithelial cells in EC.

DISCUSSION
About 30% of EC patients are diagnosed at advanced stages of the disease, associated with a drastic increase in the mortality and morbidity [1]. Therefore, the identification of sensitive and specific biomarkers to improve early detection of EC is a crucial clinical need. Despite the major effort and investments made to identify EC biomarkers, no protein has yet reached the stage of clinical application. The poor translation of the results produced by those studies in the clinic can be explained by two determinant factors: on the one side, the lack of studies in biofluids to identify accessible EC biomarkers. Most of the studies were based in tissues, and/or those that used biofluids were limited to serum or plasma [27][28][29]. Blood presents several important advantages, as it is in direct contact with all body tissues, and its collection is rapid, easy and minimally invasive. However, the search of biomarkers in plasma or serum is extremely challenging due to the low concentration of the potential biomarkers and the wide dynamic range in protein abundance [30]. On the other side, the lack of verification studies as a bridge between discovery and validation phases, which  has been defined as the current bottleneck of the biomarker pipeline [19,21]. Discovery studies generate large lists of differentially abundant proteins. Many of those biomarker candidates are never validated or turn to be false positives due to the small number of samples analyzed, the biological variability or the limited quantitative performance of the technologies employed in this phase. There is a need to verify and refine those lists to the best candidates that can enter a validation phase. This is the critical role of the verification phase. In order to overcome these limitations, we presented a stepwise workflow to select potential EC biomarkers and verified them by targeted MS-based analysis in uterine aspirate samples. Targeted MS-based approaches have gained in popularity for biomarker verification in complex clinical samples because they combine precision, sensitivity, multiplexing and absence of missing values. Among those, selected reaction monitoring (SRM) acquisition mode performed on a triple quadrupole mass spectrometer has been the reference method for the accurate quantification of peptides in biological matrices [31][32][33]. However, SRM is limited in selectivity and requires a substantial method development. We implemented the PRM acquisition, a new generation of targeted MS-based approach, performed on high-resolution accurate mass spectrometers (HRAM) such as the hybrid quadrupole-orbitrap. To date, the advantage of PRM for large scale analysis has been evaluated [24,34] but not yet commonly introduced as a technique for biomarker searches in clinical [35] or cell lines samples [36][37][38]. The high resolution and the accurate mass (i.e. 35,000 at 200 m/z and below 5 ppm) of the orbitrap analyzer decrease the risk of inferences due to the complexity of the chemical background and we obtained clear and easily readable chromatograms profiles. This aspect, in conjunction with the use of spectral matching as a quality metric, significantly facilitates the data processing to compare with SRM data, for instance only 0.2% of the chromatographic peak needed to be manually curated. A straightforward and highly automatable data processing is an important feature of large scale studies. The PRM acquisition allowed the quantification of 100 pairs of peptides at the reasonable throughput of one analysis per hour with an excellent precision (i.e. the CV% between full workflow duplicates was below 15% for 99% of the detected peptides). Finally, the design of an LC-PRM method is easier and faster than for LC-SRM, as the selection of the product ions to quantify is performed post-acquisition and the list of XICs can be refined iteratively to remove potential interferences coming from the background, without the need of a new analysis [39].  Another strength of this study is the use of uterine aspirates as a biological sample for biomarker detection. A useful diagnostic biomarker not only has to ameliorate the discrimination between patients suffering the disease and benign cases, but also should be economically profitable and advantageous in the clinical scenario.
In the case of diagnostic biomarkers for EC, fasten the diagnostic process, improving the comfort of patients, and reducing the sanitary costs are very important values. Therefore, the search of biomarkers in easy-to-access biofluids is highly recommended. Uterine aspirates seem an interesting alternative to other biofluids, such as blood or plasma, as they are in direct contact with the tumor in the endometrium, being enriched in proteins secreted from the epithelial cells in the tumor. Additionally, they are collected in the current process of EC diagnosis prior to subsequent more invasive diagnostic techniques. We here demonstrated the convenience of the soluble fraction of uterine aspirates as a source of EC biomarkers and the feasibility of its analysis by MS. The use of the soluble fraction is expected to overcome the diagnostic failure of 22% associated to this sampling, as the current diagnostic procedure relies on the cellular material in the sample [4].
Our final achievement was to eliminate doubtful biomarker candidates derived from the variable amounts of blood contamination in uterine aspirates and successfully verify the differential abundance of 26 EC biomarkers in this sample. A bioinformatics analysis confirmed their individually and collectively association with cancer, and showed that they maintain a strong association with commonly altered molecular processes in cancer such as cellular movement, cellular death and survival, etc. Among all candidates, ten provided high sensitivity and specificity, with AUC values over 0.9, and four of those, PERM, CADH1, SPIT1 and OSTP, were highlighted as they achieved sensitivity over 80% when fixing a specificity of 95%. The protein biomarkers verified in this study merit further validation in an extended study with a larger cohort of patients and controls. A prospective large multicentric study has been initiated with the aim to confirm the diagnostic power of these biomarkers and hence, to evaluate their validity and clinical applications. A limitation of the present study is that we did not use combinations of multiple markers to avoid overfitting due to the relatively small number of subjects included [40]. However, the AUC for individual proteins were already very high.
In conclusion, this study brings forward the proteomic search of biomarkers in uterine aspirates following an appropriate workflow, and so, could be expanded to other types of gynecological diseases such as endometriosis and ovarian cancer. Moreover, this study proves the efficiency of high resolution MS in order to verify a large number of potential biomarkers to fill the gap between discovery and validation studies. The described workflow permitted to reduce step by step an initial list of 506 potential biomarkers down to 10 proteins with an increased likelihood to reach the stage of a clinical assay after a subsequent validation phase.

Patients and sample collection
A total of 42 patients (22 women suffering from EC and 20 non-EC controls, i.e., women having EC symptoms but not diagnosed with EC) were recruited in the Vall d'Hebron University Hospital (Barcelona, Spain) during 2012 to 2015. Informed consent forms, approved by the Vall d´Hebron Ethical Committee, were signed by all patients (approval number: PR_AMI_50-2012). The clinical and pathological characteristics of the patients are described in Supplementary Table 5. Inclusion criteria were postmenopause, a minimum age of 50 years and vaginal bleeding. Women who had been treated previously for gynecological pelvic cancer were excluded. Patients known to be positive for the human immunodeficiency virus and/or the hepatitis virus were excluded for safety reasons.
Uterine aspirates were collected by aspiration with a Cornier Pipelle (Eurogine Ref. 03040200) in the office of the clinician or in the operating room prior to surgery and transferred to 1.5 ml microtubes. Phosphate buffer saline was added in a 1:1 (v/v) ratio and centrifuged at 2,500 rcf for 20 min in order to separate the soluble fraction (supernatant) from the solid fraction (pellet). The separated fractions were kept at −80°C until use. From the 42 supernatants collected, samples coming from four patients were used for potential biomarker selection process and the development of the LC-PRM method. The list of selected biomarker candidates was then verified in the 20 EC and 18 nonEC remaining samples by LCPRM analysis.

Evaluation of the detectability of potential protein biomarkers in uterine aspirates by LC-MS/MS analysis
Uterine aspirate supernatants from two patients diagnosed with EC and two non-EC controls were sonicated (Labsonic M, Sartorius Stedim Biotech) at 100% amplitude during 8 cycles of 15 seconds and 50 µl of each sample was depleted from albumin and IgG using the Albumin & IgG depletion spin trap kit according to the manufacturer's instructions. Total protein concentration was measured by the Bradford assay, performed in triplicate. Proteins were purified by acetone precipitation overnight at −20ºC, resuspended with 0.2% Rapigest surfactant (Waters), sequentially digested at 37ºC by Lys-C (protease/total www.impactjournals.com/oncotarget protein amount ratio of 1/150 w/w) and trypsin (1/50 w/w) overnight, and finally desalted onto SPE cartridges. The LC-MS detectability of 506 potential biomarkers in the supernatants of uterine aspirate samples was then evaluated using a LTQ-Orbitrap Velos mass spectrometer (Thermo Scientific) operated in data dependent acquisition (DDA) mode. The liquid chromatography system consisted of an UltiMate 3000 RSLC nano configured in binary gradient mode. The setup was operated in column switching mode and samples were loaded onto a trap column (Acclaim PepMap100 2 cm × 75 μm i.d., C18, 3 μm, 100 Å) for 3 min at 5 µl/min by an aqueous solution containing 0.05% trifluoroacetic acid and 1% acetonitrile (v/v). Peptides were then eluted onto an analytical column (Acclaim PepMap RSLC 15 cm × 75 μm i.d., C18, 2 μm, 100 Å) by applying a 66 min linear gradient from 2 to 35% solvent B at 300 nl/min. The solvents A and B consisted of water with 0.1% (v/v) formic acid and acetonitrile with 0.1% (v/v) formic acid, respectively. The electrospray ionization was performed through a fused silica emitter by applying a voltage of 1.5 kV. The DDA method was based on a high resolution survey scan (60,000 at 400 m/z) followed by the fragmentation and analysis of the 6 most intense precursor ions in the LTQ ion trap at normalized collision energy of 35. Dynamic exclusion of precursors already selected for MS/MS experiments was set to 90 s. Peptides and related proteins identification was performed using Proteome Discoverer software (v1.4) (Thermo Scientific, Waltham, MA, USA) by Mascot search engine using Swiss-Prot human database (SwissProt 201108 with 531473 sequences entries, restricted to the 20,245 entries of the human taxonomy). Trypsin specificity was set to cleave after arginine and lysine residues excepted when flanked by a proline on the C-terminal side. A fragment ion mass tolerance of 0.8 Da and a precursor mass tolerance of 10 ppm were applied. Up to one tryptic missed cleavage was tolerated and carbamidomethylation of cystein and oxidation of methionine were specified as dynamic modifications. Results were filtered by Proteome Discoverer using one peptide per protein, a maximum search engine rank of 1 and a false discovery rate (FDR) below 0.01 (calculated by the node "Target decoy PSM validator"). The expectation value for accepting a spectrum was below 4 * 10 −3 to set FDR at 0.01.

Effect of differential blood content on biomarker candidate detection in uterine aspirates
Uterine aspirates from one EC and one control patients were split into four equal-volume aliquots and spiked with increasing volumes of full blood (0, 10, 20 and 40% (v/v)). Samples were centrifuged at 2500 rcf for 20 min in order to separate the soluble part from the pellet. Supernatants were treated and analyzed by LC-MS/MS as described in the previous paragraph. The elution profile areas of peptides of 129 potential biomarkers identified in the uterine aspirates of these two patients with the different percentage of blood added and/or full blood were extracted from the high resolution survey scans (the identity of peptide was confirmed by MS2) using Skyline software (v3.1) (McCoss Lab, University of Washington, Seattle, WA, USA). The levels of the surrogate peptides of each protein across the four aliquots with increasing percentage of full blood were plotted. The slope of the linear regression was calculated for each peptide and those presenting a positive slope in both patients were rejected from the study.

Preparation of uterine aspirate samples for the LC-PRM analysis
Supernatants from uterine aspirates coming from 20 EC patients and 18 non-EC controls were sonicated to disrupt potential microvesicles, protein aggregates, and/ or mucus by 5 cycles at 100% amplitude during 5 seconds (Labsonic M, Sartorius Stedim Biotech). Albumin and immunoglobulin G were then depleted from 50 µl of supernatant samples using the Albumin & IgG depletion spin trap kit according to the manufacturer's instructions. Total protein concentration was measured by the Bradford assay performed in triplicate. Each of the 38 samples were then separated into two aliquots of 25 µg to generate duplicates for the whole process, with exception of one sample for which the amount of material was not sufficient for duplication. The samples were diluted into a 50 mM solution of ammonium bicarbonate to a final volume of 120 µl and were denatured by addition of 185 µl of 10 M urea suspended in 50 mM ammonium bicarbonate, incubated at 22°C under agitation for 20 min, and followed by 10 min incubation in an ultrasonic bath (Branson 5510, Branson Ultrasonics). The samples were then reduced with 7.8 µl of 200 mM dithiothreitol for 60 min at 37°C, and alkylated with 12.2 µl of 400 mM iodoacetamide at 22°C for 30 min in the dark. The samples were digested for 4 h at 37°C with LysC (protease/total protein amount ratio of 1/150; w/w). Afterwards, the concentration of urea was diluted to 1 M with 50 mM ammonium bicarbonate buffer, and samples were incubated overnight at 37°C with trypsin (protease/total protein amount ratio of 1/50; w/w). The trypsin activity was quenched by addition of 1 µl of neat formic acid per 100 µl of solution. The samples were spiked with the mix of heavy synthetic peptides and then desalted onto solid phase extraction cartridges. The eluates were subsequently evaporated to dryness in a vacuum centrifuge and suspended in 0.1% formic acid before LC-PRM analysis.

LC-PRM setup
The LCMS setup consisted of a Dionex Ultimate 3000 RSLC chromatography system configured for a high-pressure binary gradient and operated in column switching mode. The mobile phase A consisted of 0.1% formic acid in water, the phase B in 0.1% formic acid in acetonitrile and the loading phase in 0.05% trifluoroacetic acid and 1% acetonitrile in water. The equivalent of 250 ng of each digested sample was injected and loaded onto a trap column (75 µm × 2 cm, C18 pepmap 100, 3 µm) at 5 µl/min and further eluted onto the analytical column (75 µm × 15 cm, C18 pepmap 100, 2µm) at 300 nl/min by a linear gradient starting from 2 % A to 35 % B in 48 min. The MS analysis was performed by a hybrid quadrupole orbitrap mass spectrometer (Q Exactive plus, Thermo Scientific) operated in PRM mode. The MS cycle started with a full MS1 scan performed at a resolving power of 70,000 (at 200 m/z) followed by time scheduled targeted PRM scans acquired at a resolving power of 35,000 (at 200 m/z) with a normalized collision energy of 20. The quadrupole isolation window for the PRM events was set to 1 m/z unit and the duration of the time scheduled windows for each pair of endogenous and isotopically labeled peptides were set to 2 min. PRM data are accessible in a public database (Panorama server) at: https://panoramaweb.org/labkey/ PRM_analysis_of_EC_uterine_aspirate.url; reviewer account is: panorama+domon@proteinms.net; Password is: KTjy3~A#.

PRM data processing
The elution profile of the five most intense fragment ions of each precursor were extracted using Skyline. The selection of the best product ions was supported by a spectral library obtained from a reference LC-PRM acquisition of the synthetic peptide mix injected without biological matrix. The elution profiles of the samples were first manually reviewed and obvious interfered PRM XICs were replaced by the next most intense available product ion. The data set was then refined using the cosine of the spectral contrast angle (cos θ) calculated between the peak areas of the five XICs of product ions of the reference (PRM acquisition of the synthetic peptides mix) and the areas of the corresponding XICs for the endogenous and heavy peptides in the biological samples [41]. The formula is as follows: Where A exp are the areas of either the endogenous or heavy XICs of selected product ions for a peptide measured in a sample, and A ref are the areas of the same XICs measured in a reference synthetic peptides mixture.
Peptides detection and identification were confirmed if the cos θ of the endogenous and the isotope labeled peptide were higher than 0.98 [24]. Scores below 0.98 are principally due to MS measurements below the limit of detection and in such cases the area values were replaced by an estimation of the background. Peptides with cos θ below 0.98 in more than 50% of the 38 samples in duplicates were eliminated from the verification study.
For the quantitative analysis, the area ratios between the endogenous and their corresponding heavy peptides were compared between samples. The area ratios were calculated as the sum of the areas of the XICs of products ions of the endogenous peptide divided by the sum of the XICs of the respective isotope labeled version.

Statistical analysis
The statistical analysis was performed in SPSS (v20.0) (IBM, Armonk, NY, USA) and Graph Pad Prism (v.6.0) (GraphPad Software, La Jolla, CA, USA). The averaged light/heavy area ratios were calculated between duplicates. The linear correlation between the signature peptides of the same protein was calculated using the Pearson correlation coefficient. Due to the non-normality of the data, assessed by Kolmogorov-Smirnova and Shapiro-Wilk tests, comparison of the abundance of the monitored peptides between tumor and control samples was performed using the non-parametric Mann-Whitney U test. P-values were adjusted for multiple comparisons using Benjamini-Hochberg FDR method [42]. Adjusted p-values lower than 0.05 along with fold changes greater than three were considered statistically significant. Receiver operating characteristic (ROC) curves were used to calculate the relationship between sensitivity and specificity for EC versus non-EC control group and hence, to evaluate the diagnostic performance for each biomarker candidate.

ACKNOWLEDGMENTS AND FUNDING
This work was supported by the Spanish Ministry of Health (RD12/0036/0035), the Spanish Ministry of