Learn more about the PDX models in the Champions TumorGraft® Database
Champions Oncology has several sources for the clinically-relevant TumorGraft (patient-derived xenograft (PDX)) models it maintains in its bank. The first source is from the work we do directly with patients in which we develop TumorGraft models from individual patient tumors and use them to help guide treatment decisions in the clinic. The second source of Champions TumorGraft® models in our bank is research collaborations we have established with external institutes and hospitals. We are also obtaining a large number of new and drug discovery-relevant models through clinical validation studies being led by our Medical Affairs team. Additionally, we will be adding to our tumor bank through a series of co-clinical initiatives and matched patient clinical programs. We will keep you informed of these initiatives as they progress.
In summary, Champions Oncology remains committed to building the world’s most clinically-relevant TumorGraft model bank.
Model status provides information on how readily clients may access models for use in their studies; it is defined across three categories:
This is information related to the tumor that was collected from a patient to develop a TumorGraft model (pathological and histological determinants). We also include the Champions TumorGraft (CTG) number, which is a unique identifier assigned to each TumorGraft model by our Model Development and Characterization team. Information that may in any way allow identification of patients (name, date of birth etc.) is not available though this database and is restricted to only those individuals at Champions Oncology who are rigorously trained in safe-guarding patient confidentiality. Finally, all patients who donate samples to generate TumorGraft models are required to sign an Informed Consent document.
CTG Number | Unique identification number given to TumorGraft models. |
Tumor Type | Defines the type of cancer the patient was diagnosed with. |
Tumor Status | Classifies the harvested tumor as a primary or metastatic lesion. |
Tumor Subtype sarcoma, glioma, head & neck models only | Defines the subtype of tumor the patient was diagnosed with. |
ER/PR/HER2 Status breast models only | Specifies if the estrogen, progesterone, or HER2 receptors are expressed. |
Harvest Site | Identifies the site in the body where the tumor was harvested from. |
Histology | Indicates what the tumor looks like under a microscope. |
Tumor Grade | Defines how closely the tumor resembles normal tissue. |
Much of this information comes to us from pathology reports or from clinical updates given by the treating oncologist. We are also in the process of reviewing the histology and grade of Champions TumorGraft® models (based on H&E stained slides we have generated) with the help of our pathologist consultant team.
As noted in the table above, data for breast, sarcoma, glioma, and head & neck Champions TumorGraft®models is slightly modified to contain more information specifically related to these tumor types. For breast models we include whether the estrogen, progesterone, and HER2 receptors are expressed in the models (identified from patient pathology reports), whilst for sarcoma, glioma, and head & neck models, Tumor Subtype is included.
This information relates to data about the patient from whom the tumor was obtained. It includes:
Diagnosis | First diagnosis or Recurrent tumor. |
Treatment History | Pretreated (therapy received before Champions TumorGraft® model development) or Naïve. |
Disease Stage | I-IV, based on radiology/pathology reports. |
BRCA1/2 status breast and ovarian models only | Mutated, Deleted, or Normal. |
Smoking History | Smoker, Former smoker, Non-smoker. |
Age | In years, at time of tumor resection. |
Gender | Male or Female. |
Ethnicity | Caucasian, Black or African American, Asian, Hispanic, Mixed, Native Hawaiian or Pacific Islander. |
Please note that although Patient Profile and Tumor Profile are described separately here, they are grouped together in the same table in the database.
This is information related to the treatments a patient received before and after the collection of their tumor.
CTG Number | Unique identification number given to TumorGraft models. |
Pre/Post-Collection | Indicates whether a treatment was received before or after collection of the tumor. |
Drug | Identifies the generic name of the drug received by the patient. |
Outcome | Defines the response of the patient to the drug (Responded, No response, Mixed response, N/A (neoadjuvant)) (based on radiology screens/blood tests). |
Duration (months) | Indicates how long the patient responded to the drug. |
Please note that for Mixed or No response, the length of response is not included because some level of disease progression has occurred. These tables only include chemotherapy drugs (labeled using generic drug names). We do not show radiotherapy treatments or alternative/experimental therapies (e.g. naturopathy or experimental vaccines). We also obtain primary radiological and blood test records that help define treatment responses. Many of the post-collection treatments are identified by Champions Oncology through our work directly with patients (see above).
One important parameter for Champions TumorGraft® models is their growth pattern. Our bioinformatics specialist has gathered all the accumulated growth data we have for every model and performed a linear regression analysis of this data in order to generate a mean growth curve for each model (depicted as Days versus Mean tumor volume in mm3). We also show a 95% confidence interval for each data point, represented as purple shading around the curve, to give an estimate of the uncertainty in the analysis.
Target Size | Doubling Time |
---|---|
150mm3 | 46 days |
1000mm3 | 93 days |
Some growth metrics based on this analysis are also provided to help users evaluate the growth pattern of each model. These are the average time (in days) it has taken each model to reach either 150mm3 or 1000mm3 following implantation of a single tumor fragment with a volume of 64mm3 into the flank of a recipient mouse.
One of the pre-eminent features of our Champions TumorGraft® Database is the availability of information on the response of our TumorGraft models to different chemotherapeutic agents. The general scheme for screening drugs against our TumorGraft models is shown in the figure below. Tumor fragments (around 64mm3) are implanted into the flanks of recipient mice and tumor dimensions are recorded with digital calipers. Estimated tumor volumes are calculated using the formula: TV= width2 x length x π/2. Once tumor implants reach a volume of approximately 200mm3, dosing with test agents (or vehicle control) begins (designated as Day 0). Test agents are administered according to instructions from the manufacturer, as well as established literature. Tumor dimensions are measured as indicated previously and data, including individual and mean estimated tumor volumes (Mean TV ± SD), recorded for each group.
Drug Testing Workflow
At study completion, percent tumor growth inhibition (%TGI) values are calculated for all test agents (T) versus the vehicle control (C) using initial (i) and final (f) tumor measurements by the formula: %TGI=[1-(Tf-Ti)/(Cf-Ci)]x100.
At present, we are only showing tables with the %TGI values for each test agent screened against each model in the database. We are not currently showing drug response curves. We are in the process of re-analyzing all of the drug response data for each model using more rigorous algorithms in a similar fashion to our growth curves. These drug response curves will be made available in subsequent updates to our database.
The Champions TumorGraft® database provides molecular profiles for models through Next Generation Sequencing (NGS) technology and immunohistochemical staining of model tissue. At the present time these profiles include data on:
Gene Mutations (SNVs and indels) | from Whole Exome Sequencing |
Copy Number Variations (CNV) | from Whole Exome Sequencing |
Gene Splice Variants | from RNASeq |
Gene Expression Levels | from RNASeq |
Gene Fusions and Translocations | from RNASeq |
Protein Expression | Label-Free Data-Independent Acquisition (DIA) Quantitative Proteomics |
Tumor Pathology | from H&E staining of models |
Champions Oncology has established collaborations with the New York Genome Center, Rutgers University, and the Broad Institute to generate the NGS data for its TumorGraft models. All initial processing of data is performed at these centers, with final annotation and packaging for display undertaken by bioinformaticists at Champions. As well as showing this information in the database, processed NGS data files can also be downloaded by clients from individual model pages using the provided hyperlinks (where available).
RNA sequencing libraries are prepared using the TruSeq Stranded mRNA Library Preparation Kit in accordance with the manufacturer’s instructions. Briefly, 500ng of total RNA is used for purification and fragmentation of mRNA. Purified mRNA undergoes first and second strand cDNA synthesis and cDNA is then adenylated, ligated to Illumina sequencing adapters, and amplified by PCR (using 10 cycles). Final libraries are quality-controlled using fluorescent-based assays including PicoGreen (Life Technologies) or Qubit Fluorometer (Invitrogen) and Fragment Analyzer (Advanced Analytics) or BioAnalyzer (Agilent 2100), prior to sequencing on an Illumina HiSeq2500 sequencer (v4 chemistry) using 2 x 50bp cycles.
After sequencing, FASTQ files are aligned to a combined index of human and mouse reference genomes and all reads mapping uniquely to mouse removed (a process called “de-mousing”). The resultant demoused FASTQ files are then aligned to the NCBI GRCh37 human reference using STAR aligner (v2.4.2a) (PMID: 23104886). Quantification of genes annotated in Gencode v19 is performed using FeatureCounts (v1.4.3) and quantification of transcripts using RSEM (doi: 10.1186/1471-2105-12-323). QC collected with Picard (v1.83) and RSeQC (PMID: 22743226; http://broadinstitute.github.io/picard/). Normalization of feature counts is done using the edgeR package (doi:10.1093/bioinformatics/btp616). Fusion Analysis
Fusion analysis is applied to raw FASTQ files using FusionCatcher (http://dx.doi.org/10.1101/011650) using the most updated human genome references.
What does RPKM stand for and how is it calculated? RPKM values are a normalized measure of how abundantly expressed a specific gene is. RPKM stands for Reads Per Kilobase Million and is calculated from the number of reads (sequenced fragments) that align to a reference human genome (see above). These absolute read counts need to be corrected for gene length (the “Per Kilobase” part of RPKM) and sample sequencing depth (the “Million” part of RPKM) because these two elements can artificially skew higher the absolute reads for some genes. Long genes will naturally have more reads aligning to them because they have more fragments that can be sequenced. Samples with greater sequencing depth will have more reads aligning to them because more of the fragmented mRNA (cDNA) is actually sequenced.
In simple terms, RPKM is calculated as follows:
How are z-scores calculated? Z-scores are one way to measure the expression of a gene in one model relative to all models. They are a measure of the number of standard deviations away from the mean Log2(RPKM+1) for a gene in a specific model is. They are calculated by subtracting the mean Log2(RPKM+1) for a gene from the Log2(RPKM+1) value for that gene in the model of interest and dividing the result by the standard deviation of the Log2(RPKM+1) values for that gene.
How is fold expression calculated? Fold expression is another metric that measures the expression of a gene in a model relative to the mean expression of that gene across all models (using our “cancer-normal” substitute, mean Log2(RPKM+1)). In essence, it is calculated by dividing the Log2(RPKM+1) value for a specific gene in a specific model by the mean Log2(RPKM+1) value for that gene. However, there are additional correction factors that go into the calculation to account for batch sequencing effects (correcting for differences in Log2(RPKM+1) values that can be attributed to samples being sequenced at different times across different runs).
Peptide intensities were obtained by Spectronaut. Protein intensities were calculated by MSstats, using quantile-normalization per sample. Outliers were detected and removed using Median Absolute Deviation (MAD), with a cutoff of median +/- 3MADs. The values shown are Variance Stabilizing Normalization (VSN) - normalized and imputed abundances (log2(intensities)).
Three datasets are provided for proteomics/phosphoproteomics, which were generated by different processing steps after normalization:
Strictly Filtered: Contains only proteins which were detected in >= 90% of the samples, constituting a conservative dataset; The missing values were imputed.
Leniently Filtered: Contains only proteins which were detected in >= 60% of the samples, constituting a lenient dataset; The missing values were imputed.
Unfiltered: Values with neither filtering nor imputation.
For “missing at random” proteins: bpca- Bayesian PCA missing value estimation For “missing not at random” proteins (e.g. low/undetected in many samples): MinProb - imputation of left-censored missing data by random draws from a Gaussian distribution centered in a minimal value.
Proteomics Workflow
Kinase activity scores were calculated from phospho-peptide abundances using the ssGSEA algorithm, based on an assembly of databases containing kinase → substrate links information (Signor, PhosphositePlus, Hijazi et al. 2020).
Please Note: Proteomics Data is normalized per indication whereas Champions PDX Data is normalized based on the whole bank.
Whole Exome Sequencing is performed as paired-end 125-bp reads using an Illumina HiSeq 2500 machine following the manufacturer’s protocol using a defined capture kit (usually Agilent v4XT, but also V4Target, SureSelectTarget, and SeqCap EZ exome v3) to reach an average coverage of 100x. Raw reads from each model are aligned to a concatenated human and mouse reference genomes (GRC37/hg19 and GRCm38/mm10 respectively) using BWA aligner followed by filtering out of reads not aligning to the human contigs (“de-mousing”).
Local realignment around INDELs and base quality score recalibration is done using the GATK pipeline. SNP analysis is conducted using multiple callers (muTect, Lofreq and Strelka) against a normal cell line sample as a control (NA12878). SNPs below the quality threshold (quality score <20, Read depth <10) are removed from the analysis, as are SNPs found in online germline databases (HapMap, ExAC, 1000G).
Copy Number Variation Analysis is performed using EXCAVATOR (http://dx.doi.org/10.1186/gb-2013-14-10-r120) with the NA12878 normal cell line as a baseline control.