Analysis of cancer mutational signatures have been instrumental in identification of responsible endogenous and exogenous molecular processes in cancer. The quantitative approach used to deconvolute mutational signatures is becoming an integral part of cancer research. Therefore, development of a stand-alone tool with a user-friendly interface for analysis of cancer mutational signatures is necessary. In this manuscript we introduce CANCERSIGN, which enables users to identify 3-mer and 5-mer mutational signatures within whole genome, whole exome or pooled samples. Additionally, this tool enables users to perform clustering on tumor samples based on the proportion of mutational signatures in each sample. Using CANCERSIGN, we analysed all the whole genome somatic mutation datasets profiled by the International Cancer Genome Consortium (ICGC) and identified a number of novel signatures. By examining signatures found in exonic and non-exonic regions of the genome using WGS and comparing this to signatures found in WES data we observe that WGS can identify additional non-exonic signatures that are enriched in the non-coding regions of the genome while the deeper sequencing of WES may help identify weak signatures that are otherwise missed in shallower WGS data.
Publications
2020
2019
The apolipoprotein B messenger RNA editing enzyme, catalytic polypeptide-like (APOBEC) family of single-stranded DNA (ssDNA) cytosine deaminases provides innate immunity against virus and transposon replication1-4. A well-studied mechanism is APOBEC3G restriction of human immunodeficiency virus type 1, which is counteracted by a virus-encoded degradation mechanism1-4. Accordingly, most work has focused on retroviruses with obligate ssDNA replication intermediates and it is unclear whether large double-stranded DNA (dsDNA) viruses may be similarly susceptible to restriction. Here, we show that the large dsDNA herpesvirus Epstein-Barr virus (EBV), which is the causative agent of infectious mononucleosis and multiple cancers5, utilizes a two-pronged approach to counteract restriction by APOBEC3B. Proteomics studies and immunoprecipitation experiments showed that the ribonucleotide reductase large subunit of EBV, BORF26,7, binds APOBEC3B. Mutagenesis mapped the interaction to the APOBEC3B catalytic domain, and biochemical studies demonstrated that BORF2 stoichiometrically inhibits APOBEC3B DNA cytosine deaminase activity. BORF2 also caused a dramatic relocalization of nuclear APOBEC3B to perinuclear bodies. On lytic reactivation, BORF2-null viruses were susceptible to APOBEC3B-mediated deamination as evidenced by lower viral titres, lower infectivity and hypermutation. The Kaposi's sarcoma-associated herpesvirus homologue, ORF61, also bound APOBEC3B and mediated relocalization. These data support a model where the genomic integrity of human γ-herpesviruses is maintained by active neutralization of the antiviral enzyme APOBEC3B.
2018
BACKGROUND: Multiple endogenous and exogenous sources of DNA damage contribute to the overall mutation burden in cancer, with distinct and overlapping combinations contributing to each cancer type. Many mutation sources result in characteristic mutation signatures, which can be deduced from tumor genomic DNA sequences. Examples include spontaneous hydrolytic deamination of methyl-cytosine bases in CG motifs (AGEING signature) and C-to-T and C-to-G mutations in 5'-TC(A/T) motifs (APOBEC signature).
METHODS: The deconstructSigs R package was used to analyze single base substitution mutation signatures in over 1000 cancer cell lines. Two additional approaches were used to analyze the APOBEC mutation signature.
RESULTS: Most cell lines show evidence for multiple mutation signatures. For instance, the AGEING signature, which is the largest source of mutation in most primary tumors, predominates in the majority of cancer cell lines. The APOBEC mutation signature is enriched in cancer cell lines from breast, lung, head/neck, bladder, and cervical cancers, where this signature also comprises a large fraction of all mutations.
CONCLUSIONS: The single base substitution mutation signatures of cancer cell lines often reflect those of the original tumors from which they are derived. Cancer cell lines with enrichments for distinct mutation signatures such as APOBEC have the potential to become model systems for fundamental research on the underlying mechanisms and for advancing clinical strategies to exploit these processes.
Human APOBEC3H (A3H) is a single-stranded DNA cytosine deaminase that inhibits HIV-1. Seven haplotypes (I-VII) and four splice variants (SV154/182/183/200) with differing antiviral activities and geographic distributions have been described, but the genetic and mechanistic basis for variant expression and function remains unclear. Using a combined bioinformatic/experimental analysis, we find that SV200 expression is specific to haplotype II, which is primarily found in sub-Saharan Africa. The underlying genetic mechanism for differential mRNA splicing is an ancient intronic deletion [del(ctc)] within A3H haplotype II sequence. We show that SV200 is at least fourfold more HIV-1 restrictive than other A3H splice variants. To counteract this elevated antiviral activity, HIV-1 protease cleaves SV200 into a shorter, less restrictive isoform. Our analyses indicate that, in addition to Vif-mediated degradation, HIV-1 may use protease as a counter-defense mechanism against A3H in >80% of sub-Saharan African populations.
2017
APOBEC3s (A3s) are single-stranded DNA cytosine deaminases that provide innate immune defences against retroviruses and mobile elements. A3s are specific to eutherian mammals because no direct homologs exist at the syntenic genomic locus in metatherian (marsupial) or prototherian (monotreme) mammals. However, the A3s in these species have the likely evolutionary precursors, the antibody gene deaminase AID and the RNA/DNA editing enzyme APOBEC1 (A1). Here, we used cell culture-based assays to determine whether opossum A1 restricts the infectivity of retroviruses including human immunodeficiency virus type 1 (HIV-1) and the mobility of LTR/non-LTR retrotransposons. Opossum A1 partially inhibited HIV-1, as well as simian immunodeficiency virus (SIV), murine leukemia virus (MLV), and the retrotransposon MusD. The mechanism of inhibition required catalytic activity, except for human LINE1 (L1) restriction, which was deamination-independent. These results indicate that opossum A1 functions as an innate barrier to infection by retroviruses such as HIV-1, and controls LTR/non-LTR retrotransposition in marsupials.
APOBEC3 (A3) family proteins are DNA cytosine deaminases recognized for contributing to HIV-1 restriction and mutation. Prior studies have demonstrated that A3D, A3F, and A3G enzymes elicit a robust anti-HIV-1 effect in cell cultures and in humanized mouse models. Human A3H is polymorphic and can be categorized into three phenotypes: stable, intermediate, and unstable. However, the anti-viral effect of endogenous A3H in vivo has yet to be examined. Here we utilize a hematopoietic stem cell-transplanted humanized mouse model and demonstrate that stable A3H robustly affects HIV-1 fitness in vivo. In contrast, the selection pressure mediated by intermediate A3H is relaxed. Intriguingly, viral genomic RNA sequencing reveled that HIV-1 frequently adapts to better counteract stable A3H during replication in humanized mice. Molecular phylogenetic analyses and mathematical modeling suggest that stable A3H may be a critical factor in human-to-human viral transmission. Taken together, this study provides evidence that stable variants of A3H impose selective pressure on HIV-1.
[This corrects the article DOI: 10.1371/journal.ppat.1006348.].
2016
The dinucleotide CpG is highly underrepresented in the genome of human immunodeficiency virus type 1 (HIV-1). To identify the source of CpG depletion in the HIV-1 genome, we investigated two biological mechanisms: (1) CpG methylation-induced transcriptional silencing and (2) CpG recognition by Toll-like receptors (TLRs). We hypothesized that HIV-1 has been under selective evolutionary pressure by these mechanisms leading to the reduction of CpG in its genome. A CpG depleted genome would enable HIV-1 to avoid methylation-induced transcriptional silencing and/or to avoid recognition by TLRs that identify foreign CpG sequences. We investigated these two hypotheses by determining the sequence context dependency of CpG depletion and comparing it with that of CpG methylation and TLR recognition. We found that in both human and HIV-1 genomes the CpG motifs flanked by T/A were depleted most and those flanked by C/G were depleted least. Similarly, our analyses of human methylome data revealed that the CpG motifs flanked by T/A were methylated most and those flanked by C/G were methylated least. Given that a similar CpG depletion pattern was observed for the human genome within which CpGs are not likely to be recognized by TLRs, we argue that the main source of CpG depletion in HIV-1 is likely host-induced methylation. Analyses of CpG motifs in over 100 viruses revealed that this unique CpG representation pattern is specific to the human and simian immunodeficiency viruses.
Cytosine mutations within TCA/T motifs are common in cancer. A likely cause is the DNA cytosine deaminase APOBEC3B (A3B). However, A3B-null breast tumours still have this mutational bias. Here we show that APOBEC3H haplotype I (A3H-I) provides a likely solution to this paradox. A3B-null tumours with this mutational bias have at least one copy of A3H-I despite little genetic linkage between these genes. Although deemed inactive previously, A3H-I has robust activity in biochemical and cellular assays, similar to A3H-II after compensation for lower protein expression levels. Gly105 in A3H-I (versus Arg105 in A3H-II) results in lower protein expression levels and increased nuclear localization, providing a mechanism for accessing genomic DNA. A3H-I also associates with clonal TCA/T-biased mutations in lung adenocarcinoma suggesting this enzyme makes broader contributions to cancer mutagenesis. These studies combine to suggest that A3B and A3H-I, together, explain the bulk of 'APOBEC signature' mutations in cancer.
2015
The human genome encodes for a family of editing enzymes known as APOBEC3 (apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like3). They induce context dependent G-to-A changes, referred to as "hypermutation", in the genome of viruses such as HIV, SIV, HBV and endogenous retroviruses. Hypermutation is characterized by aligning affected sequences to a reference sequence. We show that indels (insertions/deletions) in the sequences lead to an incorrect assignment of APOBEC3 targeted and non-target sites. This can result in an incorrect identification of hypermutated sequences and erroneous biological inferences made based on hypermutation analysis.