APOBEC3 (A3) family proteins are DNA cytosine deaminases recognized for contributing to HIV-1 restriction and mutation. Prior studies have demonstrated that A3D, A3F, and A3G enzymes elicit a robust anti-HIV-1 effect in cell cultures and in humanized mouse models. Human A3H is polymorphic and can be categorized into three phenotypes: stable, intermediate, and unstable. However, the anti-viral effect of endogenous A3H in vivo has yet to be examined. Here we utilize a hematopoietic stem cell-transplanted humanized mouse model and demonstrate that stable A3H robustly affects HIV-1 fitness in vivo. In contrast, the selection pressure mediated by intermediate A3H is relaxed. Intriguingly, viral genomic RNA sequencing reveled that HIV-1 frequently adapts to better counteract stable A3H during replication in humanized mice. Molecular phylogenetic analyses and mathematical modeling suggest that stable A3H may be a critical factor in human-to-human viral transmission. Taken together, this study provides evidence that stable variants of A3H impose selective pressure on HIV-1.
Publications
2017
2016
The dinucleotide CpG is highly underrepresented in the genome of human immunodeficiency virus type 1 (HIV-1). To identify the source of CpG depletion in the HIV-1 genome, we investigated two biological mechanisms: (1) CpG methylation-induced transcriptional silencing and (2) CpG recognition by Toll-like receptors (TLRs). We hypothesized that HIV-1 has been under selective evolutionary pressure by these mechanisms leading to the reduction of CpG in its genome. A CpG depleted genome would enable HIV-1 to avoid methylation-induced transcriptional silencing and/or to avoid recognition by TLRs that identify foreign CpG sequences. We investigated these two hypotheses by determining the sequence context dependency of CpG depletion and comparing it with that of CpG methylation and TLR recognition. We found that in both human and HIV-1 genomes the CpG motifs flanked by T/A were depleted most and those flanked by C/G were depleted least. Similarly, our analyses of human methylome data revealed that the CpG motifs flanked by T/A were methylated most and those flanked by C/G were methylated least. Given that a similar CpG depletion pattern was observed for the human genome within which CpGs are not likely to be recognized by TLRs, we argue that the main source of CpG depletion in HIV-1 is likely host-induced methylation. Analyses of CpG motifs in over 100 viruses revealed that this unique CpG representation pattern is specific to the human and simian immunodeficiency viruses.
Cytosine mutations within TCA/T motifs are common in cancer. A likely cause is the DNA cytosine deaminase APOBEC3B (A3B). However, A3B-null breast tumours still have this mutational bias. Here we show that APOBEC3H haplotype I (A3H-I) provides a likely solution to this paradox. A3B-null tumours with this mutational bias have at least one copy of A3H-I despite little genetic linkage between these genes. Although deemed inactive previously, A3H-I has robust activity in biochemical and cellular assays, similar to A3H-II after compensation for lower protein expression levels. Gly105 in A3H-I (versus Arg105 in A3H-II) results in lower protein expression levels and increased nuclear localization, providing a mechanism for accessing genomic DNA. A3H-I also associates with clonal TCA/T-biased mutations in lung adenocarcinoma suggesting this enzyme makes broader contributions to cancer mutagenesis. These studies combine to suggest that A3B and A3H-I, together, explain the bulk of 'APOBEC signature' mutations in cancer.
2015
The human genome encodes for a family of editing enzymes known as APOBEC3 (apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like3). They induce context dependent G-to-A changes, referred to as "hypermutation", in the genome of viruses such as HIV, SIV, HBV and endogenous retroviruses. Hypermutation is characterized by aligning affected sequences to a reference sequence. We show that indels (insertions/deletions) in the sequences lead to an incorrect assignment of APOBEC3 targeted and non-target sites. This can result in an incorrect identification of hypermutated sequences and erroneous biological inferences made based on hypermutation analysis.
RATIONALE: Mass spectrometric identification of compounds in chromatography can be obtained from molecular masses from soft ionization mass spectrometry techniques such as field ionization (FI) and fragmentation patterns from hard ionization techniques such as electron ionization (EI). Simultaneous detection by EI and FI mass spectrometry allows alignment of the different information from each method.
METHODS: We report the construction and characteristics of a combined instrument consisting of a gas chromatograph and two parallel mass spectrometry ionization sources, EI and FI. When considering both ion yield and signal-to-noise it was postulated that good-quality EI and FI mass spectra could be obtained simultaneously using a post-column splitter with a split fraction of 1:10 for EI/FI. This has been realised and we report its application for the analysis of several complex mixtures.
RESULTS: The differences between the full width at half maximum (FWHM) of the EI and FI chromatograms were statistically insignificant, and the retention times of the chromatograms were highly correlated (r(2) =0.9999) with no detectable bias. The applicability and significance of this combined instrument and the attendant methodology are illustrated by the analysis of standard samples of 13 compounds with diverse structures, and the analysis of mixtures of fatty acids, fish oil, hydrocarbons and yeast metabolites.
CONCLUSIONS: This combined dual-source instrument saves time and resources, and more importantly generates equivalent chromatograms aligned in time, in EI and FI (i.e. peaks with similar shapes and identical positions). The identical FWHMs and retention times of the EI and FI chromatograms in this combined instrument enable the accurate assignment of fragment ions from EI to their corresponding molecular ions in FI.
2014
UNLABELLED: The influence of major histocompatibility complex class I (MHC-I) alleles on human immunodeficiency virus (HIV) diversity in humans has been well characterized at the population level. MHC-I alleles likely affect viral diversity in the simian immunodeficiency virus (SIV)-infected pig-tailed macaque (Macaca nemestrina) model, but this is poorly characterized. We studied the evolution of SIV in pig-tailed macaques with a range of MHC-I haplotypes. SIV(mac251) genomes were amplified from the plasma of 44 pig-tailed macaques infected with SIV(mac251) at 4 to 10 months after infection and characterized by Illumina deep sequencing. MHC-I typing was performed on cellular RNA using Roche/454 pyrosequencing. MHC-I haplotypes and viral sequence polymorphisms at both individual mutations and groups of mutations spanning 10-amino-acid segments were linked using in-house bioinformatics pipelines, since cytotoxic T lymphocyte (CTL) escape can occur at different amino acids within the same epitope in different animals. The approach successfully identified 6 known CTL escape mutations within 3 Mane-A1*084-restricted epitopes. The approach also identified over 70 new SIV polymorphisms linked to a variety of MHC-I haplotypes. Using functional CD8 T cell assays, we confirmed that one of these associations, a Mane-B028 haplotype-linked mutation in Nef, corresponded to a CTL epitope. We also identified mutations associated with the Mane-B017 haplotype that were previously described to be CTL epitopes restricted by Mamu-B*017:01 in rhesus macaques. This detailed study of pig-tailed macaque MHC-I genetics and SIV polymorphisms will enable a refined level of analysis for future vaccine design and strategies for treatment of HIV infection.
IMPORTANCE: Cytotoxic T lymphocytes select for virus escape mutants of HIV and SIV, and this limits the effectiveness of vaccines and immunotherapies against these viruses. Patterns of immune escape variants are similar in HIV type 1-infected human subjects that share the same MHC-I genes, but this has not been studied for SIV infection of macaques. By studying SIV sequence diversity in 44 MHC-typed SIV-infected pigtail macaques, we defined over 70 sites within SIV where mutations were common in macaques sharing particular MHC-I genes. Further, pigtail macaques sharing nearly identical MHC-I genes with rhesus macaques responded to the same CTL epitope and forced immune escape. This allows many reagents developed to study rhesus macaques to also be used to study pigtail macaques. Overall, our study defines sites of immune escape in SIV in pigtailed macaques, and this enables a more refined level of analysis of future vaccine design and strategies for treatment of HIV infection.
We report a systematic study of the effects of types and positions of amino acid residues of tripeptides on the formation constants logβ, acid dissociation constants pKa, and the copper coordination modes of the copper(II) complexes with 27 tripeptides formed from the amino acids glutamic acid, glycine, and histidine. logβ values were calculated from pH titrations with l mmol L(-1):1 mmol L(-1) solutions of the metal and ligand and previously reported ligand pKa values. Generalized multiplicative analysis of variance (GEMANOVA) was used to model the logβ values of the saturated, most protonated, monoprotonated, logβ(CuL) - logβ(HL), and pKa of the amide group. The resulting model of the saturated copper species has a two-term model describing an interaction between the central and the C-terminal residues plus a smaller, main effect of the N-terminal residue. The model supports the conclusion that two copper coordination modes exist depending on the absence or presence of His at the central position, giving species in which copper is coordinated via two or three fused chelate rings, respectively. The GEMANOVA model for pKamide, which is the same as that for the saturated complex, showed that Gly-Gly-His has the lowest pKamide values among the 27 tripeptides. Visible spectroscopy indicated the formation of metal-ligand dimers for tripeptides His-His-Gly and His-His-Glu, but not for His-His-His, and the formation of multiple ligand bis compexes CuL2 and Cu(HL)2 for tripeptides (Glu/Gly)-His-(Glu/Gly) and His-(Glu/Gly)-(Glu/Gly), respectively.
We used a multivariate data analysis approach to identify motifs associated with HIV hypermutation by different APOBEC3 enzymes. The analysis showed that APOBEC3G targets G mainly within GG, TG, TGG, GGG, TGGG and also GGGT. The G nucleotides flanked by a C at the 3' end (in +1 and +2 positions) were indicated as disfavoured targets by APOBEC3G. The G nucleotides within GGGG were found to be targeted at a frequency much less than what is expected. We found that the infrequent G-to-A mutation within GGGG is not limited to the inaccessibility, to APOBEC3, of poly Gs in the central and 3'polypurine tracts (PPTs) which remain double stranded during the HIV reverse transcription. GGGG motifs outside the PPTs were also disfavoured. The motifs GGAG and GAGG were also found to be disfavoured targets for APOBEC3. The motif-dependent mutation of G within the HIV genome by members of the APOBEC3 family other than APOBEC3G was limited to GA→AA changes. The results did not show evidence of other types of context dependent G-to-A changes in the HIV genome.
2013
Almost half of the human genome is composed of transposable elements. The genomic structures and life cycles of some of these elements suggest they are a result of waves of retroviral infection and transposition over millions of years. The reduction of retrotransposition activity in primates compared to that in nonprimates, such as mice, has been attributed to the positive selection of several antiretroviral factors, such as apolipoprotein B mRNA editing enzymes. Among these, APOBEC3G is known to mutate G to A within the context of GG in the genome of endogenous as well as several exogenous retroelements (the underlining marks the G that is mutated). On the other hand, APOBEC3F and to a lesser extent other APOBEC3 members induce G-to-A changes within the nucleotide GA. It is known that these enzymes can induce deleterious mutations in the genome of retroviral sequences, but the evolution and/or inactivation of retroelements as a result of mutation by these proteins is not clear. Here, we analyze the mutation signatures of these proteins on large populations of long interspersed nuclear element (LINE), short interspersed nuclear element (SINE), and endogenous retrovirus (ERV) families in the human genome to infer possible evolutionary pressure and/or hypermutation events. Sequence context dependency of mutation by APOBEC3 allows investigation of the changes in the genome of retroelements by inspecting the depletion of G and enrichment of A within the APOBEC3 target and product motifs, respectively. Analysis of approximately 22,000 LINE-1 (L1), 24,000 SINE Alu, and 3,000 ERV sequences showed a footprint of GG→AG mutation by APOBEC3G and GA→AA mutation by other members of the APOBEC3 family (e.g., APOBEC3F) on the genome of ERV-K and ERV-1 elements but not on those of ERV-L, LINE, or SINE.
We report pK(a) values with measurement uncertainties for all labile protons of the 27 tri-peptides prepared from the amino acids glutamic acid (E), glycine (G) and histidine (H). Each tri-peptide (GGG, GGE, GGH, …, HHH) was subjected to alkali titration and pK(a) values were calculated from triplicate potentiometric titrations data using HyperQuad 2008 software. A generalized multiplicative analysis of variance (GEMANOVA) of pK(a) values for the most acidic proton gave the optimum model having two terms, an interaction between the end amino acids plus an isolated main effect of the central amino acid.