Publications

2016

Alinejad-Rokny, Hamid, Firoz Anwar, Shafagh A Waters, Miles P Davenport, and Diako Ebrahimi. (2016) 2016. “Source of CpG Depletion in the HIV-1 Genome”. Molecular Biology and Evolution 33 (12): 3205-12.

The dinucleotide CpG is highly underrepresented in the genome of human immunodeficiency virus type 1 (HIV-1). To identify the source of CpG depletion in the HIV-1 genome, we investigated two biological mechanisms: (1) CpG methylation-induced transcriptional silencing and (2) CpG recognition by Toll-like receptors (TLRs). We hypothesized that HIV-1 has been under selective evolutionary pressure by these mechanisms leading to the reduction of CpG in its genome. A CpG depleted genome would enable HIV-1 to avoid methylation-induced transcriptional silencing and/or to avoid recognition by TLRs that identify foreign CpG sequences. We investigated these two hypotheses by determining the sequence context dependency of CpG depletion and comparing it with that of CpG methylation and TLR recognition. We found that in both human and HIV-1 genomes the CpG motifs flanked by T/A were depleted most and those flanked by C/G were depleted least. Similarly, our analyses of human methylome data revealed that the CpG motifs flanked by T/A were methylated most and those flanked by C/G were methylated least. Given that a similar CpG depletion pattern was observed for the human genome within which CpGs are not likely to be recognized by TLRs, we argue that the main source of CpG depletion in HIV-1 is likely host-induced methylation. Analyses of CpG motifs in over 100 viruses revealed that this unique CpG representation pattern is specific to the human and simian immunodeficiency viruses.

Starrett, Gabriel J, Elizabeth M Luengas, Jennifer L McCann, Diako Ebrahimi, Nuri A Temiz, Robin P Love, Yuqing Feng, et al. (2016) 2016. “The DNA Cytosine Deaminase APOBEC3H Haplotype I Likely Contributes to Breast and Lung Cancer Mutagenesis”. Nature Communications 7: 12918. https://doi.org/10.1038/ncomms12918.

Cytosine mutations within TCA/T motifs are common in cancer. A likely cause is the DNA cytosine deaminase APOBEC3B (A3B). However, A3B-null breast tumours still have this mutational bias. Here we show that APOBEC3H haplotype I (A3H-I) provides a likely solution to this paradox. A3B-null tumours with this mutational bias have at least one copy of A3H-I despite little genetic linkage between these genes. Although deemed inactive previously, A3H-I has robust activity in biochemical and cellular assays, similar to A3H-II after compensation for lower protein expression levels. Gly105 in A3H-I (versus Arg105 in A3H-II) results in lower protein expression levels and increased nuclear localization, providing a mechanism for accessing genomic DNA. A3H-I also associates with clonal TCA/T-biased mutations in lung adenocarcinoma suggesting this enzyme makes broader contributions to cancer mutagenesis. These studies combine to suggest that A3B and A3H-I, together, explain the bulk of 'APOBEC signature' mutations in cancer.

2015

Alinejad-Rokny, Hamid, and Diako Ebrahimi. (2015) 2015. “A Method to Avoid Errors Associated With the Analysis of Hypermutated Viral Sequences by Alignment-Based Methods”. Journal of Biomedical Informatics 58: 220-25. https://doi.org/10.1016/j.jbi.2015.10.008.

The human genome encodes for a family of editing enzymes known as APOBEC3 (apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like3). They induce context dependent G-to-A changes, referred to as "hypermutation", in the genome of viruses such as HIV, SIV, HBV and endogenous retroviruses. Hypermutation is characterized by aligning affected sequences to a reference sequence. We show that indels (insertions/deletions) in the sequences lead to an incorrect assignment of APOBEC3 targeted and non-target sites. This can result in an incorrect identification of hypermutated sequences and erroneous biological inferences made based on hypermutation analysis.

Hejazi, Leila, Michael Guilhaus, Brynn Hibbert, and Diako Ebrahimi. (2015) 2015. “Gas Chromatography With Parallel Hard and Soft Ionization Mass Spectrometry”. Rapid Communications in Mass Spectrometry : RCM 29 (1): 91-9. https://doi.org/10.1002/rcm.7091.

RATIONALE: Mass spectrometric identification of compounds in chromatography can be obtained from molecular masses from soft ionization mass spectrometry techniques such as field ionization (FI) and fragmentation patterns from hard ionization techniques such as electron ionization (EI). Simultaneous detection by EI and FI mass spectrometry allows alignment of the different information from each method.

METHODS: We report the construction and characteristics of a combined instrument consisting of a gas chromatograph and two parallel mass spectrometry ionization sources, EI and FI. When considering both ion yield and signal-to-noise it was postulated that good-quality EI and FI mass spectra could be obtained simultaneously using a post-column splitter with a split fraction of 1:10 for EI/FI. This has been realised and we report its application for the analysis of several complex mixtures.

RESULTS: The differences between the full width at half maximum (FWHM) of the EI and FI chromatograms were statistically insignificant, and the retention times of the chromatograms were highly correlated (r(2) =0.9999) with no detectable bias. The applicability and significance of this combined instrument and the attendant methodology are illustrated by the analysis of standard samples of 13 compounds with diverse structures, and the analysis of mixtures of fatty acids, fish oil, hydrocarbons and yeast metabolites.

CONCLUSIONS: This combined dual-source instrument saves time and resources, and more importantly generates equivalent chromatograms aligned in time, in EI and FI (i.e. peaks with similar shapes and identical positions). The identical FWHMs and retention times of the EI and FI chromatograms in this combined instrument enable the accurate assignment of fragment ions from EI to their corresponding molecular ions in FI.

2014

Gooneratne, Shayarana L, Hamid Alinejad-Rokny, Diako Ebrahimi, Patrick S Bohn, Roger W Wiseman, David H O’Connor, Miles P Davenport, and Stephen J Kent. (2014) 2014. “Linking Pig-Tailed Macaque Major Histocompatibility Complex Class I Haplotypes and Cytotoxic T Lymphocyte Escape Mutations in Simian Immunodeficiency Virus Infection”. Journal of Virology 88 (24): 14310-25. https://doi.org/10.1128/JVI.02428-14.

UNLABELLED: The influence of major histocompatibility complex class I (MHC-I) alleles on human immunodeficiency virus (HIV) diversity in humans has been well characterized at the population level. MHC-I alleles likely affect viral diversity in the simian immunodeficiency virus (SIV)-infected pig-tailed macaque (Macaca nemestrina) model, but this is poorly characterized. We studied the evolution of SIV in pig-tailed macaques with a range of MHC-I haplotypes. SIV(mac251) genomes were amplified from the plasma of 44 pig-tailed macaques infected with SIV(mac251) at 4 to 10 months after infection and characterized by Illumina deep sequencing. MHC-I typing was performed on cellular RNA using Roche/454 pyrosequencing. MHC-I haplotypes and viral sequence polymorphisms at both individual mutations and groups of mutations spanning 10-amino-acid segments were linked using in-house bioinformatics pipelines, since cytotoxic T lymphocyte (CTL) escape can occur at different amino acids within the same epitope in different animals. The approach successfully identified 6 known CTL escape mutations within 3 Mane-A1*084-restricted epitopes. The approach also identified over 70 new SIV polymorphisms linked to a variety of MHC-I haplotypes. Using functional CD8 T cell assays, we confirmed that one of these associations, a Mane-B028 haplotype-linked mutation in Nef, corresponded to a CTL epitope. We also identified mutations associated with the Mane-B017 haplotype that were previously described to be CTL epitopes restricted by Mamu-B*017:01 in rhesus macaques. This detailed study of pig-tailed macaque MHC-I genetics and SIV polymorphisms will enable a refined level of analysis for future vaccine design and strategies for treatment of HIV infection.

IMPORTANCE: Cytotoxic T lymphocytes select for virus escape mutants of HIV and SIV, and this limits the effectiveness of vaccines and immunotherapies against these viruses. Patterns of immune escape variants are similar in HIV type 1-infected human subjects that share the same MHC-I genes, but this has not been studied for SIV infection of macaques. By studying SIV sequence diversity in 44 MHC-typed SIV-infected pigtail macaques, we defined over 70 sites within SIV where mutations were common in macaques sharing particular MHC-I genes. Further, pigtail macaques sharing nearly identical MHC-I genes with rhesus macaques responded to the same CTL epitope and forced immune escape. This allows many reagents developed to study rhesus macaques to also be used to study pigtail macaques. Overall, our study defines sites of immune escape in SIV in pigtailed macaques, and this enables a more refined level of analysis of future vaccine design and strategies for treatment of HIV infection.

Khoury, Rima Raffoul, Gordon J Sutton, Diako Ebrahimi, and Brynn Hibbert. (2014) 2014. “Formation Constants of Copper(II) Complexes With Tripeptides Containing Glu, Gly, and His: Potentiometric Measurements and Modeling by Generalized Multiplicative Analysis of Variance”. Inorganic Chemistry 53 (3): 1278-87. https://doi.org/10.1021/ic4009575.

We report a systematic study of the effects of types and positions of amino acid residues of tripeptides on the formation constants logβ, acid dissociation constants pKa, and the copper coordination modes of the copper(II) complexes with 27 tripeptides formed from the amino acids glutamic acid, glycine, and histidine. logβ values were calculated from pH titrations with l mmol L(-1):1 mmol L(-1) solutions of the metal and ligand and previously reported ligand pKa values. Generalized multiplicative analysis of variance (GEMANOVA) was used to model the logβ values of the saturated, most protonated, monoprotonated, logβ(CuL) - logβ(HL), and pKa of the amide group. The resulting model of the saturated copper species has a two-term model describing an interaction between the central and the C-terminal residues plus a smaller, main effect of the N-terminal residue. The model supports the conclusion that two copper coordination modes exist depending on the absence or presence of His at the central position, giving species in which copper is coordinated via two or three fused chelate rings, respectively. The GEMANOVA model for pKamide, which is the same as that for the saturated complex, showed that Gly-Gly-His has the lowest pKamide values among the 27 tripeptides. Visible spectroscopy indicated the formation of metal-ligand dimers for tripeptides His-His-Gly and His-His-Glu, but not for His-His-His, and the formation of multiple ligand bis compexes CuL2 and Cu(HL)2 for tripeptides (Glu/Gly)-His-(Glu/Gly) and His-(Glu/Gly)-(Glu/Gly), respectively.

Ebrahimi, Diako, Hamid Alinejad-Rokny, and Miles P Davenport. (2014) 2014. “Insights into the Motif Preference of APOBEC3 Enzymes”. PloS One 9 (1): e87679. https://doi.org/10.1371/journal.pone.0087679.

We used a multivariate data analysis approach to identify motifs associated with HIV hypermutation by different APOBEC3 enzymes. The analysis showed that APOBEC3G targets G mainly within GG, TG, TGG, GGG, TGGG and also GGGT. The G nucleotides flanked by a C at the 3' end (in +1 and +2 positions) were indicated as disfavoured targets by APOBEC3G. The G nucleotides within GGGG were found to be targeted at a frequency much less than what is expected. We found that the infrequent G-to-A mutation within GGGG is not limited to the inaccessibility, to APOBEC3, of poly Gs in the central and 3'polypurine tracts (PPTs) which remain double stranded during the HIV reverse transcription. GGGG motifs outside the PPTs were also disfavoured. The motifs GGAG and GAGG were also found to be disfavoured targets for APOBEC3. The motif-dependent mutation of G within the HIV genome by members of the APOBEC3 family other than APOBEC3G was limited to GA→AA changes. The results did not show evidence of other types of context dependent G-to-A changes in the HIV genome.

2013

Anwar, Firoz, Miles P Davenport, and Diako Ebrahimi. (2013) 2013. “Footprint of APOBEC3 on the Genome of Human Retroelements”. Journal of Virology 87 (14): 8195-204. https://doi.org/10.1128/JVI.00298-13.

Almost half of the human genome is composed of transposable elements. The genomic structures and life cycles of some of these elements suggest they are a result of waves of retroviral infection and transposition over millions of years. The reduction of retrotransposition activity in primates compared to that in nonprimates, such as mice, has been attributed to the positive selection of several antiretroviral factors, such as apolipoprotein B mRNA editing enzymes. Among these, APOBEC3G is known to mutate G to A within the context of GG in the genome of endogenous as well as several exogenous retroelements (the underlining marks the G that is mutated). On the other hand, APOBEC3F and to a lesser extent other APOBEC3 members induce G-to-A changes within the nucleotide GA. It is known that these enzymes can induce deleterious mutations in the genome of retroviral sequences, but the evolution and/or inactivation of retroelements as a result of mutation by these proteins is not clear. Here, we analyze the mutation signatures of these proteins on large populations of long interspersed nuclear element (LINE), short interspersed nuclear element (SINE), and endogenous retrovirus (ERV) families in the human genome to infer possible evolutionary pressure and/or hypermutation events. Sequence context dependency of mutation by APOBEC3 allows investigation of the changes in the genome of retroelements by inspecting the depletion of G and enrichment of A within the APOBEC3 target and product motifs, respectively. Analysis of approximately 22,000 LINE-1 (L1), 24,000 SINE Alu, and 3,000 ERV sequences showed a footprint of GG→AG mutation by APOBEC3G and GA→AA mutation by other members of the APOBEC3 family (e.g., APOBEC3F) on the genome of ERV-K and ERV-1 elements but not on those of ERV-L, LINE, or SINE.

Khoury, Rima Raffoul, Gordon J Sutton, Brynn Hibbert, and Diako Ebrahimi. (2013) 2013. “Measurement and Modeling of Acid Dissociation Constants of Tri-Peptides Containing Glu, Gly, and His Using Potentiometry and Generalized Multiplicative Analysis of Variance”. Dalton Transactions (Cambridge, England : 2003) 42 (8): 2940-7. https://doi.org/10.1039/c2dt32797j.

We report pK(a) values with measurement uncertainties for all labile protons of the 27 tri-peptides prepared from the amino acids glutamic acid (E), glycine (G) and histidine (H). Each tri-peptide (GGG, GGE, GGH, …, HHH) was subjected to alkali titration and pK(a) values were calculated from triplicate potentiometric titrations data using HyperQuad 2008 software. A generalized multiplicative analysis of variance (GEMANOVA) of pK(a) values for the most acidic proton gave the optimum model having two terms, an interaction between the end amino acids plus an isolated main effect of the central amino acid.

2012

Ebrahimi, Diako, Firoz Anwar, and Miles P Davenport. (2012) 2012. “APOBEC3G and APOBEC3F Rarely Co-Mutate the Same HIV Genome”. Retrovirology 9: 113. https://doi.org/10.1186/1742-4690-9-113.

BACKGROUND: The human immune proteins APOBEC3G and APOBEC3F (hA3G and hA3F) induce destructive G-to-A changes in the HIV genome, referred to as 'hypermutation'. These two proteins co-express in human cells, co-localize to mRNA processing bodies and might co-package into HIV virions. Therefore they are expected to also co-mutate the HIV genome. Here we investigate the mutational footprints of hA3G and hA3F in a large population of full genome HIV-1 sequences from naturally infected patients to uniquely identify sequences hypermutated by either or both of these proteins. We develop a method of identification based on the representation of hA3G and hA3F target and product motifs that does not require an alignment to a parental/consensus sequence.

RESULTS: Out of nearly 100 hypermutated HIV-1 sequences only one sequence from the HIV-1 outlier group showed clear signatures of co-mutation by both proteins. The remaining sequences were affected by either hA3G or hA3F.

CONCLUSION: Using a novel method of identification of HIV sequences hypermutated by the hA3G and hA3F enzymes, we report a very low rate of co-mutation of full-length HIV sequences, and discuss the potential mechanisms underlying this.