Skip Navigation

Exhaustive Genotyping of the CEM15 (APOBEC3G) Gene and Absence of Association with AIDS Progression in a French Cohort

  1. Hervé Do1,,3,
  2. Alexandre Vasilescu3,
  3. Gora Diop3,
  4. Thomas Hirtzig3,
  5. Simon C. Heath3,
  6. Cédric Coulonges1,
  7. Jay Rappaport4,
  8. Amu Therwath1,
  9. Mark Lathrop3,
  10. Fumihiko Matsuda3 and
  11. Jean-François Zagury1,2,,4
  1. 1 Équipe Génomique, Bioinformatique et Pathologies du Système Immunitaire, INSERM EMI0355, Paris
  2. 2 Chaire de Bioinformatique, Conservatoire National des Arts et Métiers, Paris
  3. 3 Centre National de Génotypage, Evry, France
  4. 4 Center for Neurovirology, Temple University, Philadelphia, Pennsylvania
  1. Reprints or correspondence: Dr. Jean-François Zagury, Equipe Génomique, Bioinformatique et Pathologies du Système Immunitaire, INSERM EMI0355, 15 rue de l'Ecole de Médecine, 75006 Paris, France (jfz{at}ccr.jussieu.fr) or Fumihiko Matsuda, Centre National de Génotypage, 2 rue Gaston Crémieux, 91057 Evry Cedex, France (fumi{at}cng.fr).

Abstract

CEM15 (or APOBEC3G) has recently been identified as an inhibitor of human immunodeficiency virus type 1 (HIV-1) replication in vitro. To evaluate the impact of its genetic variations on the progression of acquired immunodeficiency syndrome (AIDS), we have performed an extensive genetic analysis of CEM15. We have sequenced CEM15 in a cohort of 327 HIV-1-seropositive patients with extreme disease progression phenotypes—either slow progression or rapid progression—and in 446 healthy control subjects, all of white descent. We have identified 29 polymorphisms with allele frequencies >1%, 14 of which were newly characterized. There were no significant associations between the polymorphisms or haplotypes of CEM15 and a disease progression phenotype in our cohort.

HIV-1, in addition to its structural and functional proteins (Gag, Pol, and Env), encodes several regulatory or auxiliary proteins—namely Tat, Rev, Nef, Vif, Vpr, and Vpu. The Vif-deficient virions can replicate in particular T cell lines (“permissive” cells) but not in others (“nonpermissive” cells) [1]. This observation was explained by the fact that Vif overcomes an endogenous inhibitor of HIV-1 produced in nonpermissive cells [2]. Sheehy et al. identified this factor, CEM15 (apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like 3G, or APOBEC3G; MIM 607113 in the Online Mendelian Inheritance in Man database [http://www.ncbi.nlm.nih.gov/Omim/]), and demonstrated that it has an inhibitory effect on the replication of Vif-deficient HIV-1 virions [3]. Like other members of the APOBEC family [4], CEM15 has a DNA mutator activity: it catalyzes the deamination of cytosine bases, transforming them into uracil bases, thus interfering with the viral replication cycle [5, 6]. The HIV-1 Vif protein counteracts the antiretroviral activity of CEM15 by inducing its ubiquitination and degradation [7, 8].

In the present study, we have performed a systematic genetic analysis of the CEM15 gene by extensive nucleotide sequencing, to clarify whether genetic variations of this gene could influence disease onset and progression. We used the Genetics of Resistance to Immunodeficiency Virus (GRIV) cohort, which consists of 2 subpopulations of white French HIV-1-seropositive individuals with extreme phenotypes: 245 patients with slow progression, which are equivalent to a 1% extreme subset of patients in a cohort of 24,500 patients who experienced seroconversion [9], and 82 patients with rapid progression. We also included 446 healthy control subjects of similar ethnic origin. This cohort is, to our knowledge, the largest of its kind in the world and has made it possible to confirm and extend the major genetic associations described to date [1012]. Indeed, transversal studies performed on patient cohorts with extreme phenotypes are complementary to longitudinal studies performed on seroconverter cohorts [9]. Single-nucleotide polymorphisms (SNPs) and other genetic variations identified through sequencing were evaluated for their association with disease susceptibility and progression.

Materials and methods

The GRIV cohort was established in 1995 in France to generate a large collection of DNAs for the identification and study of polymorphisms associated with rapid and slow progression to AIDS. Only white individuals of European descent living in France were recruited. These criteria limit the influence of the virogenetic and environmental factors (subjects are all infected by B strains and live in a similar environment) and put emphasis on the genetic background of each individual to determine the various patterns of progression. The GRIV cohort consists of 2 subpopulations of HIV-1-seropositive individuals with extreme phenotypes of disease progression: 245 patients with slow progression, defined as individuals who have been seropositive and asymptomatic for ⩾8 years with a CD4+ cell count >500 cells/mm3 in the absence of antiretroviral therapy, and 82 patients with rapid progression, defined as patients with a decrease in their CD4+ cell count to <300 cells/mm3 in <3 years after the last seronegative test. The DNA was obtained from fresh peripheral blood mononuclear cells or from Epstein-Barr virus-transformed cell lines.

Polymerase chain reaction conditions and primers used to genotype the CEM15 gene are presented in table 1. Alignment, SNP discovery, and genotyping were performed with the software Genalys (version 2.0b; M. Takahashi, Centre National de Génotypage) [13]. For all polymorphisms, we performed an initial screening of 150 patients with slow progression, 50 patients with rapid progression, and 155 control subjects. If the P value evaluating the association was of borderline significance (.05–.10), the genotyping was then extended to the rest of the cohort.

Table 1

Primers used to amplify the exons of CEM15 by polymerase chain reaction (PCR).

Statistical analyses were performed only on the polymorphisms with minor-allele frequencies >1% in the whole population. Differences in the allele frequencies of individual polymorphisms between the 3 groups were examined using a Fisher's exact test. Linkage disequilibrium (LD) was computed for each pair of polymorphisms. Haplotype estimates were obtained using the expectation maximization (EM) algorithm [14] and the Phase2 algorithm [15], either for all polymorphisms or for polymorphisms gathered in linkage groups (to determine haploblocks). For each haplotype (with a frequency >1% in the whole population), the expected numbers of individuals in each group with and without that haplotype were computed from the estimated marginal haplotype probability distribution for each individual. These numbers were rounded to the nearest integer, and a nominal P value was computed using Fisher's exact test.

Results

We systematically screened the CEM15 gene for polymorphisms by resequencing the exons with their flanking regions as well as the 1-kb region upstream of the first exon, under the assumption that putative regulatory elements would be located within that region. Of these 42 polymorphisms identified in the CEM15 gene, 29 had allele frequencies of ⩾1% in the whole population (figure 1). Among these 29 polymorphisms, 14 SNPs were newly identified in the present study. As shown in figure 1, 5 SNPs are located in the 1-kb region upstream of the first exon, and 4 SNPs are located in exons. The SNP CEM15_27431271 leads to a synonymous change, whereas the SNPs CEM15_27431277 and CEM15_27431298 lead to nonsynonymous changes (histidine to arginine at position 186 and glutamine to glutamic acid at position 275, respectively, in protein sequence NP_068594.1). The SNP CEM15_27431342 is located in the 5′ untranslated region. The polymorphism distributions in cases and controls were in Hardy-Weinberg equilibrium. Figure 2 shows pairwise LD measured between each pair of polymorphisms.

Figure 1

Genetic organization of the CEM15 gene, located on chromosome 22 (22q13.1–q13.2). Coding and untranslated regions are indicated by black and white rectangles, respectively. The regions that have been sequenced are indicated by a horizontal line, with start and end positions numbered according to the first nucleotide of the initiation codon being considered as +1 (indicated by a black triangle). The polymorphism numbers are the attribution numbers from the Centre National de Génotypage database (the correspondence with the National Center for Biotechnology Information dbSNP database (http://www.ncbi.nlm.nih.gov/projects/SNP/) is given in table 2). The newly characterized polymorphisms are indicated by an asterisk. The genomic sequence used for alignment is GenBank sequence NT_011520.

Figure 2

Linkage disequilibrium (LD) plot of the CEM15 gene. The color figure and legend are available in their entirety in the online edition of the Journal of Infectious Diseases.

We performed a statistical analysis to test whether these polymorphisms or deduced haplotypes are associated with disease progression. We failed to obtain any significant association between polymorphisms and AIDS progression by use of Fisher's exact test (the lowest P value was 0.19, for CEM15_27431290, slow progression vs. rapid progression; table 2). Similarly, using estimated haplotypes, we did not find any significant association between particular haplotypes of CEM15 and AIDS progression (table 3). The 2 algorithms, Phase2 and EM, yielded similar haplotype estimates. The detailed list of haplotypes is given in figure 3. The lowest P value was .15, for haplotype 12, in the comparison between rapid progression and slow progression (table 3). No association was identified either in the dominant/recessive models or in the simple comparison of the diplotype distribution (data not shown).

Figure 3

List of all haplotypes with frequencies >1% in the whole population (similar results were obtained with the Phase2 and expectation maximization algorithms). The color figure and legend are available in their entirety in the online edition of the Journal of Infectious Diseases.

Table 2

List of all polymorphisms with frequencies >1% in the whole population.

Table 3

List of all haplotypes with frequencies >1% in the whole population.

To refine our haplotype analysis, we examined the distributions of haploblocks, which were computed by gathering the polymorphisms in linkage groups for various levels of LD (85%, 90%, and 95%). No association could be identified by use of this approach. Finally, we looked more specifically for associations based on the haplotypes/haploblocks derived solely from the SNPs located in the 1-kb region upstream of exon 1 (assumed to be the promoter region) or solely from the 2 nonsynonymous SNPs (amino acid substitutions). Using these 2 approaches, we failed to obtain any significant association between AIDS progression and polymorphisms (data not shown).

Discussion

In this study, we have undertaken a systematic investigation of the association between genetic variations in the CEM15 gene and susceptibility to disease progression in the GRIV cohort. We have identified 29 polymorphisms with minor-allele frequencies >1% in our whole population, 14 of which were newly characterized. Two SNPs of the 29 polymorphisms, which are located in exons, lead to amino acid substitutions. However, these substitutions (His186Arg for CEM15_27431277 and Gln275Glu for CEM15_27431298) involve amino acids with similar properties that are unlikely to modify the protein function. We failed to obtain any significant association between polymorphisms, haplotypes, or haploblocks in the CEM15 gene and extreme AIDS progression phenotypes in the GRIV cohort. Therefore, it seems that there is no major genetic determinant in the CEM15 gene influencing the various profiles of progression to AIDS. As for any AIDS genomic study, our results will need to be confirmed in other cohorts, especially in seroconverter cohorts, which have a structure different from that of the GRIV cohort [9]. Since CEM15 is known to play a relevant role in HIV-1 replication [3], it will also be of interest to investigate the other cellular proteins, such as Cullin-5 or Rbx-1, that participate with Vif in its ubiquitination and proteolysis [8]. The gene polymorphisms of these cellular proteins could potentially exhibit associations in combination with the CEM15 polymorphisms, if not on their own. Finally, CEM15, because of its activity on single-stranded DNA, has been suspected to play a potential role in innate antiretroviral defense [6]. Therefore, the identification of 14 novel polymorphisms with relatively high frequencies (>1%) and haplotype information on this gene should prove useful in assessing the potential association of CEM15 polymorphisms with other retroviral diseases.

Acknowledgments

The authors are grateful to all of the patients and medical staff who have generously collaborated with the Genetics of Resistance to Immunodeficiency Virus project. The authors also thank the Epidemiological study on the Genetics and Environment of Asthma (EGEA) cooperative group, for having given access to data on the EGEA study, which was partly supported by an INSERM/Merck Sharp & Dohme-Chibret convention.

Footnotes

  • Financial support: Agence Nationale de Recherche sur le SIDA, AIDS-Cancer Vaccine Development Foundation, and Neovacs SA (grants to the Genetics of Resistance to Immunodeficiency Virus project); Ministère de la Recherche et des Nouvelles Technologies (support to the Centre National de Génotypage).

  • Received May 12, 2004.
  • Accepted August 4, 2004.

References

| Table of Contents