Skip Navigation

HIV-1 Variation before Seroconversion in Men Who Have Sex with Men: Analysis of Acute/Early HIV Infection in the Multicenter AIDS Cohort Study

  1. Geoffrey S Gottlieb1,
  2. Laura Heath2,
  3. David C Nickle2,
  4. Kim G Wong2,
  5. Stephanie E Leach2,
  6. Benjamin Jacobs2,
  7. Surafel Gezahegne2,
  8. Angélique B van ’t Wout2,
  9. Lisa P Jacobson3,
  10. Joseph B Margolick4 and
  11. James I Mullins1,2
  1. 1Division of Allergy and Infectious Diseases, Department of Medicine
  2. 2Department of Microbiology, School of Medicine, University of WashingtonSeattle
  3. 3Departments of Epidemiology
  4. 4Molecular Microbiology and Immunology, Johns Hopkins Bloomberg School of Public HealthBaltimore, Maryland
  1. Reprints or correspondence: Dr. Geoffrey S. Gottlieb, Div. of Allergy and Infectious Diseases, Dept. of Medicine, University of Washington, Seattle, WA 98195 (gottlieb{at}u.washington.edu)

Abstract

Understanding the characteristics of human immunodeficiency virus (HIV) necessary for infection in a new host is a critical goal for acquired immunodeficiency syndrome (AIDS) research. We studied the characteristics of HIV-1 envelope genes in 38 men in the Multicenter AIDS Cohort Study cohort before seroconversion. We found a range of diversity (0.2%–5.6% [median, 0.86%]), V1-V2 loop length (58–93 aa), and potential N-linked glycosylation sites (n=2–9). However, at least 46% of the men had replicating virus that appeared to have been derived from a single viral variant. Nearly all variants were predicted to be CCR5 tropic. We found no correlation between these viral characteristics and the HIV outcomes of time to clinical AIDS or death and/or a CD4 cell count <200 cells/μL

The viral characteristics required for HIV-1 transmission remain to be fully understood, yet their elucidation is critical to the development of prevention strategies. There have been mixed reports on the relative homogeneity or heterogeneity of virus populations that become established during primary infection [17]. The lack of consistent findings is likely due in part to different biological and methodological factors, such as differences in the mode of transmission, gender, HIV-1 subtypes, the precise timing of evaluation after infection, the gene regions examined, and the criteria by which viral populations were categorized as heterogeneous or homogeneous (e.g., see [18]). Overall, however, there is consensus that HIV-1 populations are relatively homogeneous in primary infection compared with chronic infection, with viral population diversity growing at a generally consistent pace during untreated asymptomatic infection [2]

To further our understanding of the factors involved in early infection, we assessed HIV-1 variant populations in participants in the Multicenter AIDS Cohort Study (MACS) during primary HIV infection (when plasma HIV-1 RNA was detectable but before seroconversion had occurred)

MethodsThe MACS is an ongoing prospective study of HIV-1 infection (http://www.statepi.jhsph.edu/macs/macs.html). HIV serostatus was determined using a combination of HIV EIA and Western blot assays. Subjects were classified as being seronegative if they had a negative or equivocal EIA result and/or a negative or indeterminate confirmatory Western blot result. Plasma HIV-1 RNA loads were measured by reverse-transcription polymerase chain reaction (PCR; HIV-1 Amplicor; Roche), and T cell subsets were quantified by flow cytometry. For estimates of time to clinical events, we used the standard criterion of seroconversion date, estimated as the midpoint between the last seronegative and first seropositive visit

Plasma samples were analyzed as described elsewhere [2]. Briefly, plasma HIV-1 RNA was isolated, and nested PCR of the HIV-1 envelope (env) V1-V5 region (using primer sets ED3-BH2 [first round] and ED5-DR8 [second round]) was performed using end-point dilution. To avoid template resampling [2, 8], we cloned and then picked single clones from dilutions with a low copy number or a number of clones corresponding to no more than 5% of the total from dilutions with a copy number >100 copies/mL. For PCR-negative samples, we attempted PCR again using the more sensitive ED31-BH2 (first round) and DR7-ED33 (second round) primer sets

All sequences were determined using dye-terminator chemistries and were assessed for potential sample mix-up and contamination by established techniques. Sequences were deposited in GenBank and were assigned accession numbers AF138652-AF138657 and EU184091-EU184657. Each sequence was aligned with references from the HIV database (http://www.hiv.lanl.gov) using CLUSTAL W, followed by manual adjustment using MacClade (version 4). Regions in the alignment that could not be unambiguously aligned were removed. No hypermutated sequences were identified using Hypermut (version 2.0; http://www.hiv.lanl.gov/content/sequence/HYPERMUT/hypermut.html). Pairwise nucleotide distances were estimated using distance-based methods and evolutionary models HKY85 (Hasegawa-Kishino-Yano, 85) or GTR+Γ+I (general time-reversible models with a gamma distribution and invariable sites) under maximum likelihood (ML) criteria and implemented in PAUP* (version 4.0b10). Neighbor-joining and ML trees were estimated using PAUP* or PhyML

Viral diversity was measured by determining the ML pairwise genetic distances between all sequences obtained at a given time point in PAUP*. Viral divergence was measured by estimating, using ML criteria, a most recent common ancestor (ANC) sequence at the root node of each subject’s clade of sequences, using reference sequences (B.FR.83.HXB2 [K03455], B.US.83.RF [M17451], B.US.86.JRFL [U63632], B.US.90.WEAU160 [U21135]) from the HIV database as outgroups, as described elsewhere [4]

Genotypic coreceptor analysis of the V3 loop was performed as described elsewhere (http://indra.mullins.microbiol.washington.edu/pssm/). Potential N-linked glycosylation sites (PNLGS) were predicted using N-GLYCOSITE (http://www.hiv.lanl.gov/content/sequence/GLYCOSITE/glycosite.html)

Rates of disease progression were measured by time from seroconversion to a clinical AIDS-defining event (1993 Centers for Disease Control and Prevention definition), death, or CD4 cell count <200 cells/μL. Subjects who did not reach an AIDS end point were censored at time of initiation of highly active antiretroviral therapy or time of loss to follow-up. Statistical analysis was done using JMP software (version 5.1.2; SAS Institute)

This study was conducted with institutional review board approval from the University of Washington and the parent institutions of the MACS

ResultsFrom 1984 through November 2004, a total of 6973 men were enrolled in the MACS, including 615 seroconverters, of whom 57 were identified as having a positive plasma HIV-1 RNA load at their last seronegative visit by systematic testing of the last seronegative visit of all seroconverters who had specimens available. Forty-five of the 57 subjects had RNA-positive and antibody-negative (RNA+Ab) blood samples available for further analyses. We confirmed viral RNA positivity at the RNA+Ab visit for 38 of the 45 subjects (table 1). In the 7 subjects for whom we could not confirm the presence of viral RNA (with a sensitivity of 1–10 copies/PCR, or less than 40–80 copies/mL of plasma; see Methods), the plasma viral loads determined previously by the Amplicor HIV-1 RNA assay (versions 1.0 and 1.5; cutoff of <400 copies/mL) were between 423 and 1029 copies/mL (whether these represent false-positive results or subsequent sample degradation occurring before our analysis could not be determined). These subjects were excluded from subsequent analyses. Fourteen (36.8%) of the 38 subjects had plasma viral loads >500,000 copies/mL at the RNA+Ab visit, suggesting that samples were obtained from them near the time of peak viremia of primary infection

Each of the 38 confirmed RNA+Ab subjects was shown by viral phylogenetic analysis to be infected with HIV-1 subtype B. Each subject’s sequence population was monophyletic; hence, there was no evidence of dual infection (figure 1A). Nor was there any evidence of clustering of intersubject sequences; thus, there was no evidence of close epidemiologic linkages between subjects (figure 1A). Viral population heterogeneity of the env V1-V5 region was examined in the 37 subjects for whom env V1-V5 sequences were available (mean, 15.2 sequences/subject; range, 11–19). A wide distribution of intrasubject diversities was observed at both the amino acid (median, 0.86%; range, 0.2%–5.6%) and nucleotide levels (figure 1A and 1B and figure 22). There was no significant correlation between viral load at the seronegative visit and envelope V1-V5 region diversity (nucleotide or amino acid; P=.70 and P=.47, respectively)

Figure 1

A Maximum-likelihood phylogenetic tree (gap stripped; implemented in PhyML) of the envelope V1-V5 region (567 independent clones from 37 subjects) at the HIV RNA-positive and antibody-negative visit. Median pairwise nucleotide diversity was 0.4% (range, 0.19%–3.9%). B Maximum-likelihood phylogenetic tree (general time-reversible model, gap stripped; implemented in PAUP*) of the envelope V1-V5 region (15 independent clones from each subject) from the 2 Multicenter AIDS Cohort Study subjects with the lowest (0.19%; white circles) and highest (3.9%; shaded diamonds) mean pairwise nucleotide diversity. Shaded boxes show the clades/sequences from the 2 subjects in the panel. C Histogram showing the no. of phylogenetically informative sites per subject, as determined by removing “private” mutations and assessing the no. of sequences with distinct patterns of informative sites. D Histogram showing the no. of unique viral variants per subject, as determined by assessing the no. of sequences with distinct patterns of phylogenetically informative sites. Putative recombinants were determined by assessing for crossover events between 2 unique viral sequence variants to form a novel “recombinant” (see figures 2255)

Figure 2

Distributions of intrasubject mean amino acid diversity. All intrasubject pairwise amino acid distances were calculated using PAUP*. Mean amino acid diversity was calculated from the distance matrix. The line shows the cumulative percentage of subjects

Figure 2

Distributions of intrasubject mean amino acid diversity

The number of unique variants replicating in, and potentially transmitted to, each subject was estimated by examining phylogenetically informative sites (nucleotide changes shared by 2 or more sequences). Six subjects (16%) had clonal populations (1 variant; i.e., no informative sites) and 17 (46%) had 0 or 1 informative site (1 or 2 unique variants), clearly suggesting outgrowth from a single unique variant. In contrast, 9 (24%) subjects had sequences with 7 or more informative sites (4–13 unique variants, counting an insertion or deletion of any length as an informative site), suggesting that multiple variants were likely to have been transmitted in these cases. In addition, 12 subjects (32%) had evidence of recombination between unique viral variants (figure 1C and 1D and figure 33)

Figure 3

Unique envelope gp120 viral variants in 37 subjects. Assessed by analyzing the no. of phylogenetically informative sites (with “private” mutations removed) for each subject. Each colored box represents an individual subject. Colors represent unique variants for each subject; putative recombinants are labeled (rec). Nucleotide positions at phylogenetically informative sites are based on the individual subject alignments

Figure 3

Unique envelope gp120 viral variants in 37 subjects

One (0.17%) of 587 V3 loop sequences from the 38 subjects had a genotype consistent with SI/X4 tropism (this sequence had a positive SI PSSM [syncytium-inducing position-specific scoring matrix] score but no canonical 11/25 mutation); thus, nearly all transmitted viruses were CCR5 tropic (figure 44)

Figure 4

Envelope V3 loop amino acid variation. In the top panel, the representation of amino acids at each position of the V3 loop, including 522 sequences with open reading frames from 38 subjects, are shown in descending order of prevalence. The most common amino acid at each site is shown at the top in boldface. Gaps in the V3 loop are designated by “(−)”. In the bottom panel, sequence logos (generated at http://weblogo.berkeley.edu/logo.cgi) for the same data set are shown. The characters at each logo position and their size depict the relative proportions of the designated amino acids at each site

Figure 4

Envelope V3 loop amino acid variation

We evaluated V1-V2 loop length variation and PNLGS in the subset of 487 sequences with open reading frames (ORFs) from 37 subjects (figure 55). The median length was 66 aa (range, 58–93 aa), with length variation detected in 7 subjects. Thirty-three subjects (89%) had variation in the number of PNLGS in this region. A mean of 5.6 PNLGS (range, 2–9) were found, which was strongly correlated with loop length (adjusted R2=0.59; P<.0001)

Figure 5

Histogram of V1-V2 amino acid loop length vs. potential N-linked glycosylation sites (PNLGS). Individual subjects (n=37) are numbered; black bars represent V1-V2 loop lengths, and gray bars represent PNLGS from individual sequences. Seven subjects had V1-V2 length variants, and 33 subjects had PNLGS variants

Figure 5

Histogram of V1-V2 amino acid loop length vs. potential N-linked glycosylation sites (PNLGS)

Table 1

Characteristics of the Multicenter AIDS Cohort Study HIV RNA–positive and antibody-negative cohort

Analysis of PNLGS in the 447 env V1-V5 sequences with ORFs also demonstrated a wide range of variation (data not shown). Only 4 (11%) of 37 subjects had the same numbers of PNLGS in all clones. However, there was no significant correlation between PNLGS and envelope V1-V5 nucleotide or amino acid diversity (P=.4 and P=.8, respectively)

We found no significant correlations between any of the aforementioned early viral genetic parameters (i.e., diversity, number of unique viral variants, divergence from the estimated ANC, V1-V2 loop length, PNLGS) and any HIV disease outcome measure or surrogate marker evaluated (i.e., set-point viral load and time from seroconversion to clinical AIDS, death, or CD4 cell count <200 cells/μL), self-reported risk factors for mode of transmission, or history of sexually transmitted infections in the 6 months preceding their visit (P>.05, for all comparisons). However, as expected, there was a significant correlation between set-point viral load at 1 year after seroconversion and time from seroconversion to AIDS (adjusted R2=0.3; P=.009)

DiscussionIn our study, approximately half (46%) of the subjects had HIV-1 gp120 gene populations shortly after transmission and before seroconversion that were substantially homogeneous (⩽2 variants). Although such clonality may have emerged after transmission, this finding suggests that these subjects were infected with a single clonal population or unique viral variant. Because viral evolution occurs rapidly, it is not possible to determine how many of the remainder were infected with multiple variants, but the number is between 54% (the remainder of the subjects) and the conservative cutoff of 24% of individuals with multiple variants harboring at least 7 informative sites. Amino acid diversity ranged up to 5.6% over the envelope V1-V5 region, and variation in V1–V2 loop lengths (range, 58–93 aa) and PNLGS (range, 2–9) was also evident, as were putative recombinants. For this analysis, we omitted phylogenetically noninformative (“private”) mutations, because they were likely to have been introduced by viral replication in vivo early during infection or during PCR and were less likely to have been transmitted from the donor [8]

Understanding why early viral variants in certain subjects are heterogeneous or homogeneous may provide insight into host-virus interactions. Learn et al. [4] suggested that a marginally diverse infecting inoculum of HIV-1 envelope populations present very early during infection may become more homogeneous within a few months after infection, and Herbeck et al. [9] found evidence for evolution toward an ANC sequence early during infection, indicating that HIV recovers certain ancestral features when infecting a new host. In addition, Derdeyn el al. [10] found evidence that viruses with shorter envelope V1-V4 loop lengths and fewer PNLGS were transmitted to, or selectively grew, in the recipients

The low level of diversity observed soon after infection may in part reflect the virus population diversity in the donor [12]. However, not all transmissions occur during acute/early infection in the donor, and some filtering and transmission bottlenecks clearly occur from donors with high viral diversity [10]

Our data are consistent with those of Ritola et al. [6] and Sagar et al. [13] showing that men can harbor complex viral populations early during infection, whereas Long et al. [3], who examined heterosexual transmission in Kenya in individuals infected with non-B HIV-1 subtypes, found lower levels of diversity in men compared with women. Differences in mode of transmission and potential differences in inoculum size at penile-vaginal versus anorectal mucosal surfaces may influence early viral replication dynamics and diversification

In contrast to several studies that have suggested a direct correlation between viral diversity and rate of disease progression [5, 11, 14, 15], we did not observe any association between progression rate and any of the measures of early viral population heterogeneity. In contrast to the 4 aforementioned studies, which assessed surrogate markers for disease progression (i.e., viral load and rate of CD4 cell count decline), our study could assess for a correlation between early viral population diversity and actual time to clinical AIDS as well as the surrogate markers of set-point viral load and time to a CD4 cell count <200 cells/μL used in earlier studies. These differences and the smaller cohort size in 3 of these 4 studies (n=12, n=15, and n=23) [5, 11, 14, 15] may help explain the discrepancy with our findings. The one large study (n=156) [5] performed to date to address this question was conducted in Kenya, where multiple subtypes circulate (typically A, D, and C); primarily used the heteroduplex mobility assay as a qualitative measure of heterogeneity; and did not control for potentially faster progression linked to subtype D infection

In conclusion, we have shown, in a cohort of men who have sex with men who were infected with HIV-1 subtype B, that variable levels of envelope gene and protein diversity are present during acute infection and before the establishment of substantial immune responses. Strategies to prevent HIV transmission or attenuate infection will likely have to take this potential viral diversity into account. (Supplemental data relevant to this study can be found at http://mullinslab.microbiol.washington.edu/HIV/Gottlieb2007/.)

Acknowledgments

We thank the staff and clinicians of the Multicenter AIDS Cohort Study (MACS) and the study participants, without whom this study would not have been possible. Data used in this manuscript were collected by the MACS, which has centers (principal investigators) at The Johns Hopkins University Bloomberg School of Public Health (Joseph B. Margolick and Lisa Jacobson); the Howard Brown Health Center and Northwestern University Medical School (John Phair); the University of California, Los Angeles (Roger Detels and Beth Jamieson); and the University of Pittsburgh (Charles Rinaldo)

Footnotes

  • Potential conflicts of interest: none reported

    Presented in part: 14th Conference on Retroviruses and Opportunistic Infections, Los Angeles, 25–28 February 2007 (oral abstract 121)

    Financial support: This work was supported by the US Public Health Service (grants R01-AI058894 and R37-AI047734 to G.S.G., J.B.M., and J.I.M. and grant P01-AI57005 to J.I.M.) and the University of Washington Center for AIDS Research and STDs (grant P30-AI27757). The Multicenter AIDS Cohort Study is funded by the National Institute of Allergy and Infectious Diseases, with additional supplemental funding from the National Cancer Institute (grants UO1-AI-35042, 5-MO1-RR-00722 (GCRC), UO1-AI-35043, UO1-AI-37984, UO1-AI-35039, UO1-AI-35040, UO1-AI-37613, and UO1-AI-35041)

  • Present affiliation: Department of Clinical Viro-Immunology, Sanquin Research, Amsterdam, The Netherlands

  • Received June 6, 2007.
  • Accepted October 29, 2007.

References

| Table of Contents