Presented in part: European Multicolloquium of Parasitology, Paris, France, 24–28 August 2008.
Background. Theoretical and experimental data support the geographic differentiation strategy as a valuable tool for detecting loci under selection. In the context of Plasmodium falciparum malaria, few populations have been studied, with limited genomic coverage.
Methods. We examined geographic differentiation in P. falciparum populations on the basis of 12 singlenucleotide polymorphisms (SNPs) in 4 genes encoding drug resistance determinants, 5 SNPs in 2 genes encoding antigens, and a set of 17 putatively neutral SNPs dispersed on 13 chromosomes. We sampled 326 parasite isolates representing 7 P. falciparum populations from regions with varied levels of malaria transmission (Gabon, Kenya, Madagascar, Mali, Mayotte, Haiti, and the Philippines).
Results. Frequencies of drug resistance alleles varied considerably among populations (mean FST, 0.52). In contrast, allele frequencies varied significantly less for antigenic and neutral SNPs (mean FST, 0.16 and 0.24, respectively). This contrasting pattern was more pronounced when only the African populations were considered. Signature of selection was detected for most of the resistant SNPs but not for the antigenic SNPs.
Conclusion. These data further validate the utility of geographic differentiation for identifying loci under strong positive selection, such as drug resistance loci. This study also provides frequencies of molecular makers of resistance in some overlooked populations.
During their journey within the human host, Plasmodium falciparum parasites experience at least 2 types of natural selection: they are exposed to antimalarial drugs, and they face the protective immunity developed by the host. Major parasitic loci conferring drug resistance and immune evasion mechanisms have been identified [1, 2]. However, progressive antima- larial treatment policy changes and the lack of an effective vaccine call for the search for additional loci in the near future.
Analyzing the distribution of within-species genetic variation can help identify loci that do not follow neutral evolution and hence are supposed to be targeted by natural selection [3]. Tests based on within-species polymorphisms interrogate the site-frequency spectrum in a genomic region of interest, linkage disequilibrium between 2 or more loci, or population differentiation. Recently, more-sophisticated tests combining population differentiation and haplotype structure have been developed [4]. In the case of population differentiation, theory predicts that when selection varies in strength between compared populations, allele frequencies at the selected locus are unusually different [5]. Loci conferring drug resistance typically are under this regime of selection, referred to as divergent directional selection. Some loci encoding antigens targeted by acquired immunity typically are under balancing selection, a regime that tends to maintain unusually similar allele frequencies among populations through some type of rare allele advantage.
In the context of malaria, most studies using among-population differentiation have focused on parasite antigens [6, 7]. Concerning drug resistance, most studies have focused on samples representing a worldwide panel of isolates rather than local P. falciparum populations [8, 9] or have focused on local populations but have used neutral makers with different mutation patterns [10, 11]. The largest population-based study conducted to date analyzed the pattern of differentiation among 11 populations from sites across Southeast Asia [12]. Geographic differentiation was found to be higher in a set of singlenucleotide polymorphisms (SNPs) associated with antimalarial drug resistance than among neutral SNPs located on chromosomes 2 and 3. It is expected that drug-pressure selection in these settings of low malaria transmission is very intense, because a high proportion of infections are symptomatic.
Altogether, these data support the geographic differentiation strategy as a valuable tool for detecting loci under selection in the P. falciparum genome. However, few populations have been studied, with limited genomic coverage.
To substantiate this strategy further in the context of malaria, we examined P. falciparum geographic differentiation for 17 SNPs in 6 genes known to be involved in drug resistance or immune response among 326 parasite isolates representing 7 populations from regions with varied levels of malaria transmission (5 populations from Africa, 1 from the Caribbean, and 1 from Southeast Asia). To control for nonselective processes, we evaluated differentiation with a novel set of 17 putatively neutral SNPs (nSNPs) distributed across 13 of 14 chromosomes.
Data on samples are summarized in Table 1. Blood samples from 326 people infected with P. falciparum were collected in 7 countries. Five parasite populations were from Africa; 3 from the mainland (Gabon, Kenya, and Mali) and 2 from islands at the southeast coast (Madagascar and Mayotte, the latter in the Comoros Archipelago). One parasite population was from Southeast Asia (the Philippines), and 1 was from the Caribbean (Haiti). Isolates from Mali and Madagascar were from asymptomatic children, and all other isolates were obtained from patients who had nonsevere malaria before treatment was initiated. Isolates were collected before the implementation of artemisinin-based combination therapy; treatment in use at the different locations at the time of collection were chloroquine, sulfadoxine-pyrimethamine, or both combined.
Comparison of the estimates of allele frequencies obtained with the predominant-allele and maximum-likelihood methods. In the predominant- allele method, only the predominant allele obtained from each infection is scored. These are compared with the estimations obtained using a maximum-likelihood method implemented in the MalHaploFreq program [14]. The linear regression is shown as a line, and its associated R2 and P values are indicated.
Interpopulation variance (FST) in allele frequencies at polymorphic single-nucleotide polymorphisms (SNPs). A, Seven-population sample. B, Four-population sample (Mali, Gabon, Kenya, and Madagascar). Black bars indicate resistant SNPs (rSNPs), white bars indicate putatively neutral SNPs (nSNPs), and hatched bars indicate antigenic SNPs (aSNPs). The horizontal line represents the mean FST value for the nSNP set. No or low differentiation is indicated by an FST value of <0.15, and high differentiation is indicated by an FST value of ⩾0.25. FST values significantly different from zero were determined by a permutation test (100,000 permutations) and are indicated by an asterisk. FST values for the different sets of SNPs were compared using the nonparametric Mann-Whitney U test. FST values for the nSNP set were significantly lower than those for the rSNP set (P=.002 and P < .001 for the 7- and 4-population samples, respectively) but were not different from those for the aSNP set (P=.38 and P=.19 for the 7- and 4-population samples, respectively).
Detection of single-nucleotide polymorphisms (SNPs) evolving under selection with the BayesFst method. Shown are results obtained for the 7-population sample (A) and the 4-population sample (B) (Mali, Kenya, Gabon, and Madagascar). The vertical line represents the critical P value (.05) for the locus-specific effects ai (such as selection). SNPs located left of the vertical line are identified as evolving under neutral evolution. SNPs located right of the vertical line in the upper part of the graph and right of the vertical line in the lower part of the graph are identified as evolving under divergent directional and balancing selection, respectively. White circles indicate putatively neutral SNPs, black circles indicate resistant SNPs, and white triangles indicate antigenic SNPs.
All studies were conducted after ethical clearance from the relevant respective ethics committees, and informed consent was obtained from each patient.
The 34 SNPs included in the study (and their chromosomic localization) are shown in Table 2.
Drug- and immune-selected SNPs. Twelve nonsynonymous SNPs in 4 genes (pfcrt, pfmdr1, pfdhfr, and pfdhps) associated with resistance to several antimalarial drugs (resistant SNPs [rSNPs]) and 5 nonsynonymous SNPs in 2 genes (ama1 and msp1) encoding antigenic proteins (antigenic SNPs [aSNPs]) were included. These SNPs were chosen because their involvement in drug resistance or in immune evasion of P. falciparum has been reported in the literature [1, 2, 13].
nSNPs. nSNPs were chosen in a 2-step process. First, the PlasmoDB database (release 5.2) was scanned to identify SNPs that were putatively neutral among 7 African strains of P. falciparum: 3D7, 106/1, FCR3, Ghana1, Senegal34.04, D6, and RO33. Four criteria were used to select the nSNPs: (1) they must be located in noncoding (intergenic or intronic) regions; (2) they must be located a good distance (∼50 kb) from genes containing multiple SNPs (because these regions are potentially targeted by selection), genes encoding antigens, and genes involved in drug resistance; (3) they must be polymorphic among the 7 African strains in PlasmoDB; (4) and they must lend themselves to the design of a restriction fragment-length polymorphism (RFLP) assay. Second, the 26 chosen nSNPs were genotyped by polymerase chain reaction (PCR)-RFLP analysis, and their polymorphism was evaluated in 29 isolates from Mali and 27 isolates from Madagascar. One of the nSNPs was excluded because of an indel in the amplified fragment. Eighteen nSNPs were polymorphic and 7 were monomorphic in these 2 subpopulations. Finally, 17 nSNPs were included in the study: 2 of the monomorphic and 15 of the polymorphic nSNPs.
DNA was extracted from blood spotted onto filter papers (filter paper samples) either by the chelex method (isolates from Mali, Gabon, Madagascar, and Mayotte) or by means of a commercial kit (isolates from the Philippines, Kenya, and Haiti).
The 34 SNPs included in the study were genotyped by nested PCR followed by RFLP assay (full methods are available at http://jclain.public.univ-paris5.fr). Digestion products were resolved on 2% or 2.5% agarose gels. A characteristic band-size pattern was obtained for each polymorphism. DNA from 6 laboratory strains (3D7, HB3, 7G8, W2, Dd2, and FCR3) were used as controls for the PCR and digestion. Alleles corresponding to the 3D7 strain were considered to be wild type.
Because 1 parasite isolate can contain multiple P. falciparum clones, estimating allele frequencies in P. falciparum populations is not straightforward. Therefore, 2 methods were used to estimate allele frequencies. First, frequencies were estimated using the predominant-allele method as defined by Anderson et al [12], in which only the predominant allele obtained from each infection was scored. In cases where 2 alleles were detected in an isolate, the predominant allele was identified by comparing the RFLP banding pattern with the one observed for a 50:50 mixture of the 2 alleles prepared with control strains. Second, allele frequencies were estimated using the maximum-likelihood method implemented in the MalHaploFreq program(version 2.1.1) [14]. For each locus, each isolate was classified either as allele 1 or allele 2 if a single allele was detected or as a heterozygote if the 2 alleles were detected.
A SNP locus was considered to be polymorphic if the frequency of the minority allele was ⩾.05 in at least 1 population.
Locus-by-locus geographic differentiation. Differentiation analysis was performed only with SNPs that were polymorphic. Therefore, the following SNPs were excluded: 13_3091, pfmdr1 N1042D, and pfdhfr I164L. FST values were estimated among all populations for each polymorphic SNP with the Arlequin program (version 3.1) [15], which allows missing data (fixed at 5%). FST values for the set of rSNPs or aSNPs were compared with those obtained for the set of nSNPs by a nonparametric test (Mann-Whitney U test). For each set of SNPs, the 95% confidence interval (CI) for the mean FST was determined by computing FST bootstrap percentile values (>20,000 bootstraps).
Neutrality test based on genetic differentiation: BayesFst test. A coalescence-based test was performed to identify SNPs with a pattern of allele frequencies among populations outside neutral expectations (outlier SNPs). This approach is based on a hierarchical Bayesian model implemented via Markov chain Monte Carlo simulations [16] and was performed using the BayesFst program (available at http://www.reading.ac.uk/Statistics/genetics/software.html).
Briefly, for locus i in population j, Fij is the probability that 2 randomly chosen alleles in a population have a common ancestor within that population, without there having been any intervening migration or mutation. Each Fij value reflects contributions from locus-specific effects αi (such as mutation and some forms of selection) and population-specific effects βj (such as effective population size Ne, migration rates, and population- mating patterns). To model these locus- and population- effects, Beaumont et al [16] adopted the following regression equation: (Fij was latter defined 1/Fij=1+αi + βj + γij as (Fijwas latter defined as FST). The parameter γij (ie, interaction effects between locus i and population j) was tested first and was found to not be significant, so it was removed from the simulations. To identify loci under selection, we focused on the posterior distribution of the locus-effect parameter αi. Under neutrality, the prior mean of αi is expected to be zero. For each locus, αi is considered to be significantly different from zero at the level P when the interval of the posterior distribution of 100(1−P)% αi excluded zero. A significant and positive αi value suggests that the locus i is under divergent directional selection, whereas a significant and negative αi value suggests balancing selection. The program was run twice on each data set to check that similar results are obtained with different starting values for the algorithm (the random-number seeds).
The significance level was fixed at .05 for the different tests performed.
Each of the frequencies (except 2) estimated by the predominant-allele method was included in the 95% CI of the frequency estimated using the maximum-likelihood method (R2=0.98; P <.001) (Figure 1). Further analyses were performed using allele frequencies obtained with the predominant-allele method.
nSNPs. Patterns of diversity estimated from the 17 nSNPs are shown in Table 3 for the 7 populations. Allelic diversity, expected heterozygosity, and the proportion of polyclonal infections averaged per locus differed among the populations and were lowest in the areas where malaria transmission is known to be low. One SNP (13_3091) was monomorphic across all populations, and 1 SNP (07_4168) was polymorphic in a unique population (the Philippines). For the 15 remaining nSNPs, there were significant differences among the 7 populations in the number of monomorphic nSNPs (P=.009, χ2 test) but not in the distribution of allele frequencies (P=.26, Wallis-Kruskal test). In Mayotte, Haiti, and the Philippines, about half of the monomorphic SNPs were specific to each population.
Patterns of Genetic Diversity at the Putatively Neutral Biallelic Single-Nucleotide Polymorphisms in the 7 Studied Populations
rSNPs. The frequency of drug resistance alleles varied considerably from site to site (Table 4). The maximum variance of allele frequencies between pairs of populations (with one population fixed for the wild-type allele and another fixed for the mutant allele) was found for 4 polymorphisms (pfcrt K76T, pfdhfr C59R, pfdhfr S108N, and pfmdr1 N86Y). Globally, the prevalence of drug resistance alleles was the lowest in Madagascar and Haiti, was intermediate in Mali, and was highest in Gabon, Kenya, Mayotte, and the Philippines.We did not detect a polymorphism in pfdhps in Haiti, Mayotte, and Madagascar; in pfdhfr in Madagascar; or in pfcrt in Haiti and Madagascar. The pfmdr1 N1042D mutant allele was not detected in any of the populations sampled. The pfdhps K540E and pfmdr1 S1034C mutant alleles were specific to Kenya and the Philippines, respectively. The pfdhfr I164L mutant allele was found only once, in Kenya, combined with the N51I+S108N haplotype.
Frequency of Mutant Alleles at Drug Resistance and Antigenic Single-Nucleotide Polymorphisms
aSNPs. The frequency of the mutant alleles varied little across the 7 populations at aSNPs (Table 4). They were found at low values for ama1 (range, 0–0.32) and at high values for msp1 (range, 0.70–1), except for msp1 in the Philippines (range, 0.42–0.45).
Genetic differentiation at putatively neutral and selected SNPs. Genetic differentiation over all 7 populations was assessed for each SNP (Figure 2A). FST values for the 16 polymorphic nSNPs (mean, 0.24; range, 0.03–0.49; 95% CI, 0.17– 0.31) were significantly lower than those for the rSNPs (mean, 0.52; range, 0.06–0.65; 95% CI, 0.45-0.59; and Z=−2.98 P=.002, Mann-Whitney U test) but were not different from those for the aSNPs (mean, 0.16; range, 0.03–0.34;, P=.38, Mann-Whitney U test). Pairwise comparison between the 7 populations for the 16 nSNPs indicated that the high differentiation observed was explained by the populations from Mayotte, Haiti, and the Philippines (data not shown). When the analysis was restricted to the populations showing less differentiation at nSNPs (the 3 populations from mainland Africa plus Madagascar, hereafter referred to as the 4-population sample) (Figure 2B), FST values for the nSNPs dropped dramatically (mean, 0.04; range, −0.02 to 0.15; 95% CI, 0.01–0.07), whereas they remained high for the rSNPs (mean, 0.46; range, 0.13– 0.72; 95% CI, 0.32–0.59). With the 4-population sample, there was a significant difference in the FST values between the nSNPs and the rSNPs (P<.001) but not between the nSNPs and the aSNPs (mean, 0.004; range, −0.06 to 0.04; P=.16, Mann-Whitney U test).
Outlier detection. The BayesFst test, based on coalescent simulations, was performed to detect SNPs with a differentiation pattern outside neutral expectations (outliers). With the 7-population sample (Figure 3A), polymorphic rSNPs were identified as outliers evolving under divergent directional selection except for 3 of them (pfmdr1 Y184F, S1034C, and D1246Y). One nSNP (11_2231) and 1 aSNP (ama1 K282I) were identified as evolving under balancing selection, with the other SNPs identified as being under neutral evolution. When the analysis was restricted to the 4-population sample (Figure 3B), similar results were obtained except for pfdhps A437G, 11_2231, and ama1 K282I, for which departure fromneutral expectations was not significant.
Similar results were found for both locus-by-locus differentiation and BayesFst tests when we used allele frequencies estimated with the maximum-likelihood method instead of those estimated with the predominant-allele method (data not shown). This indicates that the conclusions obtained are robust with respect to the use of different approaches to allele frequency estimation.
Geographic differentiation among selected SNPs. Extreme geographical variation in allele frequencies was found among the loci conferring resistance to chloroquine and sulfadoxinepyrimethamine, compared with that for nSNPs. In addition, most of the rSNPs extensively described in the literature as being determinants of resistance to chloroquine (pfcrt K76T and pfmdr1 N86Y) and to sulfadoxine-pyrimethamine (pfdhfr N51I, C59R, and S108N and pfdhps A437G and K540E) were found to be under divergent directional selection, at the continental scale (the 4-population sample defined by populations from mainland Africa plus Madagascar) and/or at a larger geographic scale (7 populations located on 3 continents). This indicates that the drug pressure that selects for resistance varied considerably with respect to intensity and time in the different populations sampled. Therefore, although other processes contributed to the differentiation observed (isolation by distance; data not shown), the geographic structure for the rSNPs can be attributed in large part to drug-mediated selection. This is consistent with the antimalarial drug policies of the countries included. All parasites were collected before the implementation of new antimalarial strategies and, therefore, have been mainly exposed to chloroquine and/or sulfadoxine-pyrimethamine for decades. Our findings extend, with novel sets of nSNPs and populations, the high level of differentiation at rSNPs found in Southeast Asia [12]. Because African parasite populations are large, it is expected that neutral polymorphisms will be maintained for a long time in populations. This fits well with the low differentiation observed here and by others at neutral loci among African populations [17, 18]. Against this neutral background, we observed very clear signatures of directional selection. Using parasite populations from 2 Southeast Asian countries that differed in their antifolate drug history, the population- differentiation strategy has been successful in demonstrating that copy-number polymorphism in the GTP cyclohydrolase I gene—which encodes the first enzyme in the folate biosynthesis pathway—is adaptive [19]. Our data suggest that a similar strategy might be particularly useful to identify new loci under strong directional selection in African populations.
Three polymorphisms in the pfmdr1 gene (Y184F, S1034C, and D1246Y) and 1 polymorphism in the pfdhps gene (A437G) were not identified as being under selection in the 7- or 4- population samples. In addition, only 1 of the 5 aSNPs (ama1 K282I) was identified as being under balancing selection. Altogether, these observations highlight a limitation of the strategy based on population differentiation. As shown by Beaumont and Balding [16], the BayesFst method poorly identifies loci under balancing selection or under weak directional selection. Indeed, concerning pfmdr1 D1246Y, this is consistent with functional assays showing that this polymorphism had very little effect on resistance to and transport of chloroquine [20, 21]. In contrast, both the pfmdr1 Y184F and the pfdhps A437G polymorphisms were found to significantly alter drug resistance in functional assays [1, 21]. Further evidence from linkage disequilibrium patterns also suggests that these 2 polymorphisms are targeted by selection [22, 23]. These discordant data show that independent strategies are necessary to demonstrate that a polymorphism is adaptive.
Finally, 2 caveats should be mentioned. First, the nSNPs included in this study were initially chosen from a limited set of African strains in PlasmoDB. They were then typed in 2 subpopulations from Mali and Madagascar for validation. This might result in an overestimation of the genetic distance estimated with nSNPs, particularly when comparing African and non-African populations. Theoretically, such a bias is conservative for the detection of divergent directional selection and might result in an overestimation of the number of loci under balancing selection. Second, the populations examined differed both by the time (from 1998 to 2007) and strategy (cross-sectional survey and passive case detection) of sampling.
Geographic patterns of frequencies of drug resistance alleles. The distribution of drug resistance alleles differed considerably between Sainte Marie Island (in Madagascar) and Mayotte, although these 2 African islands are separated by 669 km. For a long time both regions relied mostly on chloroquine as an antimalarial, but first-line strategies became different in 2002 when Mayotte switched to chloroquine plus sulfadoxine-pyrimethamine. Accordingly, polymorphisms in pfcrt, pfmdr1, and pfdhfr are common in Mayotte, and those in pfdhfr and pfdhps are absent among samples from Sainte Marie Island. A very low frequency of mutant pfcrt was found in Madagascar in this and other studies [24–26], although chloroquine drug pressure and late chloroquine treatment failure have been reported in this area since the 1980s. As outlined elsewhere [24, 26], further investigations are needed to clarify the contribution of pfmdr1 polymorphisms to chloroquine resistance in this region. Finally, a surprising finding in Mayotte was the absence of the polymorphisms in pfdhps that emerged during the 1990s in East Africa and spread along the east coast [23]. This shows that the ocean efficiently restricted their dispersal further east. Given that these polymorphisms are of recent origin [23], it might be that there has been not been sufficient time and parasite exchanges for their successful dispersal in Mayotte.
Very little is known about malaria in Haiti [27–30].We report here a comprehensive—although not exhaustive—analysis of molecular resistance prevalence in Haiti for the 4 most characterized drug resistance genes. Molecular resistance in Haiti appeared to be the lowest among all populations studied, based on a limited number of isolates (n=44) collected in 2007. Only 2 polymorphisms were detected (pfdhfr S108N and pfmdr1 Y184F), both at a high frequency (0.45 and 1, respectively). Although chloroquine was used as a first-line antimalarial during the past decades, mutant pfcrt was not detected by us and has been detected only at a low frequency by others [30]. As in Madagascar, whether the pfmdr1 Y184F polymorphism plays a role in chloroquine resistance in Haiti is unknown. Its fixation in Haiti might be explained by other forces, such as genetic drift, which is known to be an effective force in small, isolated populations. The large geographic distance between Haiti and the 3 hot spots of resistance (Africa, Southeast Asia, and South America), the relative isolation of Haiti as an island and the low and seasonal incidence of malaria in the region [28] might explain the very low frequency of drug resistance alleles found there.
Finally, high levels of molecular resistance were found in the Philippines, with 4 major drug resistance alleles fixed or almost fixed (the double-mutant pfdhfr C59R+S108N, pfdhps A437G, pfcrt K76T, and pfmdr1 N86Y). The combination of alleles with a high frequency found in the Philippines differed from those found previously in other parts of Southeast Asia. The pattern specific to the Philippines confirms the intense local adaptation to drug treatment reported in Southeast Asia [12].
Conclusion. We report here evidence for significant local adaptive differentiation in drug resistance genes, determined using several worldwide P. falciparum populations fromregions with varied levels of malaria transmission. These data highlight the utility of population-based strategies for identifying loci under strong selection. These tools could be useful for future population genetics studies, in particular to control for demographic processes when frequencies of drug resistance alleles are compared with respect to time and space.
We thank all collaborating centers for their participation in collecting materials and data. We are grateful to Lise Musset and Franck Prugnolle for critical reading of the manuscript, Jacqueline Millet for help with the R software, and Audrey Sabbagh, Pauline Garnier-Géré, and Thierry Wirth for helpful discussions and suggestions.
Potential conflicts of interest: none reported.
Financial support: French Ministry of Health (grant to the Centre National de Référence du Paludisme); Institut de Médecine et d'Epidémiologie Appliquée/Fondation Léon Mba (grant 5988CLN90PFGO); French Ministry of Foreign Affairs for the France-Philippine Technical Cooperation Project on Malaria (grant); BioMalPar Network of Excellence.
IDSA Members: For your free access to this journal, log in via the IDSA members area.
Open access options for authors visit Oxford Open
This journal enables compliance with the NIH Public Access Policy