Skip Navigation

Secreted Glycoprotein from Live Zaire ebolavirus—Infected Cultures: Preparation, Structural and Biophysical Characterization, and Thermodynamic Stability

  1. Laura G. Barrientos1,
  2. Amy M. Martin2,
  3. Robert M. Wohlhueter2 and
  4. Pierre E. Rollin1
  1. 1 Special Pathogens Branch, Division of Viral and Rickettsial Diseases, National Center for Zoonotic, Vector-Borne, and Enteric Diseases, Atlanta, Georgia
  2. 2 Biotechnology Core Facility Branch, Division of Scientific Resources, National Center for Prevention, Detection, and Control of Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, Georgia
  1. Reprints or correspondence: Dr. Laura G. Barrientos, Centers for Disease Control and Prevention, 1600 Clifton Rd. NE, Mail Stop G-36, Atlanta, GA 30333 (LBarrientos1{at}cdc.gov).

Abstract

Milligram quantities of Zaire ebolavirus nonstructural, secreted glycoprotein (sGP) were purified to homogeneity, and this preparation was characterized by an array of biophysical and biochemical experiments. Massspectrometry analysis revealed sGP posttranslational modifications and regions susceptible to limited proteolysis. In solution, sGP has an absolute molar mass of 103 kDa, is monodisperse, and folds into a predominantly β-sheet conformation with a distinct tertiary structure. sGP appears to have a unique free-energy landscape that facilitates reversible folding and a strong propensity for disulfide-linked dimeric quaternary structure under a wide range of conditions; the low apparent free energy of conformation transition of sGP (ΔG = 1.7±0.1 kcal/mol) suggests that the molecule is well suited as a thermodynamically facile switch, which would allow it to report on relatively subtle changes in milieu. In addition, a conformational transition at 37°C was detected in thermal denaturing experiments. On the basis of biophysical and biochemical considerations alone, we propose that the property of being a thermodynamically facile switch is an important clue to reveal sGP functionality.

The glycoprotein genes (fourth from the 3′ end of 7 linearly arranged genes) of all 4 known Ebolavirus (EBOV) species (Zaire [ZEBOV], Sudan, Reston, and Cote d'Ivoire) are unusual in that they share a common, unique organization that sets them apart from other viruses (including Marburgvirus, a closely related filovirus). This organization encodes in 2 reading frames, programmed to express 2 distinct viral glycoproteins [1, 2]: a virion-associated structural glycoprotein (GP; 676 aa) and a secreted glycoprotein (sGP; 364 aa), both sharing the same 295 N-terminal amino acid residues but ending in different C-terminal sequences. During the course of infection, GP is also processed and released to cell culture media as soluble GP1 [3] and GPD1,2 [4].

sGP is the primary gene product and is synthesized directly from the unedited glycoprotein mRNA as a precursor molecule that is proteolytically processed by a signalase and furin (and/or furin-like proteases) at the N- and C-termini, respectively, to yield the molecules known as sGP [2, 5] and Δ-peptide [6]. Both are efficiently secreted from infected cells into cell culture media, and the concentration of sGP in the circulatory system is relatively high in human patients acutely infected with EBOV [7]. The sGP molecule is a homodimer bridged by disulfide bonds involving 2 cysteine residues located near the N- and C-termini (C53 and C306). Matrix-assisted laser desorption/ionization-time-offlight mass spectrometry (MALDI-TOF MS) of protease-digested sGP has clearly revealed that the monomers are held together in a parallel orientation via intermolecular disulfide bonds between C53-C53′ and C306–C306′ [8]. In addition, it was shown that intramolecular disulfide bonds among the 4 internal and closely positioned cysteines were formed between alternating residues (C108–C135 and C121–C147). We previously termed the region comprising residues C108–C147 an “F2-like” module, because this disulfide bond topology is similar to that in fibronectin type II domains (F2 modules) and might constrain this portion of sGP to a similar global fold [8].

The function of sGP remains undefined, despite numerous efforts to elucidate it [1, 3, 916]. In general, studies aimed at characterizing sGP have employed various recombinant systems, which offer a safer and more convenient means of working with EBOV gene products. However, when conducting exploratory studies, the use of EBOV-expressed sGP is more desirable, because it ensures that the findings reflect the authentic structure, which is not guaranteed when working with recombinant proteins. Furthermore, both the lack of detail on the nature of posttranslational modifications and the lack of monoclonal antibodies and a functional assay with which to probe the conformation of the protein have prompted this work.

We prepared biochemical quantities of highly (>95%) pure sGP derived from ZEBOV-infected cell cultures, to perform both structural and functional studies. We previously reported the unambiguous determination of all disulfide linkages by use of this preparation [8]. Here, we focus on the biophysical and biochemical properties of the molecule, as well as on its thermodynamic stability. Such information is essential to a proper description of the molecule and might shed light on a longsought functional description of its role in the human disease caused by EBOV.

Methods

All work with infectious EBOV was performed in biosafety level 4 facilities at the Centers for Disease Control and Prevention in Atlanta, Georgia.

Preparation of highly purified sGP from live EBOV-infected cultures. sGP used in this analysis was derived from Vero E6 cell cultures (ATCC CRL-1586) infected with low-passage stocks of ZEBOV (1976 Mayinga strain). The protocol to produce and achieve biochemical quantities of highly (>95%) pure protein is provided in the Appendix, which appears only in the online edition of the Journal. The purity of the sample was checked by gel filtration chromatography (GFC), by SDS-PAGE under reduced and nonreduced conditions, and by MALDI-TOF MS analysis of tryptic peptides. The concentrations of protein stock solutions were determined by refractive index (RI) detection (OPTILAB DSP; Wyatt Technology), as described by Barrientos et al. [17]. Protein stock solutions were typically 0.5 µg/µL and were stored at 4°C or —80°C.

Physicochemical characterization of sGP. A detailed description of all of the methods used to characterize sGP is presented in the Appendix. In brief, sequence and posttranslational modifications in sGP were characterized by MALDI-TOF MS analysis of reverse-phase (RP)-high-performance liquid chromatography (HPLC)-purified tryptic or GluC (Staphylococcus aureus protease V8) digest of sGP and reduced and carboxymethylated sGP ([RCM]sGP). The analysis of HPLC-purified peptides was complemented by sequencing data derived from Edman degradation and electrospray ionization tandem mass spectrometry (ESI-MS/MS).

For oligosaccharide site occupancy and structures analysis, ZIC-HILIC (Sequant) was used for glycopeptide enrichment, followed by RP-HPLC separation and MALDI-TOF MS analysis of glycopeptide-containing fractions with or without endoglycosidase and sequential exoglycosidase treatment.

sGP was also characterized by an array of biophysical methodologies, essentially as described elsewhere [17, 18]. Experiments using circular dichroism (CD), steady-state tryptophan fluorescence, and a multiangle laser light scattering system (MALLS) were used to assess sGP secondary, tertiary, and quaternary structures, respectively. The reversibility of protein (un)folding by thermal and chemical denaturation with guanidinium hydrochloride (GdnHCl) was also assessed by use of these techniques at different protein concentrations (0.05, 0.5, and 5.0 µg/µL) and pH levels (7.4 and 6.0).

The effect that incubating the protein at 37°C had on conformational changes was investigated at different protein concentrations (0.05, 0.5, and 5.0 µg/µL) and pH levels (7.4 and 6.0). Aliquots were subjected to MALLS and fluorescence analysis (at 20°C) at various time points.

Equilibrium (un)folding induced by GdnHCl was monitored by fluorescence, and the thermodynamic parameters were extracted from the Gibbs energy function [17, 18]. Thermal denaturation was also monitored by fluorescence.

sGP tertiary structure was also probed by trypsin-limited proteolysis of the intact protein at 20°C. RP-HPLC and MALDI-TOF MS were used to monitor the appearance and accumulation of released peptides.

Results and Discussion

The premise behind this study of sGP structure, folding, and stability is that structural elucidation might clarify the poorly understood function of this viral protein. The use of the authentic form (i.e., derived from live EBOV infection) of sGP eliminates any issues related to the fidelity of glycoprotein processing and/or collateral influences by EBOV proteins expressed concomitantly with sGP. Herein, we present the results of extensive physicochemical characterizations of the molecule.

Production and purification of sGP from culture fluid from EBOV-infected Vero E6 cells. sGP was purified to homogeneity by lectin and GFC from culture fluids of EBOV-infected Vero E6 cells incubated in serum-free medium. The advantages of purifying sGP from EBOV-infected Vero E6 (or comparable) cultures is that it offers a simple, rapid, and reproducible means of producing relatively large quantities of authentic protein. GFC, SDS-PAGE, and mass spectrometry analysis (figure 1) indicate that the purified protein is highly (>95%) pure and is the disulfide-linked homodimer sGP in which the subunits are joined in a parallel orientation via 2 intermolecular (C53–C53′ and C306–C306′) cystine bridges.

Figure 1

Identification of purified secreted glycoprotein by gel filtration chromatography, SDS-PAGE, and matrix-assisted laser desorption/ionization-time-of-flight mass spectrometry. The figure and legend are available in their entirety in the online edition of the Journal of Infectious Diseases.

Primary sequence and posttranslational modifications (PTMs). Confirmation of the encoded amino acid sequence and identification of PTMs requires peptide mapping with complete sequence coverage. Our previous analysis (with sequence coverage of 97%) of tryptic peptides mass-to-charge ratio (m/z) provided convincing evidence for the complete assignment of all disulfide linkages and the detection that all 6 sites bearing the consensus sequence for N-glycosylation (NX- T/S) are fully occupied [8]. In addition to these PTMs, we found that 1 hexose unit is added to peptides T26 and T26T27 (see peptide notation in figure 2). The 2 strong peaks at m/z 2298.19 and 2426.29 observed in figure 1C were matched to T26 and T26T27, respectively, by use of the FindMod tool (http://ca.expasy.org/tools/findmod/) to predict a potential PTM. This search indicated that these peptides bear the C-mannosylation recognition signal WXXW, which potentially explains the increment of 162 atomic mass units (amu), compared with the peptide masses of T26 and T26T27.

Figure 2

Amino acid sequence of the monomeric molecule of secreted glycoprotein from the 1976 Mayinga isolate of Zaire ebolavirus (GenBank accession no. U23187). The figure and legend are available in their entirety in the online edition of the Journal of Infectious Diseases.

Enzymatic cleavage of sGP and [RCM]sGP with trypsin and GluC followed by RP-HPLC helped to further define primary structure and PTMs. Peaks were collected and analyzed by MALDI-TOF MS, and ESI-MS/MS and N-terminal sequencing by Edman degradation were used as indicated. The primary sequence and all previously found PTMs were reconfirmed in this extensive analysis (PTMs highlighted in figures 2 and 3). No evidence for unoccupied N-glycosylation sites, amino acid mutations, or other PTMs, including tryptophan oxidation and O-linked glycans, was found by use of our methodology; however, the question remains open as to whether other techniques or methodologies might reveal otherwise.

Figure 3

Summary of all structural information known to date regarding native Zaire ebolavirus secreted glycoprotein (sGP) homodimer. A, Schematic representation of sGP homodimer showing the parallel orientation of the monomeric units, all of the posttranslational modifications, and the sites susceptible to limited trypsin digestion. N- and C-termini are indicated by N, N′, C, and C′. The disulfide bonds, N-glycans, and C-mannosyl residue in sGP are indicated by S-S, Y, and hexagons, respectively. The residue labels are in yellow, blue, and green, respectively. Preferential cleavage sites on sGP, detected by limited trypsin digestion, were classified as primary (R64/R64′, R85/R85′, and K95/K95′), secondary (K276/K276′ and K295/K295′), and tertiary (K190/K190′ and K191/K191′) merely on the basis of qualitative kinetic evaluation, and the residue labels were color coded as orange, pink, and fuchsia, respectively. B, Diagram of sGP showing the glycan structures found at the 6 occupied sites, the C-mannosylated tryptophan site, the predicted global fold of the F2-like module [8], and the sites susceptible to limited trypsin proteolysis. Only 1 monomeric subunit is shown, for simplicity. Labels and their color codes are as in panel A. The hypothetical glycan structures are drawn in planar projection.

C-mannosylation of sGP at position W288. Edman degradation and ESI-MS/MS of RP-HPLC-purified peptides T26 were performed to gain information on the position of the hexose unit. As is shown in figure 4A, the amino acid sequence, as revealed by Edman degradation, matches that predicted for peptide T26, except that no phenylthiohydantoin derivative of tryptophan (PTH-W) eluted at cycle 12, which suggests that W288 has been modified. ESI-MS/MS analysis corroborated this conclusion (figure 4B). T26 was identified as doubly charged ion at m/z 1149.47. MS/MS analysis of this species revealed the sequences GE(C2-Man)WAFWETK (y9*), AFWETK (y6), FWETK (y5), and WETK (y4). The singly charged y9 fragment was also accompanied by a mass loss of 120 amu (y9*-120), known as the fingerprint mass for α-C2-mannosyltryptophan ([C2-Man]W) because of the stability of the CC bond in aromatic C-glycosides [19]. Therefore, we conclude that a mannosyl residue is attached to W288 and that W291 is unmodified. The structure of the modified tryptophan, (C2-Man)W, is shown in figure 3B. sGP is the first viral protein reported to bear this modification, and the sequence motif WAFW (also present in GP) is conserved among all 4 strains of EBOV.

Figure 4

Secreted glycoprotein (sGP) tryptic peptide T26 contains the C-mannosylation acceptor motif WXXW, and the first tryptophan is Cmannosylated, as evidenced by Edman degradation and mass spectrometry. A, Chemical microsequencing by Edman degradation of the high-performance liquid chromatography-purified peptide T26. N-terminal Edman sequencing was performed on a Procise 492 Protein Sequencer (Applied Biosystems). Sixteen Edman cycles were collected and analyzed both by use of Procise software and by manually overlaying successive traces. The elution profile of amino acid standards (upper panel) and the first 12 cycles are shown. The retention time and the identified amino acid of the eluted trace are labeled in each cycle. No phenylthiohydantoin derivative of tryptophan (PTH-W) eluted at cycle 12. B, Tandem mass spectrometry spectrum for peptide T26. The dominant y fragments (labeled) confirm that the first tryptophan residue, W288, is C-mannosylated and the second tryptophan, W290, is unmodified in intact sGP.

Site-specific N-glycosylation and glycan structures. We extended our analysis to a comprehensive characterization of all N-linked glycans of sGP. We used ZIC-HILIC (a hydrophobic stationary-phase liquid chromatography) in batch mode, a robust method for glycopeptide enrichment [20], to isolate the glycopeptides from the tryptic digests and to determine the specific carbohydrate structures, as well as their sites of attachment to the protein. The resulting glycopeptide-containing eluate was separated by reverse-phase HPLC (RP-HPLC; figure 5A. The HPLC fractions were analyzed by MALDI-TOF MS. After deglycosylation with PGNase, the carbohydrate-free peptides were identified by their mass (or Edman sequence, as in the case of peptide T24) (figure 5A and table 1). Peptide T21 (comprising masses of 6500–8500) contains 2 sites for N-glycosylation (N228 and N238). When GluC was used to digest the peptide, 2 peptides also resulted, each with a single site (referred to in table 1 as ΔT21220–231 and ΔT21236–245, respectively); each was purified and analyzed. Thus, all 6 potential sites for N-glycosylation were shown to be occupied.

Figure 5

Separation of N-linked tryptic glycopeptides and determination of site-specific glycan profile and structures. A, Reverse-phase highperformance liquid chromotography of ZIC-HILIC-enriched tryptic glycopeptides. Secreted glycoprotein (sGP) or reduced and carboxymethylated sGP ([RCM]sGP) was digested with trypsin, and the glycopeptides were isolated by use of ZIC-HILIC in batch mode. The eluted pool of glycopeptides was chromatographed on a C18 reverse-phase column. Matrix-assisted laser desorption/ionization-time-of-flight mass spectrometry (MALDI-TOF MS) analysis of the glycopeptide-containing peak fractions with and without PGNaseF treatment demonstrates that all 6 predicted N-glycosylation sites are fully occupied. The glycopeptides are labeled in blue characters according to the notation depicted in figure 2. Other peptides detected were also labeled (the disulfide linked peptides T10+T14 and T30+T30′ in yellow characters and the C-mannosylated peptide T26 in green characters). Milliabsorbance units (mAu) are shown on the Y-axis. B and C, Representative MALDI-TOF MS analysis of the glycoforms attached at a single site for N-glycosylation and determination of glycan structures by glycan sequencing. The analysis was illustrated with the first chromatographic peak consisting of peptide T20, comprising site N204, which is N-glycosylated with multiple glycosyl structures. The sugar composition of the oligosaccharide attached at the glycosylation site was deduced from the difference between the observed mass and the theoretical mass of a corresponding nonglycosylated peptide, as summarized in table 1. From the estimated carbohydrate composition, the types of oligosaccharide structures were predicted, and the proposed structures were probed by sequential digestion with exoglycosidase enzymes. In this example, sialidase α(2–3) (Streptococcus pneumoniae) (SA), β-(1–4)-galactosidase (S. pneumoniae) (Gal), and β-glucosaminidase (S. pneumoniae) (Glc) were used. The bisecting N-acetylglucosamine (GlcNAc) slows the cleavage of the nonreducing terminal β-linked GlcNAc (C, lower panel). The symbols used to draw the planar projection of the proposed sugar structures are as in figure 3B. Data were acquired in both linear (B) and reflectron (C) modes, and α-cyano-4-hydroxycinnamic acid was used as the matrix. The [M+H]+ ion accompanied by [M+Na]+ were observed in positive ion spectra. The linear mode is more sensitive for the clear detection of sialylated glycans (treatment with SA perturbed the spectra by a 291-amu loss, as shown in panel B in orange and in dashed lines). High mass accuracy was obtained in reflectron mode. The measured monoisotopic masses of the [M+H]+ were labeled, and the calculated values are also shown in parenthesis. The species marked by an asterisk (*) are listed in table 1.

Multiple forms of glycosyl structure were found at each site, reflecting microheterogeneity in glycosylation (table 1). The sugar composition of the oligosaccharide attached at each glycosylation site was deduced from the difference between the observed mass and the theoretical mass of a corresponding nonglycosylated peptide through the use of the Glycomod Tool provided through the EXPASY Web site. From their estimated carbohydrate compositions, the glycan structures were predicted using GlycoSuite. Resistance of the majority of the oligosaccharides to endoglycosidase H hydrolysis reveals a predominant presence of complex-type glycans, with only few structures that appear to be sensitive to hydrolysis (hybridtype) (the classification is shown in table 1). When necessary, multiple structural possibilities were narrowed down by sequentially digesting the glycopeptides with multiple glycosidases. A representative spectrum is illustrated for the first eluting chromatographic peak (retention time, 22.2 min) (figure 5B and 5C). Sialylated glycans were detected in the positive ion spectra recorded in linear mode (figure 5B). To simplify the analysis, glycans were desialylated (by sialidase treatment) before sequential digestion with other exoglycosidases (figure 5C).

Planar representations of all of the proposed structures are summarized in figure 3B. The majority of the glycoforms of sGP belong to the complex type, with core fucosylated structures with or without bisecting GlcNAc. Sialylated glycans were detected at all 6 sites. Only N257 bears mixed-type oligosaccharides. It is important to note that the proposed structures, although likely, are considered to be hypothetical because MS does not provide information on the stereochemical properties of the linkages, and limited information on linkages was obtained by using specific enzymes.

Biophysical characterization of sGP. The native conformation of sGP was assessed by CD, fluorescence spectroscopy, and MALLS. The far-UV CD spectra of sGP recorded at 20°C have 2 bands, which we interpret as evidence that the protein is folded and contains a mixture of elements of secondary structure (figure 6A). The negative bands at 210 and 220 nm are typical of β-sheet and α-helix content, respectively. The βsheet content is predominant, whereas the a-helical content is estimated to be 15%.

Figure 6

Biophysical characterization, protein folding, and thermodynamic stability of secreted glycoprotein (sGP) homodimer. A, Circular dichroism (CD) spectra of sGP. Far-UV CD spectra of folded sGP were recorded at 20°C with a protein concentration of 0.05 µg/µL in phosphate buffer (50 mmol/L; pH 7.2). The molar circular dichroism values, also called Δε (L·mol−1·cm−1), were calculated, and the elements of secondary structure were extracted from these data. B, Steady-state tryptophan fluorescence spectra of sGP. Fluorescence spectra of sGP (0.05 µg/µL) in PBS buffer (pH 7.4) in the native state (F) and the unfolded state (U) in 6N guanidinium hydrochloride (GdnHCl). Fluorescence was excited at 280 nm, the emission spectra were normalized (I/I347), and data were recorded at 20°C. C, Characterization of sGP by gel filtration chromatography (GFC) in conjunction with inline multiangle light scattering. Measurements were made with 50 µg of protein in PBS buffer (pH 7.4) containing 0.02% sodium azide. The elution profile was monitored by refractive index (black continuous lines), and the predicted molar masses from light-scattering measurements (black squares) are displayed. Agilent high-performance liquid chromatography coupled to a triple detector system (Wyatt Technology) was used, and the data were recorded and analyzed using the ASTRA program (Wyatt Technologies). D, In vitro refolding of sGP. The glycoproteins were denatured with 6N GdnHCl in PBS buffer (pH 7.4) and refolded at 3 different concentrations (0.05, 0.5, and 5 µg/µL). The refolding products were applied onto a Superdex-200 column in pH 7.4 PBS buffer. The GFC profile (elution volume vs. milliabsorbance units [mAu]) shows that sGP refolds preferentially to the dimeric form (upper and lower panels). Small amounts of oligomeric species are observed only at the high concentration 5 µg/µL (lower panel). The orange traces in panels A, B, and C show that refolding at low protein concentration is quantitatively reversible. A similar aliquot (containing 50 µg of protein) as that applied in the upper panel (black traces) was incubated at 37°C for 3 days. Steady-state fluorescence was continuously monitored, as indicated in Methods, and the I335/I347 ratio was calculated at various times (insert in upper panel). At day 3, the sample was reapplied onto the column (orange traces, upper panel). E, Unfolding curves for sGP. Equilibrium GdnHCl denaturation curves for sGP are shown in the upper panel. The unfolding transition was followed by the intrinsic fluorescence excited at 280 nm. Raw data were converted to the fraction of folded molecules and plotted against GdnHCl concentration. The solid lines represent nonlinear least-square fits of the data to a 2-state model. All measurements were performed at 20°C on a sample containing 0.05 µg/µL in PBS buffer (pH 7.4). The following thermodynamic parameters were recovered: [GdnHCl]1/2 = 0.9± 0.1 mol/L; ΔG = 1.7 ± 0.1 kcal ·mol−1, and m = 1.86 ± 0.1 kcal Δmol−2ΔL. The temperature-induced denaturation curve for sGP is shown in the lower panel. The unfolding transition was followed by the intrinsic fluorescence excited at 280 nm. The experiment was performed at 20°C on a sample containing 0.05 µg/µL in PBS buffer (pH 7.4).

Table 1

Glycoforms, glycosylation type, and deduced oligosaccharide structures.

Steady-state tryptophan fluorescence of native sGP, recorded at 20°C with excitation at 280 nm, shows a blue-shifted emission maximum (λmax) at 335 nm, reflecting asymmetry in the environment of the indole ring(s) (proof that they exist in the context of a tertiary structure) (figure 6B). Unfolding of sGP induced by 6N GdnHCl resulted in a decrease in the fluorescence quantum yield and a red shift of the λmax from the native value of 335 nm to 347 nm (figure 6B). The position of the λmax in the unfolded state reflects a highly polar (i.e., solvent-exposed) microenvironment.

sGP was also characterized using GFC monitored by online MALLS, UV absorbance, and RI detectors. With this setup, molar mass distribution, absolute molar masses, and number (Mn), weight (Mw), and z (Mz)—average molar mass moment can readily be computed. Figure 6C shows a plot of sGP molar mass versus elution time. The computed molar mass of the purified sGP is 102,933 ± 3017 Da. This value is in agreement with the calculated molar mass from the sequence (65.3 kDa) plus an estimated carbohydrate content (∼35–41 kDa) of dimeric sGP; this figure is construed as a more accurate measurement than that estimated by GFC or SDS-PAGE (figure 1A and 1B). Moreover, the linear nature of the absolute molar mass distribution across the peak in figure 6C revealed that the sample is monodisperse, with Mw/Mn and Mz/Mn values of 1.006±0.019 and 1.013±0.034, respectively. The range of molar mass is likely due to microheterogeneity associated with variations in N-linked sugar composition.

sGP exists preferably in the dimeric-folded state. At this juncture, we stress that the functionality of sGP remains undefined, and there is the possibility that the folded, dimeric conformation observed in solution may or may not be functionally relevant. There are well-known cases in which aggregation, conformational changes, or misfolding of the protein is closely connected with protein functionality. In this context, we searched for possible mechanisms that could lead to protein oligomerization, conformational change or misfolding.We used biophysical methodology similar to that described by Barrientos et al. [17, 18]. First, we studied GdnHCl-induced and thermally induced (un)folding of sGP. The effect of the chaotropic agent was tested at 3 different protein concentrations (0.05, 0.5, and 5 µg/µL) and 2 pH levels (7.4 and 6.0). CD and fluorescence analysis indicated that chaotropic agent—induced folding/unfolding was completely reversible (figure 6A and 6B, orange traces). MALLS/RI/UV-GFC further indicated that, at low protein concentrations (0.05 and 0.5 µg/µL), refolding to the folded, dimeric state of sGP was quantitatively reversible (figure 6C, orange trace). In addition, we did not detect formation of higher-order aggregates (figure 6D, top panel, black trace). Only at the highest protein concentration (5 µg/µL) were small amounts of higher-order species detected (<2%) (figure 6D, bottom panel). The effect of thermal denaturation was also tested in the same fashion. Natively folded protein was unfolded at 80°C, but, on cooling to 20°C, it refolded to its original state. Both fluorescence and MALLS-GFC analysis indicated reversible folding at the lowest protein concentration (0.05 µg/µL), with no aggregation or precipitation. At a higher protein concentration, however, protein aggregation was detected by both techniques; even so, the native dimer form predominated (∼70%).

We also tested whether a long incubation period at 37°C has an effect on conformational integrity and oligomerization of the dimer. Purified dimeric sGP at 3 different concentrations (0.05, 0.5, and 5 µg/µL) and 2 different pH levels (7.4 and 6.0) was incubated at 37°C for up to 72 h. Tryptophan fluorescence (excitation at 280 nm; emission spectra of 300–400 nm) was measured at 24-h intervals. After 72 h, each sample was subjected to MALLS/RI/UV-GFC analysis. Figure 6D (top panel, black and orange traces) and the insert in figure 6D (top panel) illustrate one example of the results that were obtained. In all cases, the fluorescence spectra of the native protein remained unchanged, and no precipitation or protein aggregation occurred. MALLS-GFC detected only dimeric sGP, and the RI signal indicates no loss of sample. These results indicate that sGP in the dimeric folded state is the predominant and a relatively stable form of the protein. Consistent with these experimental results, we observe that sGP stock solutions have long shelf storage at 4°C, from which we estimate that the dimeric folded form has a half-life of >100 years.

The effect of protein precipitants was also investigated. sGP (1 µg/mL) was precipitated by using a mixture of acidified methanol/acetone and then redissolved in buffer. A mixture of states was detected, including soluble and insoluble high-molecular- weight aggregates and soluble hexamers, tetramers, and dimers. Therefore, high-order aggregates were formed only under aggressive conditions that are unlikely to be encountered physiologically.

sGP appears to have a bizarre folding, free-energy landscape that allows it to refold smoothly and efficiently, under a wide range of conditions, to the preferential dimeric form. We have shown that the dimer is held together by 2 intermolecular disulfide bonds [2, 8], and it is possible that the propensity to revert to a soluble dimer is merely a consequence of this covalent fact. Given that the oligomerization constant is ∼10,000-fold higher than the concentration of sGP in tissue culture (0.5–1 ng/µL), we conclude that aggregation of sGP is not important in the context of EBOV infection, and we postulate that the folded dimeric conformation is the functional form of sGP (if it has any functionality).

Thermodynamic stability of sGP. The stability of dimeric sGP toward chemical and thermal denaturation was further probed by tryptophan fluorescence at the lowest protein concentration (0.05 µg/µL), in an attempt to observe equilibrium between folded and unfolded dimeric states. Figure 6E (top panel) displays the unfolding profile generated for sGP in the presence of increasing amounts of GdnHCl. As is discussed above, folded dimer and unfolded dimer are the only 2 species in equilibrium at the low-protein concentration (0.05 µg/µL). We consider the fractional extent of emission red-shift between its limits as a measure of fractional unfolding; that is, it quantifies a simple 2-state equilibrium between folded and unfolded dimer: Formula

The midpoint of the GdnHCl denaturation curve is 0.9±0.1 mol/L. Analysis of the unfolding curve allows the extraction of thermodynamic parameters: the free energy of unfolding (ΔG) is 1.7±0.1 kcal/mol, with an m value of 1.86±0.1 kcal·mol−2·L. These results indicate that sGP is highly susceptible to physicochemical denaturation.

The thermal melting curve (figure 6E, bottom panel) for sGP exhibited 2 well-resolved transitions: 1 at 37°C and 1 at the melting temperature (Tm, 60°C). A transition just at 37°C suggests a physiological role of this state change, perhaps some regional melting or a conformational change, although our low-resolution methods do not shed light on the precise nature of the change.

Probing the structure of sGP with limited trypsin proteolysis. Vulnerability to limited proteolysis by endoproteases is a useful probe of protein structure. To gain insight into the protein tertiary fold of the native sGP, we exposed it to mild trypsin digestion at neutral pH (7.2) and 20°C and then evaluated the effects by use of RP-HPLC and mass spectrometry. Preferential cleavage sites were classified as primary, secondary, or tertiary on the basis of qualitative kinetic evaluation (figure 3A and 3B; shown in orange [primary], pink [secondary], and fuchsia [tertiary]). The primary and secondary sites are located in the N-terminal regions (residues 65–95) and C-terminal regions (residues 276–295), respectively, indicating that these regions are particularly flexible and exposed. The tertiary site (comprising both K190 and K191) was located between the “F2-like region” [8] and the glycan-rich region. Collectively, these sites presumably lie at flexible surface loops or links between subdomains. It is worth noting that the F2-like region (comprising residues V96-K190/K191) is highly resistant to proteolysis. In contrast, the C-mannosylated site is within a protease-sensitive segment. Perhaps this PTM at W288 confers functionally significant hydrophilicity on an otherwise highly hydrophobic region comprising residues W288-A28-F300-W301.

Summary. sGP is the most abundant viral protein in the culture fluid of EBOV-infected cells, a situation that facilitated the isolation and purification of biochemical quantities of pure protein for both structural and functional studies. The characterization of this preparation should prove to be a very useful guide to accelerate other studies of sGP produced from recombinant systems. The work presented here enriches our understanding of the biophysical and biochemical properties of the protein and guides the way to the determination of a highresolution, 3-dimensional structure and the grail of sGP functionality. A high-resolution model is required to inspire a better functional understanding of this unique protein. In the present study, we separated the tasks of characterizing the posttranslational modification of sGP into site-specific glycosylation, glycan structures, and residues most sensitive to protease cleavage. The composite of these tasks, illustrated in figure 3, provides sufficient prerequisite information to guide a domain-by-domain approach to the determination of high-resolution structures of putative 3-dimensional domains in sGP. Our ability to produce and purify milligrams of authentic sGP, reinforced by the wealth of biophysical and biochemical information from this study, offers hope regarding the feasibility of pursuing crystallization and X-ray diffraction of the intact, fully glycosylated sGP molecule (i.e., to conquer).

We also demonstrated that sGP exhibited all of the characteristics of a well-folded protein, with a free-energy landscape that facilitates reversible folding and a strong propensity for disulfide-linked dimeric quaternary structure, even at high concentrations (>5 µg/µL). The low apparent free energy of conformation transition of sGP (ΔG, 1.7±0.1 kcal/mol) suggests that the molecule is well suited as a thermodynamically facile switch, which would allow it to report on relatively subtle changes in milieu. In addition, a conformational transition at 37°C was obviously detected in the thermal denaturing experiment. On the basis of biophysical and biochemical considerations alone, we propose that the property of being a thermodynamically facile switch will be an important clue to reveal sGP functionality.

Ebola hemorrhagic fever in humans is an “accidental event” that presumably played no role in the past and continuing evolution of EBOV in nature. Therefore, the biological function of sGP, to the extent that it has one, is manifested only in its infection of the maintenance host(s), in which sGP arose before the divergence of the 4 EBOV species, and in its retention by natural selection as a result of some contribution it makes to the efficient transmission of virus from one member of the host species to the next. Two EBOV mutants in which sGP is not produced [21] or is produced at low levels as a secondary gene product [22] are significantly more cytopathic than their wildtype counterpart. Therefore, the production of sGP might be an exquisite evolutionary feature for viral survival. Possible roles of sGP in the reservoir host could include interaction with macrophages or dendritic cells that prolong the infection and ameliorate illness. The facile conversion between folded and unfolded states, thanks to the low energetic barrier separating these states, might be involved in switching to a conformational form, which, lacking binding affinity, does not inhibit viral entry into cells in the context of a rapidly progressing disease in human and nonhuman primates.

Acknowledgments

We thank Anthony Sanchez, for his help with the large-scale production of secreted glycoprotein derived from culture medium of Zaire ebolavirus—infected Vero E6 cells; Claudia Chesley, for providing editorial assistance; Amy Hartman, Stuart Nichol, Tom Ksiazek, and Brian Holloway, for their support; and Sequant (Umeå, Sweden), for the kind gift of ZIC-HILIC.

Supplement sponsorship. This article was published as part of a supplement entitled “Filoviruses: Recent Advances and Future Challenges,” sponsored by the Public Health Agency of Canada, the National Institutes of Health, the Canadian Institutes of Health Research, Cangene, CUH2A, Smith Carter, Hemisphere Engineering, Crucell, and the International Centre for Infectious Diseases.

Footnotes

  • Potential conflicts of interest: none reported.

  • Financial support: Centers for Disease Control and Prevention (CDC); Research Participation Program administered by the Oak Ridge Institute for Science and Education through an interagency agreement between the US Department of Energy and the CDC (support to L.G.B. in the Division of Viral and Rickettsial Diseases). Supplement sponsorship is detailed in the Acknowledgments.

  • The findings and conclusions in this report are those of the authors and do not necessarily represent the views of the funding agency.

References

  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
| Table of Contents