Tải bản đầy đủ (.pdf) (11 trang)

Báo cáo khoa học: Computer-assisted mass spectrometric analysis of naturally occurring and artificially introduced cross-links in proteins and protein complexes potx

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (347.65 KB, 11 trang )

Computer-assisted mass spectrometric analysis of
naturally occurring and artificially introduced cross-links
in proteins and protein complexes
Leo J. de Koning
1
, Piotr T. Kasper
1
, Jaap Willem Back
1
, Merel A. Nessen
1
, Frank Vanrobaeys
2
,
Jozef Van Beeumen
2
, Ermanno Gherardi
3
, Chris G de Koster
1
and Luitzen de Jong
1
1 Biomolecular Mass Spectrometry group, Swammerdam Institute for Life Sciences, University of Amsterdam, the Netherlands
2 Laboratory of Protein Biochemistry and Protein Engineering, University of Gent, Belgium
3 MRC Centre, Cambridge, UK
Mass spectrometry has become a major tool in the
structural analysis of proteins and protein complexes
and in large scale analysis of the function of genes
(proteomics) [1].
For mass spectrometric analysis, the proteins and
proteomes under study are usually first subjected to


proteolytic digestion or chemical cleavage. A large
number of informatics tools has been developed that
helps in extracting relevant information from the com-
plex mass spectrometric data [2]. Most of these pro-
grams match the in silico predicted digest with the
corresponding mass spectrometric data for protein
identification and for mapping protein modifications.
Novel strategies and methodologies in proteomics
urge for dedicated programs that further integrate mass
spectrometric analyses with biochemical experiments.
Keywords
cross-linking; data analysis; protein
structure; mass spectrometry; NK1
Correspondence
L. de Jong, Swammerdam Institute for Life
Sciences, Mass spectometry group,
University of Amsterdam, Nieuwe
Achtergracht 166, Amsterdam, 1018 WV,
the Netherlands
Fax: +31 20 525 6971
Tel: +31 20 525 5691
E-mail:
(Received 23 September 2005, revised 28
October 2005, accepted 7 November 2005)
doi:10.1111/j.1742-4658.2005.05053.x
A versatile software tool, virtualmslab, is presented that can perform
advanced complex virtual proteomic experiments with mass spectrometric
analyses to assist in the characterization of proteins. The virtual experimen-
tal results allow rapid, flexible and convenient exploration of sample prepar-
ation strategies and are used to generate MS reference databases that can be

matched with the real MS data obtained from the equivalent real experi-
ments. Matches between virtual and acquired data reveal the identity and
nature of reaction products that may lead to characterization of post-trans-
lational modification patterns, disulfide bond structures, and cross-linking
in proteins or protein complexes. The most important unique feature of this
program is the ability to perform multistage experiments in any user-defined
order, thus allowing the researcher to vary experimental approaches that
can be conducted in the laboratory. Several features of virtualmslab are
demonstrated by mapping both disulfide bonds and artificially introduced
protein cross-links. It is shown that chemical cleavage at aspartate residues
in the protease resistant RNase A, followed by tryptic digestion can be opti-
mized so that the rigid protein breaks up into MALDI-MS detectable frag-
ments, leaving the disulfide bonds intact. We also show the mapping of a
number of chemically introduced cross-links in the NK1 domain of hepato-
cyte growth factor ⁄ scatter factor. The virtualmslab program was used to
explore the limitation and potential of mass spectrometry for cross-link
studies of more complex biological assemblies, showing the value of high
performance instruments such as a Fourier transform mass spectrometer.
The program is freely available upon request.
Abbreviations
BS
3
, bis(sulfosuccinimidyl)suberate; HGF/SF, hepatocyte growth factor ⁄ scatter factor; NEM, N-ethylmaleimide; RNase A, ribonuclease A;
SAXS, small angle X-ray scattering; TFA, trifluoroacetic acid.
FEBS Journal 273 (2006) 281–291 ª 2005 The Authors Journal compilation ª 2005 FEBS 281
To support our protein studies we have developed a
tool, virtualmslab, which allows us to perform a
variety of advanced virtual proteomics experiments
with MS analyses. Besides matching the in silico pre-
dicted reaction products with the corresponding mass

spectrometric data using mass band filtering, a most
important, unique feature of virtualmslab is its
experiment editor, allowing to calculate the results of:
(a) any virtual, complete or partial, protein modifica-
tion reaction including cleavages, either simultaneously
or in any desired order; and (b) in and out filtering of
defined reaction products.
Our aim is to establish general, rapid and relatively
simple procedures for the analysis of naturally occur-
ring cross-links, e.g. disulfide bonds, and artificially
introduced cross-links in proteins. Cross-links impose
distance constraints on amino acid residues that can be
used to model the 3-D structure of proteins and pro-
tein complexes [3–7]. Mass spectrometric analysis of
peptides derived from digested cross-linked proteins is
exceptionally suited for the rapid, sensitive and precise
mapping of the cross-links [3,6,8]. To support mass
spectrometric analysis of cross-links in proteins, soft-
ware tools have been developed [9–12] for the identifi-
cation of digest fragments where peptides are linked
together. However, available software suffers from lim-
itations, often preventing general application in cross-
link analysis [6].
Here we demonstrate several features of the virtual-
mslab program for protein cross-link analysis.
As a model system for the analysis of the disulfide
bond structure of a protein we used ribonuclease A
(RNase A), for which the 3-D structure is known in
detail [13], and the disulfide bond structure has been
established already in 1960 [14]. We use the virtual-

mslab program to explore a successful experimental
strategy to assess the disulfide bond structure from a
single MALDI-TOF mass spectrum.
In a separate study the Met receptor tyrosine kin-
ase and its ligand, hepatocyte growth factor ⁄ scatter
factor (HGF ⁄ SF) were used to explore and test a
cross-linking strategy with virtualmslab. Signal
transduction via the Met receptor is involved in cell
growth and migration during embryogenesis as well
as in cancer [15] but both the assembly of the
HGF ⁄ SF complex and the basis for receptor activa-
tion remain poorly understood. Insight into the spa-
tial arrangement of the HGF ⁄ SF–Met complex can
be obtained by chemical cross-linking. To examine
the viability of a mass spectrometric analysis of
chemically induced cross-links for this complex, the
experimental strategy has been tested by carrying out
virtual cross-linking with successive MS analysis.
Following the virtual experiments, mapping of cross-
links in the NK1 domain of HGF ⁄ SF treated with
an amine specific cross-linking agent has been
accomplished, based on the virtualmslab assisted
analysis of the MS data from the corresponding un-
fractionated tryptic digest.
Results and discussion
General setup of VIRTUALMSLAB
virtualmslab is used to perform complex virtual
proteomic experiments and integrate these with mass
spectrometric analyses. The virtual experimental results
allow rapid and convenient exploration of proteomics

strategies and are used to generate MS reference data-
bases that can be matched with the real MS data
obtained from the equivalent real experiments. Mat-
ches between virtual and real data reveal the identity
and nature of reaction products that may lead to the
characterization of post-translational modification pat-
terns, disulfide bond structures, chemical modifications
and cross-linking in protein mixtures, complexes, and
assemblies.
Proteins
A single protein, a list of proteins from a mixture or
from a protein complex or a complete proteome under
study, can be entered or imported (in fasta format)
into the program as amino acid sequences. Amino acid
residues, N- and C-terminal end groups, and modifica-
tions can be custom defined, including specific isotopic
and ⁄ or virtual labelling to keep track of specific amino
acid residues in the analyses. The entered proteins or a
custom selection can be concurrently added to the
experiments.
The virtual experiment
The protein or protein mixture can be subjected to an
experiment including several subsequent and ⁄ or paral-
lel steps. The individual steps include:
l
any customizable chemical and proteolytic cleavage,
with optional specific isotope or virtual element substi-
tution;
l
modifications of amino acid residues, partial

sequences and ⁄ or end groups, with optional specific
isotope or virtual element substitution;
l
in- and ⁄ or out-filtering of reaction products contain-
ing any combination of specific amino acid residues,
partial sequences and end groups;
l
mass band-filtering;
Computer-assisted mass spectrometric analysis of cross-links L. J. de Koning et al.
282 FEBS Journal 273 (2006) 281–291 ª 2005 The Authors Journal compilation ª 2005 FEBS
l
partial unintentional modifications (such as oxida-
tion, deamidation, etc.) of specific amino acid residues,
partial sequences and end groups.
Multipass experiment editor
A unique feature of virtualmslab is the ability to
perform the above listed calculations in succession and
in any desired order. Cleavage, modification and filter-
ing may be carried out in different steps, and the
resulting virtual experimental peptide mixture may sug-
gest alternatives for performing the real experiment in
a certain sequence in the laboratory.
For instance, if amines are modified prior to enzy-
matic cleavage, the result is different from a modifi-
cation to amines that has been introduced after
proteolysis; in the first instance cleaved peptides carry
free amino termini, in the second instance these amines
are considered to be modified. Upon running the pro-
grammed experiment, the resulting peptide mixture
database is displayed with various sorting criteria for

inspection.
Mass spectrometric data
To match virtual with real data, mass spectra are
imported into the program as monoisotopic
mass ⁄ intensity lists in ascii format. virtualmslab is
capable of providing reference lists for use in internal
calibrations. Masses (m ⁄ z-values) in any generated
digest or MS ⁄ MS prediction, including those of multi-
ply charged ions, can be double clicked for inclusion
in a reference list that can subsequently be exported in
ascii format. Lists like these serve as input for internal
calibration in many MS software packages. LC ⁄ MS
data can be time-segmented and the segments are indi-
vidually processed, normalized to the base peak in the
segment and imported as a series. Each individually
imported mass spectrum can be activated or deactiva-
ted for adding to a combined spectrum number ⁄
mass ⁄ intensity list, which is used for matching.
Data matching
Data matching can be achieved in matching quests. In
each quest, matching criteria can be defined for search-
ing unmodified and post-translationally modified pep-
tides, peptides with an internal disulfide or chemically
induced cross-link (intrapeptide cross-link products),
and peptide pairs bonded together with a disulfide
bridge or a chemically induced cross-link (interpeptide
cross-link products). Each mass in the combined mass
list is matched within a custom defined mass window
with the masses of the peptides from the virtual experi-
ment, selected according to the quest criteria. A match

can be performed for a number of quests simulta-
neously. For instance, a digest of a disulfide-containing
protein (described in more detail below) can be analysed
for the presence of unmodified peptides (quest 1), pep-
tides containing an internal disulfide linkage (quests 2),
or peptide pairs connected by a disulfide bond (quest 3).
The resulting output sheet, illustrated in Figure 1 shows
the experimental mass list with the match quests assign-
ments. For convenient analyses of the assignments the
result table can be sorted on each heading.
Platform
virtualmslab runs on a Microsoft Visual Basic plat-
form and is freely available from the author LJdK,

Mapping of disulfide bonds with aid
of
VIRTUALMSLAB
Disulfide bonds in proteins can be mapped by mass
spectrometric identification of the corresponding digest
peptides [16]. For this, efficient cleavage between cys-
teine containing sections of the protein, leaving the
disulfide bridges intact, is essential. However, disulfide
bonded proteins often have a rigid structure rendering
the native protein resistant to cleavage by proteases. In
that case, chemical cleavage may be considered, such
as the use of cyanogen bromide to cleave at methion-
ine residues, or pH 2 at elevated temperature to cleave
peptides bonds at the C- or N-terminal side of aspar-
tate residues. RNase A was used as a model protein to
show the development of a procedure with the aid of

virtualmslab for mapping disulfide bonds in a rigid
protease resistant protein [14].Virtual experiments with
the virtualmslab program showed that MALDI-MS
detectable fragments, with masses ranging from 800
to 4000 atomic mass units, could be generated by ini-
tial specific acid cleavage in front of and behind aspar-
tate residues [17,18] to break-up the rigid protein,
followed by tryptic cleavage which takes place behind
lysine and arginine residues.
Experimentally, RNase A was cleaved by treatment
at pH 2, followed by trypsin digestion and mass analy-
sis of the resulting peptide mixture. Based on a single
MALDI-FTICR mass spectrum, 42 fragments were
assigned by virtualmslab within a mass window of
4 p.p.m., corresponding to a sequence coverage of
> 90%. Figure 1 shows part of the output sheet for
the assignment over three quests. The first quest
matches all unmodified peptide masses (specified by
L. J. de Koning et al. Computer-assisted mass spectrometric analysis of cross-links
FEBS Journal 273 (2006) 281–291 ª 2005 The Authors Journal compilation ª 2005 FEBS 283
the question mark) to the experimental masses. The
second quest matches the combined masses of all pairs
of peptides, each containing at least one cysteine minus
the mass of two hydrogen atoms, assigning the disul-
fide linked peptides. The third quest matches the mass
of all peptides containing cysteine minus the mass of
two hydrogen atoms, assigning the peptides with an
internal disulfide link.
From the assignments, a peptide map was construc-
ted as shown in Fig. 2. Due to partial cleavage at

both D and R ⁄ K, many overlapping peptides were
observed. About 80% of all peaks in the MALDI-
FTICR mass spectrum with intensity above 5% of the
base peak could be assigned, assuming cleavage at D
or K ⁄ R. This demonstrates the high specificity of
chemical cleavage at aspartate residues. We were
aware of the possible occurrence of deamidations of
asparagines and subsequent partial cleavage at the
resulting aspartate residues. However, virtualmslab
analysis allowing partial modification of N to D fol-
lowed by partial cleavage on the newly formed D resi-
dues, showed no matches for the resulting peptides.
This indicates the absence of severe deamidations
under our experimental conditions. Of the 42 assigned
fragments, a total of 23 were unambiguously attrib-
uted to peptides with a correct disulfide bridge, consid-
ering four disulfide linkages in RNase A. Of the 23
disulfide-containing fragments, three were assigned to
the C26–C84 linkage, 12 to C40–C95, four to C58–
C110, and four to C65–C72. Several disulfide-linked
peptides were also present as free SH-containing pep-
tides, indicating partial in-source reduction of disul-
fides [19]. It should be noted that this phenomenon
enables assignment of pairs of in-source cleavage prod-
ucts to corresponding disulfide linked peptides, the
sum of the masses of the cleavage products, due to
incorporation of two H atoms, being 2 atomic mass
units more than the mass of the parent compounds.
This information can be used to confirm the results of
the virtualmslab analysis.

Despite the overwhelming evidence for the correct
disulfide linkages, three minor peaks were assigned by
virtualmslab to peptides with conflicting disulfide
linkages; two of these correspond to a peptide with
Fig. 1. Part of the output sheet of the VIRTUALMSLAB analysis of the MALDI-FTICR-MS data of the RNase A digest peptide mixture. The first
column lists the mass spectrum with spectrum number. The second column lists the numbers of the matches corresponding to the quest
numbers on the
VIRTUALMSLAB console shown in the inset. Column 3 lists the theoretical masses of the assignments with the match error in
p.p.m. Columns 4 and 5 list the peptide assignments with the precursor proteins (in this experiments this is only RNase A), the peptide posi-
tion in the protein and the residue sequence.
Computer-assisted mass spectrometric analysis of cross-links L. J. de Koning et al.
284 FEBS Journal 273 (2006) 281–291 ª 2005 The Authors Journal compilation ª 2005 FEBS
an internal C40–C58 linkage, the third corresponds to
a peptide with an internal C84–C95 1inkage. These
species can conceivably be naturally occurring disul-
fide-bridge variants, or can be the result of disulfide
interchange reactions during the experiment. Disulfide
interchanges can in principle be catalysed by free
thiols at neutral or high pH. If this were the case, a
thiol scavenger should be able to prevent disulfide
interchange. To investigate this possibility we added
N-ethylmaleimide (NEM) to the acid-cleaved RN-
ase A preparation before the start of the digestion at
pH 8.0 by trypsin. It should be noted that at this pH
NEM not only reacts with SH groups, but to a lesser
extent also with amines. The presence of NEM during
trypsin digestion therefore results in complex peptide
mixtures, due to partial modification at the amino
terminus and at lysine residues, and because modifica-
tion at lysine residues prevents cleavage by trypsin.

Accordingly, analysis using virtualmslab including
modifications with NEM in the match quests results
in the assignment of no fewer than 84 peptides. Of
these, 31 represent free SH-containing peptides, as the
result of in-source decay, and 53 are correct disulfide-
linked species. No unambiguous evidence was found
for peptides with internal C40–C58, C84–C95 or any
other conflicting linkages under these conditions, indi-
cating that their minor presence in the absence of
NEM must have been the result of disulfide inter-
change reactions. A possible explanation is the phe-
nomenon of b-elimination [20], occurring under the
alkaline conditions during trypsin digestion, creating
the necessary catalyst for the interchange reaction.
Even trace amounts of free sulfhydryl groups can
trigger a cascade of reshuffling of disulfide-linked
peptides, which may explain the minor formation of
the detected peptides with an internal C40–58 or
C84–95 disulfide bond. Ambiguities caused by these
interchange reactions can be resolved by adding
NEM before and during digestion.
In conclusion, it appears that well-controlled acidic
cleavage followed by tryptic digestion effectively breaks
up the rigid RNase A molecule into MALDI-MS
detectable fragments, leaving the vulnerable disulfide
bonds intact. The virtualmslab analysis of the data
from a single MALDI-mass spectrum acquired with a
high performance FTICR mass spectrometer unambig-
uously reveals the origin of all disulfide bonds.
Identification of cross-links in the NK1 domain

of HGF/SF
HGF ⁄ SF and its receptor Met stimulate cell growth, cell
differentiation and migration during embryogenesis. In
124 V
S
A
D
120 F
H
V
P
V
Y
P
N
G
E
110 C
A
V
I
I
H
K
N
A
Q
100 T
T
K

Y
A
C
N
P
Y
K
90 S
S
G
T
E
R
C
D
T
I
80 S
M
T
S
Y
S
Q
Y
C
N
70 T
Q
G

N
K
C
A
V
N
K
60 Q
S
C
V
A
Q
V
D
A
L
50 S
E
H
V
F
T
N
V
P
K
40 C
R
D

K
T
L
N
R
S
K
30 M
M
Q
N
C
Y
N
S
S
S
20 A
A
S
T
S
S
D
M
H
Q
10 R
E
F

K
A
A
A
T
E
1K
Fig. 2. Peptide map constructed from the VIRTUALMSLAB assign-
ments of the MALDI-FTICR-MS data of the RNase A digest peptide
mixture. The first column shows the sequence with the four well
established disulfide links. The second column shows the peptides
resulting from the in-source MALDI reduction of S–S-linked pep-
tides. Column 3 shows the linked peptides, clearly confirming all
four established disulfide links. Column 4 shows the peptides asso-
ciated with conflicting internal disulfide bridges.
L. J. de Koning et al. Computer-assisted mass spectrometric analysis of cross-links
FEBS Journal 273 (2006) 281–291 ª 2005 The Authors Journal compilation ª 2005 FEBS 285
cancer they promote invasive growth in surrounding
tissues and metastasis of the tumour. Both proteins
are produced as inactive singular proenzymes, which
upon cleavage form an active disulfide-linked a ⁄ b
heterodimer. Several individual domains of both Met
and HGF ⁄ SF have been elucidated, but the 3-D struc-
tures of the full-length proteins are not yet resolved. The
NK1 domain of the a-chain of HGF ⁄ SF is found to be
the main interaction site with Met, while the b-chain
might make additional interactions. To obtain a model
of the interaction of Met and HGF⁄ SF, the complex has
been subjected to solution phase small angle X-ray
scattering (SAXS) (Gherardi, E., Sandin, S., Petoukhov,

M. V., Finch, J., O
¨
fverstedt, L G., Nunez, R., Blundell,
T. L., Vande Wonde, G. F., Skoglund, U. & Svergun,
D. I., unpubished data). Experimentally determined
constraints on the distances of amino acid residues
should be helpful to either discard or confirm the solu-
tions obtained by SAXS. Identification of the sites of
artificially induced cross-links can provide such distance
constraints and, with these constraints, a detailed model
of the interaction between the two proteins can be
designed, based on SAXS data and the known 3-D
structures of single protein domains. Such a model will
be of great value both to understand how HGF ⁄ SF
interaction with Met leads to receptor dimerization
and signal transduction and to design Met inhibitors as
anticancer drugs [15].
Mass spectrometric analysis of digests of cross-linked
proteins is known to be a powerful way to identify sites
of cross-linking [3,6,8]. However, the identification of
cross-linked sites in biological assemblies as complicated
as the HGF ⁄ SF-Met complex are unprecedented. We
use the amine-specific homobifunctional cross-linker
bis(sulfosuccinimidyl)suberate (BS
3
). Besides reaction
with amines, the activated ester is also susceptible to
hydrolysis, which may lead to single labelling, i.e. modi-
fication of amines without actual cross-linking. Clearly,
the above analyses of the naturally occurring disulfide-

linkages in RNase A must be taken a step further for
this complex which is build from four peptide chains
over two disulfide-linked ab heterodimers. The complex
has more than 1600 amino acid residues adding up to a
mass of over 180 kDa, and it has 98 lysine residues
which can be heterogeneously cross-linked or singly
labelled by the cross-linking reagent.
To anticipate limitations of a mass spectrometric
analysis of this complicated system, analysis has first
been completed with the virtualmslab program. The
above HGF ⁄ SF-Met complex was subjected to reduc-
tion and alkylation of cysteine residues by iodaceta-
mide, followed by digestion with trypsin, allowing
a maximum of three miscleavages. Mass filtering
between 200 and 4500 Da resulted in a digest mixture
of 534 peptides. From this, a database was generated
of all possible realistic peptide pairs linked together
with BS
3
via their lysine residue, excluding lysines
cleaved by trypsin. From this database of 16 554 BS
3
-
linked peptide pairs, the mass list was extracted and
taken as our virtual mass spectrum. Figure 3 shows
the mass distribution of cross-linked peptide pairs,
illustrating that most of the cross-linked peptide pairs
have masses > 3000 Da. Each entry of this mass spec-
trum was then matched against the complete theoret-
ical set of peptides, including unmodified peptides,

peptides that are modified by a partially hydrolysed
cross-linker, intrapeptide cross-linking products and
interpeptide cross-linking products. The match presents
all peptide candidates for assignment of each mass in
the spectrum as a function of the match mass window.
It shows how many alternative peptide candidate
assignments can be anticipated if the experimental
mass spectrum is searched for cross-linked peptides at
a specific instrumental mass accuracy. In Fig. 4 the
results are summarized as the average number of pep-
tide candidates for all 16 554 masses in the virtual
spectrum, segmented in four mass ranges, vs. the mass
window. As expected, the number of candidates comes
down to almost 1 for all mass ranges if the mass win-
dow is zoomed in to 0 p.p.m. Still, for about 15% of
the mass entries an alternative candidate, beside the
0
100
200
300
400
500
600
700
800
900
1000
700 1700 2700 3700 4700
Mass (Dalton)
Fig. 3. Calculated mass distribution of the BS

3
cross-linked peptide
pairs in the tryptic digest of the HGF ⁄ SF Met protein complex
allowing cross-links between all lysine residues.
Computer-assisted mass spectrometric analysis of cross-links L. J. de Koning et al.
286 FEBS Journal 273 (2006) 281–291 ª 2005 The Authors Journal compilation ª 2005 FEBS
authentic cross-linked peptide pair is given. Most of
these alternatives are due to shifted tryptic cleavage
places for the peptides with RK, RR, KK and KR
sequence elements which will yield peptides with identi-
cal elemental composition. Nevertheless, these alter-
native assignments will pinpoint the same cross-link.
When the mass window is zoomed out, the number of
candidate peptides gradually increases to 2.5 for the
low mass segment of m 1000–2000 Da. This number
appears to level off if the window becomes broader
than ±60 p.p.m. The gradual increase indicates that
for this mass range the density of the candidate pep-
tide masses is relatively low. The levelling-off points
out that the distribution of the alternative candidate
peptide masses around the mass of the authentic cross-
linked peptide pair is about ±60 p.p.m. wide. This
limited width is a consequence of the known discon-
tinuous mass distribution of peptides [10]. For compar-
ison, the gap between m ⁄ z 2000 and m ⁄ z 2001 is
500 p.p.m. For the highest mass segment of m 4000–
5000 Da, which covers most of the cross-linked pep-
tides (see Fig. 3) the number of peptide candidates rap-
idly increases with increasing detection mass window,
while this number only begins to level off to about 10

outside ±90 p.p.m. This indicates that, for the higher
mass range, the density of candidate peptides masses is
much higher and the mass distribution width has
increased to over ±90 p.p.m. For comparison, the gap
between m ⁄ z 5000 and m ⁄ z 5001 is 200 p.p.m.
The above virtual analysis reveals that instrumental
mass accuracy is crucial. For mass accuracies better
that 2 p.p.m., such as can be obtained with a high
performance Fourier transform mass spectrometer,
most of the identifications can be based on accurate
mass with additional tandem mass spectrometric valid-
ation. For mass accuracies better that 20 p.p.m. the
identification is filtered to three or four possible candi-
dates (see Fig. 4). This moderate number of alternative
candidates should still allow unambiguous identifica-
tion based on additional tandem mass spectrometric
validation. It thus appears that a cross-linking
approach to obtain structural information about an
assembly as complicated as the HGF ⁄ SF-Met complex
is feasible, especially with adequate fractionation of
the peptide mixture, e.g. by reversed phase HPLC.
To experimentally test this finding, we have carried
out the mass spectrometric analysis of a cross-linked
peptide mixture with at least the same or higher com-
plexity as a reversed phase HPCL fraction of a peptide
mixture derived from cross-linked HGF ⁄ SF-Met. We
chose the NK1 domain of HGF ⁄ SF as the test protein
for these experiments. The size of NK1, with 183 resi-
dues adding up to almost 22 kDa, is roughly one-tenth
of that of the entire HGF ⁄ SF complex and therefore

of similar complexity as an average reversed phase
HPLC fraction from the complex, assuming sorting of
the peptides in at least 10 fractions. Moreover, a 3-D
structure of the NK1 domain is available, so that
cross-link identification can be validated. BS
3
was used
to covalently cross-link amines within the NK1 sub-
unit. Cross-linked and control preparations were sub-
jected to SDS ⁄ PAGE. Subsequently, protein bands
corresponding to the monomeric NK1 were treated
with trypsin and the resulting peptide mixtures were
mass analysed. The processed MS data were loaded
into the virtualmslab program and matched with the
corresponding virtual experimental results. A total of
13 peaks in the MALDI-TOF mass spectrum of the
cross-linked NK1 digest could be related to cross-link-
ing products. Some of these peaks were matched with
one or two alternative assignments within a mass win-
dow of ±30 p.p.m. corresponding to the mass accu-
racy of our MALDI-TOF instrument. As anticipated,
the relatively limited average number of possible pep-
tide assignments found for the cross-linked NK1 is
smaller than the average number of three candidate
assignments found by the virtualmslab program for
the entire HGF ⁄ SF-Met complex in a mass window of
±30 p.p.m. (Fig. 4).
Based on the peptide assignments, a list of candi-
date cross-links is given in Table 1. Four of these
candidate cross-links have been confirmed by tandem

mass spectrometric analyses of the corresponding
cross-linked peptides using either ESI-QTOF or
0
1
2
3
4
5
6
7
8
9
10
0 20406080100
Mass accuracy (ppm)
1000-2000 Dalton
2000-3000 Dalton
3000-4000 Dalton
4000-5000 Dalton
Average number of peptide candidates
Fig. 4. Calculated average number of peptide candidates within a
mass window at different mass ranges in a tryptic digest mixture
of BS
3
cross-linked HGF ⁄ SF Met protein complex. (for details see
text).
L. J. de Koning et al. Computer-assisted mass spectrometric analysis of cross-links
FEBS Journal 273 (2006) 281–291 ª 2005 The Authors Journal compilation ª 2005 FEBS 287
MALDI-TOFTOF (Fig. 5). These validated cross-
links were fit into an available crystal structure of the

protein (PDB: 1BHT) [21]. It was found that the
measured distances between amino groups are com-
patible with the calculated distance of 11.4 A
˚
which
can be spanned by the BS
3
cross-linker (Fig. 6).
Another candidate cross-linked peptide pair connect-
ing K44 and K91 was assigned by virtualmslab.
Tandem MS data allowed neither confirmation nor
rejection of the assignment, still leaving open the pos-
sibility that it corresponds to an unknown species.
However, also this candidate cross-link fits nicely into
the 3-D structure of NK1 (Fig. 6).
The candidate cross-links in Table 1 suggest cross-
linking between the N-terminal part of the protein
[Y28 (N-terminus) and K34] with the region including
K132, K137 and K170, which are close together.
However, the first seven residues of the protein N-ter-
minal region, specified as amino acids 28–34, are not
resolved in the crystal structure and links to their
amine groups cannot be drawn. This can be explained
by assuming flexibility of the seven N-terminal resi-
dues that might localize preferentially into this region.
Alternatively, we may assume that K132, K137 and
K170 have a relatively high reactivity towards the
cross-linking agent, enabling them to trap the flexible
amino terminus.
The results imply that a single MALDI-TOF mass

spectrum with moderate mass accuracy of an unfract-
ionated proteolytic digest of a cross-linked protein can
disclose significant information on the protein struc-
ture. This opens new avenues in the computer assisted
analysis of more complex biological assemblies, by
combining advanced peptide separation techniques
Table 1. Candidate cross-links found in NK1 using BS
3
as a cross-
linking agent. The cross-link candidates are nominated by the
VIRTUALMSLAB program by assigning peaks in the MALDI-TOF mass
spectrum of the tryptic digest of cross-linked NK1 to the corres-
ponding cross-linked peptides. Residue Y28 is the N-terminal resi-
due in the construct used.
Residue 1 Residue 2
Assigned
peaks (m ⁄ z)
Experimental mass
discrepancy (p.p.m)
28 34 1191.65
d
)13
28 34 1347.75
c
+14
28 132 1805.93
c
+4
28 137 1972.04
b

)8
28 170 2171.02
b
)18
34 47 1301.82
c
)6
34 132 1357.77
b
+1
34 137 1523.88
b
)34
44 47 1127.70
a
+3
58 60 1264.79
a
)47
44 91 1523.56
b
)20
132 170 2337.14
a
)34
137 170 2503.25
a,e
)15
a
Identification of the corresponding assigned cross-linked peptide

has been confirmed by tandem MS.
b
Assigned cross-linked peptide
shows no alternative noncross-linked peptide assignments.
c
As-
signed cross-linked peptide shows one alternative noncross-linked
peptide assignments.
d
Assigned cross-linked peptide shows two
alternative noncross-linked peptide assignments.
e
MS ⁄ MS data is
shown in Fig. 5.
A
B
Fig. 5. MALDI-TOF ⁄ TOF MS ⁄ MS analysis
of a NK1 cross-linked peptide with m ⁄ z
2503.3. NK1 K137 is linked to NK1 K170
(see Table 1). (A) Structures of the cross-
linked peptide. Observed fragment ions are
indicated. (B) MALDI-TOF ⁄ TOF MS ⁄ MS
data: fragment ion annotations correspond
to the annotations in A.
Computer-assisted mass spectrometric analysis of cross-links L. J. de Koning et al.
288 FEBS Journal 273 (2006) 281–291 ª 2005 The Authors Journal compilation ª 2005 FEBS
with mass analysis, and by taking advance of the high
mass accuracy of FTICR-MS.
In conclusion, it appears that advanced mass spectro-
metric studies on proteins can significantly be promo-

ted by software tools, like the virtualmslab program,
that can merge and tune mass spectrometric analysis
with biochemical experiments. In contrast to other
available software such as asap [10], ms2assign [9] and
searchxlinks the unique multistage experiment editor
in our program is a convenient tool to predict and
optimize possible outcomes beforehand, which saves
time in finding successful experimental strategies. asap
and searchxlinks [11] have the order of events hard
coded into the program and do not allow for multipass
experiments. ms2assign has the unique feature to han-
dle MS ⁄ MS data, which all other programs, including
virtualmslab cannot. virtualmslab also allows for
a large number of candidate proteins to be input in
one single analysis. The recently described program
cplm [12] is flawed, in the sense that it only candidates
the match with the least mass deviation for a given
observed mass, thus bypassing critical assessment and
verification.
The potential of our software program has been
shown for the cross-link studies presented in this
paper. However, the applications can be extended with
other studies, including studies comprising entire cellu-
lar proteomes.
Experimental procedures
Materials
N-ethylmaleimide, HCl, and the gradient grade solvents:
acetonitrile, ethanol and water were from Merck (Darms-
tadt, Germany). The cross-linking agent BS
3

was from
Pierce (Rockford, MA, USA). Ribonuclease A and lyso-
zyme were from Sigma-Aldrich Chemie GmbH (Steinheim,
Germany). Trypsin (sequencing grade) was from Roche
Diagnostics GmbH (Mannheim, Germany).
The NK1 fragment of HGF ⁄ SF was expressed in the
yeast Pichia pastoris and purified from culture supernatants
[22].
Cross-linking
Protein cross-linking was carried out with BS
3
by incuba-
ting the protein at a concentration of 0.5 mgÆmL
)1
(23 lm)
in a 50 mm Na-phosphate buffer, 150 mm NaCl, pH 7.4,
with 1 mm cross-linker, for 30 min at room temperature.
Cross-link spacer distances were approximated as described
by Green et al. [23].
Preparation of peptides
For cleavage at asparte [18], proteins (0.1 mgÆmL
)1
) were
dissolved in 0.013 m HCl (pH 2) and incubated in a closed
plastic Eppendorf vial, in an oven at 108 ° C for 2 h. Diges-
tion by trypsin, both in the presence and absence of 10 mm
NEM, was carried out in 100 mm NH
4
HCO
3

at 37 °C for
4 h using a protease : substrate ratio of 1 : 50 (w ⁄ w). In gel
digestion by trypsin of Coomassie stained protein bands
was carried according to published procedures [24]. Peptide
mixtures were desalted and concentrated by ZipTip lC
18
pipette tips (Millipore Corporation, Billerica, MA, USA),
washed with 0.1% (v ⁄ v) trifluoroacetic acid (TFA) or 1%
(v ⁄ v) formic acid solution and eluted with a solution con-
taining 50% (v ⁄ v) acetonitrile and 0.1% (v ⁄ v) TFA or 1%
(v ⁄ v) formic acid.
Mass spectrometry
MALDI-MS analyses were performed with a TofSpec 2EC
mass spectrometer (Micromass, Wythenshawe, UK) in
the reflectron mode. Peptides were mixed in a 1 : 1 ratio
Fig. 6. Space filled model of the NK1-domain of HGF ⁄ SF (1BHT).
Four confirmed (solid lines) and one candidate cross-link (dashed
line) are shown in this model. Measured distances between the
linked amino acids are indicated. The different angles between the
two views A and B are indicated by the arrows. The model was
visualized using
PYMOL ().
L. J. de Koning et al. Computer-assisted mass spectrometric analysis of cross-links
FEBS Journal 273 (2006) 281–291 ª 2005 The Authors Journal compilation ª 2005 FEBS 289
(v ⁄ v) with a 10 mgÆmL
)1
matrix (a-cyano-4-hydroxycin-
namic acid) solution in a 50 : 50 (v ⁄ v) ethanol ⁄ acetonitrile
mixture. For analyses, 0.5 lL of the mixture was spotted
on a MALDI steel target plate and allowed to dry. MALDI

ultra high resolution accurate mass analysis was performed
with a 7T ApexQ FTICR-MS instrument (Bruker Dalton-
ics, Bremen, Germany). For the analyses, an aliquot of
0.5 lL peptide mixture was mixed with a 10 mgÆmL
)1
dihydroxybenzoic acid solution containing 0.1% (v ⁄ v) TFA
in a 30 : 70 (v ⁄ v) acetonitrile ⁄ water mixture, spotted onto a
Bruker Daltonics AnchorChip
TM
, and allowed to dry.
MALDI MS ⁄ MS analyses were performed with TOF ⁄ TOF
4700 Proteomics Analyser (Applied Biosystems, Framing-
ham, CA, USA). The sample (0.5 lL) was cocrystallized
with an equal volume of matrix solution (7 mgÆmL
)1
a-cy-
ano-4-hydroxycinnamic acid dissolved in 50% v ⁄ v acetonit-
rile ⁄ 0.1% TFA in water) and applied to the target. Prior to
analysis, the instrument was externally mass calibrated with
a standard peptide mixture, as outlined by the manufac-
turer. Electrospray ionization MS and MS ⁄ MS analyses
were performed with a QTOF mass spectrometer (Micro-
mass). Peptide mixtures were directly infused from gold pla-
ted nanospray tips (New Objective, Woburn, MA, USA)
into the ESI-QTOF. Selected ions were collided with Argon
in the hexapole collision cell, at a pressure of 4 · 10
)5
mbar
measured on the quadrupole Penning gauge. Recorded
spectra were internally mass calibrated on signals from

trypsin autodigestion fragments and unambiguously identi-
fied digest fragments from the proteins studied. Mass spec-
tra were deconvoluted to lists of monoisotopic masses,
which were analysed using the virtualmslab program.
Suggested nomenclature [9,25] for fragment ions from
cross-linked peptides has been used.
Acknowledgements
This work was supported by grants of the Netherlands
Organization for Scientific Research (NWO), Chemical
Sciences division (CW) and Regieorgaan Genomics.
The ApexQ FTICR-mass spectrometer was largely
funded by NWO-CW and the TofSpec 2EC and
QTOF by NWO, Medical Sciences division.
References
1 Aebersold R & Mann M (2003) Mass spectrometry-
based proteomics. Nature 422, 198–207.
2 Cristoni S & Bernardi LR (2004) Bioinformatics in mass
spectrometry data analysis for proteomics studies. Exp
Rev Proteom 1, 469–483.
3 Back JW, de Jong L, Muijsers AO & de Koster CG
(2003) Chemical cross-linking and mass spectrometry
for protein structural modeling. J Mol Biol 331, 303–
313.
4 Geisler N, Schunemann J & Weber K (1992) Chemical
cross-linking indicates a staggered and antiparallel
protofilament of desmin intermediate filaments and
characterizes one higher-level complex between protofi-
laments. Eur J Biochem 206 , 841–852.
5 Hermanson GT (1996) Bioconjugate Techniques. Aca-
demic Press, San Diego, USA.

6 Sinz A (2003) Chemical cross-linking and mass spectro-
metry for mapping three-dimensional structures of
proteins and protein complexes. J Mass Spectrom 38,
1225–1237.
7 Wong SS (1991) Chemistry of Protein Conjugation and
Cross-Linking. CRC Press, Boca Raton, USA.
8 Borch J, Jorgensen TJ & Roepstorff P (2005) Mass
spectrometric analysis of protein interactions. Curr Opin
Chem Biol 9, 509–516.
9 Schilling B, Row RH, Gibson BW, Guo X & Young
MM (2003) MS2Assign, automated assignment and
nomenclature of tandem mass spectra of chemically
crosslinked peptides. J Am Soc Mass Spectrom 14,
834–850.
10 Clauser KR, Baker P & Burlingame AL (1999) Role of
accurate mass measurement (±10 ppm) in protein iden-
tification strategies employing MS or MS ⁄ MS and data-
base searching. Anal Chem 71, 2871–2882.
11 Wefing S, Schnaible V & Hoffman D (2001) SearchX-
Links in rchxlinks/de.
12 Tang Y, Chen Y, Lichti CF, Hall RA, Raney KD &
Jennings SF (2005) CLPM: a cross-linked peptide map-
ping algorithm for mass spectrometric analysis. BMC
Bioinformatics 6 (Suppl. 2), S9.
13 Wlodawer A, Svensson LA, Sjolin L & Gilliland GL
(1988) Structure of phosphate-free ribonuclease A
refined at 1.26 A. Biochemistry 27, 2705–2717.
14 Spackman DH, Stein WH & Moore S (1960) The disul-
fide bonds of ribonuclease. J Biol Chem 235, 648–659.
15 Birchmeier C, Birchmeier W, Gherardi E & Vande

Woude GF (2003) Met, metastasis, motility and more.
Nat Rev Mol Cell Biol 4, 915–925.
16 Gorman JJ, Wallis TP & Pitt JJ (2002) Protein disulfide
bond determination by mass spectrometry. Mass Spec-
trom Rev 21, 183–216.
17 Li A, Sowder RC, Henderson LE, Moore SP, Garfinkel
DJ & Fisher RJ (2001) Chemical cleavage at aspartyl
residues for protein identification. Anal Chem 73, 5395–
5402.
18 Inglis AS (1983) Cleavage at aspartic acid. Methods
Enzymol 91, 324–332.
19 Patterson SD & Katta V (1994) Prompt fragmentation
of disulfide-linked peptides during matrix-assisted laser
desorption ionization mass spectrometry. Anal Chem 66,
3727–3732.
20 Kim JS & Kim HJ (2001) Matrix-assisted laser
desorption ⁄ ionization time-of-flight mass spectrometric
Computer-assisted mass spectrometric analysis of cross-links L. J. de Koning et al.
290 FEBS Journal 273 (2006) 281–291 ª 2005 The Authors Journal compilation ª 2005 FEBS
observation of a peptide triplet induced by thermal
cleavage of cystine. Rapid Comm Mass Spectrom 15,
2296–2300.
21 Ultsch M, Lokker NA, Godowski PJ & de Vos AM
(1998) Crystal structure of the NK1 fragment of human
hepatocyte growth factor at 2.0 A resolution. Structure
6, 1383–1393.
22 Chirgadze DY, Hepple JP, Zhou H, Byrd RA, Blundell
TL & Gherardi E (1999) Nat Struct Biol 6, 72–79.
23 Green NS, Reisler E & Houk KN (2001) Quantitative
evaluation of the lengths of homobifunctional protein

cross-linking reagents used as molecular rulers. Protein
Sci 10, 1293–1304.
24 Shevchenko A, Wilm M, Vorm O & Mann M (1996)
Mass spectrometric sequencing of proteins silver-stained
polyacrylamide gels. Anal Chem 68 , 850–858.
25 Pearson KM, Pannell LK & Fales HM (2002) Intra-
molecular cross-linking experiments on cytochrome c
and ribonuclease A using an isotope multiplet method.
Rapid Comm Mass Spectrom 6, 139–159.
L. J. de Koning et al. Computer-assisted mass spectrometric analysis of cross-links
FEBS Journal 273 (2006) 281–291 ª 2005 The Authors Journal compilation ª 2005 FEBS 291

×