In silico modeling and in vivo efficacy of cancer preventive vaccinations

Cancer vaccine feasibility would benefit from reducing the number and duration of vaccinations without diminishing efficacy. However, the duration of in vivo studies and the huge number of possible variations in vaccination protocols have discouraged their optimization. In this study, we employed an established mouse model of preventive vaccination using HER-2/neu transgenic mice (BALB-neuT) to validate in silico-designed protocols that reduce the number of vaccinations and optimize efficacy. With biological training, the in silico model captured the overall in vivo behavior and highlighted certain critical issues. First, although vaccinations could be reduced in number without sacrificing efficacy, the intensity of early vaccinations was a key determinant of long-term tumor prevention needed for predictive utility in the model. Second, after vaccinations ended, older mice exhibited more rapid tumor onset and sharper decline in antibody levels than young mice, emphasizing immune aging as a key variable in models of vaccine protocols for elderly individuals. Long-term studies confirmed predictions of in silico modeling in which an immune plateau phase, once reached, could be maintained with a reduced number of vaccinations. Furthermore, that rapid priming in young mice is required for long-term antitumor protection, and that the accuracy of mathematical modeling of early immune responses is critical. Finally, that the design and modeling of cancer vaccines and vaccination protocols must take into account the progressive aging of the immune system, by striving to boost immune responses in elderly hosts. Our results show that an integrated in vivo-in silico approach could improve both mathematical and biological models of cancer immunoprevention.


The SimTriplex simulator
SimTriplex is a specific simulator for modeling mammary carcinoma onset in HER-2/neu transgenic mice and the immune response elicited by the cellular vaccine Triplex. SimTriplex, inspired by the Celada-Seiden educational model ImmSim (5), is implemented through an Agent Based Model (ABM) technique which allows to describe, in a defined physical space, the immune system entities with their different biological states and the interactions between different entities. The system evolution in space and in time is generated from the interactions and diffusion of the different entities. The major advantage of this technique is that the entities and the relationships can be described in terms that are very similar to the biological world. The intrinsic nonlinearity of the system is treated with no additional effort. The approach is biologically understandable, and relevance is warranted by the intrinsic biological know-how. The model is flexible and extensible, as the behavior of entities is modeled using actual biological knowledge, and can be easily modified to reflect observations from biological experiments.
SimTriplex is a bit-string polyclonal lattice model. In short, bit-string refers to the fact that antigen-receptor interactions are modeled through string composed of bits (see below); polyclonal indicates that lymphocyte clones of different specificity are represented, as opposed to monoclonal models including only a single population of genetically identical lymphocytes, and lattice means that a discrete lattice is used to represent the biological space, i.e. the biological region under consideration.

Model space and time
The version of SimTriplex described here simulates all processes in a virtual region in which interactions take place. The biological space is mapped onto a two-dimensional triangular lattice (six neighbors) with periodic boundary conditions. Physical proximity is modeled through the concept of lattice-site: all interactions among cells and molecules take place within a lattice-site in a single time step, so that there is no correlation between entities residing on different sites at a given time. In the present implementation the lattice grid, which has 16×16=256 sites, is conventionally assumed to represent 1 mm 3 of living tissue, and the time step corresponds to 8 hours.

Bit-strings for antigen recognition
SimTriplex uses bit-strings of fixed length to model all molecules involved in antigen recognition: antibody specificities, clonotypic antigen receptors of T and B cells, their cognate antigens, and major histocompatibility complex (MHC) class I (MHC-I) and class II (MHC-II) glycoproteins. Each antigen receptor is represented by a randomly generated string of 12 bits, yielding a repertoire of 2 12 = 4096 different receptors. B cell epitopes are bit strings of length 12, T cell epitopes result from the juxtaposition of peptide fragments (length=6) and MHC-I or MHC-II domains (length=6). The repertoire in SimTriplex, even though it is obviously orders of magnitude below actual repertoires, is proportional to the initial number of T and B cells in the system (Tab. S1).
The affinity of antigen-receptor binding is based on the Hamming distance between the two bit-strings, i.e. the logical XOR between the two; in practice the binding probability between the two entities is a function of the number of mismatched bits that will decline as the number of mismatchings decreases. The evaluation of the success or failure of binding as a function of the binding probability v(m) between two strings with Hamming distance m depends on the calculation of a random number between 0 and 1. The binding is successful if the number is less than the probability and it fails if it is greater. The binding probability v(m) is defined as: where N is the bit-string length, vc ∈ (0,1) is a free parameter that determines the slope of the function and mc ∈ (>N/2,N) is the threshold value below which the binding is unsuccessful. In the present version mc = 10.

Molecular and cellular entities
In addition to antigens and their clonotypic receptors, two major classes of entities are simulated in SimTriplex, soluble mediators and cells. Tab. S1 shows all such entities included in SimTriplex.
Soluble molecules that, unlike antibodies, do not entail clonotypic variability, such as interleukins (IL), are represented "in bulk" in each lattice location by numerical variables that keep track of the count (i.e. concentration) of each entity present at a given time in that location. It should be noted that the present version of the model includes only one T cell cytokine, therefore the functional role of the entity named "IL-2" was overloaded to represent the function of multiple T cell cytokines, including γ-interferon. Future versions of the simulator are planned to include more entitites of this type, possibly including Th2 cytokines.
Cellular entities (Tab. S1) include major lymphocyte populations, largely inherited from the ancestral Celada-Seiden model with the addition of NK cells, and two types of tumor cells expressing the target antigen (HER-2/neu gene product p185), mammary carcinoma cells (CC) and vaccine cells (VC), which additionally express two biological adjuvants, allogeneic MHC-I and transgenic IL-12. The initial number of each cell type (Tab. S1) is proportional to cell concentrations in mouse peripheral blood.

Differentiation of mature T and B cells
The processes of hematopoietic differentiation that give rise to mature T and B cells in the lattice are simulated by separate pre-processing modules that recapitulate the recombinational, selective and mutational events that shape the T and B cell repertoire. The initial repertoire of bit-strings representing antigen receptors is generated at random, then a "thymic function" implements positive and negative selection of T cells based on the recognition of self MHC according to the rules described in 1.2 above. Furthermore, whenever a mature B cell duplicates (see below, 1.9), its antigen receptor undergoes stochastic mutations to simulate the events leading to affinity maturation.
As it happens in real life, stochastic mechanisms of repertoire generation and of affinity maturation generate differences between the results of individual runs of the simulator due to the use of different seeds for the generation of uniformly distributed pseudo-random numbers. Such inter-run differences are usually taken as a representation of inter-individual biological heterogeneity. For this reason, all experimental conditions (e.g. different vaccination schedules) were tested in silico using multiple runs of SimTriplex, to simulate the variability between mice observed in vivo. Obviously the starting parameters of each run (e.g. random seeds) were duly recorded. This offers the opportunity, not available to scientists performing in vivo experiments, to "resuscitate" any individual mouse and to test its individual response to different experimental conditions.

Neoplastic progression and vaccination
Neoplastic progression to mammary carcinoma in HER-2/neu transgenic mice is an unrelenting process fueled by transgene expression in the mammary gland throughout the life of the host. Cancer cells in the system are continuously produced by preneoplastic lesions. To simulate this process, at each time-step 3 new CC are distributed at random positions in the lattice; this parameter was dimensioned to yield a kinetics of tumor incidence similar to that observed in untreated transgenic mice. Once inserted in the system, cancer cells duplicate following an exponential law with parameters chosen to emulate tumor growth observed in the real mice.
Vaccinations are also simulated through the introduction at random lattice positions of VC at any given time-step. Each vaccine administration consisted of 50 proliferation-blocked VC.

Age
Cellular entities have age structure. They are born, they interact and duplicate and eventually die by apoptosis or by immune lysis. To keep track of the age of each cellular entity, a variable encodes the number of time-steps since cell birth. Parameters encoding the expected half-life of each cell type are shown in Tab. S1. To simulate memory cells, the half-life of TH, TC and B cells is increased after successful interaction with target antigens. Death probability approaches 1 when the age is twice the half-life.

Parameters
SimTriplex has two main classes of parameters, the first comprises known immunological parameters, the second includes free parameters with unknown values that were heuristically set to plausible values after a series of tests (tuning phase).
Known immunological parameters are shown in Tab S2A. Hypermut is the per-bit mutation probability for the antibodies, the value comes straight from the original Celada-Seiden model. Plasmarel controls the number of antibody molecules released by a plasma cell per time step; the indicated value represents an actual production of 10 ng/ml every 8 hours. ProbMAg is the probability for a macrophage to phagocytose an antigen. Bdup, THdup and TCdup are the duplication times of lymphocytes expressed in time-steps.
Free parameters used to tune the model with experimental results are shown in Tab. S2B. Nbitstr indicates the number of bits used to represent antigens and antigen receptors (see 1.2, above). Parameters min_match and affinity_level regulate specific antigen recognition (see 1.2, above), min_match specifies the minimal number of matching bits that are required to have a non-zero probability of binding; affinity_level is the probability of interaction between two binding sites whose match is min_match. max_lfact regulates the probability for a cell that is duplicating to create a new cell; IL2_eff and IL12_eff are factors expressing the efficiency of lymphocyte stimulation by IL-2 and IL-12. Parameter thymus_eff represents the efficiency of the thymic function in selecting non-autoreactive thymocytes (see 1.4, above).

Position and movement of entities
Each entity is characterized by its position in the lattice. Interactions take place only among entities occupying the same lattice site. Molecular and cellular entities are allowed to move with uniform probability between neighboring sites of the lattice at each time step; this mimics Brownian motion. Movement of cancer cells is not allowed (a separate version of the simulator that includes metastatic spread is under preparation).

Cellular states and interactions
Each cell is characterized by a set of internal states, as shown in Tab. S3. Each cell can be in different internal states and all cells are tracked individually.
States Resting, Active and Duplica reflect the activation state of immune system cells and their proliferation. All immune cells are initialized as Active, except TC that start as Resting. Antigenic stimulation will then govern status transitions. It should be noted that, for historical reasons, plasma cells are modeled as a separate cell type, rather than as a separate state of B cells. All cells without Duplica state (Tab. S3) do not replicate during the simulation. Cancer cells are always in Duplica state.
States Intern, PresI and PresII refer to antigen presentation. Antigen presenting cells (APC), which in SimTriplex include dendritic cells, macrophages and B cells, start as Active and change their status to Intern after encountering and internalizing an antigen. PresI and PresII mean that the cell is presenting a processed antigen in the context of MHC-I or MHC-II, respectively.
BoundToAb marks those cancer cells that are bound by a specific antibody against a surface tumor antigen, thus are prone to complementmediated lysis or to antibody-dependent cell-mediated cytotoxicity by NK cells.

Interactions
When two entities, which may interact, lie in the same lattice site they interact according to probabilistic laws. An interaction between two entities is a complex action that eventually triggers a state change of one or both entities. Allowed interactions are shown in Tab. S4. Interactions can be antigenspecific or nonspecific. Specific interactions, e.g. between antigens and antigen receptors, need a preliminary recognition phase between the two entities that is goverened by the functions described under 1.2 above; the specific interaction then takes place only if this first phase occurs successfully. Antigen-nonspecific interactions, such as those mediated by cytokines, lack the recognition phase.
We will follow the sequence of interactions elicited by vaccination and culminating in the destruction of cancer cells (Fig. S1). The first bout of specific interactions is elicited by tumor antigens (TAA) released by VC cells that either die spontaneously or are lysed by polyclonal TC activation triggered by the allogeneic MHC expressed by vaccine cells. The specific interaction results in TC replication (state change into Duplica) and increase TC lifetime by one time-step. Once TAA are released, they can interact with APC or antibodies. The interaction of TAA with APC will first cause antigen internalization, then antigen presentation by the APC. A presenting APC activates B, TH and TC cells; IL-12 produced by VC enhances these interactions and NK cytotoxicity. Activated T cells release IL-2 (which, as mentioned before, is meant here to represent also γ-interferon). A specific interaction between TH and B cells will change the latter into plasma cells that start antibody production. The efferent phase is thus mediated by interactions of NK cells, TC and antibodies with cancer cells. The parameters controlling the probability of cancer cell death as a consequence of such interactions were set according to experimental evidence of CC sensitivity to the various lytic mechanisms.

Phasing of events
The sequence of events implemented in SimTriplex is the following.
• The simulator accepts pre-determined input parameters, in particular the vaccination schedule encoded in a bit-string. • The lattice is initialized and filled at random positions with initial cell populations. o Internal events not driven by interactions take place, e.g. cell aging and spontaneous cell death.
o Diffusion: cells move on the lattice, the density of molecular entities is modified to simulate diffusion.
o A trace of the state of the system at the end of the loop is saved. • The simulation stops either when the total number of cancer cells exceeds a threshold, typically 10 5 , taken to represent the appearance of a macroscopic tumor that would cause a real mouse to be scored as tumor-positive, or after a predefined number of time-steps, usually exceeding one year of real time.

The genetic algorithm
In combination with the SimTriplex simulator, a genetic algorithm (GA) was used to discover new vaccination schedules. The following description of this approach follows that published previously (2). The entities of the GA are defined according to standard GA terminology, but the approach differs from a standard GA because a simulator was used to compute the fitness function, i.e. to measure the ability of a given vaccination schedule to protect mice from mammary carcinoma growth. In this system each GA chromosome in the chromosomes population represents a vaccine schedule. The chromosome is a binary string of 1200 bits, in which each gene (i.e. each bit) represents a constant time-step of 8 hours, ti, during which it is possible to administer a vaccination. If the i-th gene is expressed, i.e. the i-th bit is set to 1, then a vaccination has to be administered at time-step i; otherwise if the i-th gene is not expressed, i.e. the i-th bit is set to 0, then no vaccination has to be administered at time-step i. The set comprises 80 chromosomes. The selection operator is tournament selection. Reproduction uses uniform crossover; standard implementations of mutation and elitism were used (6).
An effective vaccination schedule must result in a tumor-free mouse survival time of 1200 time-steps, equal to a lifespan of 400 days. In defining the fitness function, two fundamental and competing requirements were considered: i) any schedule must be an effective one, i.e. the mouse survival time must reach 400 days; ii) the best schedules must have a minimal cardinality, i.e. they must guarantee tumor-free survival with the minimum number of vaccine injections. Furthermore, as each run of SimTriplex uses a different series of stochastic parameters to simulate individual variability, the fitness function f was calculated using eight different in silico mice, by summing up the same function for each mouse: where N 1 cc and N 2 cc are the maximum number of cancer cells in the lattice during the initial phase (time steps 0…150) and during the rest of the simulation, respectively. The division of the simulation in two phases and the two constants γ1 = 1.7·10 4 and γ2 = 5·10 3 were chosen to constrain the dynamics of tumor cells to follow that previously recorded in SimTriplex during the simulation of a highly effective vaccination schedule (Chronic protocol) that prevented in almost all mice the onset of macroscopic mammary carcinoma.
Additional constraints were based on immunological knowledge and practical laboratory requirements. For example only one vaccination was allowed per day, and only two vaccination in the same week, excluding weekends.