9 views

PDF

All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.

Document Description

Systems biology aims at building computational models of biological pathways in order to study in silico their behaviour and to verify biological hypotheses. Modelling can become a new powerful method in molecular biology, if correctly used. Here we

Document Share

Document Tags

Document Transcript

J. Math. Biol.DOI 10.1007/s00285-010-0350-z
MathematicalBiology
Derivation, identiﬁcation and validationof a computational model of a novel syntheticregulatory network in yeast
Lucia Marucci
·
Stefania Santini
·
Mario di Bernardo
·
Diego di Bernardo
Received: 4 November 2009 / Revised: 19 May 2010© Springer-Verlag 2010
Abstract
Systems biology aims at building computational models of biologicalpathways in order to study in silico their behaviour and to verify biological hypothe-ses. Modelling can become a new powerful method in molecular biology, if correctlyused. Here we present step-by-step the derivation and identiﬁcation of the dynamicalmodel of a biological pathway using a novel synthetic network recently constructed inthe yeast
Saccharomyces cerevisiae
for In-vivo Reverse-Engineering and ModellingAssessment. This network consists of ﬁve genes regulating each other transcription.Moreover, it includes one protein–protein interaction, and its genes can be switchedon by addition of galactose to the medium. In order to describe the network dynam-ics, we adopted a deterministic modelling approach based on non-linear differentialequations. We show how, through iteration between experiments and modelling, it ispossible to derive a semi-quantitative prediction of network behaviour and to betterunderstand the biology of the pathway of interest.
Electronic supplementary material
The online version of this article(doi:10.1007/s00285-010-0350-z) contains supplementary material, which is available to authorized users.
L. Marucci (
B
)
·
D. di BernardoTelethon Institute of Genetics and Medicine (TIGEM), 80131 Naples, Italye-mail: marucci@tigem.itD. di Bernardoe-mail: dibernardo@tigem.itL. Marucci
·
S. Santini
·
M. di Bernardo
·
D. di BernardoDepartment of Computer and Systems Engineering,Federico II University, 80125 Naples, ItalyM. di Bernardoe-mail: mario.dibernardo@unina.it
1 3
L. Marucci et al.
Keywords
Mathematical modelling
·
Synthetic biology
·
Hill functions
·
Parameters identiﬁcation
Mathematics Subject Classiﬁcation (2000)
92-08
·
93A30
·
93B30
1 Introduction
The emerging discipline of Synthetic Biology can be deﬁned as the engineering of biology. Up to now, two major goals have been actively investigated: the building of new biological networks in the cell that perform a speciﬁc task [e.g. periodic expres-sion of a gene (Elowitz and Leibler 2000) or genetic switching (Gardner et al. 2000)]
and the modiﬁcation of networks that occur in nature in order to achieve some desiredfunctionalities(e.g.productionofaspeciﬁccompoundusefulformedicalapplicationsRoetal.2006).SyntheticBiologyisaninterdisciplinaryarearequiringadeepsynergybetween biology, biotechnology and nanotechnology on one side and mathematicalmodelling, information technology and control theory on the other. Such combinationof disciplines is needed to construct robust and predictable synthetic networks. In par-ticular, quantitative models are needed for a precise and unambiguous description of synthetic circuits (Kaznessis 2007). Mathematical models allow to rigorously com-
parehypothesesandobservations,thusprovidingadditionalinsightintothebiologicalmechanisms. Model derivation from experimental data can be carried out followingthree major approaches: white-box, black-box and gray-box. In white-box modelling,the model and parameter values are entirely derived from ﬁrst principles, while inblack-box modelling the model is completely derived from input–output data. Thethird alternative, the so-called gray-box approach (Nelles 2000), combines the twoabove approaches. Specifically, ﬁrst principles are used to partially derive the modelstructure, while parameters or terms in the model are determined by measurementdata. The approach we use in this paper is a gray-box one. In this case, modellingentails three main steps to be executed iteratively: (i) derivation of the model equa-tions; (ii) identiﬁcation of the model parameters from experimental data and/or liter-ature; (iii) validation [or invalidation (Anderson and Papachristodoulou 2009)] of the
model.Step (i) requires introducing simplifying hypothesis and choosing a proper for-mal framework. A huge variety of mathematical formalisms have been proposed inthe literature, such as directed graphs, Bayesian networks, Boolean networks andtheir generalizations, ordinary and partial differential equations, qualitative differen-tial equations, stochastic equations, and rule-based formalisms (see, for example, DeJong 2002; Ventura et al. 2006; Szallasi et al. 2006 and references therein). Determin-
isticformalisms are commonly used to describe the average behaviour of a populationofcells(DeJong2002).Theyhavebeenshowntobeviablefortheanalysisofsyntheticnetworks in a number of works (e. g. Elowitz and Leibler 2000; Gardner et al. 2000;
Kramer et al. 2004; Tigges et al. 2009; Stricker et al. 2008). The reaction mechanism
is described by applying the law of mass action: the rate of any given elementary reac-tion is proportional to the product of the concentrations of the species reacting in theelementary process (reactants) (Alon 2006). The DEs modelling approach is based on
1 3
Derivation, identiﬁcation and validation of a computational model
the following biological assumptions: the quantiﬁed concentrations do not vary withrespect to space and they are continuous functions of time. These assumptions holdfor processes evolving on long time scales in which the number of molecules of thespecies in the reaction volume is sufﬁciently large. In different experimental cases,approachesbasedonpartialdifferentialequationsorstochasticmodelswouldbemoreappropriate (Szallasi et al. 2006).
Step (ii) is required to estimate unknown model parameters from the availableexperimental data. A crucial issue that arises when estimating model parameters, isthe structural identiﬁability (Walter and Pronzato 1997). The notion of identiﬁabilityaddresses feasibility of estimating unknown parameters from data collected in well-deﬁned stimulus-response experiments (Cobelli and Distefano 1980). Structural non-
identiﬁability is related to the model structure independently from experimental data.In contrast, practical non-identiﬁability also takes into account the amount and thequality of measured data used for parameters calibration. Of note, a parameter that isstructurallyidentiﬁablemaystillbepracticallynon-identiﬁable,duetotheunavoidablepresence of noise in biological experimental data (Raue et al. 2009). Unfortunately,while being well assessed in the case of linear dynamical systems, the identiﬁabilityanalysisofhighlynon-linearsystemsremainsanopenproblem(BoubakerandFourati2004).The parameter estimation problem can be formulated from the mathematical view-pointasaconstrainedoptimizationproblemwherethegoalistominimizetheobjectivefunction, deﬁned as the error between model predictions and real data. In biologicalapplications, the objective function usually displays a large number of local optimaas measurements are strongly affected by noise. For this kind of problems, classicaloptimization methods, based on gradient descent from an arbitrary initial guess of the solution, can be unfeasible and show convergence difﬁculties. The above con-siderations suggest to look at stochastic optimization algorithms, like evolutionarystrategies, which rely on random explorations of the whole space of solutions, are notsensitivetoinitialconditionsandavoidtrappinginlocaloptimalpoints.InMolesetal.(2003),theperformanceofbothlocalandglobal-searchoptimizationmethodsiscom-pared in the identiﬁcation of the 36 unknown parameters of a non-linear biochemicalnetwork. The authors show that only evolutionary strategies are able to successfullysolve the parameters estimation problem, while gradient based methods tend to con-verge to local minima. Among the stochastic techniques, genetic algorithms (GA)(Mitchell 1998) provide a very ﬂexible approach to non-linear optimization. Theirapplication showed good results in the parametrisation of synthetic networks (Weberet al. 2007; Tigges et al. 2009).
Finally, step (iii) is required to check the validity and usefulness of the model, thatis to evaluate its ability in predicting the behaviour of the actual physical system.Theoretically, the modeller should be conﬁdent that the formalism is able to describe
all
input–output behaviours of the system (Smith and Doyle 1992). This condition
can be never guaranteed, since it would require an inﬁnite number of experiments.However, it is possible to test a necessary condition: the model is able to describe
allobserved
input–output behaviours of the system (Smith and Doyle 1992). To this aim,one possible approach is to use a cross-validation like procedure (Arlot and Celisse2010) by splitting the experimental data in two sets: one of them is used for the
1 3
L. Marucci et al.
parameter identiﬁcation, while the other one is used to validate the predictive powerof the model. If the predictive performance of the model is not satisfactory, it is invali-dated(AndersonandPapachristodoulou2009).Thus,itisnecessarytoreﬁnethemodel(for example, by increasing the level of detail) and/or to perform new experiments,going back to step (i) of the modelling procedure.In what follows, we describe the gray-box modelling of IRMA network (Fig. 1) asarepresentativeexampleofthemodellingproblemforasmallbiologicalpathway,andpresent the detailed derivation of the model whose equations were given in Cantoneet al. (2009). Note that usually, when a genetic circuit is presented to the Synthetic
Biology community, only the best performing mathematical model is showed withoutprovidinganydetailofhowthemodelwasobtained.Here,instead,weprovidethe“his-tory” of the derivation of the ﬁnal model and experimental data-set, highlighting themajor modelling choices and challenges faced during the process. The aim is to builda model able to correctly predict the dynamical changes in the mRNA concentrationsof the ﬁve network genes following both internal and external perturbations (i.e. geneover-expression, galactose addition, etc.). We choose differential equations (DEs) tomodel the dynamics of the genes. The task is challenging since, to our knowledge,up to now quantitative DEs mathematical models have been developed for syntheticnetworks composed of a smaller number of genes than IRMA (e.g. Gardner et al.2000; Elowitz and Leibler 2000; Tigges et al. 2009; Kramer et al. 2004; Stricker et al.
Fig.1
Diagramofthenetwork.Schematicdiagramofthesyntheticgenenetwork.Newtranscriptionalunits(
rectangles
) were built by assembling promoters with non-self coding sequences. Genes were tagged at the3
end with the speciﬁed sequences. Each cassette encodes for a protein (
circle
) regulating the transcriptionof another gene in the network (
solid lines
)
1 3
Derivation, identiﬁcation and validation of a computational model
2008). Regarding theidentiﬁabilityissue,weadopted thenovel approach proposedbyRaue and colleagues (see Raue et al. 2009), able to deal with non-linear models withan high number of parameters. This approach exploits the proﬁle likelihood and isabletodetect bothstructuralandpracticalnon-identiﬁable parameters.Fortheparam-eters identiﬁcation, in order to cope with the high number of unknown quantities, thenoise of experimental data and the presence of non-linear aspects in the optimisationprocedure, we used an ad hoc designed hybrid genetic algorithm (see Sect.3 for fur-ther details). Finally, for the model validation, we tested the predictions of the modelagainst data not used for the parameters identiﬁcation.
2 Results and discussion
2.1 Derivation of model equations: step (i)For each species in the network, i.e. each mRNA (italic capital letters) and corre-spondent protein concentration (roman small letters), we wrote one equation whichexpresses its change in time as the result of production and degradation:
d
[
CBF1
]
dt
=
α
1
+
v
1
H
+
−
(
[
Swi5
]
,
[
Ash1
];
k
1
,
k
2
,
h
1
,
h
2
)
−
d
1
[
CBF1
]
,
(1)
d
[
Cbf1
]
dt
=
β
1
[
CBF1
]
−
d
2
[
Cbf1
]
,
(2)
d
[
GAL4
]
dt
=
α
2
+
v
2
H
+
(
[
Cbf1
];
k
3
,
h
3
)
−
d
3
[
GAL4
]
,
(3)
d
[
Gal4
]
dt
=
β
2
[
GAL4
]
−
d
4
[
Gal4
]
,
(4)
d
[
SWI5
]
dt
=
α
3
+
v
3
H
+
(
[
Gal4
free
];
k
4
,
h
4
)
−
d
5
[
SWI5
]
,
(5)
d
[
Swi5
]
dt
=
β
3
[
SWI5
]
−
d
6
[
Swi5
]
,
(6)
d
[
GAL80
]
dt
=
α
4
+
v
4
H
+
(
[
Swi5
];
k
5
,
h
5
)
−
d
7
[
GAL80
]
,
(7)
d
[
Gal80
]
dt
=
β
4
[
GAL80
]
−
d
8
[
Gal80
]
,
(8)
d
[
ASH1
]
dt
=
α
5
+
v
5
H
+
(
[
Swi5
];
k
6
,
h
6
)
−
d
9
[
ASH1
]
,
(9)
d
[
Ash1
]
dt
=
β
5
[
ASH1
]
−
d
10
[
Ash1
]
.
(10)The ﬁrst two terms, on the right-hand side of the mRNA equations, represent theproduction, where
α
are the basal transcription rates;
v
are the maximal transcriptionrates modulated by the Hill functions,
H
+
(
y
;
k
,
h
)
=
y
h
y
h
+
k
h
,
H
−
(
z
;
k
,
h
)
=
k
h
y
h
+
k
h
1 3

Similar documents

Search Related

Method Development and Validation of New DrugDevelop and Validation of Conceptual TestIn-Silico Identification and Optimization of Verification and validation of SimulationsCallibration and Validation of Remote SensingMethod Development and Validation of New DrugCallibration and Validation of Remote SensingAdaptation and Validation of instrumentsHistory and Theory of the NovelProblems and Prospect of Pupils Vis-A-Vis Par

We Need Your Support

Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks