TY - JOUR
T1 - Identification of Nonlinear State-Space Systems from Heterogeneous Datasets
AU - Pan, Wei
AU - Yuan, Ye
AU - Ljung, Lennart
AU - Goncalves, Jorge
AU - Stan, Guy Bart
N1 - Funding Information:
Manuscript received January 12, 2017; revised May 28, 2017 and September 2, 2017; accepted September 17, 2017. Date of publication October 2, 2017; date of current version June 18, 2018. The work of W. Pan was supported by Microsoft Research through the Ph.D. Scholarship Program for his stay at Imperial College London. The work of G.-B. Stan was supported by the EPSRC under Grant EP/P009352/1 and by the EPSRC Fellowship for Growth EP/M002187/1. Recommended by Associate Editor A. E. Motter. (Corresponding author: Ye Yuan.) W. Pan is with the Department of Bioengineering, Imperial College London, London SW7 2AZ, U.K., and also with DJI Innovations, Shenzhen 518057, China (e-mail: [email protected]).
Publisher Copyright:
© 2014 IEEE.
PY - 2018/6
Y1 - 2018/6
N2 - This paper proposes a new method to identify nonlinear state-space systems from heterogeneous datasets. The method is described in the context of identifying biochemical/gene networks (i.e., identifying both reaction dynamics and kinetic parameters) from experimental data. Simultaneous integration of various datasets has the potential to yield better performance for system identification. Data collected experimentally typically vary depending on the specific experimental setup and conditions. Typically, heterogeneous data are obtained experimentally through 1) replicate measurements from the same biological system or 2) application of different experimental conditions such as changes/perturbations in biological inductions, temperature, gene knock-out, gene over-expression, etc. We formulate here the identification problem using a Bayesian learning framework that makes use of 'sparse group' priors to allow inference of the sparsest model that can explain the whole set of observed heterogeneous data. To enable scale up to large number of features, the resulting nonconvex optimization problem is relaxed to a reweighted Group Lasso problem using a convex-concave procedure. As an illustrative example of the effectiveness of our method, we use it to identify a genetic oscillator (generalized eight species repressilator). Through this example we show that our algorithm outperforms Group Lasso when the number of experiments is increased, even when each single time-series dataset is short. We additionally assess the robustness of our algorithm against noise by varying the intensity of process noise and measurement noise.
AB - This paper proposes a new method to identify nonlinear state-space systems from heterogeneous datasets. The method is described in the context of identifying biochemical/gene networks (i.e., identifying both reaction dynamics and kinetic parameters) from experimental data. Simultaneous integration of various datasets has the potential to yield better performance for system identification. Data collected experimentally typically vary depending on the specific experimental setup and conditions. Typically, heterogeneous data are obtained experimentally through 1) replicate measurements from the same biological system or 2) application of different experimental conditions such as changes/perturbations in biological inductions, temperature, gene knock-out, gene over-expression, etc. We formulate here the identification problem using a Bayesian learning framework that makes use of 'sparse group' priors to allow inference of the sparsest model that can explain the whole set of observed heterogeneous data. To enable scale up to large number of features, the resulting nonconvex optimization problem is relaxed to a reweighted Group Lasso problem using a convex-concave procedure. As an illustrative example of the effectiveness of our method, we use it to identify a genetic oscillator (generalized eight species repressilator). Through this example we show that our algorithm outperforms Group Lasso when the number of experiments is increased, even when each single time-series dataset is short. We additionally assess the robustness of our algorithm against noise by varying the intensity of process noise and measurement noise.
KW - biological system modeling
KW - system identification
UR - http://www.scopus.com/inward/record.url?scp=85030779537&partnerID=8YFLogxK
U2 - 10.1109/TCNS.2017.2758966
DO - 10.1109/TCNS.2017.2758966
M3 - Article
SN - 2325-5870
VL - 5
SP - 737
EP - 747
JO - IEEE Transactions on Control of Network Systems
JF - IEEE Transactions on Control of Network Systems
IS - 2
ER -