2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于DNN的低资源语音识别特征提取技术

秦楚雄 张连海

侯利民, 王龙洋, 王怀震. 基于NDOB的匹配/非匹配不确定性系统滑模控制. 自动化学报, 2017, 43(7): 1257-1264. doi: 10.16383/j.aas.2017.e160014
引用本文: 秦楚雄, 张连海. 基于DNN的低资源语音识别特征提取技术. 自动化学报, 2017, 43(7): 1208-1219. doi: 10.16383/j.aas.2017.c150654
Hou Limin, Wang Longyang, Wang Huaizhen. SMC for Systems With Matched and Mismatched Uncertainties and Disturbances Based on NDOB. ACTA AUTOMATICA SINICA, 2017, 43(7): 1257-1264. doi: 10.16383/j.aas.2017.e160014
Citation: QIN Chu-Xiong, ZHANG Lian-Hai. Deep Neural Network Based Feature Extraction for Low-resource Speech Recognition. ACTA AUTOMATICA SINICA, 2017, 43(7): 1208-1219. doi: 10.16383/j.aas.2017.c150654

基于DNN的低资源语音识别特征提取技术

doi: 10.16383/j.aas.2017.c150654
基金项目: 

国家自然科学基金 61673395

国家自然科学基金 61403415

国家自然科学基金 61302107

详细信息
    作者简介:

    张连海 信息工程大学信息系统工程学院副教授.主要研究方向为语音信号处理与智能信息处理.E-mail:lianhaiz@sina.com

    通讯作者:

    秦楚雄 信息工程大学信息系统工程学院博士研究生.主要研究方向为智能信息处理.本文通信作者.E-mail:chuxiongq313@gmail.com

Deep Neural Network Based Feature Extraction for Low-resource Speech Recognition

Funds: 

Supported by National Natural Science Foundation of China 61673395

Supported by National Natural Science Foundation of China 61403415

Supported by National Natural Science Foundation of China 61302107

More Information
    Author Bio:

     Associate professor in the Department of Information and System Engineering, Information Engineering University. His research interest covers speech signal processing and intelligent information processing

    Corresponding author: QIN Chu-Xiong Ph. D. candidate in the Department of Information and System Engineering, Information Engineering University. His main research interest is intelligent information processing. Corresponding author of this paper.E-mail:chuxiongq313@gmail.com
  • 摘要: 针对低资源训练数据条件下深层神经网络(Deep neural network,DNN)特征声学建模性能急剧下降的问题,提出两种适合于低资源语音识别的深层神经网络特征提取方法.首先基于隐含层共享训练的网络结构,借助资源较为丰富的语料实现对深层瓶颈神经网络的辅助训练,针对BN层位于共享层的特点,引入Dropout,Maxout,Rectified linear units等技术改善多流训练样本分布不规律导致的过拟合问题,同时缩小网络参数规模、降低训练耗时;其次为了改善深层神经网络特征提取方法,提出一种基于凸非负矩阵分解(Convex-non-negative matrix factorization,CNMF)算法的低维高层特征提取技术,通过对网络的权值矩阵分解得到基矩阵作为特征层的权值矩阵,然后从该层提取一种新的低维特征.基于Vystadial 2013的1小时低资源捷克语训练语料的实验表明,在26.7小时的英语语料辅助训练下,当使用Dropout和Rectified linear units时,识别率相对基线系统提升7.0%;当使用Dropout和Maxout时,识别率相对基线系统提升了12.6%,且网络参数数量相对其他系统降低了62.7%,训练时间降低了25%.而基于矩阵分解的低维特征在单语言训练和辅助训练的两种情况下都取得了优于瓶颈特征(Bottleneck features,BNF)的识别率,且在辅助训练的情况下优于深层神经网络隐马尔科夫识别系统,提升幅度从0.8%~3.4%不等.
  • In recent years, the control methods for systems affected by uncertainties and disturbances have been focused by researchers [1]-[7]. Compared to other control methods, sliding mode control (SMC) has attracted a significant interest due to its conceptual simplicity, easy implementation, and robustness to external disturbances and model uncertainties [8]-[10]. SMC is a nonlinear control strategy that forces the closed loop trajectories to the switching manifold in finite-time using a discontinuous feedback control action. Therefore, SMC has been widely used in many applications, such as motion control, process control, etc. However, the conventional sliding mode control has a violent chattering phenomenon in the process, which can degrade the system performance. Moreover, it can guarantee invariance only if the uncertainties and disturbances satisfy the matching conditions, and cannot attenuate mismatched uncertainties and disturbances effectively.

    Note that the matched and mismatched uncertainties and disturbances widely exist in practical engineering, such as power systems [11], electronic systems [12], [13] and motor systems [14]. The sliding motion of the traditional SMC is severely affected by the mismatched uncertainties and disturbances, and the well-known robustness of SMC does not hold any more. Algorithms like LMI-based control [15], [16], adaptive control [17], [18], back stepping based control [19], and integral sliding-mode control [20], [21] are proposed to handle mismatched uncertainties in a robust way, but the price is that the nominal control performance is compromised.

    As a practical alternative approach, disturbance observer-based control has been proven to be promising and effective in compensating the effects of unknown external disturbances and model uncertainties in control systems as well as it will not deteriorate the existing controller [22], [23]. It could completely remove the non-vanishing disturbances from system as long as they can be estimated accurately [24]. Recently, several authors introduced a disturbance observer (DOB) for SMC to alleviate the chattering problem and retain its nominal control performance [25]-[29]. The idea is to construct the control law by combining the SMC feedback with the disturbance estimation based-feed forward compensation straightforwardly. However, these methods given in [25]-[29] are only available for the matched uncertain systems. A nonlinear extension of DO has been proposed by W. H. Chen, which estimates matched as well as mismatched disturbances [30], [31]. Reference [1] extends a recent result on sliding mode control for general nth order systems with a larger class of mismatched uncertainties by proposing an extended disturbance observer. Reference [13] investigates an extended state observer (ESO)-based sliding mode control (SMC) approach for pulse width modulation-based DC-DC buck converter systems subject to mismatched disturbances, the proposed method obtains a better disturbance rejection ability even the disturbances do not satisfy the so-called matching condition. A novel sliding-mode control based on the disturbance estimation by a nonlinear disturbance observer (NDOB) based SMC in [14] is only proposed to deal with mismatched uncertainties, it can ensure the system performance and reduce the chattering.

    In this paper, aiming to improve the performance of the system affected by mismatched/matched uncertainties and disturbances, a novel nonlinear control scheme is proposed, where the SMC scheme is integrated with NDOB. By fully taking into account the estimation value of disturbances, a new sliding-mode surface is firstly designed which is insensitive to not only matched disturbances but also mismatched ones. In this paper, the contributions are listed as follows:

    1) The control is proposed for a general system of $n$ order, having mismatched/matched uncertainties and disturbances.

    2) The novel sliding surface is extended for a general system of n order and modified to enable improvement in the performance of the system without causing a large increase in the control.

    3) The proposed method exhibits the properties of nominal performance recovery and chattering reduction as well as excellent dynamic and static performance as compared with the traditional SMC.

    The paper is organized as follows: the problem of nominal sliding mode controller design with mismatched and matched uncertainties and disturbances for a class of nonlinear system is stated in Section 2. Generalization of NDOB, novel sliding surface design and the stability analysis are derived in Section 3. The simulation results are presented in Section 4, followed by some concluding remarks in Section 5.

    Consider a class of single-input single-output dynamic systems with matched and mismatched uncertainties, depicted by

    $ \begin{equation} \left\{\begin{array}{llllll} & \!\!\!\!\!\!\!{{\dot x}_i} ={x_{i + 1}} + {d_i}(t)\\ & ~\vdots\\ & \!\!\!\!\!\!\!{{\dot x}_n} =a(x) + b(x)u + {d_n}(t)\\ & \!\!\!\!\!\!\!y ={x_1} \end{array} \right. \ i= 1, \ldots, n - 1\ \end{equation} $

    (1)

    where $x = {\left[{{x_1} \ldots {x_n}} \right]^T}$ is the state vector, $u$ is the control input, $y$ is the controlled output, ${d_i}(t)$ and ${d_n}(t)$ are the mismatched and matched uncertainties and disturbances, respectively. $a(x)$ and $b(x)$ denote smooth nominal functions.

    Taking a second-order system as an illustration, we can get the control model of the system:

    $ \begin{equation} \left\{ \begin{array}{lllll} & \!\!\!\!\!\!\!{{\dot x}_1} ={x_2} + {d_1}(t)\\ & \!\!\!\!\!\!\!{{\dot x}_2} = a(x) + b(x)u + {d_2}(t)\\ & \!\!\!\!\!\!\!y ={x_1} \end{array} \right.. \end{equation} $

    (2)

    Assumption 1: The lumped disturbances ${d_1}(t)$ , ${d_2}(t)$ in (2) are bounded and defined by .

    The sliding mode surface and control law of the traditional SMC is usually designed as follows:

    $ \begin{equation} {s_1} = {x_2} + {k_1}{x_1} \end{equation} $

    (3)

    $ \begin{equation} u = - {b^{ - 1}}(x)\left[{{k_1}{x_2} + {\eta _1}{\rm sgn}(s_1) + a(x)} \right] \end{equation} $

    (4)

    where ${k_1} > 0$ is a design constant, ${\eta _1} > 0$ is the switching gain to be designed. Taking the derivative of (3), and combining (2) and (4) gives

    $ \begin{equation} {\dot s_1} = - {\eta _1}{\rm sgn}(s_1) + {k_1}{d_1}(t) + {d_2}(t). \end{equation} $

    (5)

    The states of (2) will reach the sliding mode surface ${s_1} = 0$ in finite time if the switching gain ${\eta _1}$ in the control law (4) is devised such that ${\eta _1} > {d^*}$ . Once the nominal sliding surface ${s_1} = 0$ is reached, the sliding motion is obtained and given by

    $ \begin{equation} {\dot x_1} + {k_1}{x_1} = {d_1}(t). \end{equation} $

    (6)

    Remark 1: Equation (6) implies that if the ${d_1}(t) = 0$ , the system states can be driven to the desired equilibrium point, which implies that the conventional SMC is insensitive to matched disturbance. However, in the existence of mismatched disturbance ${d_1}(t) \ne 0$ , the system state ${x_1}$ is affected by the mismatched disturbance and does not converge to zero in finite time although the control law can force the system states to reach the sliding-mode surface in a finite time. To this end, it is an essential reason why the nominal SMC design is only insensitive to matched disturbance but sensitive to mismatched disturbance.

    In this paper, the matched and mismatched disturbance rejection problem is considered for (1). A novel sliding mode controller based on NDOB is proposed by the following two steps. First, an NDOB is employed to estimate the matched and mismatched disturbances. A novel sliding mode controller is then designed for the system based on the disturbance observation. The control structure of the second-order system is designed as Fig. 1.

    图 1  Control structure of second-order system.
    Fig. 1  Control structure of second-order system.

    A nonlinear disturbance observer (NDOB), which is adopted to estimate the disturbance in [14]. Consider a class of nonlinear systems with uncertainties and external disturbance:

    $ \begin{equation} \left\{ \begin{array}{ccccc} \dot x =f(x) + {g_1}(x)u + {g_2}(x)d(t)\\ y =h(x)~~~~~~~~~~~~~~~~~~~~~~~~~~ \end{array} \right.. \end{equation} $

    (7)

    A NDOB is used to estimate the compound disturbance of the system and compensate accordingly, in order to improve the robustness of the controller. The NDOB is introduced and depicted by

    $ \begin{equation} \left\{ \begin{array}{lllllll} & \!\!\!\!\!\!\!\dot z = - l{g_2}(x)z - l\left[{{g_2}(x)lx + f(x) + {g_1}(x)u} \right]\\ & \!\!\!\!\!\!\!\hat d(t) = z + lx \end{array} \right. \end{equation} $

    (8)

    where $\hat d(t)$ is the estimation vector of the disturbance vector $d(t)$ , and $z$ is the internal state vector of the nonlinear observer. $l$ is the disturbance observer gain matrix to be designed.

    Assumption 2: The derivative of the disturbance in system (1) is bounded and satisfies

    $ \begin{equation*}\mathop {\lim }\limits_{t \to \infty } d_i^{(j)}(t) = 0, \quad \quad i = 1, \ldots, n; j = 1, \ldots, n - 1.\end{equation*} $

    Remark 2: This assumption satisfies the requirements of the simulation sample. An extended disturbance observer is proposed in [1], which can observe a class of systems with mismatched uncertainties and get for the sliding mode surface and $u$ . A nonlinear disturbance observer is designed to observe and get for the sliding mode surface and $u$ in our paper.

    The disturbance estimation error of the NDOB is defined:

    $ \begin{equation} {e_d}(t) = d(t) - \hat d(t). \end{equation} $

    (9)

    The error dynamics can be derived as

    $ \begin{align} & {{{\dot{e}}}_{d}}(t)=\dot{d}(t)-\dot{\hat{d}}(t) \\ & \ \ \ \ \ \ \ \ \ =\dot{d}(t)-\dot{z}-l\left[f(x)+{{g}_{1}}(x)u+{{g}_{2}}(x)d(t) \right]. \\ \end{align} $

    (10)

    Substituting the value of $\dot z$ , error dynamics reduce to

    $ \begin{equation} {\dot e_d}(t) = \dot d(t) - {lg_2}(x){e_d}(t). \end{equation} $

    (11)

    Lemma 1 [7]: Suppose that Assumptions 1 and 2 are satisfied for (7). The disturbance estimation $\hat d(t)$ of NDOB (8) can track the disturbance $d$ of (7) asymptotically as long as the observer gain $l$ is chosen such that $lg_2(x) > 0$ holds, which implies that ${\dot e_d}(t) + {lg_2}(x){e_d}(t) = 0$ , the estimation error will converge to zero asymptotically.

    Consider the system

    $ \begin{equation} \dot x = f(t, x, u) \end{equation} $

    (12)

    where is piecewise continuous in $t$ and locally Lipschitz in $x$ and $u$ . The input $u(t)$ is a piecewise continuous, bounded function of $t$ for all $t \ge {0}$ .

    Definition 1: The system (12) is said to be input-to-state stable (ISS) if there exist a class $\kappa \iota $ function and a class $\kappa$ function $\gamma $ such that for any initial state $x({t_0})$ and any bounded input $u(t)$ , the solution $x(t)$ exists for all $t \ge {t_0}$ and satisfies

    $ \begin{equation} \left\| {x(t)} \right\| \le \beta (\left\| {x({t_0})} \right\|, t- {t_0}) + \gamma (\mathop {\sup }\limits_{{t_0} \le \tau \le t} \left\| {u(\tau )} \right\|). \end{equation} $

    (13)

    Such a function $\gamma$ in (13) is referred to as an ISS-gain for (12). ISS implies that (12) is bounded-input bounded-state stable when $u \ne 0$ and its zero solution (with $u = 0$ ) is globally asymptotically stable.

    Lemma 2 [14]: Consider a nonlinear system $\dot x = F(x, w)$ , which is input-to-state stable (ISS). If the input satisfies $\mathop {\lim }_{t \to \infty } w(t) = 0$ , then the state .

    A novel sliding mode manifold for (2) under matched and mismatched disturbances is designed as

    $ \begin{equation} {s_2} = {x_2} + {k_2}{x_1} + {\hat d_1}(t). \end{equation} $

    (14)

    Theorem 1: Considering the above (2) with matched and mismatched disturbances, we proposed sliding-mode surface (14), if the control law is designed as follows:

    $ \begin{align} u = & - {b^{ - 1}}(x)\Big\{ {k_2}\left[{{x_2} + {{\hat d}_1}(t)} \right] + {\eta _2}{\rm sgn}({s_2})\nonumber\\ & + a(x) + {{\hat d}_2}(t) + {{\dot {\hat d}_1}}(t) \Big\}. \end{align} $

    (15)

    Suppose the second-order system satisfies Assumptions 1 and 2, the observer gain $l$ is chosen such that ${\lg}{_2}(x) > 0$ holds, if the switching gain is chosen such that and ${k_2} > 0$ , then the closed-loop system is asymptotically stable.

    Proof: Consider a candidate Lyapunov function as

    $ \begin{equation} {V_1} = \frac{1}{2}s_2^T{s_2}. \end{equation} $

    (16)

    Taking derivative of $V$ in (16), we obtain that

    $ \begin{equation} {\dot V_1} = {s_2}{\dot s_2} = {s_2}({\dot x_2} + {k_2}{\dot x_1} + \dot{\hat{d}}_1 (t). \end{equation} $

    (17)

    Substituting (15) into (2) yields

    $ \begin{align} {{\dot x}_2} = & a(x) + b(x)u + {d_2}(t)\nonumber\\ = & a(x) - b(x) \times {b^{ - 1}}(x)\Big\{ {k_2}\left[{{x_2} + {{\hat d}_1}(t)} \right] + {\eta _2}{\rm sgn}({s_2})\nonumber\\ & + a(x) + {{\hat d}_2}(t) + {{\dot {\hat d}_1}(t)}\Big\} + {d_2}(t). \end{align} $

    (18)

    Substituting (18) into (17) yields

    $ \begin{align} {\dot V}_1= & {s_2} \Big\{ - {k_2}\left[{{x_2} + {{\hat d}_1}(t)} \right] - {\eta _2}{\rm sgn}(s) - {{\hat d}_2}(t)\nonumber\\ & - {{\dot {\hat d}_1}(t)} + {d_2}(t) + {k_2}{x_2} + {k_2}{d_1}(t) + {{\dot {\hat d}_1}(t)} \Big\}\nonumber\\[1mm] = & {s_2} \Big\{ - {k_2}{{\hat d}_1}(t) - {\eta _2}{\rm sgn}(s) - {{\hat d}_2}(t) + {d_2}(t) + {k_2}{d_1}(t) \Big\}\nonumber\\ = & {s_2}\left( { - {\eta _2}{\rm sgn}({s_2}) + {k_2}{e_{d1}} + {e_{d2}}} \right)\nonumber\\[1mm] \le & \left[{-{\eta _2} + {k_2}{e_{d1}} + {e_{d2}}} \right]\left| {{s_2}} \right|\nonumber\\[1mm] = & - \sqrt 2 \left[{{\eta _2}-({k_2}{e_{d1}} + {e_{d2}})} \right]V_1^{\frac{1}{2}} \end{align} $

    (19)

    where ${e_{d1}} = {d_1} - {\hat d_1}$ , .

    It can be derived from (19) that the system states will reach the defined sliding surface ${s_2} = 0$ in finite time when . The condition implies

    $ \begin{equation} {\dot x_1} = - {k_2}{x_1} + {e_{d1}}. \end{equation} $

    (20)

    With this result, it can be derived that (20) is ISS. According to Lemma 2, we know that the system states satisfy and . This implies that the system states can converge to the desired equilibrium point along the sliding surface asymptotically under the proposed control law.

    Remark 3: Since the matched and mismatched disturbances have been precisely estimated by the NDOB, the switching gain of the proposed method can be designed much smaller than those of the traditional SMC, because the magnitude of the estimation error ${e_d}$ is much smaller than the magnitude of the disturbance $d$ . The chattering problem can be alleviated to some extent in the case the nominal performance of the sliding mode control is retained.

    Consider the following third-order system, depicted by

    $ \begin{equation} \left\{ \begin{array}{llll} {{\dot x}_1} = {x_2} + {d_1}(t)\\ {{\dot x}_2} = {x_3} + {d_2}(t)\\ {{\dot x}_3} = a(x) + b(x)u + {d_3}(t)\\ y = {x_1}. \end{array} \right. \end{equation} $

    (21)

    A sliding mode manifold for (21) is designed as

    $ \begin{equation} {s_3} = {x_3} + {x_2} + {k_3}{x_1} + {\hat d_1}(t) + {\hat d_2}(t) + \dot{\hat{d}}_1(t). \end{equation} $

    (22)

    Theorem 2: Considering the above (21) with matched and mismatched disturbances, we proposed sliding-mode surface (22), if the control law is designed as follows:

    $ \begin{align} u= & - {b^{ - 1}}(x)\Big\{ {k_3}\left[{{x_2}+ {{\hat d}_1}(t)} \right] + {\eta _3}{\rm sgn}({s_3}) + {x_3} + a(x)\nonumber\\ & + {{\hat d}_2}(t)+ {{\hat d}_3}(t) + {{\dot {\hat d}_1}(t)}+{{\dot {\hat d}_2}(t)}+{{\ddot {\hat d}_1}(t)} \Big\}. \end{align} $

    (23)

    Suppose the third-order system satisfies Assumptions 1 and 2, the observer gain $l$ is chosen such that ${\lg}{_2}(x) > 0$ holds, if the switching gain is chosen such that ${k_3} > 0$ and , then the closed-loop system is asymptotically stable.

    A sliding mode manifold for (1) in the case $n > 2$ is designed as

    $\begin{equation} {s_n} = \sum\limits_{i = 2}^n {{x_i}} + {k_n}{x_1} + \sum\limits_{j = 1}^{n - 1} {\sum\limits_{i = 1}^{n - j} {\hat d_i^{(j - 1)}(t)} }. \end{equation} $

    (24)

    Theorem 3: Considering the above system (1) with matched and mismatched disturbances, we proposed sliding-mode surface (24), if the control law is designed as follows:

    $ \begin{align} u = & - {b^{ - 1}}(x)\Big\{ {k_n}\left[{{x_2} + {{\hat d}_1}(t)} \right] + {\eta _n}{\rm sgn}({s_n}) + \sum\limits_{i = 3}^n {{x_i}} \nonumber\\ & + a(x) + \sum\limits_{i = 2}^n {{{\hat d}_i}(t)} + \sum\limits_{j = 1}^{n - 1} {\sum\limits_{i = 1}^{n - j} {\hat d_i^{(j)}(t)} } \Big\}. \end{align} $

    (25)

    Suppose the general high-order system satisfies Assumptions 1 and 2, the observer gain $l$ is chosen such that ${\lg}{_2}(x)> 0$ holds, if the switching gain is chosen such that and ${k_n} > 0$ , then the closed-loop system is asymptotically stable.

    Proof: Consider a candidate Lyapunov function as

    $ \begin{equation} {V_n} = \frac{1}{2}s_n^T{s_n}. \end{equation} $

    (26)

    Taking derivative of $V$ in (26), we obtain that

    $ \begin{align} {{\dot V}_n} = & {s_n}{{\dot s}_n} = {s_n}\big(\sum\limits_{i = 2}^n {{{\dot x}_i}} + {k_n}{{\dot x}_1} + \sum\limits_{j = 1}^{n - 1} {\sum\limits_{i = 1}^{n - j} {\hat d_i^{(j)}(t)} } \big) \nonumber\\ = & {s_n}\Big[ {x_3} + {d_2} + {x_4} + {d_3} + \cdots + {x_n} + {d_{n-1}} + a(x)\nonumber\\ & + b(x)u + {d_n}(t) + {k_n}{x_2} + {k_n}{d_1} + \sum\limits_{j = 1}^{n-1} {\sum\limits_{i = 1}^{n-j} {\hat d_i^{(j)}(t)} } \Big]. \end{align} $

    (27)

    Substituting (25) into (27) yields

    $\begin{align}{{\dot V}_n}= & {s_n}\Bigg\{ {x_3} + {d_2} + {x_4} + {d_3} + \cdots + {x_n} + {d_{n - 1}} + a(x)\nonumber\\ & - b(x)\!\! \times\!\! {b^{ - 1}}(x)\bigg\{ {k_n}\left[{{x_2} + {{\hat d}_1}(t)} \right] + {\eta _n}{\rm sgn}({s_n})\nonumber\\ & + \sum\limits_{i = 3}^n {{x_i}} + a(x) + \sum\limits_{i = 2}^n {{{\hat d}_i}(t)} + \sum\limits_{j = 1}^{n - 1} {\sum\limits_{i = 1}^{n - j} {\hat d_i^{(j)}(t)} } \bigg\}\nonumber\\ & + {d_n}(t) + {k_n}{x_2} + {k_n}{d_1} + \sum\limits_{j = 1}^{n - 1} {\sum\limits_{i = 1}^{n - j} {\hat d_i^{(j)}(t)} } \Bigg\}\nonumber\\ = & {s_n}\Bigg[{-{\eta _n}{\rm sgn}({s_n}) + {k_n}{e_{d1}} + \sum\limits_{i = 2}^n {{e_{di}}} } \Bigg]\nonumber\\ \le & \left| {{s_n}} \right|( - {\eta _n} + {k_n}{e_{d1}} + \sum\limits_{i = 2}^n {{e_{di}}}). \end{align} $

    (28)

    It can be derived from ${\dot V_n}$ (28) that the system states will reach the defined sliding surface ${s_n} = 0$ in finite time when . The condition ${s_n} = 0$ implies:

    $ \begin{equation} x_1^{(n - 1)} + x_1^{(n - 2)} + \ldots + {\dot x_1} + {k_n}{x_1} - \sum\limits_{j = 1}^{n - 1} {\sum\limits_{i = 1}^{n - j} {e_{di}^{(j - 1)}}} = 0. \end{equation} $

    (29)

    We can define

    $ \begin{equation} \left\{ {\begin{array}{*{20}{lll}} {{Y_1} = {x_1}}\\ {{Y_2} = {{\dot x}_1}}\\ ~~~~\vdots \\ {{Y_{n - 1}} = x_1^{(n - 2)}} \end{array}} \right.. \end{equation} $

    (30)

    Equation (29) is given by

    $ \begin{equation} \dot Y\! =\! \left[\! {\begin{array}{*{20}{c}} 0 & 1 & 0 & \cdots & 0\\ 0 & 0 & 1 & 0 & 0\\ \vdots & \vdots & \ddots & \vdots & \\ 0 & 0 & \cdots & 0 & 1\\ {-{k_n}} & {-1} & {-1} & \cdots & { - 1} \end{array}} \right]Y + \left[{\begin{array}{*{20}{c}} 0\\ 0\\ \vdots \\ 0\\ {\sum\limits_{j = 1}^{n-1} {\sum\limits_{i = 1}^{n-j} {e_{di}^{(j-1)}} } } \end{array}} \right]. \end{equation} $

    (31)

    Equation (31) can also be expressed as

    $ \begin{equation} \dot Y = AY + B\sum\limits_{j = 1}^{n - 1} {\sum\limits_{i = 1}^{n - j} {e_{di}^{(j - 1)}} } \end{equation} $

    (32)

    where ,

    In view of Assumption 1 and (11), we can conclude:

    $ \begin{equation} \begin{array}{llllllll} \sum\limits_{j = 1}^{n - 1} {\sum\limits_{i = 1}^{n - j} {e_{di}^{(j - 1)}} } = \sum\limits_{i = 1}^{n - 1} {{e_{di}}} + \sum\limits_{i = 1}^{n - 2} {({{\dot d}_i} - {{\lg}_2}{e_{di}})} \\+ \sum\limits_{i = 1}^{n - 3} {({{\ddot d}_i} - {{\lg}_2}{{\dot d}_i} + {l^2}g_2^2{e_{di}})} + \cdots \\+ \sum\limits_{i = 1}^2 {(d_i^{n - 3} - {{\rm lg}_2}d_i^{n - 4} + {l^2}g_2^2d_i^{n - 5} - \cdots - {l^{n - 3}}g_2^{n - 3}{e_{di}})} \\ + (d_1^{n - 2} - {{\rm lg}_2}d_1^{n - 3} + {l^2}g_2^2d_1^{n - 4} - \cdots - {l^{n - 2}}g_2^{n - 2}{e_{d1}}). \end{array} \end{equation} $

    (33)

    Combining Lemma 1, (11) and (33) can get:

    $ \begin{equation} \mathop {\lim }\limits_{t \to \infty } \sum\limits_{j = 1}^{n - 1} {\sum\limits_{i = 1}^{n - j} {e_{di}^{(j - 1)}} } = 0. \end{equation} $

    (34)

    According to the principle of Hurwitz matrix, it can be designed $A$ as a Hurwitz matrix by selecting a proper ${k_n}$ . Taking as an input of (32), it follows from Lemma 2 that (32) is ISS. It is derived from the definition of ISS that the state of (32) converges to zero as $t \to \infty $ , that is, . This implies that the system states can converge to zero along the sliding surface asymptotically under the proposed control law.

    Remark 4: The above proof implies that states of system can be driven to the desired equilibrium point and the control law can force the system states to reach the sliding-mode surface in finite time. This is the main reason why the proposed NDOB-based SMC method is insensitive to matched uncertainties and disturbances as well as mismatched uncertainties and disturbances.

    To evaluate the effectiveness of the proposed method, two examples are given below.

    $ \begin{equation} \left\{ \begin{array}{l} {{\dot x}_1} = {x_2} + {d_1}(t)\\ {{\dot x}_2} = {x_3} + {d_2}(t)\\ {{\dot x}_3} = - 2{x_2} - {x_3} + {e^{{x_1}}} + u + {d_3}(t)\\ y = {x_1}. \end{array} \right. \end{equation} $

    (35)

    In order to show the advantages of the NDOB-based SMC method proposed in this paper compared with the nominal sliding control, we will use simulations to compare the performance between them for (35). The control parameters of the two control methods are listed in Table Ⅰ. Consider the initial states of (35) as . Mismatched/matched uncertainties and disturbances ${d_1} = 1.5, {d_2} = 0.5, {d_3} = 1$ are imposed on the system at 25, 30, 15 sec respectively. The simulation results are shown in Figs. 2-4.

    表 Ⅰ  Control Parameters for the Numerical Example in Case 1
    Table Ⅰ  Control Parameters for the Numerical Example in Case 1
    Controllers Parameters
    SMC1 k = 8, η= 16
    SMC2 k = 8, η= 10
    NDOB-SMC k = 8, η = 10, l = diag{6, 6, 6}
    下载: 导出CSV 
    | 显示表格
    图 2  System state variables.
    Fig. 2  System state variables.
    图 3  Control signal of system.
    Fig. 3  Control signal of system.
    图 4  Reference/estimation uncertainties and disturbances.
    Fig. 4  Reference/estimation uncertainties and disturbances.

    The reference/estimation uncertainties and disturbances of the system are shown in Fig. 4. It can be seen from Fig. 4 that the proposed control gives a better estimation of the disturbance and a better performance.

    In $d-q$ coordinates the permanent magnet synchronous motor (PMSM) dynamic equation can be written as

    $ \begin{equation} \dot \omega = - a\omega + b{i_q} - d \end{equation} $

    (36)

    where $a = B/J$ , $b = 3p{\psi _f}/{(2J)}$ , $d = T_L/J$ . $\omega$ is the rotor angular velocity, $p$ is the number of pole pairs, ${\psi _f}$ is the flux linkage, ${T_L}$ is the load torque, $B$ is the viscous friction coefficient, $J$ is the moment of inertia.

    The state variable of speed error is defined as and its derivative as follows:

    $ \begin{equation} {\dot x_1} = {x_2} = {\dot \omega _{\rm ref}} - \dot \omega = {\dot \omega _{\rm ref}} + a\omega - b{i_q} + d \end{equation} $

    (37)

    where ${\omega _{\rm ref}}$ is the reference speed signal.

    The speed error derivative dynamic equation of the motor can be expressed as follows, with the parameters variations taken into account:

    $ \begin{eqnarray} {{\dot x}_1} & = & {{\dot \omega }_{\rm ref}} + a\omega + \Delta a\omega - b{i_q} - \Delta b{i_q} + d + \Delta d\nonumber\\ & = & {x_2} + \Delta a\omega - \Delta b{i_q} + \Delta d\nonumber\\ & = & {x_2} + {d_1} \end{eqnarray} $

    (38)

    where $\Delta a$ , $\Delta b$ , and $\Delta d$ are parameter variations of $a$ , $b$ , $d$ , and is considered as mismatched uncertainties.

    The second-order model of speed error derivative dynamic equation of PMSM system is described by

    $ \begin{equation} {\dot x_2} = {\ddot \omega _{\rm ref}} - \ddot \omega = {\ddot \omega _{\rm ref}} + a\dot \omega - b{\dot i_q} + \dot d. \end{equation} $

    (39)

    ${\dot x_2}$ can be expressed as follows, with the parameters variations taken into account:

    $ \begin{eqnarray} {{\dot x}_2} & = & {{\ddot \omega }_{\rm ref}} + a\dot \omega + \Delta a\dot \omega - b{{\dot i}_q} - \Delta b{{\dot i}_q} + \dot d + \Delta \dot d\nonumber\\ & = & - a{x_2} - b{{\dot i}_q} + {d_2}\nonumber\\ & = & - a{x_2} - bu + {d_2} \end{eqnarray} $

    (40)

    where $u = {\dot i_q}$ , .

    The second-order model of the PMSM speed regulation system can be represented in the following state-space form:

    $ \begin{equation} \left\{ \begin{array}{l} {{\dot x}_1} = {x_2} + {d_1}\\ {{\dot x}_2} = - a{x_2} - bu + {d_2} \end{array} \right.. \end{equation} $

    (41)

    To demonstrate the efficiency of the proposed method, simulation studies are carried out in this section. The parameters of the PMSM are given as follows: ${R_s} = 1.62 \Omega $ , ${L_d} = {L_q}$ = 0.005 H, B = 7.403 $\times {10^{ - 5}}$ N $\cdot$ m $\cdot$ s/rad, $J = 1.74 \times {10^{ - 4}}$ kg $\cdot$ m $^2$ , ${\psi _f}$ = 1.608 wb, $p =2$ . The control parameters of the two control methods are listed in Table Ⅱ. The simulation results are shown in Figs. 5-8.

    表 Ⅱ  Control Parameters for the Numerical Example in Case 2
    Table Ⅱ  Control Parameters for the Numerical Example in Case 2
    Controllers Parameters
    SMC1 k = 8000, η= 950
    NDOB-SMC k = 8000, η = 950, l = diag{6, 6}
    下载: 导出CSV 
    | 显示表格
    图 5  Variable speed curve.
    Fig. 5  Variable speed curve.
    图 6  Speed curve at variable load.
    Fig. 6  Speed curve at variable load.
    图 7  $i_q$ response curve.
    Fig. 7  $i_q$ response curve.
    图 8  Speed curve under variable parameter.
    Fig. 8  Speed curve under variable parameter.

    Fig. 5 depicts the variable curve, reference speed changes from 20 rad/s to 40 rad/s at 5 s, and 40 rad/s to -20 rad/s at 10 s. It can be observed that the proposed method exhibits a faster speed and smooth transition than the nominal SMC method.

    The unknown load torque ${T_L}$ = 3 N $\cdot$ m is supposed to add to the PMSM at 10 s, and be removed from the system at 15 s. Response curves of the rotor speed, q-axis currents are shown in Figs. 6 and 7. It can be seen that the proposed control method obtains fine tracking performance in the presence of unknown external load torque variations.

    The response curves of the PMSM under the proposed method in the presence of mechanical parameter variations are shown in Fig. 8. The moment of inertia is supposed to have variations in its nominal operation values, 2 J at 10 s, and 1.5 J at 15 s, respectively. It can be observed from Fig. 8 that the proposed method is insensitive to mismatched uncertainty, and has fine robustness performance, while the nominal SMC method is sensitive to mismatched uncertainty.

    In this paper, the mismatched/matched uncertainties and disturbances rejection control problem have been studied for the second-, third-, and higher-order systems. A novel NDOB-based SMC approach has been proposed. The controller not only make the states of closed-loop system obtain better tracking performance, but also enhance the disturbance attenuation and system robustness. The proposed method has exhibited nominal performance recovery and chattering reduction as compared with the nominal SMC. Simulation results reveal the effectiveness of the proposed method.


  • 本文责任编委 贾珈
  • 图  1  基于SHL的网络结构示意图

    Fig.  1  SHL based network structures

    图  2  SHL-BN-MDNN的训练流程图

    Fig.  2  Diagram of SHL-BN-MDNN training scheme

    图  3  基于CNMF的低维特征提取方法

    Fig.  3  CNMF based low-dimensional feature extraction approach

    图  4  不同分解参数下基于CNMF的低维特征词错误率

    Fig.  4  WER of CNMF based low-dimensional features under difierent factorization parameters

    表  1  不同训练方法下BNF的WER (%)

    Table  1  WER of BNF based on different training methods (%)

    训练方案WER DNN参数数量(MB)
    单语言BNF67.423.57
    SHL + BNF63.258.34
    SHL + Dropout + Maxout + BNF58.953.11
    SHL + Dropout + ReLU + BNF62.748.34
    下载: 导出CSV

    表  2  不同dropout和maxout参数下的WER (%)

    Table  2  WER under difierent dropout and maxout parameters (%)

    Dropout-maxout参数HDF = 0.1HDF = 0.1HDF = 0.2HDF = 0.2HDF = 0.3HDF = 0.3
    BN-DF = 0BN-DF = 0.1BN-DF = 0BN-DF = 0.2BN-DF = 0BN-DF = 0.3
    Pooling尺寸: 512×2 (40×2)62.1160.7761.89
    Pooling尺寸: 342×3 (40×3)59.7261.1458.9560.3260.1361.5
    Pooling尺寸: 256×4 (40×4)61.2360.3661.84
    下载: 导出CSV

    表  3  基于单语言训练时各特征的识别性能WER (%)

    Table  3  Recognition performance WER each type of feature based on monolingual training (%)

    识别任务BNFCNMF低维特征SVD低维特征
    低资源Vystadial_en21.620.621.51
    低资源Vystadial_cz64.863.7664.43
    下载: 导出CSV

    表  4  基于SHL多语言训练的CNMF低维特征的WER (%)

    Table  4  WER of SHL multilingual training CNMF based low-dimensional features (%)

    CNMF特征提取方案第3层第4层第5层
    Sigmoid + 40维分解64.2764.9464.71
    Sigmoid + 50维分解63.8663.8164.99
    Dropout + Maxout + 40维分解60.3360.1359.59
    Dropout + Maxout + 50维分解59.5959.1259.95
    Dropout + ReLU + 40维分解63.7161.5961.28
    Dropout + ReLU + 50维分解62.1560.2661.84
    下载: 导出CSV

    表  5  BNF与CNMF低维特征的GMM tandem系统WER (%)

    Table  5  WER of BNF and CNMF based low-dimensional features on GMM tandem system (%)

    实验配置BNFCNMF低维特征
    Vystadial_en (单语言fMLLR) + Sigmoid-DNN21.620.6
    Vystadial_cz (单语言fMLLR) + Sigmoid-DNN64.863.76
    Vystadial_cz (单语言fbanks) + Sigmoid-DNN63.2563.81
    Vystadial_cz (单语言fbanks) + Dropout-maxout-DNN58.9559.12
    Vystadial_cz (单语言fbanks) + Dropout-ReLU-DNN62.7460.26
    下载: 导出CSV

    表  6  基于SHL多语言训练时SGMM tandem系统和DNN-HMM系统的WER (%)

    Table  6  WER of SGMM tandem systems and DNN-HMM hybrid systems based on SHL multilingual training (%)

    DNN隐含层结构BNFCNMF低维特征DNN-HMM
    5层1 024 (BN: 40)63.1561.7963.94
    Sigmoid3层1 024 (BN: 40)63.0961.8563.99
    3层512 (BN: 40)63.561.8463.96
    5层342 (*3, BN: 40)58.0357.858.24
    Dropout + Maxout3层342 (*3, BN: 40)60.6160.463.99
    3层171 (*3, BN: 40)62.6164.7268.77
    5层1 024 (BN: 40)60.7258.8259.57
    Dropout + ReLU3层1 024 (BN: 40)64.3559.1659.92
    3层512 (BN: 40)63.4361.6862.2
    下载: 导出CSV
  • [1] Thomas S. Data-driven Neural Network Based Feature Front-ends for Automatic Speech Recognition[Ph.D. dissertation], Johns Hopkins University, Baltimore, USA, 2012.
    [2] Grézl F, Karaát M, Kontár S, Černocký J. Probabilistic and bottle-neck features for LVCSR of meetings. In:Proceedings of the 2007 International Conference on Acoustics, Speech and Signal Processing (ICASSP). Hawaii, USA:IEEE, 2007. 757-760
    [3] Yu D, Seltzer M L. Improved bottleneck features using pretrained deep neural networks. In:Proceedings of the 12th Annual Conference of the International Speech Communication Association (INTERSPEECH). Florence, Italy:Curran Associates, Inc., 2011. 237-240
    [4] Bao Y B, Jiang H, Dai L R, Liu R. Incoherent training of deep neural networks to de-correlate bottleneck features for speech recognition. In:Proceedings of the 2013 International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Vancouver, BC, Canada:IEEE, 2013. 6980-6984
    [5] Hinton G E, Deng L, Yu D, Dahl D E, Mohamed A R, Jaitly N, Senior A, Vanhoucke V, Nguyen P, Sainath T N, Kingsbury B. Deep neural networks for acoustic modeling in speech recognition:the shared views of four research groups. IEEE Signal Processing Magazine, 2012, 29 (6):82-97 doi: 10.1109/MSP.2012.2205597
    [6] Lal P, King S. Cross-lingual automatic speech recognition using tandem features. IEEE Transactions on Audio, Speech, and Language Processing, 2013, 21 (12):2506-2515 doi: 10.1109/TASL.2013.2277932
    [7] Veselý K, Karafiát M, Grézl F, Janda M, Egorova E. The language-independent bottleneck features. In:Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT). Miami, Florida, USA:IEEE, 2012. 336-341
    [8] Tüske Z, Pinto J, Willett D, Schlüter R. Investigation on cross-and multilingual MLP features under matched and mismatched acoustical conditions. In:Proceedings of the 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Vancouver, BC, Canada:IEEE, 2013. 7349-7353
    [9] Gehring J, Miao Y J, Metze F, Waibel A. Extracting deep bottleneck features using stacked auto-encoders. In:Proceedings of the 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Vancouver, BC, Canada:IEEE, 2013. 3377-3381
    [10] Miao Y J, Metze F. Improving language-universal feature extraction with deep maxout and convolutional neural networks. In:Proceedings of the 15th Annual Conference of the International Speech Communication Association (INTERSPEECH). Singapore:International Speech Communication Association, 2014. 800-804
    [11] Huang J T, Li J Y, Dong Y, Deng L, Gong Y F. Cross-language knowledge transfer using multilingual deep neural network with shared hidden layers. In:Proceedings of the 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Vancouver, BC, Canada:IEEE, 2013. 7304-7308
    [12] Hinton G E, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov R R. Improving neural networks by preventing co-adaptation of feature detectors. Computer Science, 2012, 3 (4):212-223
    [13] Goodfellow I J, Warde-Farley D, Mirza M, Courville A, Bengio Y. Maxout networks. In:Proceedings of the 30th International Conference on Machine Learning (ICML). Atlanta, GA, USA:ICML, 2013:1319-1327
    [14] Zeiler M D, Ranzato M, Monga R, Mao M, Yang K, Le Q V, Nguyen P, Senior A, Vanhoucke V, Dean J, Hinton G H. On rectified linear units for speech processing. In:Proceedings of the 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Vancouver, BC, Canada:IEEE, 2013. 3517-3521
    [15] Dahl G E, Yu D, Deng L, Acero A. Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Transactions on Audio, Speech, and Language Processing, 2012, 20 (1):30-42 doi: 10.1109/TASL.2011.2134090
    [16] Lee D D, Seung H S. Learning the parts of objects by non-negative matrix factorization. Nature, 1999, 401 (6755):788-791 doi: 10.1038/44565
    [17] Wilson K W, Raj B, Smaragdis P, Divakaran A. Speech denoising using nonnegative matrix factorization with priors. In:Proceedings of the 2008 International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Las Vegas, NV, USA:IEEE, 2008. 4029-4032
    [18] Mohammadiha N. Speech Enhancement Using Nonnegative Matrix Factorization and Hidden Markov Models[Ph.D. dissertation], KTH Royal Institute of Technology, Stockholm, Sweden, 2013.
    [19] Ding C H Q, Li T, Jordan M I. Convex and semi-nonnegative matrix factorizations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32 (1):45-55 doi: 10.1109/TPAMI.2008.277
    [20] Price P, Fisher W, Bernstein J, Pallett D. Resource management RM12.0[Online], available:https://catalog.ldc.upenn.edu/LDC93S3B, May 16, 2015
    [21] Garofolo J, Lamel L, Fisher W, Fiscus J, Pallett D, Dahlgren N, Zue V. TIMIT acoustic-phonetic continuous speech corpus[Online], available:https://catalog.ldc.upenn.edu/LDC93S1, May 16, 2015
    [22] Korvas M, Plátek O, Dušek O, Žćilka L, Jurčíček F. Vystadial 2013 English data[Online], available:https://lindat.mff.cuni.cz/repository/xmlui/handle/11858/00-097C-0000-0023-4671-4, May 17, 2015
    [23] Korvas M, Plátek O, Dušek O, Žćilka L, Jurčíček F. Vystadial 2013 Czech data[Online], available:https://lindat.mff.cuni.cz/repository/xmlui/handle/11858/00-097C-0000-0023-4670-6?show=full, May 17, 2015
    [24] Povey D, Ghoshal A, Boulianne G, Burget L, Glembek O, Goel N, Hannemann M, Motlicek P, Qian Y M, Schwarz P, Silovsky J, Stemmer G, Vesely K. The Kaldi speech recognition toolkit. In:Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU). Hawaii, USA:IEEE Signal Processing Society, 2011. 1-4
    [25] Miao Y J. Kaldi + PDNN:Building DNN-based ASR Systems with Kaldi and PDNN. arXiv preprint arXiv:1401. 6984, 2014.
    [26] Thurau C. Python matrix factorization module[Online], available:https://pypi.python.org/pypi/PyMF/0.1.9, September 25, 2015
    [27] Sainath T N, Kingsbury B, Ramabhadran B. Auto-encoder bottleneck features using deep belief networks. In:Proceedings of the 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Kyoto, Japan:IEEE, 2012. 4153-4156
    [28] Miao Y J, Metze F. Improving low-resource CD-DNN-HMM using dropout and multilingual DNN training. In:Proceedings of the 12th Annual Conference of the International Speech Communication Association (INTERSPEECH). Lyon, France:Interspeech, 2013. 2237-2241
    [29] Miao Y J, Metze F, Rawat S. Deep maxout networks for low-resource speech recognition. In:Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU). Olomouc, Czech:IEEE, 2013. 398-403
    [30] Povey D, Burget L, Agarwal M, Akyazi P, Feng K, Ghoshal A, Glembek O, Goel N K, Karafiát M, Rastrow A, Rastrow R C, Schwarz P, Thomas S. Subspace Gaussian mixture models for speech recognition. In:Proceedings of the 2010 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Texas, USA:IEEE, 2010. 4330-4333
    [31] 吴蔚澜, 蔡猛, 田垚, 杨晓昊, 陈振锋, 刘加, 夏善红.低数据资源条件下基于Bottleneck特征与SGMM模型的语音识别系统.中国科学院大学学报, 2015, 32 (1):97-102 http://www.cnki.com.cn/Article/CJFDTOTAL-ZKYB201501017.htm

    Wu Wei-Lan, Cai Meng, Tian Yao, Yang Xiao-Hao, Chen Zhen-Feng, Liu Jia, Xia Shan-Hong. Bottleneck features and subspace Gaussian mixture models for low-resource speech recognition. Journal of University of Chinese Academy of Sciences, 2015, 32 (1):97-102 http://www.cnki.com.cn/Article/CJFDTOTAL-ZKYB201501017.htm
  • 加载中
  • 图(4) / 表(6)
    计量
    • 文章访问数:  2990
    • HTML全文浏览量:  419
    • PDF下载量:  1043
    • 被引次数: 0
    出版历程
    • 收稿日期:  2015-10-16
    • 录用日期:  2016-10-20
    • 刊出日期:  2017-07-20

    目录

    /

    返回文章
    返回