Loading... ## Definition In Mamba, we have the formula: $$ h'(t)=Ah(t) + Bx(t) $$ Here, $A, B$ represent the parametric matrix, and $h$ represents the hidden state, and $x$ represents the input. It is basically from control theory, which is originally: $x'(t) = Ax(t) + Bu(t)$ Here we inherit the initial statement of the state space theory, where $x$ represent the hidden state, and $u$ represents the input at this time step. ## Motivation By observation, we will realize that: $x'(t) = Ax(t)$ which reminds us the $(e^x)' = e^x$. This is where the discretization process from. Move the $Ax(t)$ to the LHS, $$ x'(t) - Ax(t) = Bu(t) $$ Let it be: $e^{-At} x'(t) - e^{-At}x(t) = e^{-At} Bu(t)$ Note that $F(t) = e^{-At}x(t)$, then $F'(t) = -Ae^{-At}x(t) + e^{-At} x'(t)$ Also we know that $F'(t) = e^{-At}Bu(t)$ Through the integration, we can get a formular which contains both $u$ and $x$: $$ \forall \lambda \in (-\infty, +\infty), F(t) = F(\lambda) + \int_{\lambda}^{t} F'(\tau) \mathrm{d}\tau $$ Specificly, for the convinience, let $\lambda = 0$, substitute $F(t) = e^{-At}x(t)$ and $F'(t)$ into the above equation $$ e^{-At}x(t) = x(0) + \int_{0}^{t} -Ae^{-A\tau}x(\tau) + e^{-A\tau}x(\tau) \mathrm{d} \tau $$ It also equals to: $$ e^{-At}x(t) = x(0) + \int_{0}^{t} e^{-A\tau} Bu(\tau) \mathrm{d} \tau $$ Divide both sides by $e^{-At}$ $$ x(t) = e^{At}x(0) + e^{At}\int_{0}^{t} e^{A\tau} Bu(\tau) \mathrm{d} \tau $$ Generally, $$ x(t_k) = e^{At_k} x(0) + e^{At_k} \int_{0}^{t_k} e^{-A\tau} Bu(\tau) \mathrm{d} \tau $$ To get the form like $x(t_{k+1}) = ?? x(t_k)$ $x(t_{k+1}) = e^{A(t_k + (t_{k+1}-t_k))}x(0) + e^{A(t_k + (t_{k+1} - t_k))} \int_{0}^{t_{k+1}} e^{-A\tau} Bu(\tau) \mathrm{d} \tau$ Get the simplication, $$ x(t_{k+1}) = e^{A(t_{k+1} - t_k)} [e^{At_k}x(0) + e^{At_k} \int_{0}^{t_k} -e^{A\tau} Bu(\tau) \mathrm{d}\tau] + e^{At_{k+1}} \int_{t_k}^{t_{k+1}} e^{-A\tau} bu(\tau) \mathrm{d} \tau $$ Notice that, the term in $[]$ equals to $x(t_k)$. Hence, it is equivalent to: $$ x(t_{k+1}) = e^{t_{k+1} - t_k} x(t_k) + \int_{t_k}^{t_{k+1}} e^{A(t_{k+1} - \tau)} bu(\tau) \mathrm{d} \tau $$ ## Zero-Order Holder Here we introduce the Zero-Order Holder. Superficially, we only focus the key step from which is zero-order holder, rather than detailed zero-order holder theory. Note the $T = t_{k+1} - t_k$. Let $T \to 0$ By zero-order holder, we regard $u(\tau)$ as $u(t_k)$. Therefore, $$ x(t_{k+1}) = e^{AT}x(t_k) + \int_{t_k}^{t_{k+1}} e^{A(t_{k+1} - \tau)} Bu(t_k) \mathrm{d} \tau $$ The second term of RHS is equivalent to: $$ e^{At_{k+1}} Bu(t_k) \int_{t_k}^{t_{k+1}} e^{-A\tau} \mathrm{d} \tau =e^{At_{k+1}} Bu(t_k) [ -\frac{1}{A} (e^{-At_{k+1}} - e^{-At_k})] $$ Get the simplication: $$ \Rightarrow Bu(t_k) \times \frac{1}{A} [e^{A(t_{k+1} - t_k)}- 1] $$ Substitute back to the above equation: $$ x(t_{k+1}) = e^{AT}x(t_k) + Bu(t_k) \times \frac{1}{A} [e^{AT} - I] $$ We substitute $1$ by the identity matrix $I$, because all of our computations at here are based on matrix operations. Notice that $T \to 0$ is equivalent to a standard infinitesimal $\Delta$. Therefore, we substitute all the $T \to 0$ by $\Delta$ $$ x(t_{k+1}) = e^{A\Delta}x(t_k) + Bu(t_k) \cdot \frac{\Delta (e^{AT} - I)}{\Delta A} $$ $$ x(t_{k+1}) = e^{A\Delta} x(t_k) + \Delta Bu(t_k) (e^{AT} - I) (\Delta A)^{-1} $$ ## Result Don't forget that our target is that $x(t_{k+1}) = \overline{A} x(t_k) + \overline{B} u(t_k)$, where $\overline{A} $ and $\overline{B}$ are both parametric matrices. Here we can get $$ \overline{A} = e^{\Delta A} $$ $$ \overline{B} = (\Delta A)^{-1} (e^{\Delta A} - I) \Delta B $$ However, $h(t_{k+1}) = \overline{A} h(t_k) + \overline{B} x(t_{k+1})$. This is because of the definition of $h$, where $h_0 = \overline{B} x_0$ In the control theory, $x_1$ is from $x_0$ and $u_0$ In the SSM, $h_1$ is from hidden state $h_0$. Because the input $x_0$ corresponding $u_0$ which corresponds to hidden state $x_0$ in the control theory, $h_1$ is from input $x_1$. 最后修改:2024 年 05 月 31 日 © 允许规范转载 打赏 赞赏作者 支付宝微信 赞 如果觉得我的文章对你有用,请随意赞赏