Dynamic Generative Control

Chinese version: 动态生成式控制 - 知乎 (zhihu.com)

In several previous blog posts, we formulated the control problem as a generative model in the following fashion:

\[[v,w,x] = g(z), \quad z \sim \mathbb{D},\]

in which \(v\) is the collection of control objecties to be minimized, \(w\) contains the output control signals, and \(x\) represents the input signals that are thought to offer information to the control problem.

When the system is dynamic, the form above lacks the explicit modeling of dynamic changes. The simplest form of a dynamic generative control model can be written as

\[\begin{bmatrix} v_t & w_t & x_t \\ v_{t-1} & w_{t-1} & x_{t-1} \end{bmatrix} = g(z), \quad z \sim \mathbb{D}.\]

Now, \(g(z)\) is able to generate system states before and after a control step, therefore after training, it must be able to model the dynamic changes in the controlled system. The effect of this is a simpler dynamic control algorithm than that of a static model:

\[ \begin{align} & \text{Initialize } [v_0, x_0] = r (u_0, w_0) \\ & \text{for } t = 1 \text{ to } \infty \text{ do} \\ & \quad \text{Initialize } [v', w', x'] = [v_{t-1}, w_{t-1}, x_{t-1}] \\ & \quad \text{for } N \text{ steps} \text{ do} \\ & \quad \quad z' = \underset{z}{\text{argmin}} \left\| g(z) - \begin{bmatrix} v' - \epsilon & w' & x' \\ v_{t-1} & w_{t-1} & x_{t-1} \end{bmatrix} \right\| \\ & \quad \quad \begin{bmatrix} v' & w' & x' \\ \_ & \_ & \_ \end{bmatrix} = g(z') \\ & \quad w_t = w' \\ & \quad [v_t, x_t] = r (u_t, w_t) \end{align} \]

Similarly to before, this algorithm proceeds by attempting to minimize the control objectives \(v\) until it has achieved the equilibrium. The difference is not needing to tune the step size of objectives \(\epsilon\), because the variables of step \(t\) is anchored by those at step \(t-1\). As long as \(g\) has been trained properly to model the system transitions, it should be able to find the control parameters that can maximally minimize the objectives in \(v\).

Without a doubt, the inner loop of the algorithm disambiguiated the uncertainty between all variables just like discussed before, making it possible to amortize the algorithm at each control step. This will reduce the model much more agressively to the computation level of an MCU rather than a CPU / GPU / NPU, making it very cheap to use generative control compared to alternatives such as LLMs. See here for a good mathematical tutorial on amortization, and here for a philosophical discussion.

Furthermore, let \(x = [v_{t-1}, w_{t-1}, x_{t-1}, x_t]\), the formulation above is equivalent to the static model \([v,w,x] = g(z)\) and the dynamic algorithm above is equivalent to the static algorithm. This also matches with our previous discussion that the more we can provide useful signals to \(x\), the more powerful our generative world model can be. This equivalence is left to the readers to explicitly write out.

Comments

Popular posts from this blog

Serving Llama-2 7B using llama.cpp with NVIDIA CUDA on Ubuntu 22.04

A Perplexity Benchmark of llama.cpp

A Peek into Generative Control