Generative Control Inference


In the previous post, I gave a peek into generative control, a new idea that can control some system by learning its intrinsic characteristics as a generative world model. It can avoid the over-shoot and over-expoloration problem from PID control and reinforcement learning. Rewrite the formuation a bit, we have
\[[v,w,x] = g(z), \quad z \sim \mathbb{D},\]
in which \(v\) is the collection of control objectives to be minimized, \(w\) contains the output control signals, and \(x\) represents the input signals that are thought to offer information to the control problem.

During training, we are presented with a dataset of \([v,w,x]\) triplets, each representing the control objective \(v\) achieved under the output and input condition \([w,x]\). We have an algorithm that can produce \(g\), which is a generative model that can disambiguate the uncertainty of multiple acceptable \(v\), \(w\) or \(x\)'s, each under the condition of the other outputs. This uncertainty is usually the reason why traditional system identification methods fail for complex controlling.

In our formulation, the control problem can be described as: given inputs \(x=x_0\), output the control signals \(w\) such that the objectives \(v\) are minimized. We use the following iterative algorithm to solve this problem.
\[\begin{align}& \text{Initialize } v_0, w_0 \\& \text{While } t = 0 \rightarrow T \\& \quad \quad z_t = \underset{z}{\text{argmin}} ~ \| g(z) - [v_{t-1} - \epsilon, w_{t-1}, x_0] \| \\& \quad \quad [v_t, w_t, \_] = g(z_t) \\& \text{Output } w_T \text{.}\end{align}\]

This algorithm works by iteratively reconstructing \([v,w,x]\) triplets that forces \(x=x_0\) and \(v\) going downwards by \(\epsilon\). At equilibrium, the control problem is solved because according to the system world model \(g\), objectives \(v\) cannot go down any more and inputs \(x\) cannot approach \(x_0\) any closer.

Since this algorithm is deterministic by fixed initialization \(v_0\) and \(w_0\), it disambiguated the intrinsic system uncertainty, which would present in the data. In turn, this makes it possible to fit a neural network \(h\) such that \(w_T = h(x_0)\), where direct regression from data fails.

We used this modeling on many problems in the home appliances sector, delivering control networks \(h\) on very cheap hardware whose size span from a few KBs to a few MBs. Soon we will share an online SDK, where you can upload data and download a C program that can be compiled anywhere to deploy \(h\).

Comments

Popular posts from this blog

Serving Llama-2 7B using llama.cpp with NVIDIA CUDA on Ubuntu 22.04

A Perplexity Benchmark of llama.cpp

A Peek into Generative Control