General Generatives

Posts

Latent Problem Modeling: A Pragmatic Form of Artificial Intelligence

By Xiang Zhang January 27, 2026

Chinese Version: 潜变量问题建模：一种务实的人工智能形式 Why Latent Problem Modeling? Instead of pursuing the broad and often vague ambition of Artificial General Intelligence (AGI)—which aims to emulate the entirety of human cognition—it may be more effective to focus on a more actionable and testable alternative: latent problem modeling. Latent Problem Modeling can be understood as a form of world modeling, where the "world" is defined not by external physical reality, but by the structure of a given problem: its inputs, outputs, and objectives. By modeling the space of possible problems and solutions, the system constructs an internal model of the world in which it operates. Latent Problem Modeling does not define intelligence as mimicking humans. Rather, it frames intelligence as the capacity to define a problem, understand the structure that defines it, and discover solutions within that structure. It models the latent structure of objectives, inputs, and outputs that together form a probl...

Praising the Nobel Prize Committee for Breaking in the Wall between Physics and Computer Science

By Xiang Zhang October 11, 2024

The Nobel Prize committee is so brave and progressive to break the wall between physics and computer science. After all, knowledge is knowledge, when learnt, it all becomes a unified entity in one's mind. The separation of "disciplines" exists only to show the material limitation of the human brain, such that it is impossible for one person to learn every piece of knowledge in his single lifetime. However, it wasn't the case when rational thinking first became a thing -- everyone was able to learn every piece of knowledge, until knowledge exploded and humans have to invent the concept of "disciplines" to preserve knowledge and to school. It will not be the case in the future either, when AI becomes a universal tool for problem solving, rendering "scientific understanding" unnecessary and impossible. Separation of disciplines of the Nobel Prize will be outdated as well when that happens.

Dynamic Generative Control

By Xiang Zhang August 10, 2024

Chinese version: 动态生成式控制 - 知乎 (zhihu.com) In several previous blog posts, we formulated the control problem as a generative model in the following fashion: \[[v,w,x] = g(z), \quad z \sim \mathbb{D},\] in which \(v\) is the collection of control objecties to be minimized, \(w\) contains the output control signals, and \(x\) represents the input signals that are thought to offer information to the control problem. When the system is dynamic, the form above lacks the explicit modeling of dynamic changes. The simplest form of a dynamic generative control model can be written as \[\begin{bmatrix} v_t & w_t & x_t \\ v_{t-1} & w_{t-1} & x_{t-1} \end{bmatrix} = g(z), \quad z \sim \mathbb{D}.\] Now, \(g(z)\) is able to generate system states before and after a control step, therefore after training, it must be able to model the dynamic changes in the controlled system. The effect of this is a simpler dynamic control algorithm than that of a static model : \[ \begin{align} ...

Dynamic System Control using a Static Generative Model

By Xiang Zhang August 03, 2024

Chinese version: 静态生成式模型的动态系统控制算法 - 知乎 (zhihu.com) Recap from the last blog post, we formulated the control problem as a generative model in the following fashion: \[[v,w,x] = g(z), \quad z \sim \mathbb{D},\] in which \(v\) is the collection of control objectives to be minimized, \(w\) contains the output control signals, and \(x\) represents the input signals that are thought to offer information to the control problem. This model is static for the lack of explicit formulation of the continuous evolution of the system states \([v,w,x]\). It is possible to change the model to a dynamic one, but in this blog post we show an algorithm that can offer dynamic control using a static model. \[ \begin{align} & \text{Initialize } \epsilon_0 \text{ and } [v_0, x_0] = r (u_0, w_0) \\ & \text{for } t = 1 \text{ to } \infty \text{ do} \\ & \quad \text{Initialize } [v', w', x'] = [v_{t-1}, w_{t-1}, x_{t-1}] \\ & \quad \text{While } v_{t-1} - v' > \epsilon_{t...

Generative Control Inference

By Xiang Zhang July 25, 2024

Chinese version: 生成式控制论的推理算法 - 知乎 (zhihu.com) In the previous post , I gave a peek into generative control, a new idea that can control some system by learning its intrinsic characteristics as a generative world model. It can avoid the over-shoot and over-expoloration problem from PID control and reinforcement learning. Rewrite the formuation a bit, we have \[[v,w,x] = g(z), \quad z \sim \mathbb{D},\] in which \(v\) is the collection of control objectives to be minimized, \(w\) contains the output control signals, and \(x\) represents the input signals that are thought to offer information to the control problem. During training, we are presented with a dataset of \([v,w,x]\) triplets, each representing the control objective \(v\) achieved under the output and input condition \([w,x]\). We have an algorithm that can produce \(g\), which is a generative model that can disambiguate the uncertainty of multiple acceptable \(v\), \(w\) or \(x\)'s, each under the condition of the other ...

A Peek into Generative Control

By Xiang Zhang July 06, 2024

Chinese version: 一窥生成式控制论 - 知乎 (zhihu.com) As a peek into some of my most recent work in applying generative modeling into various industries, I'm proudly presenting an idea that illustrates how powerful generative models can be for industrial controling. My team and I are working to rapidly expand this idea into many different areas, and we still haven't seen its limit yet. A General Formulation of Generative Models \[y = g(z), \quad z \sim \mathbb{D}\] where \(y\) is a sample from data, \(g\) is a generator neural network written as a function, and \(z\) follows a pre-defined distribution \(\mathbb{D}\). It is easy to identify that generative adversarial networks (GANs) naturally result in models in the above fashion. For autoregressive language models (transformers or otherwise), \(z\) is the concatenation of all the random sampling variables used during the decoding process. For stable diffusion, \(z\) is the concatenation of all the noise added in the diffusion process. A...

Thoughts on AIGC for Non-AI Industries

By Xiang Zhang January 05, 2024

1. AIGC is a paradigm shift from goal-oriented problem solving to free-form interactive engineering. It is time to expand our imagination to products that can talk and draw with the customers, on top of being able to completing its own tasks with these interactions. Your fridge can help to order groceries when asked, but can also answer generic questions like ChatGPT does. No reason to limit the AI to do what its shell product is designed to do. For manufacturers of these products, it means better customer stickiness. 2. The entire AIGC economy is in its infancy because right now the paying customers are the tech-savvy people who can afford a few tens of bucks in subscription fees every month. To make it really ubiquitous in every product and every place, the AI model serving cost must be reduced by multiples of thousands of times. When that is achieved, products like GitHub copilots might just be free like Bing search. At that time, every product that is capable of accessing the Inter...