Dual control theory

Dual control theory is a branch of <a href="/facts/Control_theory/aIuERpn6">control theory</a> that deals with the control of systems whose characteristics are initially unknown. It is called dual because in controlling such a system the controller's objectives are twofold: 

<ul><li>(1) Action: To control the system as well as possible based on current system knowledge</li>
<li>(2) Investigation: To experiment with the system so as to learn about its behavior and control it better in the future.</li></ul>
These two objectives may be partly in conflict.
In the context of <a href="/facts/Reinforcement_learning/NrgPPS0Q">reinforcement learning</a>, this is known as the exploration-exploitation trade-off (e.g. <a href="/facts/Multi-armed_bandit/fc8vj4Pf">Multi-armed bandit#Empirical motivation</a>).
Dual control theory was developed by <a href="/facts/Alexander_Aronovich_Feldbaum/Nrc98kx2">Alexander Aronovich Fel'dbaum</a> in 1960. He showed that in principle the <a href="/facts/Optimal_control/9SmcQHjJ">optimal</a> solution can be found by <a href="/facts/Dynamic_programming/CLcabtmk">dynamic programming</a>, but this is often impractical; as a result a number of methods for designing sub-optimal dual controllers have been devised.

Dual control theory open-in-new

Dual control theory