Vapnik–Chervonenkis dimension

<p>In <a href="/facts/Vapnik%25E2%2580%2593Chervonenkis_theory/jDST0Ulo">Vapnik–Chervonenkis theory</a>, the Vapnik–Chervonenkis (VC) dimension is a measure of the size (capacity, complexity, expressive power, richness, or flexibility) of a class of sets. The notion can be extended to classes of binary functions. It is defined as the <a href="/facts/Cardinality/m6VPojwT">cardinality</a> of the largest set of points that the algorithm can <a href="/facts/Shattering_(machine_learning)/Kum02mT4">shatter</a>, which means the algorithm can always learn a perfect classifier for any labeling of at least one configuration of those data points. It was originally defined by <a href="/facts/Vladimir_Vapnik/UQfpyUuN">Vladimir Vapnik</a> and <a href="/facts/Alexey_Chervonenkis/EAgIkPrg">Alexey Chervonenkis</a>.
</p><p>Informally, the capacity of a classification model is related to how complicated it can be. For example, consider the <a href="/facts/Heaviside_step_function/bgsUtxgP">thresholding</a> of a high-<a href="/facts/Degree_of_a_polynomial/Bf8vEIhf">degree</a> <a href="/facts/Polynomial/Lzak8VVx">polynomial</a>: if the polynomial evaluates above zero, that point is classified as positive, otherwise as negative. A high-degree polynomial can be wiggly, so that it can fit a given set of training points well. But one can expect that the classifier will make errors on other points, because it is too wiggly. Such a polynomial has a high capacity. A much simpler alternative is to threshold a linear function. This function may not fit the training set well, because it has a low capacity. This notion of capacity is made rigorous below.
</p>

Vapnik–Chervonenkis dimension open-in-new

Vapnik–Chervonenkis dimension