In statistical hypothesis testing, a uniformly most powerful (UMP) test is a hypothesis test which has the greatest power 1 − β {\displaystyle 1-\beta } among all possible tests of a given size α. For example, according to the Neyman–Pearson lemma, the likelihood-ratio test is UMP for testing simple (point) hypotheses.
Setting
Let X {\displaystyle X} denote a random vector (corresponding to the measurements), taken from a parametrized family of probability density functions or probability mass functions f θ ( x ) {\displaystyle f_{\theta }(x)} , which depends on the unknown deterministic parameter θ ∈ Θ {\displaystyle \theta \in \Theta } . The parameter space Θ {\displaystyle \Theta } is partitioned into two disjoint sets Θ 0 {\displaystyle \Theta _{0}} and Θ 1 {\displaystyle \Theta _{1}} . Let H 0 {\displaystyle H_{0}} denote the hypothesis that θ ∈ Θ 0 {\displaystyle \theta \in \Theta _{0}} , and let H 1 {\displaystyle H_{1}} denote the hypothesis that θ ∈ Θ 1 {\displaystyle \theta \in \Theta _{1}} . The binary test of hypotheses is performed using a test function φ ( x ) {\displaystyle \varphi (x)} with a reject region R {\displaystyle R} (a subset of measurement space).
φ ( x ) = { 1 if x ∈ R 0 if x ∈ R c {\displaystyle \varphi (x)={\begin{cases}1&{\text{if }}x\in R\\0&{\text{if }}x\in R^{c}\end{cases}}}meaning that H 1 {\displaystyle H_{1}} is in force if the measurement X ∈ R {\displaystyle X\in R} and that H 0 {\displaystyle H_{0}} is in force if the measurement X ∈ R c {\displaystyle X\in R^{c}} . Note that R ∪ R c {\displaystyle R\cup R^{c}} is a disjoint covering of the measurement space.
Formal definition
A test function φ ( x ) {\displaystyle \varphi (x)} is UMP of size α {\displaystyle \alpha } if for any other test function φ ′ ( x ) {\displaystyle \varphi '(x)} satisfying
sup θ ∈ Θ 0 E [ φ ′ ( X ) | θ ] = α ′ ≤ α = sup θ ∈ Θ 0 E [ φ ( X ) | θ ] {\displaystyle \sup _{\theta \in \Theta _{0}}\;\operatorname {E} [\varphi '(X)|\theta ]=\alpha '\leq \alpha =\sup _{\theta \in \Theta _{0}}\;\operatorname {E} [\varphi (X)|\theta ]\,}we have
∀ θ ∈ Θ 1 , E [ φ ′ ( X ) | θ ] = 1 − β ′ ( θ ) ≤ 1 − β ( θ ) = E [ φ ( X ) | θ ] . {\displaystyle \forall \theta \in \Theta _{1},\quad \operatorname {E} [\varphi '(X)|\theta ]=1-\beta '(\theta )\leq 1-\beta (\theta )=\operatorname {E} [\varphi (X)|\theta ].}The Karlin–Rubin theorem
The Karlin–Rubin theorem can be regarded as an extension of the Neyman–Pearson lemma for composite hypotheses.1 Consider a scalar measurement having a probability density function parameterized by a scalar parameter θ, and define the likelihood ratio l ( x ) = f θ 1 ( x ) / f θ 0 ( x ) {\displaystyle l(x)=f_{\theta _{1}}(x)/f_{\theta _{0}}(x)} . If l ( x ) {\displaystyle l(x)} is monotone non-decreasing, in x {\displaystyle x} , for any pair θ 1 ≥ θ 0 {\displaystyle \theta _{1}\geq \theta _{0}} (meaning that the greater x {\displaystyle x} is, the more likely H 1 {\displaystyle H_{1}} is), then the threshold test:
φ ( x ) = { 1 if x > x 0 0 if x < x 0 {\displaystyle \varphi (x)={\begin{cases}1&{\text{if }}x>x_{0}\\0&{\text{if }}x<x_{0}\end{cases}}} where x 0 {\displaystyle x_{0}} is chosen such that E θ 0 φ ( X ) = α {\displaystyle \operatorname {E} _{\theta _{0}}\varphi (X)=\alpha }is the UMP test of size α for testing H 0 : θ ≤ θ 0 vs. H 1 : θ > θ 0 . {\displaystyle H_{0}:\theta \leq \theta _{0}{\text{ vs. }}H_{1}:\theta >\theta _{0}.}
Note that exactly the same test is also UMP for testing H 0 : θ = θ 0 vs. H 1 : θ > θ 0 . {\displaystyle H_{0}:\theta =\theta _{0}{\text{ vs. }}H_{1}:\theta >\theta _{0}.}
Important case: exponential family
Although the Karlin-Rubin theorem may seem weak because of its restriction to scalar parameter and scalar measurement, it turns out that there exist a host of problems for which the theorem holds. In particular, the one-dimensional exponential family of probability density functions or probability mass functions with
f θ ( x ) = g ( θ ) h ( x ) exp ( η ( θ ) T ( x ) ) {\displaystyle f_{\theta }(x)=g(\theta )h(x)\exp(\eta (\theta )T(x))}has a monotone non-decreasing likelihood ratio in the sufficient statistic T ( x ) {\displaystyle T(x)} , provided that η ( θ ) {\displaystyle \eta (\theta )} is non-decreasing.
Example
Let X = ( X 0 , … , X M − 1 ) {\displaystyle X=(X_{0},\ldots ,X_{M-1})} denote i.i.d. normally distributed N {\displaystyle N} -dimensional random vectors with mean θ m {\displaystyle \theta m} and covariance matrix R {\displaystyle R} . We then have
f θ ( X ) = ( 2 π ) − M N / 2 | R | − M / 2 exp { − 1 2 ∑ n = 0 M − 1 ( X n − θ m ) T R − 1 ( X n − θ m ) } = ( 2 π ) − M N / 2 | R | − M / 2 exp { − 1 2 ∑ n = 0 M − 1 ( θ 2 m T R − 1 m ) } exp { − 1 2 ∑ n = 0 M − 1 X n T R − 1 X n } exp { θ m T R − 1 ∑ n = 0 M − 1 X n } {\displaystyle {\begin{aligned}f_{\theta }(X)={}&(2\pi )^{-MN/2}|R|^{-M/2}\exp \left\{-{\frac {1}{2}}\sum _{n=0}^{M-1}(X_{n}-\theta m)^{T}R^{-1}(X_{n}-\theta m)\right\}\\[4pt]={}&(2\pi )^{-MN/2}|R|^{-M/2}\exp \left\{-{\frac {1}{2}}\sum _{n=0}^{M-1}\left(\theta ^{2}m^{T}R^{-1}m\right)\right\}\\[4pt]&\exp \left\{-{\frac {1}{2}}\sum _{n=0}^{M-1}X_{n}^{T}R^{-1}X_{n}\right\}\exp \left\{\theta m^{T}R^{-1}\sum _{n=0}^{M-1}X_{n}\right\}\end{aligned}}}which is exactly in the form of the exponential family shown in the previous section, with the sufficient statistic being
T ( X ) = m T R − 1 ∑ n = 0 M − 1 X n . {\displaystyle T(X)=m^{T}R^{-1}\sum _{n=0}^{M-1}X_{n}.}Thus, we conclude that the test
φ ( T ) = { 1 T > t 0 0 T < t 0 E θ 0 φ ( T ) = α {\displaystyle \varphi (T)={\begin{cases}1&T>t_{0}\\0&T<t_{0}\end{cases}}\qquad \operatorname {E} _{\theta _{0}}\varphi (T)=\alpha }is the UMP test of size α {\displaystyle \alpha } for testing H 0 : θ ⩽ θ 0 {\displaystyle H_{0}:\theta \leqslant \theta _{0}} vs. H 1 : θ > θ 0 {\displaystyle H_{1}:\theta >\theta _{0}}
Further discussion
In general, UMP tests do not exist for vector parameters or for two-sided tests (a test in which one hypothesis lies on both sides of the alternative). The reason is that in these situations, the most powerful test of a given size for one possible value of the parameter (e.g. for θ 1 {\displaystyle \theta _{1}} where θ 1 > θ 0 {\displaystyle \theta _{1}>\theta _{0}} ) is different from the most powerful test of the same size for a different value of the parameter (e.g. for θ 2 {\displaystyle \theta _{2}} where θ 2 < θ 0 {\displaystyle \theta _{2}<\theta _{0}} ). As a result, no test is uniformly most powerful in these situations.
Further reading
- Ferguson, T. S. (1967). "Sec. 5.2: Uniformly most powerful tests". Mathematical Statistics: A decision theoretic approach. New York: Academic Press.
- Mood, A. M.; Graybill, F. A.; Boes, D. C. (1974). "Sec. IX.3.2: Uniformly most powerful tests". Introduction to the theory of statistics (3rd ed.). New York: McGraw-Hill.
- L. L. Scharf, Statistical Signal Processing, Addison-Wesley, 1991, section 4.7.
References
Casella, G.; Berger, R.L. (2008), Statistical Inference, Brooks/Cole. ISBN 0-495-39187-5 (Theorem 8.3.17) /wiki/ISBN_(identifier) ↩