In mathematics the signal-to-noise statistic distance between two vectors a and b with mean values μ a {\displaystyle \mu _{a}} and μ b {\displaystyle \mu _{b}} and standard deviation σ a {\displaystyle \sigma _{a}} and σ b {\displaystyle \sigma _{b}} respectively is:
D s n = ( μ a − μ b ) ( σ a + σ b ) {\displaystyle D_{sn}={(\mu _{a}-\mu _{b}) \over (\sigma _{a}+\sigma _{b})}}In the case of Gaussian-distributed data and unbiased class distributions, this statistic can be related to classification accuracy given an ideal linear discrimination, and a decision boundary can be derived.
This distance is frequently used to identify vectors that have significant difference. One usage is in bioinformatics to locate genes that are differential expressed on microarray experiments.
See also
Notes
References
Auffarth, B., Lopez, M., Cerquides, J. (2010). Comparison of redundancy and relevance measures for feature selection in tissue classification of CT images. Advances in Data Mining. Applications and Theoretical Aspects. p. 248--262. Springer. https://www.researchgate.net/publication/225143114_Comparison_of_Redundancy_and_Relevance_Measures_for_Feature_Selection_in_Tissue_Classification_of_CT_Images ↩
Golub, T.R. et al. (1999) Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring. Science 286, 531-537, http://archive.broadinstitute.org/mpr/publications/projects/Leukemia/Golub_et_al_1999.pdf ↩
Slonim D.K. et al. (2000) Class Prediction and Discovery Using Gene Expression Data. Procs. of the Fourth Annual International Conference on Computational Molecular Biology Tokyo, Japan April 8 - 11, p263-272 https://www.researchgate.net/publication/2804500_Class_Prediction_and_Discovery_Using_Gene_Expression_Data ↩
Pomeroy, S.L. et al. (2002) Gene Expression-Based Classification and Outcome Prediction of Central Nervous System Embryonal Tumors. Nature 415, 436–442. http://www.broad.mit.edu/mpr/CNS/ ↩