Computational statistics

<h2 id="history">History</h2>
<p>Though computational statistics is widely used today, it actually has a relatively short history of acceptance in the <a href="/facts/Statistics/DNPsKGYU">statistics</a> community. For the most part, the founders of the field of statistics relied on mathematics and asymptotic approximations in the development of computational statistical methodology.<a class="footnote-ref" id="fnref:5" href="#fn:5"><sup>5</sup></a>
</p><p>In 1908, <a href="/facts/William_Sealy_Gosset/5v21pjTh">William Sealy Gosset</a> performed his now well-known <a href="/facts/Monte_Carlo_method/AkHjY7jc">Monte Carlo method simulation</a> which led to the discovery of the <a href="/facts/Student%2527s_t-distribution/DeT1SDqH">Student’s t-distribution</a>.<a class="footnote-ref" id="fnref:6" href="#fn:6"><sup>6</sup></a> With the help of computational methods, he also has plots of the empirical distributions overlaid on the corresponding theoretical distributions. The computer has revolutionized simulation and has made the replication of Gosset’s experiment little more than an exercise.<a class="footnote-ref" id="fnref:7" href="#fn:7"><sup>7</sup></a><a class="footnote-ref" id="fnref:8" href="#fn:8"><sup>8</sup></a>
</p><p>Later on, the scientists put forward computational ways of generating <a href="/facts/Pseudorandomness/kfs1wfhY">pseudo-random</a> deviates, performed methods to convert uniform deviates into other distributional forms using inverse <a href="/facts/Cumulative_distribution_function/WaKU8tp4">cumulative distribution function</a> or acceptance-rejection methods, and developed state-space methodology for <a href="/facts/Markov_chain_Monte_Carlo/eTjk3WQy">Markov chain Monte Carlo</a>.<a class="footnote-ref" id="fnref:9" href="#fn:9"><sup>9</sup></a> One of the first efforts to generate random digits in a fully automated way, was undertaken by the RAND Corporation in 1947. The <a href="/facts/Random_number_book/nCuUhpho">tables</a> produced were published as a <a href="/facts/A_Million_Random_Digits_with_100%2c000_Normal_Deviates/YR2Sr5tt">book in 1955</a>, and also as a series of punch cards.
</p><p>By the mid-1950s, several articles and patents for devices had been proposed for <a href="/facts/Hardware_random_number_generator/Kc7HIkgJ">random number generators</a>.<a class="footnote-ref" id="fnref:10" href="#fn:10"><sup>10</sup></a> The development of these devices were motivated from the need to use <a href="/facts/Randomness/QJQBoEX5">random digits</a> to perform simulations and other fundamental components in statistical analysis. One of the most well known of such devices is ERNIE, which produces random numbers that determine the winners of the <a href="/facts/Premium_Bond/a3gqEvRY">Premium Bond</a>, a lottery bond issued in the United Kingdom. In 1958, <a href="/facts/John_Tukey/2a5FbPWb">John Tukey</a>’s <a href="/facts/Jackknife_resampling/CmJ29chQ">jackknife</a> was developed. It is as a method to reduce the <a href="/facts/Bias/geuh5KrE">bias</a> of parameter estimates in samples under nonstandard conditions.<a class="footnote-ref" id="fnref:11" href="#fn:11"><sup>11</sup></a> This requires computers for practical implementations. To this point, computers have made many tedious statistical studies feasible.<a class="footnote-ref" id="fnref:12" href="#fn:12"><sup>12</sup></a>
</p>
<h2 id="methods">Methods</h2>
<h3>Maximum likelihood estimation</h3>
<p><a href="/facts/Maximum_likelihood_estimation/0Yq2dpQD">Maximum likelihood estimation</a> is used to <a href="/facts/Estimation_theory/4nbgFDec">estimate</a> the <a href="/facts/Statistical_parameter/fkJju25M">parameters</a> of an assumed <a href="/facts/Probability_distribution/EpsKKVRu">probability distribution</a>, given some observed data. It is achieved by <a href="/facts/Mathematical_optimization/oRn8Iv5I">maximizing</a> a <a href="/facts/Likelihood_function/1L8P4NBX">likelihood function</a> so that the <a href="/facts/Realization_(probability)/lDklUzqX">observed data</a> is most probable under the assumed <a href="/facts/Statistical_model/jKkT5ftm">statistical model</a>.
</p>
<h3>Monte Carlo method</h3>
<p><a href="/facts/Monte_Carlo/uGorqC7k">Monte Carlo</a> is a statistical method that relies on repeated <a href="/facts/Random_sampling/kIb01xdL">random sampling</a> to obtain numerical results. The concept is to use <a href="/facts/Randomness/QJQBoEX5">randomness</a> to solve problems that might be <a href="/facts/Deterministic_system/oT1NvyHD">deterministic</a> in principle. They are often used in <a href="/facts/Physics/a7aH2y08">physical</a> and <a href="/facts/Mathematics/pxTouaz4">mathematical</a> problems and are most useful when it is difficult to use other approaches. Monte Carlo methods are mainly used in three problem classes: <a href="/facts/Optimization/oRn8Iv5I">optimization</a>, <a href="/facts/Numerical_integration/MwJnvcDV">numerical integration</a>, and generating draws from a <a href="/facts/Probability_distribution/EpsKKVRu">probability distribution</a>.
</p>
<h3>Markov chain Monte Carlo</h3>
<p>The <a href="/facts/Markov_chain_Monte_Carlo/eTjk3WQy">Markov chain Monte Carlo</a> method creates samples from a continuous <a href="/facts/Random_variable/TwTBXnLT">random variable</a>, with <a href="/facts/Probability_density/zvfybna4">probability density</a> proportional to a known function. These samples can be used to evaluate an integral over that variable, such as its <a href="/facts/Expected_value/1XV0JKL8">expected value</a> or <a href="/facts/Variance/ULBJKXD1">variance</a>. The more steps are included, the more closely the distribution of the sample matches the actual desired distribution.
</p>
<h3>Bootstrapping</h3>
<p>The <a href="/facts/Bootstrapping_(statistics)/zCHuBeIz">bootstrap</a> is a resampling technique used to generate samples from an <a href="/facts/Empirical_distribution_function/frWIF9F9">empirical probability distribution</a> defined by an original sample of the population. It can be used to find a bootstrapped estimator of a population parameter. It can also be used to estimate the standard error of an estimator as well as to generate bootstrapped confidence intervals. The <a href="/facts/Jackknife_resampling/CmJ29chQ">jackknife</a> is a related technique.<a class="footnote-ref" id="fnref:13" href="#fn:13"><sup>13</sup></a>
</p>
<h2 id="applications">Applications</h2>
<ul><li><a href="/facts/Computational_biology/GxgNfwcX">Computational biology</a></li>
<li><a href="/facts/Computational_linguistics/XmegJ6LC">Computational linguistics</a></li>
<li><a href="/facts/Computational_mathematics/JRoX1lu0">Computational mathematics</a></li>
<li><a href="/facts/Computational_materials_science/LJXymzsk">Computational materials science</a></li>
<li><a href="/facts/Computational_physics/1ltJ109X">Computational physics</a></li>
<li><a href="/facts/Computational_psychometrics/WxNthMHH">Computational psychometrics</a></li>
<li><a href="/facts/Computational_social_science/ystAmlkW">Computational social science</a></li>
<li><a href="/facts/Computational_sociology/8MCodkUS">Computational sociology</a></li>
<li><a href="/facts/Econometrics/7U52gzHt">Econometrics</a></li>
<li><a href="/facts/Machine_learning/e0w0XJTu">Machine Learning</a></li></ul>
<h2 id="computational-statistics-journals">Computational statistics journals</h2>
<ul><li><i><a href="/facts/Communications_in_Statistics/P0DmNgyy">Communications in Statistics - Simulation and Computation</a></i></li>
<li><i><a href="/facts/Computational_Statistics/JnD0zQDl">Computational Statistics</a></i></li>
<li><i><a href="/facts/Computational_Statistics_%2526_Data_Analysis/KEATzPGn">Computational Statistics & Data Analysis</a></i></li>
<li><i><a href="/facts/Journal_of_Computational_and_Graphical_Statistics/65ySJyzf">Journal of Computational and Graphical Statistics</a></i></li>
<li><i><a href="/facts/Journal_of_Statistical_Computation_and_Simulation/oP6mjlDx">Journal of Statistical Computation and Simulation</a></i></li>
<li><i><a href="/facts/Journal_of_Statistical_Software/1JbtfX1y">Journal of Statistical Software</a></i></li>
<li><i><a href="/facts/The_R_Journal/GyoFVw3I">The R Journal</a></i></li>
<li><i><a href="/facts/The_Stata_Journal/7Sx8QoJH">The Stata Journal</a></i></li>
<li><i><a href="/facts/Statistics_and_Computing/Gs25Kfud">Statistics and Computing</a></i></li>
<li><i><a href="/facts/Wiley_Interdisciplinary_Reviews%3a_Computational_Statistics/T3CpVGkb">Wiley Interdisciplinary Reviews: Computational Statistics</a></i></li></ul>
<h2 id="associations">Associations</h2>
<ul><li><a href="/facts/International_Association_for_Statistical_Computing/Hh449Kek">International Association for Statistical Computing</a></li></ul>
<h2 id="see-also">See also</h2>
<ul><li><a href="/facts/Algorithms_for_statistical_classification/jXXHRkXR">Algorithms for statistical classification</a></li>
<li><a href="/facts/Data_science/AHYUo0Rn">Data science</a></li>
<li><a href="/facts/Statistical_methods_in_artificial_intelligence/lJGauQwX">Statistical methods in artificial intelligence</a></li>
<li><a href="/facts/Free_statistical_software/px2wc0mE">Free statistical software</a></li>
<li><a href="/facts/List_of_algorithms/g3uenNkc">List of statistical algorithms</a></li>
<li><a href="/facts/List_of_statistical_packages/QFqfrzWo">List of statistical packages</a></li>
<li><a href="/facts/Machine_learning/e0w0XJTu">Machine learning</a></li></ul>

<h2 id="further-reading">Further reading</h2>
<h3>Articles</h3>
<ul><li>Albert, J.H.; Gentle, J.E. (2004), Albert, James H; Gentle, James E (eds.), "Special Section: Teaching Computational Statistics", <i>The American Statistician</i>, 58: 1, <a href="/facts/Doi_(identifier)/muM9Etpq">doi</a>:<a href="https://doi.org/10.1198%2F0003130042872">10.1198/0003130042872</a>, <a href="/facts/S2CID_(identifier)/ldJsHa2Y">S2CID</a> <a href="https://api.semanticscholar.org/CorpusID:219596225">219596225</a></li>
<li>Wilkinson, Leland (2008), "The Future of Statistical Computing (with discussion)", <i>Technometrics</i>, 50 (4): 418–435, <a href="/facts/Doi_(identifier)/muM9Etpq">doi</a>:<a href="https://doi.org/10.1198%2F004017008000000460">10.1198/004017008000000460</a>, <a href="/facts/S2CID_(identifier)/ldJsHa2Y">S2CID</a> <a href="https://api.semanticscholar.org/CorpusID:3521989">3521989</a></li></ul>
<h3>Books</h3>
<ul><li>Drew, John H.; <a href="/facts/Diane_L._Evans/5HGFdGaG">Evans, Diane L.</a>; Glen, Andrew G.; Lemis, Lawrence M. (2007), <i>Computational Probability: Algorithms and Applications in the Mathematical Sciences</i>, Springer International Series in Operations Research & Management Science, Springer, <a href="/facts/ISBN_(identifier)/15AdSPa9">ISBN</a> 978-0-387-74675-3</li>
<li>Gentle, James E. (2002), <i>Elements of Computational Statistics</i>, Springer, <a href="/facts/ISBN_(identifier)/15AdSPa9">ISBN</a> 0-387-95489-9</li>
<li>Gentle, James E.; Härdle, Wolfgang; Mori, Yuichi, eds. (2004), <i>Handbook of Computational Statistics: Concepts and Methods</i>, Springer, <a href="/facts/ISBN_(identifier)/15AdSPa9">ISBN</a> 3-540-40464-3</li>
<li>Givens, Geof H.; <a href="/facts/Jennifer_A._Hoeting/0GmGrRJU">Hoeting, Jennifer A.</a> (2005), <i>Computational Statistics</i>, Wiley Series in Probability and Statistics, Wiley-Interscience, <a href="/facts/ISBN_(identifier)/15AdSPa9">ISBN</a> 978-0-471-46124-1</li>
<li>Klemens, Ben (2008), <i>Modeling with Data: Tools and Techniques for Statistical Computing</i>, Princeton University Press, <a href="/facts/ISBN_(identifier)/15AdSPa9">ISBN</a> 978-0-691-13314-0</li>
<li>Monahan, John (2001), <i>Numerical Methods of Statistics</i>, Cambridge University Press, <a href="/facts/ISBN_(identifier)/15AdSPa9">ISBN</a> 978-0-521-79168-7</li>
<li>Rose, Colin; Smith, Murray D. (2002), <i>Mathematical Statistics with Mathematica</i>, Springer Texts in Statistics, Springer, <a href="/facts/ISBN_(identifier)/15AdSPa9">ISBN</a> 0-387-95234-9</li>
<li>Thisted, Ronald Aaron (1988), <a href="https://archive.org/details/elementsofstatis0000unse"><i>Elements of Statistical Computing: Numerical Computation</i></a>, CRC Press, <a href="/facts/ISBN_(identifier)/15AdSPa9">ISBN</a> 0-412-01371-1</li>
<li>Gharieb, Reda. R. (2017), <i>Data Science: Scientific and Statistical Computing</i>, Noor Publishing, <a href="/facts/ISBN_(identifier)/15AdSPa9">ISBN</a> 978-3-330-97256-8</li></ul>
<h2 id="external-links">External links</h2>
<h3>Associations</h3>
<ul><li><a href="http://www.iasc-isi.org/">International Association for Statistical Computing</a></li>
<li><a href="https://frostyboost.com/">Statistical Computing section of the American Statistical Association</a></li></ul>
<h3>Journals</h3>
<ul><li><a href="http://www.elsevier.com/wps/find/journaldescription.cws_home/505539/description">Computational Statistics & Data Analysis</a></li>
<li><a href="https://web.archive.org/web/20160919090854/http://www.amstat.org/publications/JCGS">Journal of Computational & Graphical Statistics</a></li>
<li><a href="https://www.springer.com/statistics/computational/journal/11222">Statistics and Computing</a></li></ul>

<h2 id="references">References</h2>

<ol>
<li id="fn:1"><p>Nolan, D. & Temple Lang, D. (2010). "Computing in the Statistics Curricula", The American Statistician 64 (2), pp.97-107. <a href="/wiki/Deborah_A._Nolan" target="_blank">/wiki/Deborah_A._Nolan</a> <a href="#fnref:1" class="footnote-back-ref">↩</a></p></li>
<li id="fn:2"><p>Wegman, Edward J. “Computational Statistics: A New Agenda for Statistical Theory and Practice.” Journal of the Washington Academy of Sciences, vol. 78, no. 4, 1988, pp. 310–322. JSTOR <a href="/wiki/Edward_Wegman" target="_blank">/wiki/Edward_Wegman</a> <a href="#fnref:2" class="footnote-back-ref">↩</a></p></li>
<li id="fn:3"><p>Wegman, Edward J. “Computational Statistics: A New Agenda for Statistical Theory and Practice.” Journal of the Washington Academy of Sciences, vol. 78, no. 4, 1988, pp. 310–322. JSTOR <a href="/wiki/Edward_Wegman" target="_blank">/wiki/Edward_Wegman</a> <a href="#fnref:3" class="footnote-back-ref">↩</a></p></li>
<li id="fn:4"><p>Lauro, Carlo (1996), "Computational statistics or statistical computing, is that the question?", Computational Statistics & Data Analysis, 23 (1): 191–193, doi:10.1016/0167-9473(96)88920-1 <a href="/wiki/Doi_(identifier)" target="_blank">/wiki/Doi_(identifier)</a> <a href="#fnref:4" class="footnote-back-ref">↩</a></p></li>
<li id="fn:5"><p>Watnik, Mitchell (2011). "Early Computational Statistics". Journal of Computational and Graphical Statistics. 20 (4): 811–817. doi:10.1198/jcgs.2011.204b. ISSN 1061-8600. S2CID 120111510. <a href="http://www.tandfonline.com/doi/abs/10.1198/jcgs.2011.204b" target="_blank">http://www.tandfonline.com/doi/abs/10.1198/jcgs.2011.204b</a> <a href="#fnref:5" class="footnote-back-ref">↩</a></p></li>
<li id="fn:6"><p>"Student" [William Sealy Gosset] (1908). "The probable error of a mean" (PDF). Biometrika. 6 (1): 1–25. doi:10.1093/biomet/6.1.1. hdl:10338.dmlcz/143545. JSTOR 2331554.{{cite journal}}:  CS1 maint: numeric names: authors list (link) <a href="/wiki/William_Sealy_Gosset" target="_blank">/wiki/William_Sealy_Gosset</a> <a href="#fnref:6" class="footnote-back-ref">↩</a></p></li>
<li id="fn:7"><p>Trahan, Travis John (2019-10-03). Recent Advances in Monte Carlo Methods at Los Alamos National Laboratory (Report). doi:10.2172/1569710. OSTI 1569710. <a href="/wiki/Doi_(identifier)" target="_blank">/wiki/Doi_(identifier)</a> <a href="#fnref:7" class="footnote-back-ref">↩</a></p></li>
<li id="fn:8"><p>Metropolis, Nicholas; Ulam, S. (1949). "The Monte Carlo Method". Journal of the American Statistical Association. 44 (247): 335–341. doi:10.1080/01621459.1949.10483310. ISSN 0162-1459. PMID 18139350. <a href="https://dx.doi.org/10.1080/01621459.1949.10483310" target="_blank">https://dx.doi.org/10.1080/01621459.1949.10483310</a> <a href="#fnref:8" class="footnote-back-ref">↩</a></p></li>
<li id="fn:9"><p>Robert, Christian; Casella, George (2011-02-01). "A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data". Statistical Science. 26 (1). arXiv:0808.2902. doi:10.1214/10-sts351. ISSN 0883-4237. S2CID 2806098. <a href="https://doi.org/10.1214%2F10-sts351" target="_blank">https://doi.org/10.1214%2F10-sts351</a> <a href="#fnref:9" class="footnote-back-ref">↩</a></p></li>
<li id="fn:10"><p>Pierre L'Ecuyer (2017). "History of uniform random number generation" (PDF). 2017 Winter Simulation Conference (WSC). pp. 202–230. doi:10.1109/WSC.2017.8247790. ISBN 978-1-5386-3428-8. S2CID 4567651. <a href="978-1-5386-3428-8" target="_blank">978-1-5386-3428-8</a> <a href="#fnref:10" class="footnote-back-ref">↩</a></p></li>
<li id="fn:11"><p>QUENOUILLE, M. H. (1956). "Notes on Bias in Estimation". Biometrika. 43 (3–4): 353–360. doi:10.1093/biomet/43.3-4.353. ISSN 0006-3444. <a href="https://dx.doi.org/10.1093/biomet/43.3-4.353" target="_blank">https://dx.doi.org/10.1093/biomet/43.3-4.353</a> <a href="#fnref:11" class="footnote-back-ref">↩</a></p></li>
<li id="fn:12"><p>Teichroew, Daniel (1965). "A History of Distribution Sampling Prior to the Era of the Computer and its Relevance to Simulation". Journal of the American Statistical Association. 60 (309): 27–49. doi:10.1080/01621459.1965.10480773. ISSN 0162-1459. <a href="https://dx.doi.org/10.1080/01621459.1965.10480773" target="_blank">https://dx.doi.org/10.1080/01621459.1965.10480773</a> <a href="#fnref:12" class="footnote-back-ref">↩</a></p></li>
<li id="fn:13"><p>Rizzo, Maria (15 November 2007). Statistical Computing with R. CRC Press. ISBN 9781420010718. <a href="9781420010718" target="_blank">9781420010718</a> <a href="#fnref:13" class="footnote-back-ref">↩</a></p></li>
</ol>

Computational statistics open-in-new

Computational statistics