Menu
Home Explore People Places Arts History Plants & Animals Science Life & Culture Technology
On this page
Sigmoid function
Mathematical function having a characteristic "S"-shaped curve or sigmoid curve

A sigmoid function is any mathematical function whose graph has a characteristic S-shaped curve. A common example is the logistic function, defined by the formula σ(x) = 1 / (1 + e⁻ˣ), often used synonymously with sigmoid in artificial neural networks. Special cases include the Gompertz curve and the ogee curve, employed in modeling and spillways of dams. Sigmoid functions typically map all real numbers to values ranging from 0 to 1 or –1 to 1, often monotonically increasing. Variants like the hyperbolic tangent serve as activation functions in artificial neurons, while sigmoid curves appear as cumulative distribution functions in statistics. The logistic function’s inverse is the logit function.

Related Image Collections Add Image
We don't have any YouTube videos related to Sigmoid function yet.
We don't have any PDF documents related to Sigmoid function yet.
We don't have any Books related to Sigmoid function yet.
We don't have any archived web articles related to Sigmoid function yet.

Definition

A sigmoid function is a bounded, differentiable, real function that is defined for all real input values and has a positive derivative at each point23 and exactly 1/4 at the inflection point.

Properties

In general, a sigmoid function is monotonic, and has a first derivative which is bell shaped. Conversely, the integral of any continuous, non-negative, bell-shaped function (with one local maximum and no local minimum, unless degenerate) will be sigmoidal. Thus the cumulative distribution functions for many common probability distributions are sigmoidal. One such example is the error function, which is related to the cumulative distribution function of a normal distribution; another is the arctan function, which is related to the cumulative distribution function of a Cauchy distribution.

A sigmoid function is constrained by a pair of horizontal asymptotes as x → ± ∞ {\displaystyle x\rightarrow \pm \infty } .

A sigmoid function is convex for values less than a particular point, and it is concave for values greater than that point: in many of the examples here, that point is 0.

Examples

  • Logistic function f ( x ) = 1 1 + e − x {\displaystyle f(x)={\frac {1}{1+e^{-x}}}}
  • Hyperbolic tangent (shifted and scaled version of the logistic function, above) f ( x ) = tanh ⁡ x = e x − e − x e x + e − x {\displaystyle f(x)=\tanh x={\frac {e^{x}-e^{-x}}{e^{x}+e^{-x}}}}
  • Arctangent function f ( x ) = arctan ⁡ x {\displaystyle f(x)=\arctan x}
  • Gudermannian function f ( x ) = gd ⁡ ( x ) = ∫ 0 x d t cosh ⁡ t = 2 arctan ⁡ ( tanh ⁡ ( x 2 ) ) {\displaystyle f(x)=\operatorname {gd} (x)=\int _{0}^{x}{\frac {dt}{\cosh t}}=2\arctan \left(\tanh \left({\frac {x}{2}}\right)\right)}
  • Error function f ( x ) = erf ⁡ ( x ) = 2 π ∫ 0 x e − t 2 d t {\displaystyle f(x)=\operatorname {erf} (x)={\frac {2}{\sqrt {\pi }}}\int _{0}^{x}e^{-t^{2}}\,dt}
  • Generalised logistic function f ( x ) = ( 1 + e − x ) − α , α > 0 {\displaystyle f(x)=\left(1+e^{-x}\right)^{-\alpha },\quad \alpha >0}
  • Smoothstep function f ( x ) = { ( ∫ 0 1 ( 1 − u 2 ) N d u ) − 1 ∫ 0 x ( 1 − u 2 ) N   d u , | x | ≤ 1 sgn ⁡ ( x ) | x | ≥ 1 N ∈ Z ≥ 1 {\displaystyle f(x)={\begin{cases}{\displaystyle \left(\int _{0}^{1}\left(1-u^{2}\right)^{N}du\right)^{-1}\int _{0}^{x}\left(1-u^{2}\right)^{N}\ du},&|x|\leq 1\\\\\operatorname {sgn}(x)&|x|\geq 1\\\end{cases}}\quad N\in \mathbb {Z} \geq 1}
  • Some algebraic functions, for example f ( x ) = x 1 + x 2 {\displaystyle f(x)={\frac {x}{\sqrt {1+x^{2}}}}}
  • and in a more general form4 f ( x ) = x ( 1 + | x | k ) 1 / k {\displaystyle f(x)={\frac {x}{\left(1+|x|^{k}\right)^{1/k}}}}
  • Up to shifts and scaling, many sigmoids are special cases of f ( x ) = φ ( φ ( x , β ) , α ) , {\displaystyle f(x)=\varphi (\varphi (x,\beta ),\alpha ),} where φ ( x , λ ) = { ( 1 − λ x ) 1 / λ λ ≠ 0 e − x λ = 0 {\displaystyle \varphi (x,\lambda )={\begin{cases}(1-\lambda x)^{1/\lambda }&\lambda \neq 0\\e^{-x}&\lambda =0\\\end{cases}}} is the inverse of the negative Box–Cox transformation, and α < 1 {\displaystyle \alpha <1} and β < 1 {\displaystyle \beta <1} are shape parameters.5
  • Smooth transition function6 normalized to (−1,1):

f ( x ) = { 2 1 + e − 2 m x 1 − x 2 − 1 , | x | < 1 sgn ⁡ ( x ) | x | ≥ 1 = { tanh ⁡ ( m x 1 − x 2 ) , | x | < 1 sgn ⁡ ( x ) | x | ≥ 1 {\displaystyle {\begin{aligned}f(x)&={\begin{cases}{\displaystyle {\frac {2}{1+e^{-2m{\frac {x}{1-x^{2}}}}}}-1},&|x|<1\\\\\operatorname {sgn}(x)&|x|\geq 1\\\end{cases}}\\&={\begin{cases}{\displaystyle \tanh \left(m{\frac {x}{1-x^{2}}}\right)},&|x|<1\\\\\operatorname {sgn}(x)&|x|\geq 1\\\end{cases}}\end{aligned}}} using the hyperbolic tangent mentioned above. Here, m {\displaystyle m} is a free parameter encoding the slope at x = 0 {\displaystyle x=0} , which must be greater than or equal to 3 {\displaystyle {\sqrt {3}}} because any smaller value will result in a function with multiple inflection points, which is therefore not a true sigmoid. This function is unusual because it actually attains the limiting values of −1 and 1 within a finite range, meaning that its value is constant at −1 for all x ≤ − 1 {\displaystyle x\leq -1} and at 1 for all x ≥ 1 {\displaystyle x\geq 1} . Nonetheless, it is smooth (infinitely differentiable, C ∞ {\displaystyle C^{\infty }} ) everywhere, including at x = ± 1 {\displaystyle x=\pm 1} .

Applications

Many natural processes, such as those of complex system learning curves, exhibit a progression from small beginnings that accelerates and approaches a climax over time. When a specific mathematical model is lacking, a sigmoid function is often used.7

The van Genuchten–Gupta model is based on an inverted S-curve and applied to the response of crop yield to soil salinity.

Examples of the application of the logistic S-curve to the response of crop yield (wheat) to both the soil salinity and depth to water table in the soil are shown in modeling crop response in agriculture.

In artificial neural networks, sometimes non-smooth functions are used instead for efficiency; these are known as hard sigmoids.

In audio signal processing, sigmoid functions are used as waveshaper transfer functions to emulate the sound of analog circuitry clipping.8

In biochemistry and pharmacology, the Hill and Hill–Langmuir equations are sigmoid functions.

In computer graphics and real-time rendering, some of the sigmoid functions are used to blend colors or geometry between two values, smoothly and without visible seams or discontinuities.

Titration curves between strong acids and strong bases have a sigmoid shape due to the logarithmic nature of the pH scale.

The logistic function can be calculated efficiently by utilizing type III Unums.9

An hierarchy of sigmoid growth models with increasing complexity (number of parameters) was built10 with the primary goal to re-analyze kinetic data, the so called N-t curves, from heterogeneous nucleation experiments,11 in electrochemistry. The hierarchy includes at present three models, with 1, 2 and 3 parameters, if not counting the maximal number of nuclei Nmax, respectively—a tanh2 based model called α2112 originally devised to describe diffusion-limited crystal growth (not aggregation!) in 2D, the Johnson-Mehl-Avrami-Kolmogorov (JMAKn) model,13 and the Richards model.14 It was shown that for the concrete purpose even the simplest model works and thus it was implied that the experiments revisited are an example of two-step nucleation with the first step being the growth of the metastable phase in which the nuclei of the stable phase form.15

See also

Wikimedia Commons has media related to Sigmoid functions.

Further reading

  • Mitchell, Tom M. (1997). Machine Learning. WCB McGraw–Hill. ISBN 978-0-07-042807-2.. (NB. In particular see "Chapter 4: Artificial Neural Networks" (in particular pp. 96–97) where Mitchell uses the word "logistic function" and the "sigmoid function" synonymously – this function he also calls the "squashing function" – and the sigmoid (aka logistic) function is used to compress the outputs of the "neurons" in multi-layer neural nets.)
  • Humphrys, Mark. "Continuous output, the sigmoid function". Archived from the original on 2022-07-14. Retrieved 2022-07-14. (NB. Properties of the sigmoid, including how it can shift along axes and how its domain may be transformed.)

References

  1. Han, Jun; Morag, Claudio (1995). "The influence of the sigmoid function parameters on the speed of backpropagation learning". In Mira, José; Sandoval, Francisco (eds.). From Natural to Artificial Neural Computation. Lecture Notes in Computer Science. Vol. 930. pp. 195–201. doi:10.1007/3-540-59497-3_175. ISBN 978-3-540-59497-0. 978-3-540-59497-0

  2. Han, Jun; Morag, Claudio (1995). "The influence of the sigmoid function parameters on the speed of backpropagation learning". In Mira, José; Sandoval, Francisco (eds.). From Natural to Artificial Neural Computation. Lecture Notes in Computer Science. Vol. 930. pp. 195–201. doi:10.1007/3-540-59497-3_175. ISBN 978-3-540-59497-0. 978-3-540-59497-0

  3. Ling, Yibei; He, Bin (December 1993). "Entropic analysis of biological growth models". IEEE Transactions on Biomedical Engineering. 40 (12): 1193–2000. doi:10.1109/10.250574. PMID 8125495. https://ieeexplore.ieee.org/document/250574

  4. Dunning, Andrew J.; Kensler, Jennifer; Coudeville, Laurent; Bailleux, Fabrice (2015-12-28). "Some extensions in continuous methods for immunological correlates of protection". BMC Medical Research Methodology. 15 (107): 107. doi:10.1186/s12874-015-0096-9. PMC 4692073. PMID 26707389. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4692073

  5. "grex --- Growth-curve Explorer". GitHub. 2022-07-09. Archived from the original on 2022-08-25. Retrieved 2022-08-25. https://github.com/ogarciav/grex

  6. EpsilonDelta (2022-08-16). "Smooth Transition Function in One Dimension | Smooth Transition Function Series Part 1". 13:29/14:04 – via www.youtube.com. https://www.youtube.com/watch?v=vD5g8aVscUI

  7. Gibbs, Mark N.; Mackay, D. (November 2000). "Variational Gaussian process classifiers". IEEE Transactions on Neural Networks. 11 (6): 1458–1464. doi:10.1109/72.883477. PMID 18249869. S2CID 14456885. /wiki/IEEE_Transactions_on_Neural_Networks

  8. Smith, Julius O. (2010). Physical Audio Signal Processing (2010 ed.). W3K Publishing. ISBN 978-0-9745607-2-4. Archived from the original on 2022-07-14. Retrieved 2020-03-28. 978-0-9745607-2-4

  9. Gustafson, John L.; Yonemoto, Isaac (2017-06-12). "Beating Floating Point at its Own Game: Posit Arithmetic" (PDF). Archived (PDF) from the original on 2022-07-14. Retrieved 2019-12-28. /wiki/John_L._Gustafson

  10. Kleshtanova, Viktoria and Ivanov, Vassil V and Hodzhaoglu, Feyzim and Prieto, Jose Emilio and Tonchev, Vesselin (2023). "Heterogeneous Substrates Modify Non-Classical Nucleation Pathways: Reanalysis of Kinetic Data from the Electrodeposition of Mercury on Platinum Using Hierarchy of Sigmoid Growth Models". Crystals. 13 (12). MDPI: 1690. Bibcode:2023Cryst..13.1690K. doi:10.3390/cryst13121690.{{cite journal}}: CS1 maint: multiple names: authors list (link) https://doi.org/10.3390%2Fcryst13121690

  11. Markov, I. and Stoycheva, E. (1976). "Saturation Nucleus Density in the Electrodeposition of Metals onto Inert Electrodes II. Experimental". Thin Solid Films. 35 (1). Elsevier: 21–35. doi:10.1016/0040-6090(76)90109-7.{{cite journal}}: CS1 maint: multiple names: authors list (link) /wiki/Doi_(identifier)

  12. Ivanov, V.V. and Tielemann, C. and Avramova, K. and Reinsch, S. and Tonchev, V. (2023). "Modelling Crystallization: When the Normal Growth Velocity Depends on the Supersaturation". Journal of Physics and Chemistry of Solids. 181. Elsevier: 111542. doi:10.1016/j.jpcs.2022.111542 (inactive 2025-01-28).{{cite journal}}: CS1 maint: DOI inactive as of January 2025 (link) CS1 maint: multiple names: authors list (link) /wiki/Doi_(identifier)

  13. Fanfoni, M. and Tomellini, M. (1998). "The Johnson-Mehl-Avrami-Kohnogorov Model: A Brief Review". Il Nuovo Cimento D. 20. Springer: 1171–1182. doi:10.1007/s002690050098.{{cite journal}}: CS1 maint: multiple names: authors list (link) /wiki/Doi_(identifier)

  14. Tjørve, E. and Tjørve, K.M.C. (2010). "A Unified Approach to the Richards-Model Family for Use in Growth Analyses: Why We Need Only Two Model Forms". Journal of Theoretical Biology. 267 (3). Elsevier: 417–425. doi:10.1016/j.jtbi.2010.02.027. PMID 20176032.{{cite journal}}: CS1 maint: multiple names: authors list (link) /wiki/Doi_(identifier)

  15. Kleshtanova, Viktoria and Ivanov, Vassil V and Hodzhaoglu, Feyzim and Prieto, Jose Emilio and Tonchev, Vesselin (2023). "Heterogeneous Substrates Modify Non-Classical Nucleation Pathways: Reanalysis of Kinetic Data from the Electrodeposition of Mercury on Platinum Using Hierarchy of Sigmoid Growth Models". Crystals. 13 (12). MDPI: 1690. Bibcode:2023Cryst..13.1690K. doi:10.3390/cryst13121690.{{cite journal}}: CS1 maint: multiple names: authors list (link) https://doi.org/10.3390%2Fcryst13121690