Consider, for example, this code segment in the Java programming language:5
The problem is that this does not work when using multiple threads. A lock must be obtained in case two threads call getHelper() simultaneously. Otherwise, either they may both try to create the object at the same time, or one may wind up getting a reference to an incompletely initialized object.
Synchronizing with a lock can fix this, as is shown in the following example:
This is correct and will most likely have sufficient performance. However, the first call to getHelper() will create the object and only the few threads trying to access it during that time need to be synchronized; after that all calls just get a reference to the member variable. Since synchronizing a method could in some extreme cases decrease performance by a factor of 100 or higher,6 the overhead of acquiring and releasing a lock every time this method is called seems unnecessary: once the initialization has been completed, acquiring and releasing the locks would appear unnecessary. Many programmers, including the authors of the double-checked locking design pattern, have attempted to optimize this situation in the following manner:
Intuitively, this algorithm is an efficient solution to the problem. But if the pattern is not written carefully, it will have a data race. For example, consider the following sequence of events:
Most runtimes have memory barriers or other methods for managing memory visibility across execution units. Without a detailed understanding of the language's behavior in this area, the algorithm is difficult to implement correctly. One of the dangers of using double-checked locking is that even a naive implementation will appear to work most of the time: it is not easy to distinguish between a correct implementation of the technique and one that has subtle problems. Depending on the compiler, the interleaving of threads by the scheduler and the nature of other concurrent system activity, failures resulting from an incorrect implementation of double-checked locking may only occur intermittently. Reproducing the failures can be difficult.
For the singleton pattern, double-checked locking is not needed:
If control enters the declaration concurrently while the variable is being initialized, the concurrent execution shall wait for completion of the initialization.
C++11 and beyond also provide a built-in double-checked locking pattern in the form of std::once_flag and std::call_once:
If one truly wishes to use the double-checked idiom instead of the trivially working example above (for instance because Visual Studio before the 2015 release did not implement the C++11 standard's language about concurrent initialization quoted above 8 ), one needs to use acquire and release fences:9
pthread_once() must be used to initialize library (or sub-module) code when its API does not have a dedicated initialization procedure required to be called in single-threaded mode.
As of J2SE 5.0, the volatile keyword is defined to create a memory barrier. This allows a solution that ensures that multiple threads handle the singleton instance correctly. This new idiom is described in [3] and [4].
Note the local variable "localRef", which seems unnecessary. The effect of this is that in cases where helper is already initialized (i.e., most of the time), the volatile field is only accessed once (due to "return localRef;" instead of "return helper;"), which can improve the method's overall performance by as much as 40 percent.10
Java 9 introduced the VarHandle class, which allows use of relaxed atomics to access fields, giving somewhat faster reads on machines with weak memory models, at the cost of more difficult mechanics and loss of sequential consistency (field accesses no longer participate in the synchronization order, the global order of accesses to volatile fields).11
If the helper object is static (one per class loader), an alternative is the initialization-on-demand holder idiom12 (See Listing 16.613 from the previously cited text.)
This relies on the fact that nested classes are not loaded until they are referenced.
Semantics of final field in Java 5 can be employed to safely publish the helper object without using volatile:14
The local variable tempWrapper is required for correctness: simply using helperWrapper for both null checks and the return statement could fail due to read reordering allowed under the Java Memory Model.15 Performance of this implementation is not necessarily better than the volatile implementation.
In .NET Framework 4.0, the Lazy<T> class was introduced, which internally uses double-checked locking by default (ExecutionAndPublication mode) to store either the exception that was thrown during construction, or the result of the function that was passed to Lazy<T>:16
Schmidt, D et al. Pattern-Oriented Software Architecture Vol 2, 2000 pp353-363 ↩
Pattern languages of program design. 3 (PDF) (Nachdr. ed.). Reading, Mass: Addison-Wesley. 1998. ISBN 978-0201310115. 978-0201310115 ↩
Gregoire, Marc (24 February 2021). Professional C++. John Wiley & Sons. ISBN 978-1-119-69545-5. 978-1-119-69545-5 ↩
David Bacon et al. The "Double-Checked Locking is Broken" Declaration. http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html ↩
Boehm, Hans-J (Jun 2005). "Threads cannot be implemented as a library" (PDF). ACM SIGPLAN Notices. 40 (6): 261–268. doi:10.1145/1064978.1065042. Archived from the original (PDF) on 2017-05-30. Retrieved 2014-08-12. https://web.archive.org/web/20170530160703/http://www.hpl.hp.com/techreports/2004/HPL-2004-209.pdf ↩
Haggar, Peter (1 May 2002). "Double-checked locking and the Singleton pattern". IBM. Archived from the original on 2017-10-27. Retrieved 2022-05-19. https://web.archive.org/web/20171027162134/https://www.ibm.com/developerworks/java/library/j-dcl/index.html ↩
"Support for C++11-14-17 Features (Modern C++)". https://msdn.microsoft.com/en-au/library/hh567368.aspx#concurrencytable ↩
Double-Checked Locking is Fixed In C++11 http://preshing.com/20130930/double-checked-locking-is-fixed-in-cpp11/ ↩
Bloch, Joshua (2018). Effective Java (Third ed.). Addison-Wesley. p. 335. ISBN 978-0-13-468599-1. On my machine, the method above is about 1.4 times as fast as the obvious version without a local variable. 978-0-13-468599-1 ↩
"Chapter 17. Threads and Locks". docs.oracle.com. Retrieved 2018-07-28. https://docs.oracle.com/javase/specs/jls/se10/html/jls-17.html#jls-17.4.4 ↩
Brian Goetz et al. Java Concurrency in Practice, 2006 pp348 ↩
Goetz, Brian; et al. "Java Concurrency in Practice – listings on website". Retrieved 21 October 2014. http://jcip.net.s3-website-us-east-1.amazonaws.com/listings.html ↩
[1] Javamemorymodel-discussion mailing list Page not found – consider updating the link https://mailman.cs.umd.edu/mailman/private/javamemorymodel-discussion/2010-July/000422.html ↩
[2] Manson, Jeremy (2008-12-14). "Date-Race-Ful Lazy Initialization for Performance – Java Concurrency (&c)". Retrieved 3 December 2016. http://jeremymanson.blogspot.ru/2008/12/benign-data-races-in-java.html ↩
Albahari, Joseph (2010). "Threading in C#: Using Threads". C# 4.0 in a Nutshell. O'Reilly Media. ISBN 978-0-596-80095-6. Lazy actually implements […] double-checked locking. Double-checked locking performs an additional volatile read to avoid the cost of obtaining a lock if the object is already initialized. 978-0-596-80095-6 ↩