Main article: Stateful firewall
Most businesses and other institutions use a firewall to protect their internal network from malicious attacks originating from outside. All traffic between the internal network and the external Internet must pass through a firewall, which discards traffic likely to be harmful.
A stateful firewall tracks the state of each logical connection passing through it, and rejects data packets inappropriate for the state of the connection. For example, a website would not be allowed to send a page to a computer on the internal network, unless the computer had requested it. This requires a firewall to keep track of the pages recently requested, and match requests with responses.
A firewall must also analyze network traffic in much more detail, compared to other networking components, such as routers and switches. Routers only have to deal with the network layer, but firewalls must also process the transport and application layers as well. All this additional processing takes time, and limits network throughput. While routers and most other networking components can handle speeds of 100 billion bits per second (Gbps), firewalls limit traffic to about 1 Gbit/s,12 which is unacceptable for passing large amounts of scientific data.
Modern firewalls can leverage custom hardware (ASIC) to accelerate traffic and inspection, in order to achieve higher throughput. This can present an alternative to Science DMZs and allows in place inspection through existing firewalls, as long as unified threat management (UTM) inspection is disabled.
While stateful firewall may be necessary for critical business data, such as financial records, credit cards, employment data, student grades, trade secrets, etc., science data requires less protection, because copies usually exist in multiple locations and there is less economic incentive to tamper.13
Main article: DMZ (computing)
A firewall must restrict access to the internal network but allow external access to services offered to the public, such as web servers on the internal network. This is usually accomplished by creating a separate internal network called a DMZ, a play on the term "demilitarized zone." External devices are allowed to access devices in the DMZ. Devices in the DMZ are usually maintained more carefully to reduce their vulnerability to malware. Hardened devices are sometimes called bastion hosts.
The Science DMZ takes the DMZ idea one step farther, by moving high performance computing into its own DMZ.14 Specially configured routers pass science data directly to or from designated devices on an internal network, thereby creating a virtual DMZ. Security is maintained by setting access control lists (ACLs) in the routers to only allow traffic to/from particular sources and destinations. Security is further enhanced by using an intrusion detection system (IDS) to monitor traffic, and look for indications of attack. When an attack is detected, the IDS can automatically update router tables, resulting in what some call a Remotely Triggered BlackHole (RTBH).15
The Science DMZ provides a well-configured location for the networking, systems, and security infrastructure that supports high-performance data movement. In data-intensive science environments, data sets have outgrown portable media, and the default configurations used by many equipment and software vendors are inadequate for high performance applications. The components of the Science DMZ are specifically configured to support high performance applications, and to facilitate the rapid diagnosis of performance problems. Without the deployment of dedicated infrastructure, it is often impossible to achieve acceptable performance. Simply increasing network bandwidth is usually not good enough, as performance problems are caused by many factors, ranging from underpowered firewalls to dirty fiber optics to untuned operating systems.
The Science DMZ is the codification of a set of shared best practices—concepts that have been developed over the years—from the scientific networking and systems community. The Science DMZ model describes the essential components of high-performance data transfer infrastructure in a way that is accessible to non-experts and scalable across any size of institution or experiment.
The primary components of a Science DMZ are:
Optional Science DMZ components include:
Dan Goodin (June 26, 2012). "Scientists experience life outside the firewall with "Science DMZs."". Retrieved 2013-05-12. https://arstechnica.com/security/2012/06/science-dmz/ ↩
Eli Dart; Brian Tierney; Eric Pouyoul; Joe Breen (January 2012). "Achieving the Science DMZ" (PDF). Retrieved 2015-12-31. http://www.internet2.edu/presentations/jt2012winter/ScienceDMZ-Tutorial-Jan2012-1.pdf ↩
Dart, E.; Rotman, L.; Tierney, B.; Hester, M.; Zurawski, J. (2013). "The Science DMZ". Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '13. p. 1. doi:10.1145/2503210.2503245. ISBN 978-1-4503-2378-9. S2CID 52861484. 978-1-4503-2378-9 ↩
"Why Science DMZ?". Retrieved 2013-05-12. http://fasterdata.es.net/science-dmz/motivation/ ↩
Dart, Eli; Metzger, Joe (June 13, 2011). "The Science DMZ". CERN LHCOPN/LHCONE workshop. Retrieved 2013-05-26. This is the earliest cite-able reference to the Science DMZ. Work on the concept had been going on for several years prior to this. http://indico.cern.ch/getFile.py/access?subContId=0&contribId=16&resId=0&materialId=slides&confId=131550 ↩
"Implementation of a Science DMZ at San Diego State University to Facilitate High-Performance Data Transfer for Scientific Applications". National Science Foundation. September 10, 2012. Retrieved 2013-05-13. https://www.nsf.gov/awardsearch/showAward?AWD_ID=1245312 ↩
"SDNX - Enabling End-to-End Dynamic Science DMZ". National Science Foundation. September 7, 2012. Retrieved 2013-05-13. https://www.nsf.gov/awardsearch/showAward?AWD_ID=1246386 ↩
"Improving an existing science DMZ". National Science Foundation. September 12, 2012. Retrieved 2013-05-13. https://www.nsf.gov/awardsearch/showAward?AWD_ID=1246187 ↩
Dart, Eli; Rotman, Lauren (Aug 2012). "The Science DMZ: A Network Architecture for Big Data". LBNL-report. http://www.es.net/news-and-publications/publications-and-presentations/ ↩
Brett Ryder (Feb 25, 2010). "The Data Deluge". The Economist. http://www.economist.com/node/15579717 ↩
."Network Requirements and Expectations". Lawrence Berkeley National Laboratory. http://fasterdata.es.net/fasterdata-home/requirements-and-expectations/ ↩
"Firewall Performance Comparison" (PDF). https://www.juniper.net/content/dam/www/assets/datasheets/us/en/security/security-products-comparison-chart.pdf ↩
pmoyer (Dec 13, 2012). "Research & Education Network (REN) Architecture: Science-DMZ". Retrieved 2013-05-12. http://community.brocade.com/community/brocadeblogs/service_providers/blog/2012/12/13/research-education-network-ren-architecture-science-dmz ↩
"Science DMZ: Data Transfer Nodes". Lawrence Berkeley Laboratory. 2013-04-04. Retrieved 2013-05-13. http://fasterdata.es.net/science-dmz/DTN/ ↩