Menu
Home Explore People Places Arts History Plants & Animals Science Life & Culture Technology
On this page
Comparison of regular expression engines
List article

This is a comparison of regular expression engines.

Libraries

List of regular expression libraries
NameOfficial websiteProgramming languageSoftware licenseUsed by
Boost.Regex1Boost C++ LibrariesC++BoostNotepad++ >= 6.0.0, EmEditor
Boost.XpressiveBoost C++ LibrariesC++Boost 
DEELXRegExLabC++Proprietary 
FREJ2Fuzzy Regular Expressions for JavaJavaLGPL 
GLib/GRegex3GLib reference manualCLGPL 
GNU regexGnulib reference manualCLGPLGNU libc, GNU programs
GRETAMicrosoft ResearchC++Proprietary 
GregexGrovf Inc.RTL, HLSProprietaryFPGA accelerated >100 Gbit/s regex engine for cybersecurity, financial, e-commerce industries.
HyperscanIntelC, x86-specific assembly (SSSE3+4)3-clause BSDRspamd
ICUInternational Components for UnicodeC, C++5ICUFoundation (Apple and Swift open-source versions)
Jakarta RegexpThe Apache Jakarta ProjectJavaApache 
java.util.regexJava's User manualJavaGNU GPLv2 with Classpath exceptionjEdit
JRegexJRegexJavaBSD 
MATLABRegular ExpressionsMATLAB LanguageProprietary 
OnigurumaKosakoCBSDAtom, Take Command Console, Tera Term, TextMate, Sublime Text, SubEthaEdit, EmEditor, jq, Ruby
PattwoStevesoftJava (compatible with Java 1.0)LGPL 
PCREpcre.orgC, C++6BSDApache HTTP Server, Nginx, BBEdit, Edbrowse, Julia, HHVM, Notepad++ < 6.0.0, PHP, Delphi, R, Exim, SWI-Prolog, Elixir, Erlang
Qt/QRegExpDigia Archived 2013-12-12 at the Wayback MachineC++Qt GNU GPL v. 3.0,

Qt GNU LGPL v. 2.1,Qt Commercial

Kate, Kile
regex - Henry Spencer's regular expression librariesArgListCBSD 
RE2RE2C++BSDGo, Google Sheets, Gmail, G Suite
Henry Spencer's Advanced Regular ExpressionsTclCBSD 
RGXRGX C++ based component libraryP6R 
RXPTitan ICRTLProprietaryhardware-accelerated search acceleration using RegEx available for ASIC, FPGA and cloud. Enables massively parallel content processing at ultra-high speeds.
SubRegMatt BucknallCMIT 
TPerlRegExTPerlRegEx VCL ComponentObject PascalMPLv1.1 
TRE7Ville LaurikariCBSDmusl
TRegExprTRegExpr, documentation,

(RegExp Studio)

Object PascalDual-license: freeware, or LGPL with static linking exceptionTotal Commander
Wolfram Language (Mathematica)Wolfram Language Documentation CenterWolfram LanguageProprietaryMathematica, the Wolfram Development Platform
XRegExpXRegExpJavaScriptMIT 

Languages

List of languages and frameworks including regular expression support
LanguageOfficial websiteSoftware licenseRemarks
ActionScript 3ActionScript Technology CenterFree
APL (APLX, Dyalog, GNU)APL WikiLicensed by the respective implementation⎕SS (PCRE), ⎕R/⎕S (PCRE), ⎕SS (PCRE2), respectively
C++11 (C++)C++ standards websiteLicensed by the respective implementationSince ISO14822:2011(e), similar to ECMAScript on default (Grammar Description)
DDBoost Software License8
Elixirelixir-lang.orgApache 2.0Standard library includes PCRE-based Regex module. The matching algorithms of the library are based on the PCRE library, but not all of the PCRE library is interfaced and some parts of the library go beyond what PCRE offers. Currently PCRE version 8.40 (release date 2017-01-11) is used.
Erlangerlang.orgApache 2.0Standard library includes PCRE-based re module. The matching algorithms of the library are based on the PCRE library, but not all of the PCRE library is interfaced and some parts of the library go beyond what PCRE offers. Currently PCRE version 8.40 (release date 2017-01-11) is used.
Free Pascal (Object Pascal)freepascal.orgLGPL with static linking exceptionFree Pascal 2.6+ ships with TRegExpr from Sorokin and two other regular expression libraries; See wiki.lazarus.freepascal.org/Regexpr.
Gogo.devBSD-style
HaskellHaskell.orgBSD3Omitted in the language report, and in GHC's Hierarchical Libraries
JavaJavaGNU General Public LicenseREs are written as strings in source code: all backslashes must be doubled, harming readability.
JavaScript (ECMAScript)ECMA-262BSD3Limited but REs are first-class citizens of the language with a specific /.../mod syntax.
JuliaJuliaLang.orgMIT LicenseREs are part of the language core library using PCRE built-in and an optional wrapper for (C code) ICU is available.
LuaLua.orgMIT LicenseUses simplified, limited dialect; can be bound to more powerful library, like PCRE or an alternative parser like LPeg.
MathematicaWolframProprietary
.NETMSDNMIT License910
Nimnim-lang.orgMIT LicenseStandard library includes PCRE-based re and nre modules, as well as various alternatives (ex. strutils, pegs (Parsing Expression Grammar matching), strscans, parseutils, etc.).
OCamlCamlLGPLAs of 2010[update], the standard module is generally regarded as deprecated;11 often recommended libraries are pcre (with full support for PCRE) and re (which is not as complete but claims better performance and provides frontends to popular syntaxes: PCRE, Perl, Posix, Emacs, shell globbing).
PerlPerl.comArtistic License, or GNU General Public LicenseFull, central part of the language
PHPPHP.netPHP LicenseHas two implementations, with PCRE being the more efficient in speed, functions
POSIX C (C)POSIX.1 web publicationLicensed by the respective implementationSupports POSIX BRE and ERE syntax
Pythonpython.orgPython Software Foundation LicensePython has two major implementations, the built in re and the regex library.
Rubyruby-lang.orgGNU Library General Public LicenseRuby 1.8, Ruby 1.9, and Ruby 2.0 and later versions use different engines; Ruby 1.9 integrates Oniguruma, Ruby 2.0 and later integrate Onigmo, a fork from Oniguruma.
Rustdocs.rsMIT LicenseThe primary regex crate does not allow look-around expressions. There is an Oniguruma binding called onig that does.
SAP ABAPSAP.comProprietary
Tcltcl.tkTcl/Tk License(BSD-style)Tcl library doubles as a regular expression library.
Wolfram LanguageWolfram ResearchProprietary: usable for free on a limited scale on the Wolfram Development platform
XML SchemaW3CLicensed by the respective implementation
XPath 3/XQueryW3CLicensed by the respective implementation

Language features

NOTE: An application using a library for regular expression support does not necessarily support the full set of features of the library, e.g., GNU grep uses PCRE, but supports no lookahead, though PCRE does.

Part 1

Language feature comparison (part 1)
"+" quantifierNegated character classesNon-greedy quantifiers12Shy groups13RecursionLook-aheadLook-behindBackreferences14>9 indexable captures
Boost.RegexYesYesYesYesYes15YesYesYesYes
Boost.XpressiveYesYesYesYesYes16YesYesYesYes
CL-PPCREYesYesYesYesNoYesYesYesYes
EmEditorYesYesYesYesNoYesYesYesNo
FREJNo17NoSome18YesNoNoNoYesYes
GLib/GRegexYesYesYesYesYesYesYesYesYes
GNU grepYesYesYesYesNoYesYesYes
HaskellYesYesYesYesNoYesYesYesYes
RXPYesYesYesYesNoNoNoYesYes
ICU RegexYesYesYesYesNoYesYesYesYes
JavaYesYesYesYesNoYesYesYesYes
JavaScript (ECMAScript)YesYesYesYesNoYesYes19YesYes
JGsoftYesYesYesYesYes20YesYesYesYes
LuaYesYesSome21NoNoNoNoYesNo
.NETYesYesYesYesNoYesYesYesYes
OCamlYesYesNoNoNoNoNoYesNo
PCREYesYesYesYesYesYesYesYesYes
PerlYesYesYesYesYesYesYesYesYes
PHPYesYesYesYesYesYesYesYesYes
PythonYesYesYesYesYes22YesYesYesYes
Qt/QRegExpYesYesYesYesNoYesNoYesYes
RE2YesYesYesYesNoNoNoNoYes
Ruby, OnigmoYesYesYesYesYesYesYesYesYes
TREYesYesYesYesNoNoNoYesNo
VimYesYesYesYesNoYesYesYesNo
RGXYesYesYesYesNoYesYesYesYes
TclYesYesYesYesNoYesYesYesYes
TRegExprYes?Yes??????
XML SchemaYesYesNoNoNoNoNo
XPath 3/XQueryYesYesYesYesNoNoNoYesYes
XRegExpYesYesYesYesNoYesYes23YesYes

Part 2

Language feature comparison (part 2)
Directives24ConditionalsAtomic groups25Named capture26CommentsEmbedded codeUnicode property support 27Balancing groups28Variable-length look-behinds29
Boost.RegexYesYesYesYesYesNoSome30NoNo
Boost.XpressiveYesNoYesYesYesNoNoNoNo
CL-PPCREYesYesYesYesYesYesSome31NoNo
EmEditorYesYes??YesNo?NoNo
FREJNoNoYesYesYesNo?NoNo
GLib/GRegexYesYesYesYesYesNoSome32NoNo
GNU grepYesYes?YesYesNoNoNoNo
Haskell?????NoNoNoNo
RXPYesYesNoYesYesNoNoNoNo
ICU RegexYesNoYesYes33YesNoYesNoNo
JavaYesNoYesYes34YesNoSome35NoNo
JavaScript (ECMAScript)NoNoNoYesNoNoSome363738NoYes
JGsoftYesYesYesYesYesNoSome39NoYes
LuaNoNoNoNoNoNoNoNoNo
.NETYesYesYesYesYesNoSome40YesYes
OCamlNoNoNoNoNoNoNoNoNo
PCREYesYesYesYesYesYesYesNoNo
PerlYesYesYesYesYesYesYesNoNo41
PHPYesYesYesYesYesNoNoNoNo
PythonYesYesYes42YesYesNoYes43NoYes44
Qt/QRegExpNoNoNoNoNoNoNoNoNo
RE2YesNo?YesNoNoSome45NoNo
Ruby, OnigmoYesYesYesYesYesNoSome46NoNo
TclYesNoYesNoYesNoYesNoNo
TREYesNoNoNoYesNo?NoNo
VimYesNoYesNoNoNoNoNoYes
RGXYesYesYesYesYesNoYesNoNo
XML SchemaNoNoNoNoNoNoYesNoNo
XPath 3/XQueryNoNoNoNoNoNoYesNoNo
XRegExpLeading onlyNoNoYesYesNoYesNoYes

API features

API feature comparison
Native UTF-16 support47Native UTF-8 support48Multi-line matchingPartial match49
Boost.RegexNoNoYesYes
GLib/GRegexYesYesYesYes
RXPYesYesNoYes
ICU RegexYesNoYes?
JavaYes50Yes51YesYes
.NETNo52YesYes?
PCREYes53YesYesYes
Qt/QRegExpYesNoNoYes54
Qt/QRegularExpressionYesYesYesYes
TclYesYes55Yes?
TREYesYesYes?
RGXNoNoYes?
wxWidgets::wxRegEx56YesYesYes?
XRegExpYesYesYesNo

See also

References

  1. Formerly called Regex++.

  2. One of fuzzy regular expression engines. /wiki/Regular_expression#Fuzzy_regular_expressions

  3. Included since version 2.13.0.

  4. "Getting Started – Hyperscan 5.4.0 documentation". https://intel.github.io/hyperscan/dev-reference/getting_started.html#requirements

  5. ICU4J, the Java version, does not support regular expressions.

  6. C++ bindings were developed by Google and became officially part of PCRE in 2006.

  7. One of fuzzy regular expression engines. /wiki/Regular_expression#Fuzzy_regular_expressions

  8. "STD.regex - D Programming Language - Digital Mars". http://www.digitalmars.com/d/2.0/phobos/std_regex.html

  9. "Dotnet/Corefx". GitHub. 16 February 2022. https://github.com/dotnet/corefx/blob/7116584186f8f3a886616aaf8cb5d4a982c60e27/src/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.cs#L2

  10. "Dotnet/Corefx". GitHub. 16 February 2022. https://github.com/dotnet/corefx#license

  11. "Regex - Regular Expressions in OCaml". https://stackoverflow.com/questions/3221067#comment3323649_3221067

  12. Non-greedy quantifiers match as few characters as possible, instead of the default as many. Note that many older, pre-POSIX engines were non-greedy and didn't have greedy quantifiers at all. /wiki/POSIX

  13. Shy groups, also called non-capturing groups cannot be referred to with backreferences; non-capturing groups are used to speed up matching where the group's content does not need to be accessed later.

  14. Backreferences enable referring to previously matched groups in later parts of the regex and/or replacement string (where applicable). For instance, ([ab]+)\1 matches "abab" but not "abaab".

  15. "Perl Regular Expression Syntax - 1.47.0". http://www.boost.org/doc/libs/1_47_0/libs/regex/doc/html/boost_regex/syntax/perl_syntax.html#boost_regex.syntax.perl_syntax.recursive_expressions

  16. "User's Guide - 1.47.0". http://www.boost.org/doc/libs/1_47_0/doc/html/xpressive/user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.embedding_a_regex_by_reference

  17. FREJ have no repetitive quantifiers, but have "optional" element which behaves similar to simple "?" quantifier.

  18. FREJ have no repetitive quantifiers, but have "optional" element which behaves similar to simple "?" quantifier.

  19. As of ES2018

  20. "Recursive Regex—Tutorial". https://www.rexegg.com/regex-recursion.php#engines

  21. Lua's only non-greedy quantifier is -, which is a non-greedy version of *. It does not have non-greedy versions of + or ?; in the former case, the non-greedy effect can be achieved by repeating the token followed by -, but in the latter case, there is no equivalent.

  22. Supported by the optional regex library only. https://pypi.org/project/regex/#recursive-patterns-hg-issue-27

  23. As of ES2018

  24. Also known as flags modifiers, modes modifiers or option letters. Example pattern: "(?i:test)".

  25. Also called independent sub-expressions.

  26. Similar to back references, but with names instead of indices.

  27. "UTS #18: Unicode Regular Expressions". https://www.unicode.org/reports/tr18/

  28. Special feature allowing to match balanced constructs without recursion.

  29. Refers to the possibility of including quantifiers in look-behinds, thus making their length unpredictable.

  30. Unicode property support may be incomplete (products are continuously updated!). All will be incomplete when a new Unicode revision is released until they are updated to comply.

  31. Unicode property support may be incomplete (products are continuously updated!). All will be incomplete when a new Unicode revision is released until they are updated to comply.

  32. Unicode property support may be incomplete (products are continuously updated!). All will be incomplete when a new Unicode revision is released until they are updated to comply.

  33. Available as of ICU55.

  34. Available as of JDK7.

  35. Unicode property support may be incomplete (products are continuously updated!). All will be incomplete when a new Unicode revision is released until they are updated to comply.

  36. Unicode property support may be incomplete (products are continuously updated!). All will be incomplete when a new Unicode revision is released until they are updated to comply.

  37. The support and range of properties is dependent on implementation.

  38. "ECMA-262, 9th edition, June 2018 ECMAScript® 2018 Language Specification". www.ecma-international.org. Retrieved 4 August 2020. https://www.ecma-international.org/ecma-262/9.0/#sec-runtime-semantics-unicodematchproperty-p

  39. Unicode property support may be incomplete (products are continuously updated!). All will be incomplete when a new Unicode revision is released until they are updated to comply.

  40. Unicode property support may be incomplete (products are continuously updated!). All will be incomplete when a new Unicode revision is released until they are updated to comply.

  41. Experimental support added in v5.29.9.

  42. Supported by Python v3.11 and later, and the optional regex library only. https://pypi.python.org/pypi/regex

  43. May only be available in the regex library when used with Python versions after 3.3.

  44. Supported by the optional regex library only. https://pypi.python.org/pypi/regex

  45. Unicode property support may be incomplete (products are continuously updated!). All will be incomplete when a new Unicode revision is released until they are updated to comply.

  46. Unicode property support may be incomplete (products are continuously updated!). All will be incomplete when a new Unicode revision is released until they are updated to comply.

  47. Means the format can be used internally without explicit conversion.

  48. Means the format can be used internally without explicit conversion.

  49. Partial match of the whole regular expression. For example the pattern ".*END$" will match any string partially, but only strings ending with END fully.[1]. http://www.boost.org/doc/libs/1_34_1/libs/regex/doc/partial_matches.html

  50. Supports Unicode 15.0 standard from 2023.[2]. https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/Character.html

  51. Supports Unicode 15.0 standard from 2023.[2]. https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/Character.html

  52. Implementation uses original UCS-2 support/features, so it only recognizes 64K chars total (vs UTF-16's 1,112,064 characters). A Microsoft developer-representative answered a bug report on this as "will not fix" in 2010.[3]. /wiki/UCS-2

  53. Since version 8.30.

  54. Partial matching is performed implicitly, requiring a separate call to matchedLength() if an exact match fails.

  55. Tcl includes facilities to convert to and from UTF-8.

  56. wxRegEx uses any system supplied POSIX library or if not available and for Unicode mode uses Henry Spencer's library. /wiki/POSIX