You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Björn b963b3396c
init
5 years ago
..
data init 5 years ago
LICENSE-MIT.txt init 5 years ago
README.md init 5 years ago
package.json init 5 years ago
rewrite-pattern.js init 5 years ago

README.md

regexpu-core Build status Code coverage status

regexpu is a source code transpiler that enables the use of ES2015 Unicode regular expressions in JavaScript-of-today (ES5).

regexpu-core contains regexpus core functionality, i.e. rewritePattern(pattern, flag), which enables rewriting regular expressions that make use of the ES2015 u flag into equivalent ES5-compatible regular expression patterns.

Installation

To use regexpu-core programmatically, install it as a dependency via npm:

npm install regexpu-core --save

Then, require it:

const rewritePattern = require('regexpu-core');

API

This module exports a single function named rewritePattern.

rewritePattern(pattern, flags, options)

This function takes a string that represents a regular expression pattern as well as a string representing its flags, and returns an ES5-compatible version of the pattern.

rewritePattern('foo.bar', 'u');
// → 'foo(?:[\\0-\\t\\x0B\\f\\x0E-\\u2027\\u202A-\\uD7FF\\uDC00-\\uFFFF]|[\\uD800-\\uDBFF][\\uDC00-\\uDFFF]|[\\uD800-\\uDBFF])bar'

rewritePattern('[\\u{1D306}-\\u{1D308}a-z]', 'u');
// → '(?:[a-z]|\\uD834[\\uDF06-\\uDF08])'

rewritePattern('[\\u{1D306}-\\u{1D308}a-z]', 'ui');
// → '(?:[a-z\\u017F\\u212A]|\\uD834[\\uDF06-\\uDF08])'

regexpu-core can rewrite non-ES6 regular expressions too, which is useful to demonstrate how their behavior changes once the u and i flags are added:

// In ES5, the dot operator only matches BMP symbols:
rewritePattern('foo.bar');
// → 'foo(?:[\\0-\\t\\x0B\\f\\x0E-\\u2027\\u202A-\\uFFFF])bar'

// But with the ES2015 `u` flag, it matches astral symbols too:
rewritePattern('foo.bar', 'u');
// → 'foo(?:[\\0-\\t\\x0B\\f\\x0E-\\u2027\\u202A-\\uD7FF\\uDC00-\\uFFFF]|[\\uD800-\\uDBFF][\\uDC00-\\uDFFF]|[\\uD800-\\uDBFF])bar'

The optional options argument recognizes the following properties:

dotAllFlag (default: false)

Setting this option to true enables support for the s (dotAll) flag.

rewritePattern('.');
// → '[\\0-\\t\\x0B\\f\\x0E-\\u2027\\u202A-\\uFFFF]'

rewritePattern('.', '', {
  'dotAllFlag': true
});
// → '[\\0-\\t\\x0B\\f\\x0E-\\u2027\\u202A-\\uFFFF]'

rewritePattern('.', 's', {
  'dotAllFlag': true
});
// → '[\\0-\\uFFFF]'

rewritePattern('.', 'su', {
  'dotAllFlag': true
});
// → '(?:[\\0-\\uD7FF\\uE000-\\uFFFF]|[\\uD800-\\uDBFF][\\uDC00-\\uDFFF]|[\\uD800-\\uDBFF](?![\\uDC00-\\uDFFF])|(?:[^\\uD800-\\uDBFF]|^)[\\uDC00-\\uDFFF])'

unicodePropertyEscape (default: false)

Setting this option to true enables support for Unicode property escapes:

rewritePattern('\\p{Script_Extensions=Anatolian_Hieroglyphs}', 'u', {
  'unicodePropertyEscape': true
});
// → '(?:\\uD811[\\uDC00-\\uDE46])'

lookbehind (default: false)

Setting this option to true enables support for lookbehind assertions.

rewritePattern('(?<=.)a', '', {
  'lookbehind': true
});
// → '(?<=[\\0-\\t\\x0B\\f\\x0E-\\u2027\\u202A-\\uFFFF])a'

namedGroup (default: false)

Setting this option to true enables support for named capture groups.

rewritePattern('(?<name>.)\k<name>', '', {
  'namedGroups': true
});
// → '(.)\1'

onNamedGroup

This option is a function that gets called when a named capture group is found. It receives two parameters: the name of the group, and its index.

rewritePattern('(?<name>.)\k<name>', '', {
  'namedGroups': true,
  onNamedGroup(name, index) {
    console.log(name, index);
    // → 'name', 1
  }
});

useUnicodeFlag (default: false)

Setting this option to true enables the use of Unicode code point escapes of the form \u{…}. Note that in regular expressions, such escape sequences only work correctly when the ES2015 u flag is set. Enabling this setting often results in more compact output, although there are cases (such as \p{Lu}) where it actually increases the output size.

rewritePattern('\\p{Script_Extensions=Anatolian_Hieroglyphs}', 'u', {
  'unicodePropertyEscape': true,
  'useUnicodeFlag': true
});
// → '[\\u{14400}-\\u{14646}]'

Author

twitter/mathias
Mathias Bynens

License

regexpu-core is available under the MIT license.