You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

83 lines
2.9 KiB

4 years ago
# flatstr
Flattens the underlying C structures of a concatenated JavaScript string
## About
If you're doing lots of string concatenation and then writing that
string somewhere, you may find that passing your string through
`flatstr` vastly improves performance.
## Usage
```js
var flatstr = require('flatstr')
flatstr(someHeavilyConcatenatedString)
```
## Benchmarks
Benchmarks test flat vs non-flat strings being written to
an `fs.WriteStream`.
```
unflattenedManySmallConcats*10000: 147.540ms
flattenedManySmallConcats*10000: 105.994ms
unflattenedSeveralLargeConcats*10000: 287.901ms
flattenedSeveralLargeConcats*10000: 226.121ms
unflattenedExponentialSmallConcats*10000: 410.533ms
flattenedExponentialSmallConcats*10000: 219.973ms
unflattenedExponentialLargeConcats*10000: 2774.230ms
flattenedExponentialLargeConcats*10000: 1862.815ms
```
In each case, flattened strings win,
here's the performance gains from using `flatstr`
```
ManySmallConcats: 28%
SeveralLargeConcats: 21%
ExponentialSmallConcats: 46%
ExponentialLargeConcats: 33%
```
## How does it work
In the v8 C++ layer, JavaScript strings can be represented in two ways.
1. As an array
2. As a tree
When JavaScript strings are concatenated, tree structures are used
to represent them. For the concat operation, this is cheaper than
reallocating a larger array. However, performing other operations
on the tree structures can become costly (particularly where lots of
concatenation has occurred).
V8 has a a method called `String::Flatten`which converts the tree into a C array. This method is typically called before operations that walk through the bytes of the string (for instance, when testing against a regular expression). It may also be called if a string is accessed many times over,
as an optimization on the string. However, strings aren't always flattened. One example is when we pass a string into a `WriteStream`, at some point the string will be converted to a buffer, and this may be expensive if the underlying representation is a tree.
`String::Flatten` is not exposed as a JavaScript function, but it can be triggered as a side effect.
There are several ways to indirectly call `String::Flatten` (see `alt-benchmark.js`),
but coercion to a number appears to be (one of) the cheapest.
However since Node 10 the V8 version has stopped using Flatten in all
places identified. Thus the code has been updated to seamlessly
use the native runtime function `%FlattenString` without having to use
the `--allow-natives-syntax` flag directly.
One final note: calling flatstr too much can in fact negatively effect performance. For instance, don't call it every time you concat (if that
was performant, v8 wouldn't be using trees in the first place). The best
place to use flatstr is just prior to passing it to an API that eventually
runs non-v8 code (such as `fs.WriteStream`, or perhaps `xhr` or DOM apis in the browser).
## Acknowledgements
* Sponsored by nearForm
## License
MIT