7. About a year ago I started
to maintain CSSO
(a CSS minifier)
7
github.com/css/csso
8. CSSO was based on Gonzales
(a CSS parser)
8
github.com/css/gonzales
9. What's wrong with Gonzales
• Development stopped in 2013
• Unhandy and buggy AST format
• Parsing mistakes
• Excessively complex code base
• Slow, high memory consumption, pressure for GC
9
10. But I didn’t want
to spend my time developing the
parser…
10
15. PostCSS pros
• Сonstantly developing
• Parses CSS well, even non-standard syntax
+ tolerant mode
• Saves formatting info
• Handy API to work with AST
• Fast
15
17. That forces developers to
• Use non-robust or non-effective approaches
• Invent their own parsers
• Use additional parsers:
postcss-selector-parser
postcss-value-parser
17
18. Switching to PostCSS meant writing
our own selector and value parsers,
what is pretty much the same as
writing an entirely new parser
18
19. However, as a result of a continuous
refactoring within a few months
the CSSO parser was completely rewrote
(which was not planned)
19
20. And was extracted
to a separate project
github.com/csstree/csstree
20
22. CSSO – performance boost story
(russian)
22
tinyurl.com/csso-speedup
My previous talk about parser performance
23. After my talk on HolyJS
conference the parser's
performance was improved
one more time :)
23
* Thanks Vyacheslav @mraleph Egorov for inspiration
24. 24
CSSTree: 24 ms
Mensch: 31 ms
CSSOM: 36 ms
PostCSS: 38 ms
Rework: 81 ms
PostCSS Full: 100 ms
Gonzales: 175 ms
Stylecow: 176 ms
Gonzales PE: 214 ms
ParserLib: 414 ms
bootstrap.css v3.3.7 (146Kb)
github.com/postcss/benchmark
Non-detailed AST
Detailed AST
PostCSS Full =
+ postcss-selector-parser
+ postcss-value-parser
25. Epic fail
as I realised later I extracted
the wrong version of the parser
25
😱
github.com/csstree/csstree/commit/57568c758195153e337f6154874c3bc42dd04450
26. 26
CSSTree: 24 ms
Mensch: 31 ms
CSSOM: 36 ms
PostCSS: 38 ms
Rework: 81 ms
PostCSS Full: 100 ms
Gonzales: 175 ms
Stylecow: 176 ms
Gonzales PE: 214 ms
ParserLib: 414 ms
bootstrap.css v3.3.7 (146Kb)
github.com/postcss/benchmark
Time after parser
update
13 ms
43. 43
scanner.token // current token or null
scanner.next() // going to next token
scanner.lookup(N) // look ahead, returns
// Nth token from current token
Key API
44. 44
• lookup(N)
fills tokens buffer up to N tokens (if they are not
computed yet), returns N-1 token from buffer
• next()
shift token from buffer, if any, or compute
next token
45. Computing the same number of tokens,
but not simultaneously
and requires less memory
45
51. 51
[
{
type: 46,
value: '.',
offset: 0,
line: 1,
column: 1
},
…
]
We can avoid substring
storage in the token – it's very
expensive for punctuation
(moreover those substrings
are never used);
Many constructions are
assembled by several
substrings. One long substring
is better than
a concat of several small ones
54. 54
Moreover not an Array, but TypedArray
Array
of objects
Arrays
of numbers
55. Array vs. TypedArray
• Can't have holes
• Faster in theory (less checking)
• Can be stored outside the heap (when big
enough)
• Prefilled with zeros
55
69. 65
line = lines[offset];
column = offset - lines.lastIndexOf(line - 1, offset);
lines & columns
It's acceptable only for short lines,
that's why we cache the last line
start offset
74. Performance «killers»*
• RegExp
• String concatenation
• toLowerCase/toUpperCase
• substr/substring
• …
70
* Polluted GC pulls performance down
We can’t avoid using
these things, but we
can get rid of the
rest
80. Heuristics
• Comparison with the reference strings only (str)
• Reference strings may be in lower case and
contain latin letters only (no unicode)
• I read once on Twitter…
76
81. Setting of the 6th bit to 1 changes upper case
latin letter to lower case
(works for latin ASCII letters only)
'A' = 01000001
'a' = 01100001
'A'.charCodeAt(0) | 32 === 'a'.charCodeAt(0)
77
82. 78
function cmpStr(source, start, end, str) {
…
for (var i = start; i < end; i++) {
…
// source[i].toLowerCase()
if (sourceCode >= 65 && sourceCode <= 90) { // 'A' .. 'Z'
sourceCode = sourceCode | 32;
}
if (sourceCode !== strCode) {
return false;
}
}
…
}
Case insensitive string comparison
83. Benefits
• Frequent comparison stops on length check
• No substring (no pressure on GC)
• No temporary strings (e.g. result of
toLowerCase/toUpperCase)
• String comparison don't pollute CG
79
86. What's wrong with arrays?
• As we are growing arrays their memory
fragments are to be relocated frequently
(unnecessary memory moving)
• Pressure on GC
• We don't know the size of resulting arrays
82
92. Pros
• No memory relocation
• No GC pollution during AST assembly
• next/prev references for free
• Cheap insertion and deletion
• Better for monomorphic walkers
87
93. Those approaches and others allowed
to reduce memory consumption,
pressure on GC and made the parser
twice faster than before
88
94. 89
CSSTree: 24 ms
Mensch: 31 ms
CSSOM: 36 ms
PostCSS: 38 ms
Rework: 81 ms
PostCSS Full: 100 ms
Gonzales: 175 ms
Stylecow: 176 ms
Gonzales PE: 214 ms
ParserLib: 414 ms
bootstrap.css v3.3.7 (146Kb)
github.com/postcss/benchmark
It's about this
changes
13 ms
116. 110
class Scanner {
...
next() {
var next = this.currentToken + 1;
this.currentToken = next;
this.tokenStart = this.tokenEnd;
this.tokenEnd = this.offsetAndType[next] & 0xFFFFFF;
this.tokenType = this.offsetAndType[next] >> 24;
}
}
Now we need just
one read
117. 111
class Scanner {
...
next() {
var next = this.currentToken + 1;
this.currentToken = next;
this.tokenStart = this.tokenEnd;
next = this.offsetAndType[next];
this.tokenEnd = next & 0xFFFFFF;
this.tokenType = next >> 24;
}
}
-50% reads (~250k)
👌
121. New strategy
• Preallocate 16Kb buffer by default
• Create new buffer only if current is smaller
than needed for parsing
• Significantly improves performance
especially in cases when parsing a number of
small CSS fragments
114
122. 115
CSSTree: 24 ms
Mensch: 31 ms
CSSOM: 36 ms
PostCSS: 38 ms
Rework: 81 ms
PostCSS Full: 100 ms
Gonzales: 175 ms
Stylecow: 176 ms
Gonzales PE: 214 ms
ParserLib: 414 ms
bootstrap.css v3.3.7 (146Kb)
github.com/postcss/benchmark
13 ms 7 ms
Current results
130. 123
var csstree = require('css-tree');
var syntax = csstree.syntax.defaultSyntax;
var ast = csstree.parse('… your css …');
csstree.walkDeclarations(ast, function(node) {
if (!syntax.match(node.property.name, node.value)) {
console.log(syntax.lastMatchError);
}
});
Your own validator in 8 lines of code
131. Some tools and plugins
• csstree-validator – npm package + cli command
• stylelint-csstree-validator – plugin for stylelint
• gulp-csstree – plugin for gulp
• SublimeLinter-contrib-csstree – plugin for Sublime Text
• vscode-csstree – plugin for VS Code
• csstree-validator – plugin for Atom
More is coming…
124