Scaling xtext - XtextCon 2015

Scaling Xtext
Lieven Lemiengre

Sigasi
● IDE for Hardware Description Languages
○ VHDL, (System)Verilog
● Using Xtext for 4 years
● Large user base
○ (commercial, free, students)

Our company goal
● Assist hardware designer
● High quality interactive front-end compiler
○ Instant feedback
■ parsing, semantic, linting, style checking
○ IDE Services
■ visualisations
■ design exploration
■ documentation generation
○ Integrate with ecosystem
■ other compilers, simulators, synthesizers

The challenge
● Pre-specified languages
● Large projects
○ > 250 KLOC is not uncommon
○ design + external libraries
○ big files
■ some libraries are distributed as 1 file
■ generated hardware cores
●

Adopting Xtext
● Started with the early Xtext 2.0 snapshots
● Initial performance analysis
○ Clean build performance of a big project (330k LOC)
■ > 20 minutes
■ > 2 GB
○ Editing big files (> 1 MB)
■ unusable

Adopting Xtext
● Started with the early Xtext 2.0 snapshots
● Initial performance analysis
○ Clean build performance of a big project (330k LOC)
■ > 20 minutes → < 1 min
■ > 2 GB → ~ 1 GB memory
○ Editing big files (> 1 MB)
■ unusable → usable with reduced editor

● Xtext framework improvements
● Measure → analyze → improve or cheat
○ faster build
○ reduce memory usage
○ UI responsiveness
Improving performance

Overview
● Analysing build performance
○ Analyze the build
■ Macro build measurements
■ Key performance points
● Reduce workload
● Parallelize the build
○ Tracking performance
○ VHDL linking in depth
● Analyzing UI issues
○ Monitoring the UI thread
○ Saveguards

Analyzing builds: builder overview
Global
indexing
Linking
Validation
Custom
Validation
Global
index
Eclipse
resources
warnings
errors
resource
descriptions
Builder
Participants
resource
changes
?

Global
indexing
Global
index
resource
descriptions
resource
changes
● Usage
○ Location of exported declarations
○ Incremental compilation
● Implementation
○ IResourceDescriptionsStrategy
○ Default: all declarations
○ Customize!
○ Runs before linking!
● IResourceDescriptions &
IEObjectDescriptions
○ Always in memory
○ Persisted tot disk @ shutdown

Linking
Validation
● Usage
○ Determine IScope all cross references
○ Link cross reference or create linking error
● Implementation
○ ILinkingService, IScopeProvider,
LazyLinkingResource
○ Requires global index for global scope
○ Direct link or link to global scope
○ Linking may trigger linking and resource
loading
linking
errors
Eclipse
resources

● Usage
○ Execute all custom validations
○ Creates errors / warnings
● Implementation
○ AbstractDeclarativeValidator
○ Execute validations using reflection
○ Works against linked model
○ May trigger linking & resource loading
Linking
Validation
errors
warnings
Eclipse
resources

Global
indexing
Linking
Validation
Custom
Validation
Global
index
Eclipse
resources
warnings
errors
resource
descriptions
Builder
Participants
resource
changes
?
● iterations ?
● order ?

Analyzing builds: metrics
● For each build
○ # of files being build
○ timing: Global index, Linking, Validation, Individual
builder participants
● Instrument by overriding
ClusteringBuilderState & XtextBuilder
● Example: Building 134 resources, timing: {
global index=1806,
linking=378,
validation=823,
totalLinkingAndValidation=1364
}

Analyzing builds: resource loads
● Observation:
○ Most time spent in resource loads
○ Certain files are loaded multiple times?!
● Solutions
○ Reduce memory pressure
○ Make loading faster
Global
indexing
Linking
validation
Custom
Validation
Builder
Participants
resources
LOAD
POTENTIAL
RELOADS
POTENTIAL
RELOADS

Memory pressure
Global
index
Resource
Set
Memory pressure?
● Size of EMF models
○ All the resources loaded during the build
● Size of global index
○ Always loaded
○ Depends on number of open projects

Memory pressure: EMF models
Reduce EMF size
○ Watch out for inferred model
http://www.sigasi.com/content/view-complexity-your-xtext-ecore-model
○ Avoid
■ Emf classes with just one list of things
● ListOfThings : ‘(‘ things+=Thing (‘,’ things+=Thing)* ‘)’
● class ListOfThings { contains Thing[] things }
■ Often unused fields
○ Code duplication in grammar vs efficient model
○ Fine-grained control with Xcore model

Memory pressure: Global index
In YourResourceDescriptionStrategy
○ Export foo, foo.rec, foo.rec.field1, foo.rec.field2
○ Add user-data: someType & anotherType
○ To reduce memory usage: don’t export child elements
■ export foo.rec + hash of fields
■ export foo + hash of contents of foo
■ can’t link these elements without loading anymore
package foo is
record rec is
field1 : someType;
field2 : anotherType(X downto Y);
end;
end;

Optimize loading
● What is resource load?
○ Parse
○ build EMF model & install EMF proxies
○ build Node model

Optimize loading
○ Parse
● Parallelise
○ parse multiple files simultaneously
○ ~3 time faster loads on 4 core machine
○ only loading, not linking

Optimize loading
○ Parse
● Parallelise
○ parse multiple files simultaneously
○ ~3 time faster loads on 4 core machine
○ only loading, not linking
● Cache
○ serialize EMF and Node model in a cache
○ originally 3-4 time faster loads
○ now 1.5x (no backtracking, simplified grammar)

Linking
Global
indexing
Linking
validation
Custom
Validation
Builder
Participants
● Language specific
○ VHDL vs Verilog
● Avoiding linking
○ library files, only linked when used in user-code
● Many iterations
○ lazy linking vs eager linking
○ From 40% of build time to 20%

Custom Validation
Global
indexing
Linking
Validation
Custom
Validation
Builder
Participants
● Combine validations to avoid model
traversals
● Local analysis, do global validations moved
into builder participant
● Avoid validation
○ disabled validations
○ libraries: errors & warnings are suppressed anyway
● Monitor

Track performance
● Nightly build
● log build times

VHDL linking in depth
History of VHDL linking
● 1st version
○ AbstractDeclarativeScopeProvider
○ best effort scoping
● 2nd version
○ removed reflection
● 3rd version
○ special rule-based internal java dsl
○ first attempt to be 100% correct
● 4rth version
○ batch/eager linking
○ type errors

foo(baz) <= bar(bak.f(“?”), (‘1’, 2))) + 1;
● Most elements in an expression are overloaded
○ subprograms, literals, enumliterals
● foo(baz)
○ 9 kinds of meanings
○ 4 of them can have subprogram overloading
● overloading includes return type
● overload resolution is very hard
○ you have to find 1 unambiguous solution
○ resolve all cross-references together

Xtext lazy linking
● good
○ declarative: only a few rules
○ fine-grained: can be good for performance
○ re-use: scoping, auto-complete, serialisation
● bad
○ hard to debug
■ can call itself
■ lots of caching
■ indirection, huge stack traces
○ performance
■ build context for every cross-reference

Batch/Eager linking
● good
○ simple top-down algorithm
○ natural fit for vhdl
○ well described in literature
● bad
○ resolve 1 reference = resolve all references
○ a lot of extra xtext customisation
■ auto-complete & serialization?
■ linking errors?

Our hybrid approach
● Eager/batch linking of design units
○ Big files are partially scoped
○ Parent-scope of a design unit is the global scope
○ Local scoping is executed eagerly
● Global scope
○ Import declarations of other design units
○ Only query is find design unit x.y
■ load resource of x.y
■ create an ‘ExternalScope’ & cache it
● Always load dependent resources
○ needed for validation, hovers, highlighting anyway

Vhdl linking in depth
Conclusion
● Easier to implement, debug and optimize
● Type error reporting during linking
● Memory intensive?
○ Every dependency is loaded
○ OK in practice for VHDL
● A lot of xtext customisation!
○ A lot of classes are affected
○ Forward compatible?

UI responsiveness
● Measuring: detect a blocked UI thread
○ initially Svelto https://github.com/dragos/svelto
○ now our own method & logging
○ Eclipse Mars
● Improvements
○ UI is for drawing only!
○ Make sure everything is cancellable
● Safeguards
○ certain services should never be executed on the UI
thread => check & log

Lightweight Editor (fallback)
● Syntax-highlighting + markers
● For files > 1 MB
● Based on ContentTypes extension point

Two ContentTypes (based on file size)
<extension point="org.eclipse.core.contenttype.contentTypes">
<content-type ...
describer="com.sigasi...FullVhdlContentDescriber"
name="VHDL editor"
<describer class="...FullVhdlContentDescriber" />
</content-type>
<content-type ...
describer="com.sigasi....LightweightVhdlContentDescriber"
name="Lightweight VHDL editor"
<describer class="...LightweightVhdlContentDescriber" />
</content-type>
</extension>

Future work
● Continuous process
● Cache global index info per resource?
● Linking without node model?
● StoredResources

Come talk to us about...
● Documentation generation
● Fancy linking algorithms / type systems
● Graphical views
● Cross-language support
● Testing Xtext-plugins
● Lexical macros
● Manage large amount of validations
● ...

Scaling xtext - XtextCon 2015

Recommended

Recommended

More Related Content

Recently uploaded

Recently uploaded (20)

Featured

Featured (20)

Scaling xtext - XtextCon 2015

Editor's Notes