SlideShare a Scribd company logo
1 of 38
Download to read offline
Scaling Xtext
Lieven Lemiengre
Sigasi
● IDE for Hardware Description Languages
○ VHDL, (System)Verilog
● Using Xtext for 4 years
● Large user base
○ (commercial, free, students)
Our company goal
● Assist hardware designer
● High quality interactive front-end compiler
○ Instant feedback
■ parsing, semantic, linting, style checking
○ IDE Services
■ visualisations
■ design exploration
■ documentation generation
○ Integrate with ecosystem
■ other compilers, simulators, synthesizers
Visualisations
The challenge
● Pre-specified languages
● Large projects
○ > 250 KLOC is not uncommon
○ design + external libraries
○ big files
■ some libraries are distributed as 1 file
■ generated hardware cores
●
Adopting Xtext
● Started with the early Xtext 2.0 snapshots
● Initial performance analysis
○ Clean build performance of a big project (330k LOC)
■ > 20 minutes
■ > 2 GB
○ Editing big files (> 1 MB)
■ unusable
Adopting Xtext
● Started with the early Xtext 2.0 snapshots
● Initial performance analysis
○ Clean build performance of a big project (330k LOC)
■ > 20 minutes → < 1 min
■ > 2 GB → ~ 1 GB memory
○ Editing big files (> 1 MB)
■ unusable → usable with reduced editor
● Xtext framework improvements
● Measure → analyze → improve or cheat
○ faster build
○ reduce memory usage
○ UI responsiveness
Improving performance
Overview
● Analysing build performance
○ Analyze the build
■ Macro build measurements
■ Key performance points
● Reduce workload
● Parallelize the build
○ Tracking performance
○ VHDL linking in depth
● Analyzing UI issues
○ Monitoring the UI thread
○ Saveguards
Analyzing builds: builder overview
Global
indexing
Linking
Validation
Custom
Validation
Global
index
Eclipse
resources
warnings
errors
resource
descriptions
Builder
Participants
resource
changes
?
Analyzing builds: builder overview
Global
indexing
Global
index
resource
descriptions
resource
changes
● Usage
○ Location of exported declarations
○ Incremental compilation
● Implementation
○ IResourceDescriptionsStrategy
○ Default: all declarations
○ Customize!
○ Runs before linking!
● IResourceDescriptions &
IEObjectDescriptions
○ Always in memory
○ Persisted tot disk @ shutdown
Analyzing builds: builder overview
Linking
Validation
● Usage
○ Determine IScope all cross references
○ Link cross reference or create linking error
● Implementation
○ ILinkingService, IScopeProvider,
LazyLinkingResource
○ Requires global index for global scope
○ Direct link or link to global scope
○ Linking may trigger linking and resource
loading
linking
errors
Eclipse
resources
Analyzing builds: builder overview
● Usage
○ Execute all custom validations
○ Creates errors / warnings
● Implementation
○ AbstractDeclarativeValidator
○ Execute validations using reflection
○ Works against linked model
○ May trigger linking & resource loading
Linking
Validation
errors
warnings
Eclipse
resources
Analyzing builds: builder overview
Global
indexing
Linking
Validation
Custom
Validation
Global
index
Eclipse
resources
warnings
errors
resource
descriptions
Builder
Participants
resource
changes
?
● iterations ?
● order ?
Analyzing builds: metrics
● For each build
○ # of files being build
○ timing: Global index, Linking, Validation, Individual
builder participants
● Instrument by overriding
ClusteringBuilderState & XtextBuilder
● Example: Building 134 resources, timing: {
global index=1806,
linking=378,
validation=823,
totalLinkingAndValidation=1364
}
Analyzing builds: resource loads
● Observation:
○ Most time spent in resource loads
○ Certain files are loaded multiple times?!
● Solutions
○ Reduce memory pressure
○ Make loading faster
Global
indexing
Linking
validation
Custom
Validation
Builder
Participants
resources
LOAD
POTENTIAL
RELOADS
POTENTIAL
RELOADS
Memory pressure
Global
index
Resource
Set
Memory pressure?
● Size of EMF models
○ All the resources loaded during the build
● Size of global index
○ Always loaded
○ Depends on number of open projects
Memory pressure: EMF models
Reduce EMF size
○ Watch out for inferred model
http://www.sigasi.com/content/view-complexity-your-xtext-ecore-model
○ Avoid
■ Emf classes with just one list of things
● ListOfThings : ‘(‘ things+=Thing (‘,’ things+=Thing)* ‘)’
● class ListOfThings { contains Thing[] things }
■ Often unused fields
○ Code duplication in grammar vs efficient model
○ Fine-grained control with Xcore model
Memory pressure: Global index
In YourResourceDescriptionStrategy
○ Export foo, foo.rec, foo.rec.field1, foo.rec.field2
○ Add user-data: someType & anotherType
○ To reduce memory usage: don’t export child elements
■ export foo.rec + hash of fields
■ export foo + hash of contents of foo
■ can’t link these elements without loading anymore
package foo is
record rec is
field1 : someType;
field2 : anotherType(X downto Y);
end;
end;
Optimize loading
● What is resource load?
○ Parse
○ build EMF model & install EMF proxies
○ build Node model
Optimize loading
● What is resource load?
○ Parse
○ build EMF model & install EMF proxies
○ build Node model
● Parallelise
○ parse multiple files simultaneously
○ ~3 time faster loads on 4 core machine
○ only loading, not linking
Optimize loading
● What is resource load?
○ Parse
○ build EMF model & install EMF proxies
○ build Node model
● Parallelise
○ parse multiple files simultaneously
○ ~3 time faster loads on 4 core machine
○ only loading, not linking
● Cache
○ serialize EMF and Node model in a cache
○ originally 3-4 time faster loads
○ now 1.5x (no backtracking, simplified grammar)
Linking
Global
indexing
Linking
validation
Custom
Validation
Builder
Participants
● Language specific
○ VHDL vs Verilog
● Avoiding linking
○ library files, only linked when used in user-code
● Many iterations
○ lazy linking vs eager linking
○ From 40% of build time to 20%
Custom Validation
Global
indexing
Linking
Validation
Custom
Validation
Builder
Participants
● Combine validations to avoid model
traversals
● Local analysis, do global validations moved
into builder participant
● Avoid validation
○ disabled validations
○ libraries: errors & warnings are suppressed anyway
● Monitor
Track performance
● Nightly build
● log build times
VHDL linking in depth
History of VHDL linking
● 1st version
○ AbstractDeclarativeScopeProvider
○ best effort scoping
● 2nd version
○ removed reflection
● 3rd version
○ special rule-based internal java dsl
○ first attempt to be 100% correct
● 4rth version
○ batch/eager linking
○ type errors
VHDL linking in depth
foo(baz) <= bar(bak.f(“?”), (‘1’, 2))) + 1;
● Most elements in an expression are overloaded
○ subprograms, literals, enumliterals
● foo(baz)
○ 9 kinds of meanings
○ 4 of them can have subprogram overloading
● overloading includes return type
● overload resolution is very hard
○ you have to find 1 unambiguous solution
○ resolve all cross-references together
VHDL linking in depth
Xtext lazy linking
● good
○ declarative: only a few rules
○ fine-grained: can be good for performance
○ re-use: scoping, auto-complete, serialisation
● bad
○ hard to debug
■ can call itself
■ lots of caching
■ indirection, huge stack traces
○ performance
■ build context for every cross-reference
VHDL linking in depth
Batch/Eager linking
● good
○ simple top-down algorithm
○ natural fit for vhdl
○ well described in literature
● bad
○ resolve 1 reference = resolve all references
○ a lot of extra xtext customisation
■ auto-complete & serialization?
■ linking errors?
VHDL linking in depth
Our hybrid approach
● Eager/batch linking of design units
○ Big files are partially scoped
○ Parent-scope of a design unit is the global scope
○ Local scoping is executed eagerly
● Global scope
○ Import declarations of other design units
○ Only query is find design unit x.y
■ load resource of x.y
■ create an ‘ExternalScope’ & cache it
● Always load dependent resources
○ needed for validation, hovers, highlighting anyway
Vhdl linking in depth
Conclusion
● Easier to implement, debug and optimize
● Type error reporting during linking
● Memory intensive?
○ Every dependency is loaded
○ OK in practice for VHDL
● A lot of xtext customisation!
○ A lot of classes are affected
○ Forward compatible?
UI responsiveness
● Measuring: detect a blocked UI thread
○ initially Svelto https://github.com/dragos/svelto
○ now our own method & logging
○ Eclipse Mars
● Improvements
○ UI is for drawing only!
○ Make sure everything is cancellable
● Safeguards
○ certain services should never be executed on the UI
thread => check & log
Lightweight Editor (fallback)
● Syntax-highlighting + markers
● For files > 1 MB
● Based on ContentTypes extension point
Two ContentTypes (based on file size)
<extension point="org.eclipse.core.contenttype.contentTypes">
<content-type ...
describer="com.sigasi...FullVhdlContentDescriber"
name="VHDL editor"
<describer class="...FullVhdlContentDescriber" />
</content-type>
<content-type ...
describer="com.sigasi....LightweightVhdlContentDescriber"
name="Lightweight VHDL editor"
<describer class="...LightweightVhdlContentDescriber" />
</content-type>
</extension>
Future work
● Continuous process
● Cache global index info per resource?
● Linking without node model?
● StoredResources
Q/A
Come talk to us about...
● Documentation generation
● Fancy linking algorithms / type systems
● Graphical views
● Cross-language support
● Testing Xtext-plugins
● Lexical macros
● Manage large amount of validations
● ...

More Related Content

Recently uploaded

Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...Technogeeks
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsChristian Birchler
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Developmentvyaparkranti
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 

Recently uploaded (20)

Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
Odoo Development Company in India | Devintelle Consulting Service
Odoo Development Company in India | Devintelle Consulting ServiceOdoo Development Company in India | Devintelle Consulting Service
Odoo Development Company in India | Devintelle Consulting Service
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Advantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessAdvantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your Business
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 

Featured

AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 

Featured (20)

AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 

Scaling xtext - XtextCon 2015

  • 2. Sigasi ● IDE for Hardware Description Languages ○ VHDL, (System)Verilog ● Using Xtext for 4 years ● Large user base ○ (commercial, free, students)
  • 3.
  • 4. Our company goal ● Assist hardware designer ● High quality interactive front-end compiler ○ Instant feedback ■ parsing, semantic, linting, style checking ○ IDE Services ■ visualisations ■ design exploration ■ documentation generation ○ Integrate with ecosystem ■ other compilers, simulators, synthesizers
  • 6. The challenge ● Pre-specified languages ● Large projects ○ > 250 KLOC is not uncommon ○ design + external libraries ○ big files ■ some libraries are distributed as 1 file ■ generated hardware cores ●
  • 7. Adopting Xtext ● Started with the early Xtext 2.0 snapshots ● Initial performance analysis ○ Clean build performance of a big project (330k LOC) ■ > 20 minutes ■ > 2 GB ○ Editing big files (> 1 MB) ■ unusable
  • 8. Adopting Xtext ● Started with the early Xtext 2.0 snapshots ● Initial performance analysis ○ Clean build performance of a big project (330k LOC) ■ > 20 minutes → < 1 min ■ > 2 GB → ~ 1 GB memory ○ Editing big files (> 1 MB) ■ unusable → usable with reduced editor
  • 9. ● Xtext framework improvements ● Measure → analyze → improve or cheat ○ faster build ○ reduce memory usage ○ UI responsiveness Improving performance
  • 10. Overview ● Analysing build performance ○ Analyze the build ■ Macro build measurements ■ Key performance points ● Reduce workload ● Parallelize the build ○ Tracking performance ○ VHDL linking in depth ● Analyzing UI issues ○ Monitoring the UI thread ○ Saveguards
  • 11. Analyzing builds: builder overview Global indexing Linking Validation Custom Validation Global index Eclipse resources warnings errors resource descriptions Builder Participants resource changes ?
  • 12. Analyzing builds: builder overview Global indexing Global index resource descriptions resource changes ● Usage ○ Location of exported declarations ○ Incremental compilation ● Implementation ○ IResourceDescriptionsStrategy ○ Default: all declarations ○ Customize! ○ Runs before linking! ● IResourceDescriptions & IEObjectDescriptions ○ Always in memory ○ Persisted tot disk @ shutdown
  • 13. Analyzing builds: builder overview Linking Validation ● Usage ○ Determine IScope all cross references ○ Link cross reference or create linking error ● Implementation ○ ILinkingService, IScopeProvider, LazyLinkingResource ○ Requires global index for global scope ○ Direct link or link to global scope ○ Linking may trigger linking and resource loading linking errors Eclipse resources
  • 14. Analyzing builds: builder overview ● Usage ○ Execute all custom validations ○ Creates errors / warnings ● Implementation ○ AbstractDeclarativeValidator ○ Execute validations using reflection ○ Works against linked model ○ May trigger linking & resource loading Linking Validation errors warnings Eclipse resources
  • 15. Analyzing builds: builder overview Global indexing Linking Validation Custom Validation Global index Eclipse resources warnings errors resource descriptions Builder Participants resource changes ? ● iterations ? ● order ?
  • 16. Analyzing builds: metrics ● For each build ○ # of files being build ○ timing: Global index, Linking, Validation, Individual builder participants ● Instrument by overriding ClusteringBuilderState & XtextBuilder ● Example: Building 134 resources, timing: { global index=1806, linking=378, validation=823, totalLinkingAndValidation=1364 }
  • 17. Analyzing builds: resource loads ● Observation: ○ Most time spent in resource loads ○ Certain files are loaded multiple times?! ● Solutions ○ Reduce memory pressure ○ Make loading faster Global indexing Linking validation Custom Validation Builder Participants resources LOAD POTENTIAL RELOADS POTENTIAL RELOADS
  • 18. Memory pressure Global index Resource Set Memory pressure? ● Size of EMF models ○ All the resources loaded during the build ● Size of global index ○ Always loaded ○ Depends on number of open projects
  • 19. Memory pressure: EMF models Reduce EMF size ○ Watch out for inferred model http://www.sigasi.com/content/view-complexity-your-xtext-ecore-model ○ Avoid ■ Emf classes with just one list of things ● ListOfThings : ‘(‘ things+=Thing (‘,’ things+=Thing)* ‘)’ ● class ListOfThings { contains Thing[] things } ■ Often unused fields ○ Code duplication in grammar vs efficient model ○ Fine-grained control with Xcore model
  • 20. Memory pressure: Global index In YourResourceDescriptionStrategy ○ Export foo, foo.rec, foo.rec.field1, foo.rec.field2 ○ Add user-data: someType & anotherType ○ To reduce memory usage: don’t export child elements ■ export foo.rec + hash of fields ■ export foo + hash of contents of foo ■ can’t link these elements without loading anymore package foo is record rec is field1 : someType; field2 : anotherType(X downto Y); end; end;
  • 21. Optimize loading ● What is resource load? ○ Parse ○ build EMF model & install EMF proxies ○ build Node model
  • 22. Optimize loading ● What is resource load? ○ Parse ○ build EMF model & install EMF proxies ○ build Node model ● Parallelise ○ parse multiple files simultaneously ○ ~3 time faster loads on 4 core machine ○ only loading, not linking
  • 23. Optimize loading ● What is resource load? ○ Parse ○ build EMF model & install EMF proxies ○ build Node model ● Parallelise ○ parse multiple files simultaneously ○ ~3 time faster loads on 4 core machine ○ only loading, not linking ● Cache ○ serialize EMF and Node model in a cache ○ originally 3-4 time faster loads ○ now 1.5x (no backtracking, simplified grammar)
  • 24. Linking Global indexing Linking validation Custom Validation Builder Participants ● Language specific ○ VHDL vs Verilog ● Avoiding linking ○ library files, only linked when used in user-code ● Many iterations ○ lazy linking vs eager linking ○ From 40% of build time to 20%
  • 25. Custom Validation Global indexing Linking Validation Custom Validation Builder Participants ● Combine validations to avoid model traversals ● Local analysis, do global validations moved into builder participant ● Avoid validation ○ disabled validations ○ libraries: errors & warnings are suppressed anyway ● Monitor
  • 26. Track performance ● Nightly build ● log build times
  • 27. VHDL linking in depth History of VHDL linking ● 1st version ○ AbstractDeclarativeScopeProvider ○ best effort scoping ● 2nd version ○ removed reflection ● 3rd version ○ special rule-based internal java dsl ○ first attempt to be 100% correct ● 4rth version ○ batch/eager linking ○ type errors
  • 28. VHDL linking in depth foo(baz) <= bar(bak.f(“?”), (‘1’, 2))) + 1; ● Most elements in an expression are overloaded ○ subprograms, literals, enumliterals ● foo(baz) ○ 9 kinds of meanings ○ 4 of them can have subprogram overloading ● overloading includes return type ● overload resolution is very hard ○ you have to find 1 unambiguous solution ○ resolve all cross-references together
  • 29. VHDL linking in depth Xtext lazy linking ● good ○ declarative: only a few rules ○ fine-grained: can be good for performance ○ re-use: scoping, auto-complete, serialisation ● bad ○ hard to debug ■ can call itself ■ lots of caching ■ indirection, huge stack traces ○ performance ■ build context for every cross-reference
  • 30. VHDL linking in depth Batch/Eager linking ● good ○ simple top-down algorithm ○ natural fit for vhdl ○ well described in literature ● bad ○ resolve 1 reference = resolve all references ○ a lot of extra xtext customisation ■ auto-complete & serialization? ■ linking errors?
  • 31. VHDL linking in depth Our hybrid approach ● Eager/batch linking of design units ○ Big files are partially scoped ○ Parent-scope of a design unit is the global scope ○ Local scoping is executed eagerly ● Global scope ○ Import declarations of other design units ○ Only query is find design unit x.y ■ load resource of x.y ■ create an ‘ExternalScope’ & cache it ● Always load dependent resources ○ needed for validation, hovers, highlighting anyway
  • 32. Vhdl linking in depth Conclusion ● Easier to implement, debug and optimize ● Type error reporting during linking ● Memory intensive? ○ Every dependency is loaded ○ OK in practice for VHDL ● A lot of xtext customisation! ○ A lot of classes are affected ○ Forward compatible?
  • 33. UI responsiveness ● Measuring: detect a blocked UI thread ○ initially Svelto https://github.com/dragos/svelto ○ now our own method & logging ○ Eclipse Mars ● Improvements ○ UI is for drawing only! ○ Make sure everything is cancellable ● Safeguards ○ certain services should never be executed on the UI thread => check & log
  • 34. Lightweight Editor (fallback) ● Syntax-highlighting + markers ● For files > 1 MB ● Based on ContentTypes extension point
  • 35. Two ContentTypes (based on file size) <extension point="org.eclipse.core.contenttype.contentTypes"> <content-type ... describer="com.sigasi...FullVhdlContentDescriber" name="VHDL editor" <describer class="...FullVhdlContentDescriber" /> </content-type> <content-type ... describer="com.sigasi....LightweightVhdlContentDescriber" name="Lightweight VHDL editor" <describer class="...LightweightVhdlContentDescriber" /> </content-type> </extension>
  • 36. Future work ● Continuous process ● Cache global index info per resource? ● Linking without node model? ● StoredResources
  • 37. Q/A
  • 38. Come talk to us about... ● Documentation generation ● Fancy linking algorithms / type systems ● Graphical views ● Cross-language support ● Testing Xtext-plugins ● Lexical macros ● Manage large amount of validations ● ...

Editor's Notes

  1. Overall structure of the presentation Quickly introduce Sigasi & it’s relationship with Xtext Explain the requirements of our clients, their projects, their expectations of our product Break down the scalability problems that we had to face Explain the problems faced with build performance & responsive UIs Build performance Start with grammar & model => mostly memory considerations Explain a build in the ‘builder’ and in the editor Focus on resource loads Build performance breakdown: global index, linking, validation, builder participants => introduce metrics? Evaluate global index phase Evaluate linking phase
  2. Introduce us: 3500 active users We’re 2 engineers from sigasi We’re a company that builds a plugin for VHDL & starting (System)Verilog, hardware design languages, these languages are standardized by IEEE and are the industry standard for designing digital hardware. We’ve mainly focused on VHDL, but we’re starting to improve our Verilog support We started using Xtext 4 years ago (just this, we revisit this decision later) We have a lot of users, worldwide
  3. <add screenshot with code> to discuss user definable types hover, semantic highlighting complex expressions hierarchy view
  4. Assist hardware designer: HW design is very complex. Assist HW designer as good as possible so that she can focus on the import aspects. The core of our product is the compiler, written with Xtext We’re only a front-end compler => we don’t do code generation, design simulation or synthesis We use it to… [see points]
  5. Our product gets used on some existing project by our clients. Vhdl is old files have a compilation order (we disregard this) huge spec (800p); can be used as a general purpose language, many additions over the years & many constructs are never used extra verbose language => loads of opportunities for improving the experience with an IDE Projects are messy: source and binary artefacts are sometimes mixed, source folders contain old versions, many files with compilation errors everything is distributed as source: even huge libraries, that we will have to compile, can’t see the difference with user code huge files: see slide emacs & notepad++: they aren’t used to making clean projects that IDEs can consume they don’t accept slow editors actions, or an editor that suddenly blocks until a compilation has finished used to simply being able to open a file, free of context
  6. In 2011 Sigasi had a difficult choice to make, improve our old plugin basically a proof of concept or adopt Xtext. In a sort-of bet-the-company move we chose xtext. Why? see an older presentation by Hendrik. After some quick experimentation we chose to use Xtext. Although we knew that there were major performance problems we were confident that we could overcome them. Some of this was with the excellent support of Itemis.
  7. In 2011 Sigasi had a difficult choice to make, improve our old plugin basically a proof of concept or adopt Xtext. In a sort-of bet-the-company move we chose xtext. Why? see an older presentation by Hendrik. After some quick experimentation we chose to use Xtext. Although we knew that there were major performance problems we were confident that we could overcome them. Some of this was with the excellent support of Itemis.
  8. How did we improve? do nothing => wait for xtext framework improvements we did some contributions, we’re happy with the cooperation every release some improvements => don’t forget to read the release notes typical improvement cycle what to measure & how to measure? learn xtext performance characteristics then you try to improve or try to find a way to circumvent the issue or cheat
  9. This is what happens when you trigger a clean/incremental build We start with a set of resources. We make an index of the resources in the global index phase why? it’s impossible to keep the entire compilation graph in memory the global index will help us find where certain elements are located without having to load those files all files are processed, each file gets an entry in the global index all data goes into a global index, this index stays in memory (at startup/shutdown it is persisted to the disk) During linking: all crossreferences are resolved or not in that case there is a linking error placed on the eclipse resource validation follows directly after linking Builder participants (done) they do whatever they want Understand where time is spent This is a gross simplification of the build pipeline GI -> L&V -> Part GI: ResourceDescriptionStrategy absolutely no linking here walk parsed, unlinked AST, using ‘get*’ on a cross-reference will cause linking!! used to find global names and to determine if the public interface exposed by a file has changed linking: IScopeProvider => link cross-references, if it can’t be linked => create link-error Validation: Abstract*Validator: create errors & warnings Builder participant: can do virtually anything, there are ParallelBuilderParticipants Memory consumption profile is basically your ResourceSet + the global index fat arrow: => passing a resource set normally everything is loaded once in the GI phase loading means parsing, building the node model and building the emf ast => it’s expensive! if you run out of memory => the resource set gets cleared & a lot of files will need to be reloaded running out of memory is bad m’kay in principle, once the GI is made, you can link & validate 1 resource with an empty resource set small arrow: passing a resource => linking passes linked resource to validator Performance profile GI: first ast walk: minimize what you want to do in the global indexing phase (you normally don’t have to walk the entire model) linking: second ast walk: depends on your scoping complexity validation: 3rd ast walk: Your validation budget is pretty big (linking & GI is expensive) as long as the analysis is mainly local & dependent resources avoid throwing uncaught exceptions (mostly NPEs), they are expensive & always caught behind the scenes
  10. This is what happens when you trigger a clean/incremental build We start with a set of resources. We make an index of the resources in the global index phase why? it’s impossible to keep the entire compilation graph in memory the global index will help us find where certain elements are located without having to load those files all files are processed, each file gets an entry in the global index all data goes into a global index, this index stays in memory (at startup/shutdown it is persisted to the disk) During linking: all crossreferences are resolved or not in that case there is a linking error placed on the eclipse resource validation follows directly after linking Builder participants (done) they do whatever they want Understand where time is spent This is a gross simplification of the build pipeline GI -> L&V -> Part GI: ResourceDescriptionStrategy absolutely no linking here walk parsed, unlinked AST, using ‘get*’ on a cross-reference will cause linking!! used to find global names and to determine if the public interface exposed by a file has changed linking: IScopeProvider => link cross-references, if it can’t be linked => create link-error Validation: Abstract*Validator: create errors & warnings Builder participant: can do virtually anything, there are ParallelBuilderParticipants Memory consumption profile is basically your ResourceSet + the global index fat arrow: => passing a resource set normally everything is loaded once in the GI phase loading means parsing, building the node model and building the emf ast => it’s expensive! if you run out of memory => the resource set gets cleared & a lot of files will need to be reloaded running out of memory is bad m’kay in principle, once the GI is made, you can link & validate 1 resource with an empty resource set small arrow: passing a resource => linking passes linked resource to validator Performance profile GI: first ast walk: minimize what you want to do in the global indexing phase (you normally don’t have to walk the entire model) linking: second ast walk: depends on your scoping complexity validation: 3rd ast walk: Your validation budget is pretty big (linking & GI is expensive) as long as the analysis is mainly local & dependent resources avoid throwing uncaught exceptions (mostly NPEs), they are expensive & always caught behind the scenes
  11. This is what happens when you trigger a clean/incremental build We start with a set of resources. We make an index of the resources in the global index phase why? it’s impossible to keep the entire compilation graph in memory the global index will help us find where certain elements are located without having to load those files all files are processed, each file gets an entry in the global index all data goes into a global index, this index stays in memory (at startup/shutdown it is persisted to the disk) During linking: all crossreferences are resolved or not in that case there is a linking error placed on the eclipse resource validation follows directly after linking Builder participants (done) they do whatever they want Understand where time is spent This is a gross simplification of the build pipeline GI -> L&V -> Part GI: ResourceDescriptionStrategy absolutely no linking here walk parsed, unlinked AST, using ‘get*’ on a cross-reference will cause linking!! used to find global names and to determine if the public interface exposed by a file has changed linking: IScopeProvider => link cross-references, if it can’t be linked => create link-error Validation: Abstract*Validator: create errors & warnings Builder participant: can do virtually anything, there are ParallelBuilderParticipants Memory consumption profile is basically your ResourceSet + the global index fat arrow: => passing a resource set normally everything is loaded once in the GI phase loading means parsing, building the node model and building the emf ast => it’s expensive! if you run out of memory => the resource set gets cleared & a lot of files will need to be reloaded running out of memory is bad m’kay in principle, once the GI is made, you can link & validate 1 resource with an empty resource set small arrow: passing a resource => linking passes linked resource to validator Performance profile GI: first ast walk: minimize what you want to do in the global indexing phase (you normally don’t have to walk the entire model) linking: second ast walk: depends on your scoping complexity validation: 3rd ast walk: Your validation budget is pretty big (linking & GI is expensive) as long as the analysis is mainly local & dependent resources avoid throwing uncaught exceptions (mostly NPEs), they are expensive & always caught behind the scenes
  12. This is what happens when you trigger a clean/incremental build We start with a set of resources. We make an index of the resources in the global index phase why? it’s impossible to keep the entire compilation graph in memory the global index will help us find where certain elements are located without having to load those files all files are processed, each file gets an entry in the global index all data goes into a global index, this index stays in memory (at startup/shutdown it is persisted to the disk) During linking: all crossreferences are resolved or not in that case there is a linking error placed on the eclipse resource validation follows directly after linking Builder participants (done) they do whatever they want Understand where time is spent This is a gross simplification of the build pipeline GI -> L&V -> Part GI: ResourceDescriptionStrategy absolutely no linking here walk parsed, unlinked AST, using ‘get*’ on a cross-reference will cause linking!! used to find global names and to determine if the public interface exposed by a file has changed linking: IScopeProvider => link cross-references, if it can’t be linked => create link-error Validation: Abstract*Validator: create errors & warnings Builder participant: can do virtually anything, there are ParallelBuilderParticipants Memory consumption profile is basically your ResourceSet + the global index fat arrow: => passing a resource set normally everything is loaded once in the GI phase loading means parsing, building the node model and building the emf ast => it’s expensive! if you run out of memory => the resource set gets cleared & a lot of files will need to be reloaded running out of memory is bad m’kay in principle, once the GI is made, you can link & validate 1 resource with an empty resource set small arrow: passing a resource => linking passes linked resource to validator Performance profile GI: first ast walk: minimize what you want to do in the global indexing phase (you normally don’t have to walk the entire model) linking: second ast walk: depends on your scoping complexity validation: 3rd ast walk: Your validation budget is pretty big (linking & GI is expensive) as long as the analysis is mainly local & dependent resources avoid throwing uncaught exceptions (mostly NPEs), they are expensive & always caught behind the scenes
  13. This is what happens when you trigger a clean/incremental build We start with a set of resources. We make an index of the resources in the global index phase why? it’s impossible to keep the entire compilation graph in memory the global index will help us find where certain elements are located without having to load those files all files are processed, each file gets an entry in the global index all data goes into a global index, this index stays in memory (at startup/shutdown it is persisted to the disk) During linking: all crossreferences are resolved or not in that case there is a linking error placed on the eclipse resource validation follows directly after linking Builder participants (done) they do whatever they want Understand where time is spent This is a gross simplification of the build pipeline GI -> L&V -> Part GI: ResourceDescriptionStrategy absolutely no linking here walk parsed, unlinked AST, using ‘get*’ on a cross-reference will cause linking!! used to find global names and to determine if the public interface exposed by a file has changed linking: IScopeProvider => link cross-references, if it can’t be linked => create link-error Validation: Abstract*Validator: create errors & warnings Builder participant: can do virtually anything, there are ParallelBuilderParticipants Memory consumption profile is basically your ResourceSet + the global index fat arrow: => passing a resource set normally everything is loaded once in the GI phase loading means parsing, building the node model and building the emf ast => it’s expensive! if you run out of memory => the resource set gets cleared & a lot of files will need to be reloaded running out of memory is bad m’kay in principle, once the GI is made, you can link & validate 1 resource with an empty resource set small arrow: passing a resource => linking passes linked resource to validator Performance profile GI: first ast walk: minimize what you want to do in the global indexing phase (you normally don’t have to walk the entire model) linking: second ast walk: depends on your scoping complexity validation: 3rd ast walk: Your validation budget is pretty big (linking & GI is expensive) as long as the analysis is mainly local & dependent resources avoid throwing uncaught exceptions (mostly NPEs), they are expensive & always caught behind the scenes
  14. Now we’re going to look at EMF resource loading if your memory is big enough all resources loaded in global indexing phase in this phase no linking should be done pass RS to linking & validation in linking phase everything gets linked validation works with an already linked model pass RS to builder participants if it isn’t when there is too much memory pressure the clusteringbuilder will clear the resourceset may cause reloads during linking may cause relinking of reloaded dependencies in validation & builder participants
  15. you can hash subelements
  16. Note: even with incomplete linking, you get OK Xtext support for rename, refactor, … -> improve incrementally.
  17. Nightly -> see progress, regressions log also during usage -> detect unexpected use cases
  18. cancellable => some xtext classes like outline have been retrofitted for this
  19. IContentDescriber