4. CASE: EVIRA
› Evira is Finnish Food Safety Authority.
• Lots of official documentation
• Lots of content editors
• Contains mostly text, documents, forms and table data
• Low amount of images and rich content
› Same project contains also intranet for Evira. So the same
architecture was required to work with intranet case also.
7. Search with URL
-parameters
Facet groups
Ordering
Search word
highlights
Customizable
search results
Did you mean this
-feature
Filters and
Facets
Easily
customizable
facets
Fallback wildcard search
File search
for most common
document types
9. KEY COMPONENTS: E&A&E
› Elasticsearch (Search and performance)
• Global search and efficient way to query large data sets with full-text
support
› Azure (Cloud platform and scale)
• Azure contains all the environments, files, data, backups, monitoring,
maintenance jobs, etc.
› Episerver CMS (Content editing, UI and master data store)
• Platform for content editing with many languages, versioning, document
bank, metadata, etc.
• Master data and primary data source for Elasticsearch
10. CUSTOMIZABLE PLATFORM
› Elasticsearch
• From the smallest to very large projects
• Runs locally your laptop, buy it from the cloud or private data center
› Azure Cloud
• From the smallest to very large projects
• IaaS and PaaS options
› Episerver CMS
• From medium size to very large projects
• Easily customizable front-end and pluggable/extendable back-end
11. On premise / Private cloudAzure IaaS on virtual
machines (one or many)
Developer’s laptop
IIS Web
Server
SETUP OPTIONS
PaaS on Azure App Services
and Elastic Cloud
SEARCH VM
Elasticsearch
Web Site
SQL Database
Blob Storage
Elasticsearch
Web
ServerWeb Server
SQL Server
12. ELASTIC IS NOT JUST FOR SEARCH
› It’s a performance tool. It makes querying large data sets much more
efficient than tools like SQL Server or many other search tools
› We use Elastic:
› Global search
› Internal searches and listings:
Products news, announcements,
comments, Files, RSS, sitemap
› Handling a large datasets. Example
some migrations.
› Analytics and statistics
• Site visitor analytics
• Search usage analytics
› 404 statistics
› Error logging and log analyzing
› Monitoring servers
Full-text search, Listings, performance Analytics, statistics
13.
14.
15.
16. NOT A CRAWLER
› We have integrated Elasticsearch to events of Episerver
› Real-time (1 or 2 seconds latency)
• Long latencies often cause multiple other problems
› We can send more data than what’s visible (example access rights)
Real-time is really hard gain
if it’s not built into the architecture
18. CQRS WITH CMS (TRADITIONAL FORMAT)
Commands
Queries
SQL Server
database
Elasticsearch
Index
Web Site
Episerver CMS
Elasticsearch
19. CQRS WITH CMS (AS WE USE IT)
Commands
Queries
Elasticsearch
Index
Web Site
Episerver CMS
Elasticsearch
Simple Queries
Episerver CMS
SQL Server
database
20. CQRS WITH CMS (AS EPISERVER FIND USE IT)
Commands
Queries
Episerver Find
Index
Web Site
Episerver CMS
Elasticsearch
Simple Queries
Episerver CMS
SQL Server
database
returns only
the id’s
21. WHY ELASTIC WITH CMS
› Content Management Systems are generally good for managing
content, files, content relations, hierarchy, language variations,
content versions, access rights, user management, model type
management and CACHING
› They often have hierarchical structure of handling content
› So querying a page and querying parent or child pages often
comes straight from the cache and does not even make a database
query.
› But CMS often do not include good tools querying across hierarchies
22. CHOOSE THE BEST TOOL
› Use Episerver/CMS for simple queries
• If you need to query: just one object, sibling objects or child objects from
less than 2 hierarchy levels
› Use Elasticsearch
• Everything else
› Except don’t use Elasticsearch:
• If 1-2 second latency is too much
• If there is some transactions requirements
23. ELASTIC INDEX = QUERY DATABASE
› We can always recreate elastic index
from SQL Server “master data”
› That’s why we don’t really need
multiple nodes or chards
Get all the data
Elasticsearch
Index
Episerver CMS
SQL Server
database
Reindex
24. ELASTICSEARCH.NET & NEST
› Official .NET Elasticsearch clients
› ElasticSearch.NET & NEST makes
the usage strongly typed:
• No JSON
• No typos
• Every value has a type
• IDE will help you
• Not like JavaScript
var response = client.Search<Tweet>(s => s
.From(0)
.Size(10)
.Query(q =>
q.Term(t => t.HashTags, "elasticsearch")
)
);
public class Tweet
{
public string[] HasTags;
...
}
› Code example:
25. MAPPINGS ARE LIKE SCHEMA IN DB
› NEST will automatically map most of the types
but not all:
› Separate string types:
• Text (analyzed)
default type for strings
• Keywords (not analyzed)
Keyword fields are only searchable by their exact value
› Automating the mappings will help a lot in
long run
public class Tweet
{
[Text]
public string Content;
[keyword]
public string Url;
[keyword]
public string[] HashTags;
...
}
› Code example:
› Mappings is normally generated automatically based on content you insert
into index. But sometimes you need custom mappings.
26. SCORING OPTIMIZATION
› Boosting fields is the most important scoring customization
› We normally have 3 fields which we boost with different values:
• Titles (boost 2.0)
• FullTextField (boost 1.5)
• ExtraContent (boost 1.0)
27. SCORING OPTIMIZATION
› Script scoring allows us to boost results with custom properties:
• Search result type
• Number of internal links
• Depth in hierarchy
• Recently published / edited
• Popularity by user visits
› Requires that dynamic scripting is enabled from the Elasticsearch.
All hosting partners won’t allow it.
28. SUMMARY
› Every dev loves Elasticsearch and it’s easy to start
› Choose the best tool for the purpose
› Compare options and seek for weaknesses which other solutions may fulfill
-> use the best out of both solutions
› Elasticsearch fits with most CMSes because they lack good search tools
› CQRS pattern will help with performance but choose wisely how to use it
› Invest your platform that it’s customizable. So it fits your next project also.