SlideShare uma empresa Scribd logo
1 de 27
Baixar para ler offline
ELASTICSEARCH
What’s new since 0.90?
techtalk @ ferret
• Latest stable release: Elasticsearch 1.1.0	

• Released: 25.03.2014	

• Based on Lucene 4.6.1
BREAKING CHANGES
in versions 1.x
CONFIGURATION
• The cluster.routing.allocation settings (disable_allocation,
disable_new_allocation and disable_replica_location) have
been replaced by the single setting:
cluster.routing.allocation.enable: all|primaries|new_primaries|
none	

• Elasticsearch on 64 bit Linux now uses mmapfs by default. Make
sure that you set MAX_MAP_COUNT to a sufficiently high
number. The RPM and Debian packages default this value to
262144.
MULTI-FIELDS
Existing multi-fields will be upgraded to the new format automatically.
"title": {	

"type": "multi_field",	

"fields": {	

"title": { "type": "string" },	

"raw": { 	

"type":“string",	

"index": "not_analyzed" 	

}	

}	

}
"title": {	

"type": "string",	

"fields": {	

"raw": { 	

"type":“string",	

"index": "not_analyzed" }	

}	

}
STOPWORDS
• Previously, the standard and pattern analyzers used
the list of English stopwords by default, which
caused some hard to debug indexing issues.	

• Now they are set to use the empty stopwords list
(ie _none_) instead.
RETURNVALUES
• The ok return value has been removed from all response bodies
as it added no useful information.	

• The found, not_found and exists return values have been
unified as found on all relevant APIs.	

• Field values, in response to the fields parameter, are now always
returned as arrays. Metadata fields are always returned as scalars.	

• The analyze API no longer supports the text response format,
but does support JSON andYAML.
DEPRECATIONS
• Per-document boosting with the _boost field has been
removed.You can use the function_score instead.	

• The custom_score and custom_boost_score is no longer
supported. You can use function_score instead.	

• The field query has been removed. Use the query_string
query instead.	

• The path parameter in mappings has been deprecated. Use
the copy_to parameter instead.
AGGREGATIONS
since version 1.0.0
AGGREGATIONTYPES
• Bucketing aggregations	

Aggregations that build buckets, where each bucket is associated with a key and a
document criterion.	

!
Examples: range, terms, histogram	

!
Bucketing aggregations can have sub-aggregations (bucketing or metric). The sub-aggregations
will be computed for the buckets which their parent aggregation generates.
• Metrics aggregations	

Aggregations that keep track and compute metrics over a set of documents.	

!
Examples: min, max, stats
{	

"aggs" : {	

"price_ranges" : {	

"range" : {	

"field" : "price",	

"ranges" : [	

{ "to" : 50 },	

{ "from" : 100 }	

]	

},	

"aggs" : {	

"price_stats" : {	

"stats" : { "field" : "price" }	

}	

}	

}	

}	

}
{	

"aggregations": {	

"price_ranges" : {	

"buckets": [	

{	

"to": 50,	

"doc_count": 2,	

"price_stats": {	

"count": 2,	

"min": 20,	

"max": 47,	

"avg": 33.5,	

"sum": 67	

}	

}, …	

]	

}	

}	

}
CARDINALITY
The cardinality aggregation is a metric aggregation that allows to compute approximate unique
counts based on the HyperLogLog++ algorithm which has the nice properties of both being close
to accurate on low cardinalities and having fixed memory usage so that estimating high cardinalities
doesn't blow up memory.
{	

"aggs" : {	

"author_count" : {	

"cardinality" : {	

"field" : "author"	

}	

}	

}	

}
PERCENTILES
A percentiles aggregation would allow to compute (approximate) values of arbitrary percentiles
based on the t-digest algorithm. Computing exact percentiles is not reasonably feasible as it would
require shards to stream all values to the node that coordinates search execution, which could be
gigabytes on a high-cardinality field.
1.1.0
{	

"aggs" : {	

"load_time_outlier" : {	

"percentiles" : {	

"field" : "load_time" 	

}	

}	

}	

}
{	

...	

"aggregations": {	

"load_time_outlier": {	

"1.0": 15,	

"5.0": 20,	

"25.0": 23,	

"50.0": 25,	

"75.0": 29,	

"95.0": 60,	

"99.0": 150	

}	

}	

}
SIGNIFICANT_TERMS
{	

"query" : {	

"terms" : {	

"force" : [ "BritishTransport Police" ]	

}	

},	

"aggregations" : {	

"significantCrimeTypes" : {	

"significant_terms" : { "field" : "crime_type" }	

}	

}	

}
An aggregation that identifies terms that are significant rather than merely popular in a result set.
Significance is related to the changes in document frequency observed between everyday use in the
corpus and frequency observed in the result set.
1.1.0
{	

"aggregations" : {	

"significantCrimeTypes" : {	

"doc_count": 47347,	

"buckets" : [	

{	

"key": "Bicycle theft",	

"doc_count": 3640,	

"score": 0.371235374214817,	

"bg_count": 66799	

}, …	

]	

}	

}	

}
IMPROVEMENTS
1.1.0
TERMS AGGREGATION
• Before 1.1.0 terms aggregations return up to size terms, so the way
to get all matching terms back was to set size to an arbitrary high
number that would be larger than the number of unique terms.	

!
• Since version 1.1.0 to get ALL terms just set size=0
MULTI-FIELD SEARCH
• The multi_match query now supports three types of execution:

• best_fields (field-centric, default) Find the field that best matches the
query string. Useful for finding a single concept like “full text search” in
either the title or the body field.	

!
• most_fields (field-centric) Find all matching fields and add up their
scores. Useful for matching against multi-fields, where the same text
has been analyzed in different ways to improve the relevance score:
with/without stemming, shingles, edge-ngrams etc.	

!
• cross_fields (term-centric) New execution mode which looks for
each term in any of the listed fields. Useful for documents whose
identifying features are spread across multiple fields, such as
first_name and last_name, and supports the minimum_should_match
operator in a more natural way than the other two modes.
CAT API
since version 1.0.0
JSON is great… for computers. Human eyes, especially when looking at an ssh terminal, need
compact and aligned text.The cat API aims to meet this need.
$ curl 'localhost:9200/_cat/nodes?h=ip,port,heapPercent,name'	

192.168.56.40 9300 40.3 Captain Universe	

192.168.56.20 9300 15.3 Kaluu	

192.168.56.50 9300 17.0Yellowjacket	

192.168.56.10 9300 12.3 Remy LeBeau	

192.168.56.30 9300 43.9 Ramsey, Doug
TRIBE NODES
since version 1.0.0
The tribes feature allows a tribe node to act as a federated client across multiple clusters.
tribe:	

t1: 	

cluster.name: cluster_one	

t2: 	

cluster.name: cluster_two
elasticsearch.yml
The merged global cluster state means that almost all operations work in the sam
way as a single cluster: distributed search, suggest, percolation, indexing, etc.	

!
However, there are a few exceptions:	

• The merged view cannot handle indices with the same name in multiple cluster
• Master level read operations (eg Cluster State, Cluster Health) will automati
execute with a local flag set to true since there is no master.	

• Master level write operations (eg Create Index) are not allowed.These should
performed on a single cluster.
BACKUP & RESTORE
since version 1.0.0
REPOSITORIES
$ curl -XPUT 'http://localhost:9200/_snapshot/my_backup' -d '{	

"type": "fs",	

"settings": {	

"location": "/mount/backups/my_backup",	

"compress": true }}'
Before any snapshot or restore operation can be performed a snapshot
repository should be registered in Elasticsearch.
Supported repository types:	

• fs (filesystem)	

• S3	

• HDFS (Hadoop)	

• Azure
SNAPSHOTS
$ curl -XPUT "localhost:9200/_snapshot/my_backup/snapshot_1" -d '{	

"indices": "index_1,index_2"	

}'
A repository can contain multiple snapshots of the same cluster. Snapshot are
identified by unique names within the cluster.
• The index snapshot process is incremental.	

• Only one snapshot process can be executed in the cluster at any
time.	

• Snapshotting process is executed in non-blocking fashion
RESTORE
$ curl -XPOST "localhost:9200/_snapshot/my_backup/snapshot_1/_restore" -d '{	

"indices": "index_1,index_2",	

"rename_pattern": "index_(.+)",	

"rename_replacement": "restored_index_$1"	

}'
A snapshot can be restored using the following command:
• The restore operation can be performed on a functioning cluster.	

• An existing index can be only restored if it’s closed.	

• The restored persistent settings are added to the existing
persistent settings.
ELASTICSEARCH-PY
Official low-level client for Elasticsearch
Features:	

• translating basic Python data types to and from json (datetimes are not
decoded for performance reasons)	

• configurable automatic discovery of cluster nodes	

• persistent connections	

• load balancing (with pluggable selection strategy) across all available nodes	

• failed connection penalization (time based - failed connections won’t be
retried until a timeout is reached)	

• thread safety	

• pluggable architecture	

Versioning:
• There are two branches - master and 0.4. Master branch is used to track all the
changes for Elasticsearch 1.0 and beyond whereas 0.4 tracks Elasticsearch 0.90.

Mais conteúdo relacionado

Mais procurados

Log analysis using Logstash,ElasticSearch and Kibana
Log analysis using Logstash,ElasticSearch and KibanaLog analysis using Logstash,ElasticSearch and Kibana
Log analysis using Logstash,ElasticSearch and KibanaAvinash Ramineni
 
Centralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackCentralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackRich Lee
 
Elastic search overview
Elastic search overviewElastic search overview
Elastic search overviewABC Talks
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to ElasticsearchRuslan Zavacky
 
From Lucene to Elasticsearch, a short explanation of horizontal scalability
From Lucene to Elasticsearch, a short explanation of horizontal scalabilityFrom Lucene to Elasticsearch, a short explanation of horizontal scalability
From Lucene to Elasticsearch, a short explanation of horizontal scalabilityStéphane Gamard
 
Elasticsearch for beginners
Elasticsearch for beginnersElasticsearch for beginners
Elasticsearch for beginnersNeil Baker
 
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...Oleksiy Panchenko
 
ElasticSearch Basic Introduction
ElasticSearch Basic IntroductionElasticSearch Basic Introduction
ElasticSearch Basic IntroductionMayur Rathod
 
Elasticsearch From the Bottom Up
Elasticsearch From the Bottom UpElasticsearch From the Bottom Up
Elasticsearch From the Bottom Upfoundsearch
 
Solr + Hadoop = Big Data Search
Solr + Hadoop = Big Data SearchSolr + Hadoop = Big Data Search
Solr + Hadoop = Big Data SearchMark Miller
 
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...Edureka!
 
Real time analytics using Hadoop and Elasticsearch
Real time analytics using Hadoop and ElasticsearchReal time analytics using Hadoop and Elasticsearch
Real time analytics using Hadoop and ElasticsearchAbhishek Andhavarapu
 
Workshop: Learning Elasticsearch
Workshop: Learning ElasticsearchWorkshop: Learning Elasticsearch
Workshop: Learning ElasticsearchAnurag Patel
 
Case study of Rujhaan.com (A social news app )
Case study of Rujhaan.com (A social news app )Case study of Rujhaan.com (A social news app )
Case study of Rujhaan.com (A social news app )Rahul Jain
 
The How and Why of Fast Data Analytics with Apache Spark
The How and Why of Fast Data Analytics with Apache SparkThe How and Why of Fast Data Analytics with Apache Spark
The How and Why of Fast Data Analytics with Apache SparkLegacy Typesafe (now Lightbend)
 
ElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedBeyondTrees
 
ELK - Stack - Munich .net UG
ELK - Stack - Munich .net UGELK - Stack - Munich .net UG
ELK - Stack - Munich .net UGSteve Behrendt
 
Scala and jvm_languages_praveen_technologist
Scala and jvm_languages_praveen_technologistScala and jvm_languages_praveen_technologist
Scala and jvm_languages_praveen_technologistpmanvi
 
Managing Your Content with Elasticsearch
Managing Your Content with ElasticsearchManaging Your Content with Elasticsearch
Managing Your Content with ElasticsearchSamantha Quiñones
 

Mais procurados (20)

Log analysis using Logstash,ElasticSearch and Kibana
Log analysis using Logstash,ElasticSearch and KibanaLog analysis using Logstash,ElasticSearch and Kibana
Log analysis using Logstash,ElasticSearch and Kibana
 
Elasticsearch 5.0
Elasticsearch 5.0Elasticsearch 5.0
Elasticsearch 5.0
 
Centralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackCentralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stack
 
Elastic search overview
Elastic search overviewElastic search overview
Elastic search overview
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
From Lucene to Elasticsearch, a short explanation of horizontal scalability
From Lucene to Elasticsearch, a short explanation of horizontal scalabilityFrom Lucene to Elasticsearch, a short explanation of horizontal scalability
From Lucene to Elasticsearch, a short explanation of horizontal scalability
 
Elasticsearch for beginners
Elasticsearch for beginnersElasticsearch for beginners
Elasticsearch for beginners
 
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
 
ElasticSearch Basic Introduction
ElasticSearch Basic IntroductionElasticSearch Basic Introduction
ElasticSearch Basic Introduction
 
Elasticsearch From the Bottom Up
Elasticsearch From the Bottom UpElasticsearch From the Bottom Up
Elasticsearch From the Bottom Up
 
Solr + Hadoop = Big Data Search
Solr + Hadoop = Big Data SearchSolr + Hadoop = Big Data Search
Solr + Hadoop = Big Data Search
 
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...
 
Real time analytics using Hadoop and Elasticsearch
Real time analytics using Hadoop and ElasticsearchReal time analytics using Hadoop and Elasticsearch
Real time analytics using Hadoop and Elasticsearch
 
Workshop: Learning Elasticsearch
Workshop: Learning ElasticsearchWorkshop: Learning Elasticsearch
Workshop: Learning Elasticsearch
 
Case study of Rujhaan.com (A social news app )
Case study of Rujhaan.com (A social news app )Case study of Rujhaan.com (A social news app )
Case study of Rujhaan.com (A social news app )
 
The How and Why of Fast Data Analytics with Apache Spark
The How and Why of Fast Data Analytics with Apache SparkThe How and Why of Fast Data Analytics with Apache Spark
The How and Why of Fast Data Analytics with Apache Spark
 
ElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learned
 
ELK - Stack - Munich .net UG
ELK - Stack - Munich .net UGELK - Stack - Munich .net UG
ELK - Stack - Munich .net UG
 
Scala and jvm_languages_praveen_technologist
Scala and jvm_languages_praveen_technologistScala and jvm_languages_praveen_technologist
Scala and jvm_languages_praveen_technologist
 
Managing Your Content with Elasticsearch
Managing Your Content with ElasticsearchManaging Your Content with Elasticsearch
Managing Your Content with Elasticsearch
 

Destaque

An Introduction to Elastic Search.
An Introduction to Elastic Search.An Introduction to Elastic Search.
An Introduction to Elastic Search.Jurriaan Persyn
 
ElasticSearch at berlinbuzzwords 2010
ElasticSearch at berlinbuzzwords 2010ElasticSearch at berlinbuzzwords 2010
ElasticSearch at berlinbuzzwords 2010Elasticsearch
 
Your Data, Your Search, ElasticSearch (EURUKO 2011)
Your Data, Your Search, ElasticSearch (EURUKO 2011)Your Data, Your Search, ElasticSearch (EURUKO 2011)
Your Data, Your Search, ElasticSearch (EURUKO 2011)Karel Minarik
 
Elasticsearch presentation 1
Elasticsearch presentation 1Elasticsearch presentation 1
Elasticsearch presentation 1Maruf Hassan
 
LogStash - Yes, logging can be awesome
LogStash - Yes, logging can be awesomeLogStash - Yes, logging can be awesome
LogStash - Yes, logging can be awesomeJames Turnbull
 
Elasticsearch for Data Analytics
Elasticsearch for Data AnalyticsElasticsearch for Data Analytics
Elasticsearch for Data AnalyticsFelipe
 
Down and dirty with Elasticsearch
Down and dirty with ElasticsearchDown and dirty with Elasticsearch
Down and dirty with Elasticsearchclintongormley
 
Scaling real-time search and analytics with Elasticsearch
Scaling real-time search and analytics with ElasticsearchScaling real-time search and analytics with Elasticsearch
Scaling real-time search and analytics with Elasticsearchclintongormley
 
Yemen's Remote Mountain Villages
 Yemen's Remote Mountain Villages Yemen's Remote Mountain Villages
Yemen's Remote Mountain Villagesmaditabalnco
 
Что мы сделали в 2015 году?
Что мы сделали в 2015 году?Что мы сделали в 2015 году?
Что мы сделали в 2015 году?Анна Засухина
 
TI04_Licencias_ Creative_ commons
TI04_Licencias_ Creative_ commonsTI04_Licencias_ Creative_ commons
TI04_Licencias_ Creative_ commonsLidia Espino
 
Web 2.0: Warum virtuelle und reale Welt untrennbar miteinander verbunden sind
Web 2.0: Warum virtuelle und reale Welt untrennbar miteinander verbunden sindWeb 2.0: Warum virtuelle und reale Welt untrennbar miteinander verbunden sind
Web 2.0: Warum virtuelle und reale Welt untrennbar miteinander verbunden sindJan Schmidt
 

Destaque (17)

Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
An Introduction to Elastic Search.
An Introduction to Elastic Search.An Introduction to Elastic Search.
An Introduction to Elastic Search.
 
ElasticSearch at berlinbuzzwords 2010
ElasticSearch at berlinbuzzwords 2010ElasticSearch at berlinbuzzwords 2010
ElasticSearch at berlinbuzzwords 2010
 
Your Data, Your Search, ElasticSearch (EURUKO 2011)
Your Data, Your Search, ElasticSearch (EURUKO 2011)Your Data, Your Search, ElasticSearch (EURUKO 2011)
Your Data, Your Search, ElasticSearch (EURUKO 2011)
 
Elasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetupElasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetup
 
Elasticsearch presentation 1
Elasticsearch presentation 1Elasticsearch presentation 1
Elasticsearch presentation 1
 
LogStash - Yes, logging can be awesome
LogStash - Yes, logging can be awesomeLogStash - Yes, logging can be awesome
LogStash - Yes, logging can be awesome
 
Elasticsearch for Data Analytics
Elasticsearch for Data AnalyticsElasticsearch for Data Analytics
Elasticsearch for Data Analytics
 
Down and dirty with Elasticsearch
Down and dirty with ElasticsearchDown and dirty with Elasticsearch
Down and dirty with Elasticsearch
 
Scaling real-time search and analytics with Elasticsearch
Scaling real-time search and analytics with ElasticsearchScaling real-time search and analytics with Elasticsearch
Scaling real-time search and analytics with Elasticsearch
 
Yemen's Remote Mountain Villages
 Yemen's Remote Mountain Villages Yemen's Remote Mountain Villages
Yemen's Remote Mountain Villages
 
U.s. Immigration Demographics and Immigrant Integration
U.s. Immigration Demographics and Immigrant IntegrationU.s. Immigration Demographics and Immigrant Integration
U.s. Immigration Demographics and Immigrant Integration
 
Что мы сделали в 2015 году?
Что мы сделали в 2015 году?Что мы сделали в 2015 году?
Что мы сделали в 2015 году?
 
Expo info
Expo infoExpo info
Expo info
 
TI04_Licencias_ Creative_ commons
TI04_Licencias_ Creative_ commonsTI04_Licencias_ Creative_ commons
TI04_Licencias_ Creative_ commons
 
Web 2.0: Warum virtuelle und reale Welt untrennbar miteinander verbunden sind
Web 2.0: Warum virtuelle und reale Welt untrennbar miteinander verbunden sindWeb 2.0: Warum virtuelle und reale Welt untrennbar miteinander verbunden sind
Web 2.0: Warum virtuelle und reale Welt untrennbar miteinander verbunden sind
 
Red Hat Storage 3.0
Red Hat Storage 3.0Red Hat Storage 3.0
Red Hat Storage 3.0
 

Semelhante a Elasticsearch

Introduction To Apache Mesos
Introduction To Apache MesosIntroduction To Apache Mesos
Introduction To Apache MesosJoe Stein
 
Building and Deploying Application to Apache Mesos
Building and Deploying Application to Apache MesosBuilding and Deploying Application to Apache Mesos
Building and Deploying Application to Apache MesosJoe Stein
 
Apache Kafka, HDFS, Accumulo and more on Mesos
Apache Kafka, HDFS, Accumulo and more on MesosApache Kafka, HDFS, Accumulo and more on Mesos
Apache Kafka, HDFS, Accumulo and more on MesosJoe Stein
 
ELK - What's new and showcases
ELK - What's new and showcasesELK - What's new and showcases
ELK - What's new and showcasesAndrii Gakhov
 
AWS CloudFormation Best Practices
AWS CloudFormation Best PracticesAWS CloudFormation Best Practices
AWS CloudFormation Best PracticesAmazon Web Services
 
KSQL - Stream Processing simplified!
KSQL - Stream Processing simplified!KSQL - Stream Processing simplified!
KSQL - Stream Processing simplified!Guido Schmutz
 
Log Analytics with Amazon Elasticsearch Service - September Webinar Series
Log Analytics with Amazon Elasticsearch Service - September Webinar SeriesLog Analytics with Amazon Elasticsearch Service - September Webinar Series
Log Analytics with Amazon Elasticsearch Service - September Webinar SeriesAmazon Web Services
 
Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.Prajal Kulkarni
 
PostgreSQL High_Performance_Cheatsheet
PostgreSQL High_Performance_CheatsheetPostgreSQL High_Performance_Cheatsheet
PostgreSQL High_Performance_CheatsheetLucian Oprea
 
Webinar: What's New in Solr 6
Webinar: What's New in Solr 6Webinar: What's New in Solr 6
Webinar: What's New in Solr 6Lucidworks
 
ElasticSearch for .NET Developers
ElasticSearch for .NET DevelopersElasticSearch for .NET Developers
ElasticSearch for .NET DevelopersBen van Mol
 
How ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps lifeHow ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps life琛琳 饶
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to ElasticsearchSperasoft
 
Cassandra
CassandraCassandra
Cassandraexsuns
 
Scaling Drupal in AWS Using AutoScaling, Cloudformation, RDS and more
Scaling Drupal in AWS Using AutoScaling, Cloudformation, RDS and moreScaling Drupal in AWS Using AutoScaling, Cloudformation, RDS and more
Scaling Drupal in AWS Using AutoScaling, Cloudformation, RDS and moreDropsolid
 

Semelhante a Elasticsearch (20)

Introduction To Apache Mesos
Introduction To Apache MesosIntroduction To Apache Mesos
Introduction To Apache Mesos
 
Building and Deploying Application to Apache Mesos
Building and Deploying Application to Apache MesosBuilding and Deploying Application to Apache Mesos
Building and Deploying Application to Apache Mesos
 
Apache Kafka, HDFS, Accumulo and more on Mesos
Apache Kafka, HDFS, Accumulo and more on MesosApache Kafka, HDFS, Accumulo and more on Mesos
Apache Kafka, HDFS, Accumulo and more on Mesos
 
ELK - What's new and showcases
ELK - What's new and showcasesELK - What's new and showcases
ELK - What's new and showcases
 
AWS CloudFormation Best Practices
AWS CloudFormation Best PracticesAWS CloudFormation Best Practices
AWS CloudFormation Best Practices
 
KSQL - Stream Processing simplified!
KSQL - Stream Processing simplified!KSQL - Stream Processing simplified!
KSQL - Stream Processing simplified!
 
Log Analytics with Amazon Elasticsearch Service - September Webinar Series
Log Analytics with Amazon Elasticsearch Service - September Webinar SeriesLog Analytics with Amazon Elasticsearch Service - September Webinar Series
Log Analytics with Amazon Elasticsearch Service - September Webinar Series
 
Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.
 
PostgreSQL High_Performance_Cheatsheet
PostgreSQL High_Performance_CheatsheetPostgreSQL High_Performance_Cheatsheet
PostgreSQL High_Performance_Cheatsheet
 
Webinar: What's New in Solr 6
Webinar: What's New in Solr 6Webinar: What's New in Solr 6
Webinar: What's New in Solr 6
 
ElasticSearch for .NET Developers
ElasticSearch for .NET DevelopersElasticSearch for .NET Developers
ElasticSearch for .NET Developers
 
Elasticsearch as a Database?
Elasticsearch as a Database?Elasticsearch as a Database?
Elasticsearch as a Database?
 
Deep Learning for Computer Vision: Software Frameworks (UPC 2016)
Deep Learning for Computer Vision: Software Frameworks (UPC 2016)Deep Learning for Computer Vision: Software Frameworks (UPC 2016)
Deep Learning for Computer Vision: Software Frameworks (UPC 2016)
 
How ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps lifeHow ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps life
 
Logstash
LogstashLogstash
Logstash
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
Cassandra
CassandraCassandra
Cassandra
 
Into The Box 2018 - CBT
Into The Box 2018 - CBTInto The Box 2018 - CBT
Into The Box 2018 - CBT
 
Elasticsearch as a Database?
Elasticsearch as a Database?Elasticsearch as a Database?
Elasticsearch as a Database?
 
Scaling Drupal in AWS Using AutoScaling, Cloudformation, RDS and more
Scaling Drupal in AWS Using AutoScaling, Cloudformation, RDS and moreScaling Drupal in AWS Using AutoScaling, Cloudformation, RDS and more
Scaling Drupal in AWS Using AutoScaling, Cloudformation, RDS and more
 

Mais de Andrii Gakhov

Let's start GraphQL: structure, behavior, and architecture
Let's start GraphQL: structure, behavior, and architectureLet's start GraphQL: structure, behavior, and architecture
Let's start GraphQL: structure, behavior, and architectureAndrii Gakhov
 
Exceeding Classical: Probabilistic Data Structures in Data Intensive Applicat...
Exceeding Classical: Probabilistic Data Structures in Data Intensive Applicat...Exceeding Classical: Probabilistic Data Structures in Data Intensive Applicat...
Exceeding Classical: Probabilistic Data Structures in Data Intensive Applicat...Andrii Gakhov
 
Too Much Data? - Just Sample, Just Hash, ...
Too Much Data? - Just Sample, Just Hash, ...Too Much Data? - Just Sample, Just Hash, ...
Too Much Data? - Just Sample, Just Hash, ...Andrii Gakhov
 
Implementing a Fileserver with Nginx and Lua
Implementing a Fileserver with Nginx and LuaImplementing a Fileserver with Nginx and Lua
Implementing a Fileserver with Nginx and LuaAndrii Gakhov
 
Pecha Kucha: Ukrainian Food Traditions
Pecha Kucha: Ukrainian Food TraditionsPecha Kucha: Ukrainian Food Traditions
Pecha Kucha: Ukrainian Food TraditionsAndrii Gakhov
 
Probabilistic data structures. Part 4. Similarity
Probabilistic data structures. Part 4. SimilarityProbabilistic data structures. Part 4. Similarity
Probabilistic data structures. Part 4. SimilarityAndrii Gakhov
 
Probabilistic data structures. Part 3. Frequency
Probabilistic data structures. Part 3. FrequencyProbabilistic data structures. Part 3. Frequency
Probabilistic data structures. Part 3. FrequencyAndrii Gakhov
 
Probabilistic data structures. Part 2. Cardinality
Probabilistic data structures. Part 2. CardinalityProbabilistic data structures. Part 2. Cardinality
Probabilistic data structures. Part 2. CardinalityAndrii Gakhov
 
Вероятностные структуры данных
Вероятностные структуры данныхВероятностные структуры данных
Вероятностные структуры данныхAndrii Gakhov
 
Recurrent Neural Networks. Part 1: Theory
Recurrent Neural Networks. Part 1: TheoryRecurrent Neural Networks. Part 1: Theory
Recurrent Neural Networks. Part 1: TheoryAndrii Gakhov
 
Apache Big Data Europe 2015: Selected Talks
Apache Big Data Europe 2015: Selected TalksApache Big Data Europe 2015: Selected Talks
Apache Big Data Europe 2015: Selected TalksAndrii Gakhov
 
Swagger / Quick Start Guide
Swagger / Quick Start GuideSwagger / Quick Start Guide
Swagger / Quick Start GuideAndrii Gakhov
 
API Days Berlin highlights
API Days Berlin highlightsAPI Days Berlin highlights
API Days Berlin highlightsAndrii Gakhov
 
Apache Spark Overview @ ferret
Apache Spark Overview @ ferretApache Spark Overview @ ferret
Apache Spark Overview @ ferretAndrii Gakhov
 
Data Mining - lecture 8 - 2014
Data Mining - lecture 8 - 2014Data Mining - lecture 8 - 2014
Data Mining - lecture 8 - 2014Andrii Gakhov
 
Data Mining - lecture 7 - 2014
Data Mining - lecture 7 - 2014Data Mining - lecture 7 - 2014
Data Mining - lecture 7 - 2014Andrii Gakhov
 
Data Mining - lecture 6 - 2014
Data Mining - lecture 6 - 2014Data Mining - lecture 6 - 2014
Data Mining - lecture 6 - 2014Andrii Gakhov
 
Data Mining - lecture 5 - 2014
Data Mining - lecture 5 - 2014Data Mining - lecture 5 - 2014
Data Mining - lecture 5 - 2014Andrii Gakhov
 
Data Mining - lecture 4 - 2014
Data Mining - lecture 4 - 2014Data Mining - lecture 4 - 2014
Data Mining - lecture 4 - 2014Andrii Gakhov
 

Mais de Andrii Gakhov (20)

Let's start GraphQL: structure, behavior, and architecture
Let's start GraphQL: structure, behavior, and architectureLet's start GraphQL: structure, behavior, and architecture
Let's start GraphQL: structure, behavior, and architecture
 
Exceeding Classical: Probabilistic Data Structures in Data Intensive Applicat...
Exceeding Classical: Probabilistic Data Structures in Data Intensive Applicat...Exceeding Classical: Probabilistic Data Structures in Data Intensive Applicat...
Exceeding Classical: Probabilistic Data Structures in Data Intensive Applicat...
 
Too Much Data? - Just Sample, Just Hash, ...
Too Much Data? - Just Sample, Just Hash, ...Too Much Data? - Just Sample, Just Hash, ...
Too Much Data? - Just Sample, Just Hash, ...
 
DNS Delegation
DNS DelegationDNS Delegation
DNS Delegation
 
Implementing a Fileserver with Nginx and Lua
Implementing a Fileserver with Nginx and LuaImplementing a Fileserver with Nginx and Lua
Implementing a Fileserver with Nginx and Lua
 
Pecha Kucha: Ukrainian Food Traditions
Pecha Kucha: Ukrainian Food TraditionsPecha Kucha: Ukrainian Food Traditions
Pecha Kucha: Ukrainian Food Traditions
 
Probabilistic data structures. Part 4. Similarity
Probabilistic data structures. Part 4. SimilarityProbabilistic data structures. Part 4. Similarity
Probabilistic data structures. Part 4. Similarity
 
Probabilistic data structures. Part 3. Frequency
Probabilistic data structures. Part 3. FrequencyProbabilistic data structures. Part 3. Frequency
Probabilistic data structures. Part 3. Frequency
 
Probabilistic data structures. Part 2. Cardinality
Probabilistic data structures. Part 2. CardinalityProbabilistic data structures. Part 2. Cardinality
Probabilistic data structures. Part 2. Cardinality
 
Вероятностные структуры данных
Вероятностные структуры данныхВероятностные структуры данных
Вероятностные структуры данных
 
Recurrent Neural Networks. Part 1: Theory
Recurrent Neural Networks. Part 1: TheoryRecurrent Neural Networks. Part 1: Theory
Recurrent Neural Networks. Part 1: Theory
 
Apache Big Data Europe 2015: Selected Talks
Apache Big Data Europe 2015: Selected TalksApache Big Data Europe 2015: Selected Talks
Apache Big Data Europe 2015: Selected Talks
 
Swagger / Quick Start Guide
Swagger / Quick Start GuideSwagger / Quick Start Guide
Swagger / Quick Start Guide
 
API Days Berlin highlights
API Days Berlin highlightsAPI Days Berlin highlights
API Days Berlin highlights
 
Apache Spark Overview @ ferret
Apache Spark Overview @ ferretApache Spark Overview @ ferret
Apache Spark Overview @ ferret
 
Data Mining - lecture 8 - 2014
Data Mining - lecture 8 - 2014Data Mining - lecture 8 - 2014
Data Mining - lecture 8 - 2014
 
Data Mining - lecture 7 - 2014
Data Mining - lecture 7 - 2014Data Mining - lecture 7 - 2014
Data Mining - lecture 7 - 2014
 
Data Mining - lecture 6 - 2014
Data Mining - lecture 6 - 2014Data Mining - lecture 6 - 2014
Data Mining - lecture 6 - 2014
 
Data Mining - lecture 5 - 2014
Data Mining - lecture 5 - 2014Data Mining - lecture 5 - 2014
Data Mining - lecture 5 - 2014
 
Data Mining - lecture 4 - 2014
Data Mining - lecture 4 - 2014Data Mining - lecture 4 - 2014
Data Mining - lecture 4 - 2014
 

Último

Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identityteam-WIBU
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceBrainSell Technologies
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfYashikaSharma391629
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf31events.com
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxRTS corp
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZABSYZ Inc
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 

Último (20)

Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identity
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. Salesforce
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZ
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 

Elasticsearch

  • 1. ELASTICSEARCH What’s new since 0.90? techtalk @ ferret
  • 2. • Latest stable release: Elasticsearch 1.1.0 • Released: 25.03.2014 • Based on Lucene 4.6.1
  • 4. CONFIGURATION • The cluster.routing.allocation settings (disable_allocation, disable_new_allocation and disable_replica_location) have been replaced by the single setting: cluster.routing.allocation.enable: all|primaries|new_primaries| none • Elasticsearch on 64 bit Linux now uses mmapfs by default. Make sure that you set MAX_MAP_COUNT to a sufficiently high number. The RPM and Debian packages default this value to 262144.
  • 5. MULTI-FIELDS Existing multi-fields will be upgraded to the new format automatically. "title": { "type": "multi_field", "fields": { "title": { "type": "string" }, "raw": { "type":“string", "index": "not_analyzed" } } } "title": { "type": "string", "fields": { "raw": { "type":“string", "index": "not_analyzed" } } }
  • 6. STOPWORDS • Previously, the standard and pattern analyzers used the list of English stopwords by default, which caused some hard to debug indexing issues. • Now they are set to use the empty stopwords list (ie _none_) instead.
  • 7. RETURNVALUES • The ok return value has been removed from all response bodies as it added no useful information. • The found, not_found and exists return values have been unified as found on all relevant APIs. • Field values, in response to the fields parameter, are now always returned as arrays. Metadata fields are always returned as scalars. • The analyze API no longer supports the text response format, but does support JSON andYAML.
  • 8. DEPRECATIONS • Per-document boosting with the _boost field has been removed.You can use the function_score instead. • The custom_score and custom_boost_score is no longer supported. You can use function_score instead. • The field query has been removed. Use the query_string query instead. • The path parameter in mappings has been deprecated. Use the copy_to parameter instead.
  • 10. AGGREGATIONTYPES • Bucketing aggregations Aggregations that build buckets, where each bucket is associated with a key and a document criterion. ! Examples: range, terms, histogram ! Bucketing aggregations can have sub-aggregations (bucketing or metric). The sub-aggregations will be computed for the buckets which their parent aggregation generates. • Metrics aggregations Aggregations that keep track and compute metrics over a set of documents. ! Examples: min, max, stats
  • 11. { "aggs" : { "price_ranges" : { "range" : { "field" : "price", "ranges" : [ { "to" : 50 }, { "from" : 100 } ] }, "aggs" : { "price_stats" : { "stats" : { "field" : "price" } } } } } } { "aggregations": { "price_ranges" : { "buckets": [ { "to": 50, "doc_count": 2, "price_stats": { "count": 2, "min": 20, "max": 47, "avg": 33.5, "sum": 67 } }, … ] } } }
  • 12. CARDINALITY The cardinality aggregation is a metric aggregation that allows to compute approximate unique counts based on the HyperLogLog++ algorithm which has the nice properties of both being close to accurate on low cardinalities and having fixed memory usage so that estimating high cardinalities doesn't blow up memory. { "aggs" : { "author_count" : { "cardinality" : { "field" : "author" } } } }
  • 13. PERCENTILES A percentiles aggregation would allow to compute (approximate) values of arbitrary percentiles based on the t-digest algorithm. Computing exact percentiles is not reasonably feasible as it would require shards to stream all values to the node that coordinates search execution, which could be gigabytes on a high-cardinality field. 1.1.0 { "aggs" : { "load_time_outlier" : { "percentiles" : { "field" : "load_time" } } } } { ... "aggregations": { "load_time_outlier": { "1.0": 15, "5.0": 20, "25.0": 23, "50.0": 25, "75.0": 29, "95.0": 60, "99.0": 150 } } }
  • 14. SIGNIFICANT_TERMS { "query" : { "terms" : { "force" : [ "BritishTransport Police" ] } }, "aggregations" : { "significantCrimeTypes" : { "significant_terms" : { "field" : "crime_type" } } } } An aggregation that identifies terms that are significant rather than merely popular in a result set. Significance is related to the changes in document frequency observed between everyday use in the corpus and frequency observed in the result set. 1.1.0 { "aggregations" : { "significantCrimeTypes" : { "doc_count": 47347, "buckets" : [ { "key": "Bicycle theft", "doc_count": 3640, "score": 0.371235374214817, "bg_count": 66799 }, … ] } } }
  • 16. TERMS AGGREGATION • Before 1.1.0 terms aggregations return up to size terms, so the way to get all matching terms back was to set size to an arbitrary high number that would be larger than the number of unique terms. ! • Since version 1.1.0 to get ALL terms just set size=0
  • 17. MULTI-FIELD SEARCH • The multi_match query now supports three types of execution:
 • best_fields (field-centric, default) Find the field that best matches the query string. Useful for finding a single concept like “full text search” in either the title or the body field. ! • most_fields (field-centric) Find all matching fields and add up their scores. Useful for matching against multi-fields, where the same text has been analyzed in different ways to improve the relevance score: with/without stemming, shingles, edge-ngrams etc. ! • cross_fields (term-centric) New execution mode which looks for each term in any of the listed fields. Useful for documents whose identifying features are spread across multiple fields, such as first_name and last_name, and supports the minimum_should_match operator in a more natural way than the other two modes.
  • 19. JSON is great… for computers. Human eyes, especially when looking at an ssh terminal, need compact and aligned text.The cat API aims to meet this need. $ curl 'localhost:9200/_cat/nodes?h=ip,port,heapPercent,name' 192.168.56.40 9300 40.3 Captain Universe 192.168.56.20 9300 15.3 Kaluu 192.168.56.50 9300 17.0Yellowjacket 192.168.56.10 9300 12.3 Remy LeBeau 192.168.56.30 9300 43.9 Ramsey, Doug
  • 21. The tribes feature allows a tribe node to act as a federated client across multiple clusters. tribe: t1: cluster.name: cluster_one t2: cluster.name: cluster_two elasticsearch.yml The merged global cluster state means that almost all operations work in the sam way as a single cluster: distributed search, suggest, percolation, indexing, etc. ! However, there are a few exceptions: • The merged view cannot handle indices with the same name in multiple cluster • Master level read operations (eg Cluster State, Cluster Health) will automati execute with a local flag set to true since there is no master. • Master level write operations (eg Create Index) are not allowed.These should performed on a single cluster.
  • 22. BACKUP & RESTORE since version 1.0.0
  • 23. REPOSITORIES $ curl -XPUT 'http://localhost:9200/_snapshot/my_backup' -d '{ "type": "fs", "settings": { "location": "/mount/backups/my_backup", "compress": true }}' Before any snapshot or restore operation can be performed a snapshot repository should be registered in Elasticsearch. Supported repository types: • fs (filesystem) • S3 • HDFS (Hadoop) • Azure
  • 24. SNAPSHOTS $ curl -XPUT "localhost:9200/_snapshot/my_backup/snapshot_1" -d '{ "indices": "index_1,index_2" }' A repository can contain multiple snapshots of the same cluster. Snapshot are identified by unique names within the cluster. • The index snapshot process is incremental. • Only one snapshot process can be executed in the cluster at any time. • Snapshotting process is executed in non-blocking fashion
  • 25. RESTORE $ curl -XPOST "localhost:9200/_snapshot/my_backup/snapshot_1/_restore" -d '{ "indices": "index_1,index_2", "rename_pattern": "index_(.+)", "rename_replacement": "restored_index_$1" }' A snapshot can be restored using the following command: • The restore operation can be performed on a functioning cluster. • An existing index can be only restored if it’s closed. • The restored persistent settings are added to the existing persistent settings.
  • 27. Features: • translating basic Python data types to and from json (datetimes are not decoded for performance reasons) • configurable automatic discovery of cluster nodes • persistent connections • load balancing (with pluggable selection strategy) across all available nodes • failed connection penalization (time based - failed connections won’t be retried until a timeout is reached) • thread safety • pluggable architecture Versioning: • There are two branches - master and 0.4. Master branch is used to track all the changes for Elasticsearch 1.0 and beyond whereas 0.4 tracks Elasticsearch 0.90.