3. Infinispan
• Cache distribuita
• Datagrid scalabile e transazionale:
performance estreme e cloud
• NoSQL “DataBase”: key-value store
– Come si interroga un data grid ?
SELECT * FROM GRID
8. Test sulla mia
libreria
• Dov'é Hibernate
Search in Action?
• Mi passi
ISBN 978-1-
933988-17-7 ?
• Prendi i libri su
Gaudí ?
9.
10.
11.
12. Come implementare
queste funzioni su un
Key/Value store?
• Dov'é Hibernate Search in Action?
• Mi passi ISBN 978-1-933988-17-7 ?
• Trovi i libri su Gaudí ?
13. document based NoSQL:
Map/Reduce
Infinispan non é propriamente document based ma
offre Map/Reduce.
Eppure non é escluso l'uso di JSON, XML, YAML, Java:
public class Book implements Serializable {
final String title;
final String author;
final String editor;
public Book(String title, String author, String editor) {
this.title = title;
this.author = author;
this.editor = editor;
}
}
14. Iterate & collect
class TitleBookSearcher implements
Mapper<String, Book, String, Book> {
final String title;
public TitleBookSearcher(String t) { title = t; }
public void map(String key, Book value, Collector collector){
if ( title.equals( value.title ) )
collector.emit( key, value );
}
class BookReducer implements
Reducer<String, Book> {
public Book reduce(String reducedKey, Iterator<Book> iter) {
return iter.next();
}
}
15. Implementare queste
semplici funzioni:
✔ Trova “Hibernate Search in Action”?
✔ Trova per codice “ISBN 978-1-933988-17-7” ?
✗ Quanti libri a proposito di
“Shakespeare” ?
• Per uno score corretto in ricerche fulltext
servono le frequenze dei frammenti di
testo relative al corpus.
• Il Pre-tagging é poco pratico e limitante
16. Apache Lucene
• Progetto open source Apache™
• Integrato in innumerevoli progetti
• .. tra cui Hibernate via Hibernate Search
• Clusterizzabile via Infinispan
– Performance
– Real time
– High availability
23. Dov'é la fregatura?
• Necessita di un indice: risorse fisiche e di
amministrazione.
– in memory
– on filesystem
– in Infinispan
• Sostanzialmente immutable segments
– Ottimizzato per data mining / query, non per
updates.
• Un mondo di stringhe e vettori di frequenze
24. Infinispan Query quickstart
• Abilita indexing=true nella
configurazione
• Aggiungi il modulo infinispan-
query.jar al classpath
• Annota i POJO inseriti nella cache
per le modalitá di indicizzazione
<dependency>
<groupId>org.infinispan</groupId>
<artifactId>infinispan-query</artifactId>
<version>5.1.3.FINAL</version>
</dependency>
25. Configurazione tramite
codice
Configuration c = new Configuration()
.fluent()
.indexing()
.addProperty(
"hibernate.search.default.directory_provider",
"ram")
.build();
CacheManager manager = new DefaultCacheManager(c);
27. Annotazioni sul modello
@ProvidedId @Indexed
public class Book implements Serializable {
@Field String title;
@Field String author;
@Field String editor;
public Book(String title, String author, String editor) {
this.title = title;
this.author = author;
this.editor = editor;
}
}
28. Esecuzione di Query
SearchManager sm = Search.getSearchManager(cache);
Query query = sm.buildQueryBuilderForClass(Book.class)
.get()
.phrase()
.onField("title")
.sentence("in action")
.createQuery();
List<Object> list = sm.getQuery(query).list();
29. Architettura
• Integra Hibernate Search (engine)
– Listener a eventi Hibernate &
transazioni
• Eventi Infinispan & transazioni
– Mappa tipi Java e grafi del modello a
Documents di Lucene
– Thin-layer design
38. Quickstart Hibernate
Search
• Aggiungi la dipendenza ad hibernate-
search:
<dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernatesearchorm</artifactId>
<version>4.1.0.Final</version>
</dependency>
39. Quickstart Hibernate
Search
• Tutto il resto é opzionale:
– Come gestire gli indici
– Moduli di estensione, Analyzer custom
– Performance tuning
– Mapping custom dei tipi
– Clustering
• JGroups
• Infinispan
• JMS
40. Quickstart Hibernate
@Entity
Search
public class Essay {
@Id
public Long getId() { return id; }
public String getSummary() { return
summary; }
@Lob
public String getText() { return text; }
@ManyToOne
public Author getAuthor() { return
author; }
...
41. Quickstart Hibernate
@Entity @Indexed
Search
public class Essay {
@Id
public Long getId() { return id; }
public String getSummary() { return
summary; }
@Lob
public String getText() { return text; }
@ManyToOne
public Author getAuthor() { return
author; }
...
42. Quickstart Hibernate
@Entity @Indexed
Search
public class Essay {
@Id
public Long getId() { return id; }
@Field
public String getSummary() { return
summary; }
@Lob
public String getText() { return text; }
@ManyToOne
public Author getAuthor() { return
author; }
...
43. Quickstart Hibernate
@Entity @Indexed
Search
public class Essay {
@Id
public Long getId() { return id; }
@Field
public String getSummary() { return
summary; }
@Lob @Field @Boost(0.8)
public String getText() { return text; }
@ManyToOne
public Author getAuthor() { return
author; }
...
44. Quickstart Hibernate
@Entity @Indexed
Search
public class Essay {
@Id
public Long getId() { return id; }
@Field
public String getSummary() { return
summary; }
@Lob @Field @Boost(0.8)
public String getText() { return text; }
@ManyToOne @IndexedEmbedded
public Author getAuthor() { return
author; }
...
45. Un secondo esempio
@Entity @Entity
public class Author { public class Book {
@Id @GeneratedValue private Integer id;
private Integer id; private String title;
private String name; }
@OneToMany
private Set<Book>
books;
}
46. Struttura dell'indice
@Entity @Indexed @Entity
public class Author { public class Book {
@Id @GeneratedValue private Integer id;
private Integer id; @Field(store=Store.YES)
private String title;
@Field(store=Store.YES) }
private String name;
@OneToMany
@IndexedEmbedded
private Set<Book>
books;
}
56. Suggerimenti per
performance ottimali
• Calibra il chunk_size per l'uso effettivo
del vostro indice (evita i read lock
evitando la frammentazione)
• Verifica la dimensione dei pacchetti
network: blob size, JGroups packets,
network interface and hardware.
• Scegli e configura un CacheLoader
adatto
57. Requisiti di memoria
• RAMDirectory: tutto l'indice (e piú) in RAM.
• FSDirectory: un buon OS sa fare un ottimo
lavoro di caching di IO – spesso meglio di
RAMDirectory.
• Infinispan: configurabile, fino alla memoria
condivisa tra nodi
– Flexible
– Fast
– Network vs. disk
58. Moduli per cloud
deployment scalabili
One Infinispan to rule them all
– Store Lucene indexes
– Hibernate second level cache
– Application managed cache
– Datagrid
– EJB, session replication in AS7
– As a JPA “store” via Hibernate OGM
59. Ingredienti per la cloud
• JGroups DISCOVERY protocol
– MPING
– TCP_PING
– JDBC_PING
– S3_PING
• Scegli un CacheLoader
– Database based, Jclouds,
Cassandra, ...
60. Futuro prossimo
• Semplificare la scalabilitá in scrittura
• Auto-tuning dei parametri di
clustering – ergonomics!
• Parallel searching: multi/core +
multi/node
• A component of
– http://www.cloudtm.eu
63. NoSQL:
la flessibilitá costa
• Programming model
• one per product :-(
• no schema => app driven schema
• query (Map Reduce, specific DSL, ...)
• data structure transpires
• Transaction
• durability / consistency
64. Esempio: Infinispan
Distributed Key/Value store
(or Replicated, local only efficient cache,
•
invalidating cache)
Each node is equal
Just start more nodes, or kill some
•
No bottlenecks
by design
•
Cloud-network friendly
JGroups
•
And “cloud storage” friendly too!
•
66. É una ConcurrentMap !
map.put( “user-34”, userInstance );
map.get( “user-34” );
map.remove( “user-34” );
map.putIfAbsent( “user-38”,
another );
67. Qualche altro dettaglio su
Infinispan
● Support for Transactions (XA)
● CacheLoaders
● Cassandra, JDBC, Amazon S3 (jclouds),...
● Tree API for JBossCache compatibility
● Lucene integration
● Two-fold
● Some Hibernate integrations
● Second level cache
● Hibernate Search indexing backend
68. Obiettivi di Hibernate
OGM
Encourage new data usage patterns
•
Familiar environment
•
Ease of use
•
easy to jump in
•
easy to jump out
•
Push NoSQL exploration in enterprises
•
“PaaS for existing API” initiative
•
69. Cos'é
• JPA front end to key/value stores
• Object CRUD (incl polymorphism and
associations)
• OO queries (JP-QL)
• Reuses
• Hibernate Core
• Hibernate Search (and Lucene)
• Infinispan
• Is not a silver bullet
• not for all NoSQL use cases
70. Entitá come blob
serializzati?
• Serialize objects into the (key) value
• store the whole graph?
• maintain consistency with duplicated
objects
• guaranteed identity a == b
• concurrency / latency
• structure change and (de)serialization,
class definition changes
71. OGM’s approach to
schema
• Keep what’s best from relational model
• as much as possible
• tables / columns / pks
• Decorrelate object structure from data
structure
• Data stored as (self-described) tuples
• Core types limited
• portability
72.
73. Query
• Hibernate Search indexes entities
• Store Lucene indexes in Infinispan
• JP-QL to Lucene query transformation
• Works for simple queries
• Lucene is not a relational SQL engine