2.8. Hazelcast

Hazelcast is an in memory datastore cluster with replication and processing support.

Hazelcast Architecture

Topologies. 

Embedded

Node is part of client (Java)

Client Server

Nodes are separate servers with connected clients (Java, Python, C++, C# ...)

Smart Client

Connects to all server nodes and distributes requests

Single Socket Client

Connects to one server node that mediates requests

Partitioning.  Partitioned data structures split between nodes

  • By default 271 partitions

  • Per instance configurable backup copies

  • Per instance configurable synchronization with backup copies

  • Read of backup copies possible with reduced consistency guarantees

Nodes can form partition groups

  • Used to distribute partitions across failure domains

  • Can be derived from deployment architecture in cloud

Partitioning uses consistent hash algorithm

  • Smart clients can communicate with relevant nodes directly

  • Non partitioned data structures can specify partition manually

Hazelcast Member Discovery

Hazelcast supports multiple member discovery mechanisms. When cluster membership changes, partitions are migrated accordingly. Standard member discovery mechanisms include multicast, bootstrap from existing cluster member, discovery through shared registry (Active Directory, ZooKeeper, Consul, etcd) and discovery through resource enumeration (Amazon Elastic Cloud, Google Cloud Platform).

Distributed Collections

Map

distributed hash map with possible persistency

Set

distributed hash set with possible persistency

Multi Map

a hash map variant that supports multiple values per key

Replicated Map

a hash map variant that stores all entries everywhere

Queue

distributed blocking queue

List

ordered list stored on one node

Ring Buffer

distributed circular buffer

Event Journal

distributed map update journal

Cardinality Estimator

distributed set cardinality estimator

IMap Interface

public interface IMap <K,V> extends ConcurrentMap <K,V>, BaseMap<K,V>, Iterable <Map.Entry <K,V>> {

    // Synchronous access methods

    V get (Object key);
    Map <K,V> getAll (Set <K> keys);

    void set (K key, V value);
    void setAll (Map <? extends K, ? extends V> map);

    V put (K key, V value);
    V putIfAbsent (K key, V value);
    void putAll (Map <? extends K, ? extends V> map);

    V replace (K key, V value);
    boolean replace (K key, V oldValue, V newValue);

    V remove (Object key);
    boolean remove (Object key, Object value);
    void removeAll (Predicate <K,V> predicate);

    void delete (Object key);
    void clear ();

    boolean containsKey (Object key);
    boolean containsValue (Object value);

    Iterator <Entry <K,V>> iterator ();
    Iterator <Entry <K,V>> iterator (int fetchSize);

    Set <K> keySet ();
    Set <K> keySet (Predicate <K,V> predicate);
    Set <K> localKeySet ();
    Set <K> localKeySet (Predicate <K,V> predicate);
    Collection <V> values ();
    Collection <V> values (Predicate <K,V> predicate);
    Set <Map.Entry <K,V>> entrySet ();
    Set <Map.Entry <K,V>> entrySet (Predicate <K,V> predicate);

    // Entries can have limited lifetime

    boolean setTtl (K key, long ttl, TimeUnit timeunit);

    void set (K key, V value, long ttl, TimeUnit ttlUnit);
    void set (K key, V value, long ttl, TimeUnit ttlUnit, long maxIdle, TimeUnit maxIdleUnit);

    V put (K key, V value, long ttl, TimeUnit ttlUnit);
    V put (K key, V value, long ttl, TimeUnit ttlUnit, long maxIdle, TimeUnit maxIdleUnit);
    V putIfAbsent (K key, V value, long ttl, TimeUnit ttlUnit);
    V putIfAbsent (K key, V value, long ttl, TimeUnit ttlUnit, long maxIdle, TimeUnit maxIdleUnit);

    // Keys can be locked even if absent

    void lock (K key);
    void lock (K key, long leaseTime, TimeUnit timeUnit);
    boolean tryLock (K key);
    boolean tryLock (K key, long time, TimeUnit timeunit);
    boolean tryLock (K key, long time, TimeUnit timeunit, long leaseTime, TimeUnit leaseTimeunit);

    void unlock (K key);
    void forceUnlock (K key);

    boolean isLocked (K key);

    boolean tryPut (K key, V value, long timeout, TimeUnit timeunit);
    boolean tryRemove (K key, long timeout, TimeUnit timeunit);

    // Asynchronous access methods

    CompletionStage <V> getAsync (K key);

    CompletionStage <Void> setAsync (K key, V value);
    CompletionStage <Void> setAsync (K key, V value, long ttl, TimeUnit ttlUnit);
    CompletionStage <Void> setAsync (K key, V value, long ttl, TimeUnit ttlUnit, long maxIdle, TimeUnit maxIdleUnit);
    CompletionStage <Void> setAllAsync (Map <? extends K, ? extends V> map);

    CompletionStage <V> putAsync (K key, V value);
    CompletionStage <V> putAsync (K key, V value, long ttl, TimeUnit ttlUnit);
    CompletionStage <V> putAsync (K key, V value, long ttl, TimeUnit ttlUnit, long maxIdle, TimeUnit maxIdleUnit);
    CompletionStage <Void> putAllAsync (Map <? extends K, ? extends V> map);

    CompletionStage <V> removeAsync (K key);

    // Non transient entries can have backing store

    void loadAll(boolean replaceExistingValues);
    void loadAll(Set<K> keys, boolean replaceExistingValues);

    boolean evict (K key);
    void evictAll ();
    void flush();

    void putTransient (K key, V value, long ttl, TimeUnit ttlUnit);
    void putTransient (K key, V value, long ttl, TimeUnit ttlUnit, long maxIdle, TimeUnit maxIdleUnit);

    // Local and global state change listeners with predicate filtering are supported

    UUID addLocalEntryListener (MapListener listener);
    UUID addLocalEntryListener (MapListener listener, Predicate <K,V> predicate, boolean includeValue);
    UUID addLocalEntryListener (MapListener listener, Predicate <K,V> predicate, K key, boolean includeValue);

    UUID addEntryListener (MapListener listener, boolean includeValue);
    UUID addEntryListener (MapListener listener, K key, boolean includeValue);
    UUID addEntryListener (MapListener listener, Predicate <K,V> predicate, boolean includeValue);
    UUID addEntryListener (MapListener listener, Predicate <K,V> predicate, K key, boolean includeValue);

    boolean removeEntryListener (UUID id);

    UUID addPartitionLostListener (MapPartitionLostListener listener);
    boolean removePartitionLostListener (UUID id);

    // Interceptors can modify or cancel operations

    String addInterceptor (MapInterceptor interceptor);
    boolean removeInterceptor (String id);

    // Entry view provides entry access statistics

    EntryView <K,V> getEntryView (K key);
    LocalMapStats getLocalMapStats ();

    // Distributed processing

    <R> R executeOnKey (K key, EntryProcessor <K,V,R> entryProcessor);
    <R> Map <K,R> executeOnKeys (Set<K> keys, EntryProcessor <K,V,R> entryProcessor);
    <R> Map <K,R> executeOnEntries (EntryProcessor <K,V,R> entryProcessor);
    <R> Map <K,R> executeOnEntries (EntryProcessor <K,V,R> entryProcessor, Predicate <K,V> predicate);

    <R> Collection <R> project (Projection <? super Map.Entry<K,V>,R> projection);
    <R> Collection <R> project (Projection <? super Map.Entry<K,V>,R> projection, Predicate <K,V> predicate);

    <R> R aggregate (Aggregator <? super Map.Entry <K,V>,R> aggregator);
    <R> R aggregate (Aggregator <? super Map.Entry <K,V>,R> aggregator, Predicate <K,V> predicate);

    <R> CompletionStage <R> submitToKey (K key, EntryProcessor <K,V,R> entryProcessor);
    <R> CompletionStage <Map<K,R>> submitToKeys (Set<K> keys, EntryProcessor <K,V,R> entryProcessor);

    // Cache for continuous queries defined by predicates

    QueryCache <K,V> getQueryCache (String name);
    QueryCache <K,V> getQueryCache (String name, Predicate <K,V> predicate, boolean includeValue);
    QueryCache <K,V> getQueryCache (String name, MapListener listener, Predicate <K,V> predicate, boolean includeValue);

    ...
}

Hazelcast Map SQL Query

Primitive entries in a map are accessible as an SQL table with key and value columns. Object entries in a map are accessible as an SQL table with a key column and value columns with object fields. Field access via SQL restricts permitted value serializers and does not support nested objects.

Distributed Communication

Topic

publish subscribe messaging pattern implementation

  • configurable for sender or total ordering

  • totally ordered sends through topic owner

Reliable Topic

publish subscribe with backup ring buffer

  • configurable slow consumer handling

    • DISCARD_OLDEST or DISCARD_NEWEST

    • BLOCK

    • ERROR

Distributed Coordination

Lock

distributed recursive unfair lock

Semaphore

distributed semaphore

Atomic Long

distributed counter

Atomic Reference

distributed atomic object storage (not quite reference)

Positive Negative Counter

distributed counter with relaxed consistency

Countdown Latch

distributed counter with wait for zero support

ID Generator

cluster wide unique identifier generator (for long integers)

  • embeds node identity to avoid communication

  • provides rough time ordering (k-ordering)

Hazelcast Positive Negative Counter

The counter is a conflict free replicated type represented by an array of positive and negative updates aggregated per node. Each node can compute the current counter value by summing the array items, updates to the array do not conflict and therefore do not need immediate synchronization.