JGroups is a middleware for reliable multicast communication in Java. JGroups provides both low level communication primitives, such as message transport and group membership, and high level communication functions, such as synchronous message exchange or distributed mutual exclusion. The architecture of JGroups is configurable to allow tailoring to application requirements.
The low level functions of the communication mechanism, such as group membership and message transport, are provided by channels.
public class JChannel implements Closeable { // Initialization accepts configuration options public JChannel (); public JChannel (String url); public JChannel (InputStream stream); // Join a group with a given name public void connect (String cluster); public String clusterName (); public void disconnect (); // View is the current list of members public View getView (); // Send a message to all or one group member. public void send (Message msg); public void send (Address dst, Object obj); public void send (Address dst, byte [] buf); public void send (Address dst, byte [] buf, int offset, int length); // Asynchronous notification about messages and membership is available public void setReceiver (Receiver r); public Receiver getReceiver (); ... }
The addresses used in the channel methods are internal identifiers typically assigned by the transport protocol modules.
public interface Receiver { // Receive individual messages or batches of messages default void receive (Message msg) { ... } default void receive (MessageBatch batch) { ... } // Notification about membership view change default void viewAccepted (View new_view) { ... } // Notification to temporarily suspend sending messages default void block () { ... } default void unblock () { ... } // Group members can share state default void getState (OutputStream output) { ... } default void setState (InputStream input) { ... } }
public interface Message ... { short BYTES_MSG = 0, NIO_MSG = 1, EMPTY_MSG = 2, OBJ_MSG = 3, LONG_MSG = 4, COMPOSITE_MSG = 5, FRAG_MSG = 6; short getType (); Address getDest (); Message setDest (Address new_dest); Address getSrc (); Message setSrc (Address new_src); // Headers are internal and interpreted by individual protocol modules Message putHeader (short id, Header hdr); <T extends Header> T getHeader (short id); Map<Short,Header> getHeaders (); // Flags are interpreted by individual protocol modules // Examples include disabling flow control or reliability short getFlags (boolean transient_flags); Message setFlag (short flag, boolean transient_flags); // Convenience methods on the interface // May not make sense for all message classes byte [] getArray (); int getOffset (); int getLength (); public Message setBuffer (byte [] b); Message setArray (byte [] b, int offset, int length); <T extends Object> T getObject (); Message setObject (Object obj); <T extends Object> T getPayload (); Message setPayload (Object pl); ... } public class BytesMessage ... { public BytesMessage (Address dest, byte [] array) { ... } public BytesMessage (Address dest, byte [] array, int offset, int length) { ... } ... } public class NioMessage ... { // Uses java.nio.ByteBuffer that can reduce copying overhead public NioMessage (Address dest, ByteBuffer buf) { ... } public ByteBuffer getBuf () { ... } public NioMessage setBuf (ByteBuffer b) { ... } ... } public class ObjectMessage ... { public ObjectMessage(Address dest, Object obj) { ... } public class CompositeMessage ... implements Iterable<Message> { public CompositeMessage (Address dest, Message ... messages) { ... } public CompositeMessage add (Message msg) { ... } public <T extends Message> T get (int index) { ... } public Iterator<Message> iterator () { ... } ... }
Somewhat inaptly named, building blocks use channels to provide high level functions of the communication mechanism, such as synchronous message exchange or group mutual exclusion.
public class MessageDispatcher implements ... { // Message dispatcher needs channel for communication and request handler for message delivery. public MessageDispatcher (JChannel channel) { ... } public MessageDispatcher (JChannel channel, RequestHandler req_handler) { ... } // Casting sends to multiple destinations or all members when none specified. public <T> RspList<T> castMessage (final Collection<Address> dests, Message msg, RequestOptions opts) { ... } public <T> CompletableFuture<RspList<T>> castMessageWithFuture (final Collection<Address> dests, Message msg, RequestOptions opts) { ... } // Sending sends to single destination. public <T> T sendMessage (Message msg, RequestOptions opts) { ... } public <T> CompletableFuture<T> sendMessageWithFuture (Message msg, RequestOptions opts) { ... } // Request handler interface if none provided externally. @Override public Object handle (Message msg) { ... } @Override public void handle (Message request, Response response) { ... } ... } public class RequestOptions { // Can wait for none, one or all responses. public ResponseMode getMode () { ... } public RequestOptions setMode (ResponseMode mode) { ... } // Can specify response filter if response expected. public RspFilter getRspFilter () { ... } public RequestOptions setRspFilter (RspFilter filter) { ... } // Can specify response timeout if response expected. public long getTimeout () { ... } public RequestOptions setTimeout (long timeout) { ... } ... } public class RspList<T> extends HashMap<Address,Rsp<T>> implements Iterable<Rsp<T>> { public int numReceived () { ... } public boolean isReceived (Address sender) { ... } // Standard get inherited. public T getFirst () { ... } public List<T> getResults () { ... } // Response is not expected from failed members. public int numSuspectedMembers () { ... } public List<Address> getSuspectedMembers () { ... } public boolean isSuspected (Address sender) { ... } ... }
public class CounterService { // Channel stack must include COUNTER protocol. public CounterService (JChannel ch) { ... } public SyncCounter getOrCreateSyncCounter (String name, long initial_value) { ... } public CompletionStage<AsyncCounter> getOrCreateAsyncCounter (String name, long initial_value) { ... } public void deleteCounter (String name) { ... } ... } public interface SyncCounter extends BaseCounter { long get (); void set (long new_value); long addAndGet (long delta); long incrementAndGet (); long decrementAndGet (); long compareAndSwap (long expect, long update); boolean compareAndSet (long expect, long update); // Useful for complex updates under high contention. <T extends Streamable> T update (CounterFunction<T> updateFunction); } public interface AsyncCounter extends BaseCounter { CompletionStage<Long> get (); CompletionStage<Void> set (long new_value); ... }
Cluster coordinator stores and updates counter values
Cluster coordinator can have backup coordinators
Counter values include version also sent to clients
Client value with latest version used to recover from coordinator failure
A stack of protocol modules is used to implement various aspects of the reliable multicast communication mechanism.
The transport modules are responsible for transporting messages. The UDP module uses IP multicast to deliver multicast messages and IP unicast to deliver unicast messages. The TCP and TCP_NIO2 modules use a mesh of TCP connections to deliver both multicast and unicast messages, with thread per connection and asynchronous single thread models. The TUNNEL module can tunnel other transport to a specialized router.
uses IP multicast to deliver multicast messages
uses mesh of TCP connections, thread per connection model
uses mesh of TCP connections, asynchronous single thread model
tunnels transport to specialized router
The discovery modules are responsible for locating the group upon initialization. The PING, MPING and BPING modules use IP multicast or IP broadcast over UDP. The TCPPING module attempts to contact members from a given list. The TCPGOSSIP module attempts to contact members using a specialized router. The FILE_PING, JDBC_PING, RACKSPACE_PING, SWIFT_PING and S3_PING keep track of members in various places ranging from shared file systems and shared database tables to cloud storage services. The DNS_PING module relies on A and SRV records in DNS. The PDC module provides persistent cache of discovered members.
uses IP multicast over existing UDP transport
uses IP multicast over separate UDP transport
uses IP broadcast
uses list of member addresses
uses specialized router
uses shared directory to keep track of members
uses shared database to keep track of members
uses Rackspace Cloud File Storage
uses Openstack Swift object storage
uses Amazon Simple Storage Service
uses A and SRV records in DNS
caches discovered members
The merge modules are responsible for merging groups during recovery from network partitioning failures. The MERGE2 module has group coordinators periodically multicast presence and membership information, distinct subgroups are merged upon discovery (versions 3.X only). The MERGE3 module has all members periodically multicast membership information hash, inconsistent membership information is retrieved and merged upon discovery.
group coordinator multicasts presence and membership view (3.X)
all members multicast presence and membership view
The failure detection modules are responsible for detecting failed members. The FD module uses periodic ping with acknowledgment between neighboring members in a ring. The FD_ALL and FD_ALL2 modules use multicast heartbeat among all members in a group. The FD_SOCK module uses a TCP socket ring, socket close indicates suspect. The FD_HOST module augments member failure detection with host failure detection through internal library method (version 4.X only). The VERIFY_SUSPECT module provides additional verification of suspect members.
uses periodic ping in logical ring
uses multicast heartbeat
uses multicast heartbeat
uses TCP socket ring
uses internal library method to ping hosts (4.X)
verify suspect members additionally
The reliable message transmission modules are responsible for providing reliable ordered message delivery.
uses negative acknowledgments and sequence numbering, old version (3.X)
uses negative acknowledgments and sequence numbering, new version
uses positive acknowledgments and sequence numbering, for unicast messages
uses negative acknowledgments and sequence numbering, for unicast messages (3.X)
uses both positive and negative acknowledgments and sequence numbering, for unicast messages (4.X)
Other modules provide functions such as authentication, encryption, compression, fragmentation, flow control, atomic delivery, totally ordered delivery, and other.
rate limiting flow control for unicast
rate limiting flow control for multicast
message fragmentation
message fragmentation (4.X)
atomic delivery in group
helper for shared state transfer
totally ordered delivery through coordinator
bridge between multiple directly reachable clusters
bridge between multiple clusters with routing rules
member authentication
message body encryption
message body compression
The JGroups Project Home Page. https://www.jgroups.org