Skip to content

What Is DSM?

DSM is an open-source Distributed Shared Memory toolkit for enterprise Java services. It gives service teams a runtime-first way to replicate control-plane data across nodes without adopting a general-purpose database platform inside every service.

Runtime Picture

application code
	|
	| put(...) / acquire(...) / update(...)
	v
+---------------------------------------------------+
| collection handle                                 |
| DsmRegister<E> / DsmLeaseRegister<E> / DsmCrdt... |
+---------------------------------------------------+
	|
	v
+---------------------------------------------------+
| DsmRuntime                                        |
| - validates collection locator                    |
| - stamps EntityMetadata                           |
| - records local state                             |
| - exposes RuntimeDiagnostics                      |
+---------------------------------------------------+
	|
	+----------------------------+------------------+
	|                            |
	v                            v
+--------------------------+   +---------------------------+
| RuntimePlatformSync...   |   | RuntimeDataPlaneRep...   |
| steady-state delta flow  |   | digest/snapshot/replay   |
+--------------------------+   +---------------------------+
	|                            |
	+--------------+-------------+
				   |
				   v
		 peer nodes in same clusterId/serviceId

What DSM Solves

DSM is designed for data that has to be shared and coordinated across a service family:

  • route hints and service-local discovery metadata
  • shard or partition ownership
  • low-volume replicated configuration
  • convergent counters and mergeable state

The core idea is simple: your application code talks to collection handles, DsmRuntime stamps and records the change, and the sync layer replicates those deltas across the cluster.

A Concrete Example

The standalone example in dsm-examples starts one runtime with three collections:

  • register: shared/gateway/route-hints
  • CRDT: shared/worker/request-counter
  • lease: shared/worker/shard-owner

That gives you three different kinds of shared state in one process:

shared/gateway/route-hints
	edge-eu-west-1 -> 10.1.0.8:8443

shared/worker/shard-owner
	shard-17 -> owned by worker-a, fencing token 42

shared/worker/request-counter
	merged cluster counter across all workers

The collection locator is always three-part:

tenantId/applicationId/collectionId

shared/gateway/route-hints
shared/worker/shard-owner
shared/worker/request-counter

Why Teams Use It

  • It keeps coordination inside the service that owns the workflow.
  • It exposes explicit runtime boundaries: clusterId, serviceId, tenantId, applicationId, and collectionId.
  • It supports three different coordination models in one API surface: registers, leases, and CRDTs.
  • It ships with security, diagnostics, and Spring Boot support instead of leaving those as integration debt.

What DSM Is Not

DSM is not a general-purpose OLTP store, analytics engine, or document database. It is optimized for replicated metadata and coordination state where deterministic runtime behavior matters more than bulk storage throughput.

Use a database when you need rich querying, large document payloads, or system-of-record guarantees.

Use DSM when you need fast shared coordination state inside a service family.

Core Runtime Flow

  1. Application code mutates a collection handle.
  2. DsmRuntime stamps entity metadata and records the change locally.
  3. RuntimePlatformSyncService replicates register, lease, and CRDT deltas through platform envelopes.
  4. RuntimeDataPlaneReplicationService repairs lagging peers using digest, snapshot, and replay flows.

In practice, this means a register put(...) does not talk directly to every peer. The local runtime commits the update first, then the sync layer transports the delta, and repair catches up any peer that missed it.

What The Identifiers Mean

DSM has several identifiers. They are not interchangeable.

clusterId     = which DSM cluster is this?
serviceId     = which service family is allowed to join it?
tenantId      = which tenant or high-level owner does the data belong to?
applicationId = which application domain owns the collection?
collectionId  = which exact collection is being replicated?

Example:

clusterId     = prod-eu-west-cluster
serviceId     = gateway-service
tenantId      = shared
applicationId = gateway
collectionId  = route-hints

If clusterId or serviceId is wrong, nodes should not share runtime traffic. If the collection locator is wrong, peers may reject or misroute replicated data.

Deployment Boundary Rules

clusterId and serviceId are required isolation boundaries. Keep them stable for a deployed DSM cluster and service family. In practice:

  • clusterId separates one DSM cluster from another.
  • serviceId ensures only related services participate in the same runtime fabric.

Recommended rule: use different clusterId values for local, staging, and production, and keep serviceId stable per deployed service family.

Where To Go Next