mhenrixon | Article

After eight major versions and ten years of accumulating features, compatibility shims, and Redis keys, I needed to stop and ask: what does a lock actually need?

The answer turned out to be two Redis keys.

v9 is a ground-up rewrite. I deleted ~5,700 lines of code, reduced Redis keys per lock from 13 to 2, and shipped built-in metrics so you can finally see your locks working. Here's how I got there and what it means for your applications.

The problem with v8

Every lock in v8 created up to 13 Redis keys: QUEUED lists, PRIMED lists, INFO hashes, LOCKED hashes, RUN suffixes, changelog entries, expiring digest sets, and more. Each key had its own lifecycle, its own cleanup logic, and its own failure mode.

This complexity had real costs:

Performance: Lock acquisition required multiple Lua scripts touching many keys. Under contention, throughput suffered.
Memory: 13 keys per lock meant significant Redis memory overhead for applications with millions of unique digests.
Reliability: More moving parts meant more ways for locks to become orphaned. The reaper system grew to 615 lines across 6 files just to clean up after itself.
Debugging: When a lock got stuck, figuring out why meant inspecting a dozen Redis keys and understanding which Lua script was responsible for each one.

I maintained backward compatibility through all of this. Every edge case got its own workaround. The codebase reflected a decade of "just add one more key."

v9 asked: what if I didn't?

Two keys. That's it.

uniquejobs::LOCKED   # Hash  — who holds the lock (JID → metadata)
uniquejobs:digests           # ZSet  — global index of all active digests

The LOCKED hash tells you everything: which job IDs hold the lock, when they acquired it, what worker class and queue they belong to, and what lock type is active. The digests sorted set is a global index that the reaper uses to find orphaned locks.

That's the entire data model.

How locking works

The lock Lua script is 38 lines:

-- Already locked by this job? Idempotent return.
if redis.call("HEXISTS", locked, job_id) == 1 then
  return job_id
end

-- At capacity? Deny.
if redis.call("HLEN", locked) >= limit then
  return nil
end

-- Acquire: store metadata, register in global index.
redis.call("HSET", locked, job_id, metadata)
redis.call("ZADD", digests, score, digest)

-- Apply TTL if configured.
if pttl and pttl > 0 then
  redis.call("PEXPIRE", locked, pttl)
end

return job_id

Two Redis commands for a lock. One Lua script. No intermediate states, no QUEUED-to-PRIMED transitions, no race windows between steps. The lock either exists in the hash or it doesn't.

Unlocking is similarly minimal: HDEL the job from the hash, and if the hash is empty, ZREM the digest and UNLINK the key. Zero keys remain after the last holder releases.

Why this matters

In v8, a crashed process could leave behind orphaned QUEUED, PRIMED, INFO, and RUN keys that the reaper might miss. In v9, a crashed process leaves behind exactly one thing: a LOCKED hash entry. The reaper scans the digests ZSET, checks if the LOCKED hash still exists, and removes stale entries. It's 27 lines of Lua.

Performance

I benchmarked v8.0.12, v8.1.0, and v9.0.0 on the same hardware:

Benchmark	v8.0.12	v8.1.0	v9.0.0	Improvement
Lock + Unlock	6,520 i/s	6,855 i/s	8,684 i/s	+33%
Lock + Execute	8,445 i/s	--	14,406 i/s	+71%
No contention	1,805 i/s	~6,855 i/s	8,019 i/s	+344%
Under contention	2,920 i/s	~5,400 i/s	5,461 i/s	+87%
500 jobs / 20 threads	3.18 i/s	9.54 i/s	9.52 i/s	+199%

Memory usage dropped significantly:

Scenario	v8.0.12	v9.0.0	Reduction
100 lock/unlock cycles	~2.4 MB	1.78 MB	-25%
100 execute cycles	~2.8 MB	1.33 MB	-52%

The no-contention case improved by 344% because v8 was doing unnecessary work checking QUEUED and PRIMED lists on every lock attempt. v9 does a single HEXISTS + HLEN + HSET.

Lock metrics: seeing your locks in action

v8 had no built-in way to answer "how many locks were acquired this hour?" or "which lock type has the most denials?" You could configure the reflection system and build your own tracking, but most people didn't.

v9 includes built-in metrics with zero configuration. Every lock acquisition, denial, release, and failure is recorded directly to Redis:

uniquejobs:metrics|260328|16:42    # Hash — counters per lock type per minute

The Sidekiq Web UI "Locks" tab now shows a metrics table:

Type	Acquired	Denied	Released	Failures
until_executed	1,204	87	1,191	0
while_executing	523	0	523	2
until_and_while_executing	312	14	640	0
total	2,039	101	2,354	2

These numbers tell you things that were previously invisible:

Acquired vs Denied reveals your conflict rate. A high denial rate on until_executed might mean your jobs are too slow or your lock TTL is too long.
Released > Acquired for until_and_while_executing is normal — this lock type acquires once on the client but releases twice (the "until" lock on the server, then the "while executing" runtime lock after the job completes).
Failures > 0 means something went wrong during execution. Check your error tracker.

Cross-process recording

Getting metrics right was trickier than I expected. My first implementation used the reflection system: lock events fired reflections, and a server-side listener accumulated counters in memory, flushing to Redis every 60 seconds.

The problem: reflections only have listeners in the Sidekiq server process. When a job is enqueued from a Rails web process, the lock is acquired there — where nobody is listening. The until_executing lock type showed 0 Acquired despite working correctly because all its locks are acquired client-side.

The fix was direct Redis recording via HINCRBY from the locksmith itself, bypassing the reflection pipeline entirely for metrics. Now every lock acquisition and release writes to Redis immediately, regardless of which process it happens in.

ReliableFetch

v9 includes an optional fetch strategy that provides crash recovery:

Sidekiq.configure_server do |config|
  config[:fetch_class] = SidekiqUniqueJobs::Fetch::Reliable
end

Standard Sidekiq uses BRPOP to fetch jobs from queues. If the process crashes between fetching and completing a job, that job is lost. ReliableFetch uses LMOVE to atomically move jobs from the queue to a per-process working list:

local job = redis.call("LMOVE", queue, working, "RIGHT", "LEFT")

On startup, the fetch strategy scans for orphaned working lists from dead processes and requeues their jobs. During graceful shutdown, it requeues in-progress jobs while preserving their locks (so duplicates don't slip in during the restart window).

The fetch script also validates locks at pop time — if a job's lock has already been released (maybe by the reaper), the job is still processed but the server knows not to try unlocking.

What I deleted

The most satisfying part of this rewrite was deletion:

Component	v8	v9
Locksmith	449 lines	163 lines
Reaper	615 lines (6 files)	27 lines (1 Lua script)
Lua scripts	15 scripts	8 scripts
README	1,082 lines	180 lines
Net lines	--	-5,700

The entire orphan cleanup system — Manager, Observer, RubyReaper, LuaReaper, Resurrector — collapsed into a single 27-line Lua script that scans the digests ZSET and removes entries whose LOCKED hash no longer exists.

The QUEUED list, PRIMED list, INFO hash, and RUN suffix key — all gone. These existed to support a state machine (queued → primed → locked → executing) that added complexity without adding safety. The v9 model is binary: you're in the LOCKED hash or you're not.

The Changelog feature (a Redis ZSET recording every lock operation) was removed entirely. It wrote to Redis on every lock and unlock, consuming memory and I/O for data that few users ever looked at. The reflection system provides the same observability without the storage cost, and the new metrics give you the aggregate view that's actually useful.

Supply chain security

While I was at it, I modernized the release process:

OIDC Trusted Publishing: No long-lived API keys. RubyGems.org verifies the GitHub Actions workflow identity via OpenID Connect.
Sigstore attestations: Every gem is signed with a keyless signature logged in a public transparency log.
SHA-256 + SHA-512 checksums: Generated in CI and attached to every GitHub release.
Gem content verification: CI unpacks the built gem and fails if test files, rake tasks, or other dev artifacts leaked in.

Releasing is one command:

rake release[9.0.0]

Upgrading from v8

v9 automatically migrates your lock data on first startup. The UpgradeLocks module scans for v8 key patterns (QUEUED, PRIMED, INFO, RUN variants), removes the obsolete keys, and merges the old expiring_digests sorted set into the unified digests ZSET.

No manual steps. No downtime. Start the new version and it handles the rest.

Ruby: 3.2+ (was 2.7+)
Sidekiq: 8.0+ (was 7.0+)
Redis: 6.2+ (for LMOVE support)

What's next

v9.0.0.alpha1 is available on RubyGems. I'm running it in production and stabilizing for a final release. If you're on Sidekiq 8+ and Ruby 3.2+, give it a try:

gem "sidekiq-unique-jobs", "~> 9.0.0.alpha1"

sidekiq-unique-jobs v9: Fewer Keys, More Speed, Zero Guesswork