Shortcuts: s show h hide n next p prev
perlre: Improve possessive quantifier example

Fixes #15498

How to get past Sub::Defer in the Perl debugger

dev.to #perl

YOU ARE IN A MAZE OF TWISTY LITTLE PASSAGES, ALL ALIKE

perl -d professional_adventure_2024.pl

Sub::Defer::CODE(0x8ab8938)(/epsi/perlbrew/perls/perl-5.16.3/lib/perl5/Sub/Defer.pm:55):
55:         $undeferred ||= undefer_sub($deferred_info->[3]);
auto(-1)  DB<2> v
52:       my $undeferred;
53:       my $deferred_info;
54        my $deferred = sub {
55==>       $undeferred ||= undefer_sub($deferred_info->[3]);
56:         goto &$undeferred;
57:       };
58:       $deferred_info = [ $target, $maker, \$undeferred, $deferred ];
59:       weaken($deferred_info->[3]);
60:       weaken($DEFERRED{$deferred} = $deferred_info);
61:       _install_coderef($target => $deferred) if defined $target;
  DB<2> n

Walkthrough:

  • 7n
  • 3s
perlrebackslash: Add detail about "/" inside qr//.

Closes #12144

This adds detail about the treatment of pattern delimiter characters
that also appear within the pattern proper.
File::Glob: Note use [!...] for negated char classes

This works in sh, not Perl's [^...].

Fixes #21866

CPAN Dependencies, static and dynamic

r/perl

Adopting one from Zefram

Back in 2011, Andrew Main, known to the Perl community as Zefram, released Devel::CallParser. It was a quiet piece of infrastructure: a C API that let XS modules attach custom argument parsers to Perl subroutines. Where the core's PL_keyword_plugin API was awkward to work with directly, CallParser gave you a structured way to extend Perl's syntax from C.

Zefram maintained it through 2013, fixing compatibility issues with indirect and Sub::StrictDecl, working around padrange optimiser changes, and shipping version 0.002. Then silence. The module sat on CPAN, unmaintained, while Perl kept moving.

By the current year and perl version it was breaking. I personally could not install it locally on my hardened macOS runtimes, many reports of issues on threaded builds, shifted qerror internals, and many red reports on CPAN testers. I needed CallParser for Object::Proto::Sugar, so I adopted it and so far have shipped six dev releases (0.003_01 through 0.003_06) to try get it passing green again on all envs. Not glamorous work, but Zefram built something worth preserving.

(RIP Zefram... I didn't know them personally but the infrastructure they left behind is still making new things possible.)

The Idea

With CallParser working again, I decided to implement an idea I'd thought about for a long time: give Perl a proper enum keyword.

Not a hash. Not a bunch of use constant lines. Not a class/object pretending to be an enumeration. An actual keyword that declares an enum at compile time, generates real constants, and gives you a meta object for introspection.

Enum::Declare

Here's what it looks like:

use Enum::Declare;

enum Colour {
    Red,
    Green,
    Blue
}

say Red;    # 0
say Green;  # 1
say Blue;   # 2

That's it. enum is a real keyword, parsed at compile time by an XS callback wired through cv_set_call_parser. The constants are true constants, not subroutine calls, not tied variables. The compiler sees them.

Explicit Values

enum HttpStatus {
    OK        = 200,
    Created   = 201,
    NotFound  = 404,
    Internal  = 500
}

String Enums

enum LogLevel :Str {
    Debug,
    Info,
    Warn = "warning",
    Error,
    Fatal
}

say Debug;  # "debug"
say Warn;   # "warning"

Without an explicit value, :Str lowercases the constant name. With one, it uses what you gave it.

Bitflags

enum Perms :Flags {
    Read,
    Write,
    Execute
}

my $rw = Read | Write;
say "can read"  if $rw & Read;
say "can write" if $rw & Write;

:Flags assigns powers of two automatically. Combine them with bitwise operators as you'd expect.

Exporting

# In your module:
enum StatusCode :Export {
    OK      = 200,
    NotFound = 404
}

# Consumers get the constants automatically, or use tags:
use MyModule qw(:StatusCode);

Meta Objects

Every enum gets a meta object accessible by calling the enum name as a function:

my $meta = Colour();

say $meta->count;           # 3
say $meta->name(0);         # "Red"
say $meta->value('Blue');   # 2
say $meta->valid(1);        # true

my @pairs = $meta->pairs;  # (Red => 0, Green => 1, Blue => 2)

Exhaustive Matching

Colour()->match($val, {
    Red   => sub { "stop" },
    Green => sub { "go" },
    Blue  => sub { "sky" },
});

Miss a variant and it dies. Every key/branch must be covered.

How It Works

Under the hood, use Enum::Declare installs an XS stub named enum into the calling package, then attaches a custom parser via cv_set_call_parser. When Perl encounters enum during compilation, the parser callback fires and:

  1. Reads the enum name (lex_read_ident)
  2. Reads optional attributes - :Str, :Flags, :Export, :Type
  3. Reads the { Name = Value, ... } variant block
  4. Builds the constant subs and enum data structures
  5. Installs the meta object
  6. Optionally wires up @EXPORT / @EXPORT_OK

All of this happens at compile time. By the time Perl starts executing your code, the constants exist, the meta object is ready, and the exports are in place.

Enum::Declare::Common

Once the keyword worked, the obvious next step was a library of common enums:

use Enum::Declare::Common::HTTP qw(:StatusCode :Method);
use Enum::Declare::Common::Calendar qw(:Weekday :Month);
use Enum::Declare::Common::Color qw(:CSS);

say OK;        # 200
say GET;       # "get"
say Monday;    # 1
say January;   # 1

Enum::Declare::Common ships 20 submodules covering HTTP status codes and methods, ISO country and currency codes, MIME types, 148 named CSS hex colours, timezone offsets, Unix permissions, log levels, and more. All built on the same enum keyword, all with meta objects, all exportable.

Integration with Object::Proto

Every enum in the Common collection is declared with the :Type attribute:

enum StatusCode :Type :Export {
    OK      = 200,
    Created = 201,
    ...
}

This registers the enum as a type in Object::Proto at load time, so you can use enum names directly as slot types:

use Enum::Declare::Common::HTTP qw(:StatusCode :Method);
use Enum::Declare::Common::LogLevel qw(:Level);
use Object::Proto;

object 'APIRequest',
    'method:Method:required',
    'status:StatusCode',
    'log_level:Level:default(' . Info . ')',
;

my $req = new APIRequest method => GET;
$req->status(OK);       # valid
$req->status(9999);     # dies - not a valid StatusCode
$req->status(200);      # coercion - resolves to OK

The type checks and coercions run in C via object_register_type_xs_ex. No Perl callback overhead. A single pair of C functions serves every enum type, only the data pointer differs.

If you're writing Perl and you've been using hashes or use constant blocks to fake enums, give Enum::Declare a try. In my opinion it's enums the way they should have always worked.

As always if you have any questions just post below.


Tony writes:

``` [Hours] [Activity] 2026/03/02 Monday 1.55 #24228 follow-up comment, check updates, research and comment 0.75 #24187 review updates, mark comment resolved, research 0.97 #24242 review, research 0.40 #24242 debugging and comment

1.02 #24001 debugging, research, testing

4.69

2026/03/03 Tuesday 0.15 #24242 review dicsussion 0.10 #24211 review discussion and apply to blead 0.53 #24242 comment 0.23 #24239 review and comment 0.18 #24223 review and approve 0.40 #24244 review and comment 0.58 #24245 review and approve 0.07 #24247 review, existing comments seem fine 0.50 #24187 review more, comments 0.08 #24244 review update and approve

0.23 #24195 research

3.05

2026/03/04 Wednesday 0.88 #24252 review, research and comments 0.75 #24251 review, research and comments 0.90 #24253 review, comments 0.12 #24239 review updates and approve 0.28 #24208 comment with guide to update

0.15 #24208 review update and approve

3.08

2026/03/05 Thursday 0.68 #24254 review and comments 0.18 #24256 review and approve 0.13 #24247 check CI results and restart an apparent spurious failure 0.18 #24241 review CI failures and comment

0.40 #24228 compare to #24252 behaviour, testing

1.57

2026/03/09 Monday 0.33 #24254 review updates and approve 0.40 #24253 review updates and comment 1.38 #24252 review updates and comments, research, testing and follow-up

0.68 #24105 rebase, testing

2.79

2026/03/10 Tuesday 0.57 test 5.42.1 on fedora, looks ok, message on list indicates likely a local problem 2.70 #24105 check everything covered, various fixes, testing,

push for CI

3.27

2026/03/11 Wednesday 0.88 #24105 check CI results, fixes, push for more CI 0.57 #24187 review discussion, research and comment 0.13 #24253 review updates and approve 0.12 #24252 review updates and approve with comment 0.23 #24228 review updates and approve 0.15 #24252 approve with perldelta update 1.15 #24001 debugging (what is PL_curcopdb?)

1.20 #24001 debugging, research

4.43

2026/03/12 Thursday 1.12 #24265 review, research and comment

1.33 #24001 research, testing, needs some thought

2.45

2026/03/13 Friday

0.82 research, email to list about benchmarking

0.82

2026/03/16 Monday 0.10 #24208 review updates and apply to blead 2.47 #24272 profiling, benchmarking, comment and work on bisect 0.75 #24272 review bisect results, confirm bisect results, briefly try to work out cause, long comment with results

0.32 #24287 review and approve

3.64

2026/03/17 Tuesday 0.23 #24265 recheck and approve 0.92 #24001 re-work, research and testing 0.57 #24105 rebase and testing, minor fix and push for CI 1.13 #24272 try to diagnose

0.45 #24056 re-work commit message

3.30

2026/03/18 Wednesday 0.40 #24105 check CI results, re-check, make PR #24294 0.75 #24099 review, research and comment 0.22 #24296/#24295 research and comment (both have the same problem)

1.37 #24277 review, testing, comment

2.74

2026/03/19 Thursday 2.05 #24227 research and comments

1.22 #24272 debugging

3.27

2026/03/20 Friday

0.53 #24251 research and follow-up

0.53

2026/03/23 Monday 0.40 #24251 review updates, research and approve with comment 1.22 #24304 review, comment 0.15 #24313 review, research and apply to blead 0.10 #24310 review (nothing to say) 0.17 #24309 review, research and approve 0.13 #24305 review and approve 0.53 #24290 review 1.32 #24056 more update commit message, simplify perldelta

note, push and update OP on PR

4.02

2026/03/24 Tuesday 0.32 #24318 review and review ticket, start workflow, research and comment 0.08 #24301 review and approve 0.37 #24290 more review and comments 0.53 #24289 review, research current PSC and approve with comment 0.38 #24288 review, research and comment 0.18 #24285 review, research and approve 0.72 #24282 review, research and comment

1.23 #24290 review updates, testing and more comment

3.81

2026/03/25 Wednesday 0.15 #24056 check rules, apply to blead 0.30 #24308 review, research and comments 0.82 #24304 review, research and comment, consider Paul’s reply 1.55 #23918 string comparison APIs, research, open #24319 0.13 #24290 review updates and follow-up

1.50 #24005 start on perldebapi, research

4.45

2026/03/26 Thursday 1.25 #24326 review and comment 0.47 #24290 review updates, comment and approve with comment 0.40 #24326 review, comment on side issue and approve 0.28 #24323 review, try to find the referenced documentation, comment 0.10 #24324 review and approve

0.13 #24323 review update and approve

2.63

2026/03/30 Monday 0.80 #24308 review updates and comments 0.08 #24290 review discussion and apply to blead 1.05 #24304 review updates and comment, long comment 1.55 #23676 research, make APIs public and document, testing and push for CI 0.60 #24187 review updates

1.08 #24187 testing, comment

5.16

2026/03/31 Tuesday 1.22 #23676 comment, comment on PR regarding qerror() name, research, work on perldelta 0.47 github notifications, minor updates 0.53 #24332 review original ticket discussion and the change, approve with comment 0.23 #24329 review, research and apply to blead 0.32 #24281 review, try to get a decent view, given github’s tab mis-handling, comment 0.12 #24280 review, comments 0.35 #23995 research and comment 0.08 #24105 follow-up on PR 24294

0.50 #24251 follow-up comment

3.82

Which I calculate is 63.52 hours.

Approximately 51 tickets were reviewed or worked on, and 6 patches were applied. ```


Paul writes:

A couple of bugfixes in March, combined with starting to line up a few development ideas to open 5.45 with.

  • 2 = Bugfix for field refalias memory leak
    • https://github.com/Perl/perl5/pull/24254
  • 2 = Improved field performance
    • https://github.com/Perl/perl5/pull/24265
  • 3 = Continue progress on implementing PPC0030
    • https://github.com/Perl/perl5/pull/24304 (draft)
  • 2 = Bugfix for deferred class seal
    • https://github.com/Perl/perl5/pull/24326

Total: 9 hours

Besides working up to the 5.44 release, my main focus now will be getting things like PPC0030, magic-v2, attributes-v2, and various class feature improvements lined up ready for the 5.45 development cycle.


Dave writes:

Last month was spent looking into race conditions in threads and threads::shared. I initially started looking at a specific ticket, where (with effort) I could reproduce a specific crash by running many instances in parallel for several hours. I think I have fixed that specific bug, but that led me to the rabbit hole of dynamic thread-safety checkers such as helgrind, and I am currently plunging down the rabbit hole of issues which that tools is flagging up.

Nothing has been pushed yet.

Summary:

  • 17:06 GH #24258 dist/threads/t/free.t: Rare test failure in debugging build on FreeBSD

Total:

  • 17:06 TOTAL (HH::MM)

This week in PSC (221) | 2026-04-13

blogs.perl.org

All three of us attended this long meeting consisting entirely of dealing with release blockers.

  • We found no blockers among the issues and patches new since last week.

  • We weighed up #23676 again and decided that it merits blocker status even though the breakage was the result of a C compiler change, not of a change to perl (in this dev cycle or otherwise).

  • We then reviewed all of the blockers we were already tracking and made decisions on how to proceed with all of them. Of particular note,

  • We agreed that #23131 simply cannot be pursued in this form – not just in this cycle but at all. We may eventually be able to virtualize the stash entirely, allowing much deeper optimization by using a more rational internal data structure while maintaing the Perl-land illusion that nothing has changed (effectively turning the stash hash tree into an API in the same way that a tied hash is an API that looks like a data structure). But until such time, Perl-visible changes to the stash data structure are simply too disruptive.

  • #24340 is part of a series that constitutes a promising-looking fix which we want to pursue, but it didn’t get fully stabilized soon enough in this cycle, and because it’s in a dark and tricky corner of the codebase, we don’t want to take the risk of shipping with undiscovered new breakage there rather than a longstanding bug. Reapplying this patch series early in the next cycle should give it ample time to bake and settle down, and we hope it will eventually ship successfully.

  • We spent some time debating #24341, tracing commit and issue tracker history to try to assess the correctness of the changes. We lean toward the view that this is breakage that core should fix, but discussion in the comments is ongoing.

[P5P posting of this summary]

Happy sharing

blogs.perl.org

So you've got a bunch of Perl worker processes and they need to share state. A work queue, a counter, a lookup table - the usual. What do you reach for?

Perl has solid options here, and they've been around for a while. File::Map gives you clean zero-copy access to mmap'd files - substr, index, even regexes run directly on mapped memory. LMDB_File wraps the Lightning Memory-Mapped Database - mature, ACID-compliant, lock-free concurrent readers via MVCC, crash-safe persistence. Hash::SharedMem offers a purpose-built concurrent hash with lock-free reads and atomic copy-on-write updates. Cache::FastMmap is the workhorse for shared caching - mmap-backed pages with per-page fcntl locking, LRU eviction, and optional compression.

These are all good, proven tools. But they have something in common: they're about storage. You put data in, you get data out. They don't give you a queue that consumers can block on. They don't give you a pub/sub channel, a ring buffer, a semaphore, a priority heap, or a lock-free MPMC algorithm. They don't do atomic counters or futex-based blocking with timeouts.

That's the gap the Data::*::Shared family fills - fourteen Perl modules that give you proper, typed, concurrent data structures backed by mmap. Not better storage - concurrent data structures that happen to live in shared memory. Queues, hash maps, pub/sub, stacks, ring buffers, heaps, graphs, sync primitives - the works. All written in XS/C, all designed to work across fork()'d processes with zero serialization overhead.

Let me walk you through what's in the box.

The Approach

Every module in the family uses the same core recipe:

  • mmap(MAP_SHARED) for the actual shared memory - no serialization, no copies, just raw memory visible to all processes
  • Linux futex for blocking/waiting - when a queue is empty and you want to wait for data, you sleep in the kernel, not in a spin loop
  • CAS (compare-and-swap) for lock-free operations where possible - no mutex, no contention, just atomic CPU instructions
  • PID-based crash recovery - if a process dies holding a lock, other processes detect the stale PID and recover automatically

Requires Linux (futex, memfd), 64-bit Perl 5.22+. A deliberate tradeoff - portable it isn't, but fast it is.

Three ways to create the backing memory:

# File-backed - persistent, survives restarts
my $q = Data::Queue::Shared::Int->new('/tmp/myq.shm', 1024);

# Anonymous - fork-inherited, no filesystem footprint
my $q = Data::Queue::Shared::Int->new(undef, 1024);

# memfd - passable via Unix socket fd, no filesystem visibility
my $q = Data::Queue::Shared::Int->new_memfd("my_queue", 1024);

The Modules

Here's the full roster, grouped by use case.

Message Passing

Data::Queue::Shared - Your bread-and-butter MPMC (multi-producer, multi-consumer) bounded queue. Integer variants use the Vyukov lock-free algorithm; string variant uses a mutex with a circular arena. Blocking and non-blocking modes, batch operations, the whole deal.

use Data::Queue::Shared;

my $q = Data::Queue::Shared::Int->new(undef, 4096);

# In producer
$q->push(42);
$q->push_multi(1, 2, 3, 4, 5);

# In consumer
my $val = $q->pop_wait(1.5); # block up to 1.5s
my @batch = $q->pop_multi(100);

Single-process throughput: ~5M ops/s for integers. That's roughly 3x MCE::Queue and 6x POSIX message queues.

Data::PubSub::Shared - Broadcast pub/sub over a ring buffer. Publishers write, subscribers each track their own cursor. If a subscriber falls behind, it auto-recovers to the oldest available message. No back-pressure on writers.

my $ps = Data::PubSub::Shared::Int->new(undef, 8192);
$ps->publish(42);

my $sub = $ps->subscribe;
my $val = $sub->poll_wait(1.0);

Batch publishing hits ~170M msgs/s for integers. Yes, really. It's just writing to mapped memory.

Data::ReqRep::Shared - Request-response pattern with per-request reply routing. Client acquires a response slot, sends a request carrying the slot ID, server replies to that specific slot. Supports both sync and async client styles.

# Server
my ($request, $id) = $rr->recv_wait(1.0);
$rr->reply($id, "processed: $request");

# Client (async)
my $id = $rr->send("do something");
my $response = $rr->get_wait($id, 2.0);

Around 200K req/s cross-process - competitive with Unix domain sockets but with true MPMC support.

Key-Value

Data::HashMap::Shared - This is the big one. Concurrent hash map with elastic capacity, optional LRU eviction (clock algorithm with lock-free reads), optional per-key TTL, atomic counters, sharding, cursors. Eleven type variants from II (int-int) to SS (string-string).

use Data::HashMap::Shared::SS;

my $map = Data::HashMap::Shared::SS->new('/tmp/cache.shm', 100_000);
$map->put("user:123", "alice");
my $name = $map->get("user:123");

# LRU cache with max 10K entries
my $cache = Data::HashMap::Shared::SS->new('/tmp/lru.shm', 100_000, 10_000);

# TTL - entries expire after 60 seconds
my $ttl = Data::HashMap::Shared::II->new('/tmp/ttl.shm', 100_000, 0, 60);

# Atomic counter (lock-free fast path under read lock)
$map->incr("hits:page_a");

Cross-process string reads: 3.25M/s. Integer lookups hit ~10M/s. And you get built-in LRU and TTL without an external cache layer.

Sequential & Positional

Data::Stack::Shared - Lock-free LIFO stack. Push, pop, peek. ~6.4M ops/s.

Data::Deque::Shared - Double-ended queue. Push/pop from both ends. Lock-free CAS. ~6.3M ops/s.

Data::RingBuffer::Shared - Fixed-size circular buffer that overwrites on wrap. No consumer tracking - you just read by position. Great for metrics windows and rolling logs. ~11.7M writes/s.

Data::Log::Shared - Append-only log. Unlike Queue (consumed on read) or RingBuffer (overwritten), Log retains everything until explicitly truncated. CAS-based append, cursor-based reads. ~8.9M appends/s.

Resource Management

Data::Pool::Shared - Object pool with allocate/free. CAS-based bitmap allocation, typed slots (I64, I32, F64, Str), scope guards for automatic cleanup, raw C pointers for FFI integration. PID-tracked slots are auto-recovered when a process dies.

my $pool = Data::Pool::Shared::I64->new(undef, 256);
my $idx = $pool->alloc;
$pool->set($idx, 42);
# ...
$pool->free($idx);

# Or with auto-cleanup
{
my $guard = $pool->alloc_guard;
$pool->set($$guard, 99);
} # auto-freed here

Data::BitSet::Shared - Fixed-size bitset with per-bit atomic CAS operations. Good for flags, membership tracking, allocation bitmaps. ~10.5M ops/s.

Data::Buffer::Shared - Type-specialized arrays (I8 through F64, plus Str) with atomic per-element access. Seqlock for bulk reads, RW lock for bulk writes. Think shared sensor arrays or metric buffers.

Graphs & Priority

Data::Graph::Shared - Directed weighted graph with mutex-protected mutations. Node bitmap pool, adjacency lists, per-node data. ~3.9M node adds/s, ~13.3M lookups/s.

Data::Heap::Shared - Binary min-heap for priority queues. Mutex-protected, futex blocking when empty. ~5.3M pushes/s.

Synchronization Primitives

Data::Sync::Shared - Five cross-process sync primitives in one module: Semaphore, Barrier, RWLock, Condvar, and Once. All futex-based, all with PID-based stale lock recovery, all with scope guards.

use Data::Sync::Shared;

my $sem = Data::Sync::Shared::Semaphore->new(undef, 4); # 4 permits
{
my $guard = $sem->acquire_guard;
# at most 4 processes here concurrently
}

my $barrier = Data::Sync::Shared::Barrier->new(undef, $num_workers);
$barrier->wait; # blocks until all workers arrive

my $once = Data::Sync::Shared::Once->new(undef);
if ($once->enter) {
init_expensive_thing();
$once->done;
}

At a Glance

ModulePatternConcurrencyThroughput
Queue::SharedMPMC queuelock-free (Int), mutex (Str)~5M ops/s
PubSub::Sharedbroadcast pub/sublock-free (Int), mutex (Str)~170M/s batched
ReqRep::Sharedrequest-responselock-free (Int), mutex (Str)~200K req/s
HashMap::Sharedhash map + LRU/TTLfutex RW lock, seqlock reads~10M gets/s
Stack::SharedLIFO stacklock-free CAS~6.4M ops/s
Deque::Shareddouble-ended queuelock-free CAS~6.3M ops/s
RingBuffer::Sharedcircular bufferlock-free CAS~11.7M writes/s
Log::Sharedappend-only loglock-free CAS~8.9M appends/s
Pool::Sharedobject poollock-free bitmap~3.3M alloc/s
BitSet::Sharedbitsetlock-free CAS~10.5M ops/s
Buffer::Sharedtyped arraysatomic + seqlockper-type
Graph::Shareddirected graphmutex~13.3M lookups/s
Heap::Sharedpriority queuemutex~5.3M pushes/s
Sync::Sharedsem/barrier/rwlock/condvar/oncefutex-

Type Specialization

Most modules come in typed variants - Int16, Int32, Int64, Str, and so on. This isn't just for type safety. An Int16 queue uses half the memory of an Int64 queue, which means double the cache density on the same hardware. When you're doing millions of operations per second, cache lines matter.

Event Loop Integration

Every module supports eventfd() for integration with event loops like EV, Mojo, or AnyEvent:

my $fd = $q->eventfd;
# register $fd with your event loop
# on readable: $q->eventfd_consume; then poll/pop

Signaling is explicit ($q->notify) so you can batch writes before waking consumers.

Playing Nice with Others: PDL, FFI::Platypus, OpenGL::Modern

One thing I want to highlight is that these aren't isolated islands. Because everything lives in mmap'd memory with known layouts, you get natural interop with other systems that work with raw pointers and packed data.

PDL is the obvious one. If you're doing numerical work in Perl - signal processing, image manipulation, statistics - PDL is your workhorse. The Buffer module's as_scalar returns a zero-copy scalar reference directly over the mmap'd region. Feed that to PDL and you've got an ndarray backed by shared memory:

use Data::Buffer::Shared::F64;
use PDL;

my $buf = Data::Buffer::Shared::F64->new('/tmp/signal.shm', 10000);

# one process fills the buffer with sensor data...
# another process reads it as a PDL:
my $pdl = PDL->new_from_specification(double, 10000);
${$pdl->get_dataref} = ${$buf->as_scalar};
$pdl->upd_data;

printf "mean=%.4f stddev=%.4f\n", $pdl->stats;

For typed arrays you can also use get_raw/set_raw for bulk transfers - a single memcpy under the hood, seqlock-guarded for consistency. That means you can build a multiprocess image pipeline where one process captures frames into a shared U8 buffer, another runs PDL convolutions on it, and a third renders the result - all communicating through shared memory with eventfd notifications, no serialization anywhere.

FFI::Platypus works just as naturally. Pool and Buffer both expose ptr() / data_ptr() - raw C pointers as unsigned integers, ready to hand to any C function through FFI. Need to call libc qsort directly on your shared data? Go ahead:

use Data::Pool::Shared;
use FFI::Platypus;

my $pool = Data::Pool::Shared::I64->new(undef, 1000);
# ... alloc and fill slots ...

my $ffi = FFI::Platypus->new(api => 2);
$ffi->lib(undef); # libc
$ffi->attach([qsort => 'c_qsort'] =>
['opaque', 'size_t', 'size_t', '(opaque,opaque)->int'] => 'void');

c_qsort($pool->data_ptr, 1000, 8, $comparator);
# slots are now sorted in-place, visible to all processes

Pool slots are contiguous in memory (data_ptr + idx * elem_size), so any C library that expects a flat array works out of the box.

OpenGL::Modern is where it gets fun. Buffer::F32 is essentially a shared vertex buffer. One process computes positions, another renders them - connected by a shared mmap region and eventfd:

# Compute process:
my $verts = Data::Buffer::Shared::F32->new('/tmp/verts.shm', 30000);
$verts->set_slice(0, @new_positions);
$verts->notify;

# Render process:
my $ref = $verts->as_scalar;
# on eventfd readable:
glBufferSubData_p(GL_ARRAY_BUFFER, 0, $$ref); # zero-copy upload

Pool goes further - it's a natural fit for particle systems. Particles are dynamically spawned (alloc) and despawned (free), each with a fixed-size state struct. A spawner process allocates particles, a physics process updates them, and the renderer uploads the live slots to a VBO via ptr(). The raw pointer goes straight to glBufferSubData_c - no packing, no intermediate copies.

The common thread here is that the data is already in the format the consuming library expects. F32 buffers are packed floats. I64 pools are packed int64s. There's no Perl-side serialization layer to bypass because there was never one to begin with.

Optional Keyword API

If you install XS::Parse::Keyword, several modules expose lexical keywords that bypass Perl method dispatch entirely:

use Data::Queue::Shared;

q_int_push $q, 42;
my $v = q_int_pop $q;

Zero dispatch overhead. The XS function gets called directly. It's optional - the method API works fine - but it's there when you need every last microsecond.

The Big Picture

Here's how the pieces fit together in a typical system:

  • Data::Queue::Shared distributes work from producers to a pool of workers
  • Data::HashMap::Shared acts as a shared cache or config store that all workers read from
  • Data::PubSub::Shared broadcasts events or status updates to whoever's listening
  • Data::Sync::Shared coordinates startup (Barrier), limits concurrency (Semaphore), and protects shared initialization (Once)
  • Data::Pool::Shared manages reusable resource slots
  • Data::RingBuffer::Shared or Data::Log::Shared holds recent metrics or audit trails

All of this running across fork()'d processes, communicating through shared memory at millions of operations per second, no serialization overhead.

Getting Started

Values are typed C scalars or fixed-length strings - no automatic serialization of arbitrary Perl structures. That's by design: raw mmap'd memory is what makes everything fast and FFI-friendly, but it means you won't be sharing hashrefs or blessed objects directly.

All modules follow the same pattern:

use Data::Queue::Shared;

# Pick your backing: file, anonymous, or memfd
my $q = Data::Queue::Shared::Int->new(undef, 4096);

if (fork() == 0) {
$q->push($$); # child pushes its PID
exit;
}

my $child_pid = $q->pop_wait(5.0);
say "Child reported in: $child_pid";

The modules are on GitHub under the vividsnow account. Each one has its own repo, test suite, and benchmarks you can run yourself.

If you've ever wished Perl had something like Go's channels and sync primitives but for fork()'d processes - well, now it does. Fourteen of them, actually.

Happy sharing

Let’s Make a Drum Machine application! Yeah! :D

There are basically two important things to handle: A MIDI “clock” and a groove to play.

Why asynchronous? Well, a simple while (1) { Time::HiRes::sleep($interval); ... } will not do because the time between ticks will fluctuate, often dramatically. IO::Async::Timer::Periodic is a great timer for this purpose. Its default scheduler uses system time, so intervals happen as close to the correct real-world time as possible.

Clocks

A MIDI clock tells a MIDI device about the tempo. This can be handed to a drum machine or a sequencer. Each clock tick tells the device to advance a step of a measured interval. Usually this is very short, and is often 24 pulses per quarter-note (four quarter-notes to a measure of four beats).

Here is code to do that, followed by an explanation of the parts:

#!/usr/bin/env perl

use v5.36;
use feature 'try';
use IO::Async::Loop ();
use IO::Async::Timer::Periodic ();
use MIDI::RtMidi::FFI::Device ();

my $name = shift || 'usb'; # MIDI sequencer device
my $bpm  = shift || 120; # beats per minute

my $interval = 60 / $bpm / 24; # time / bpm / clocks-per-beat

# open the named midi device for output
my $midi_out = RtMidiOut->new;
try { # this will die on Windows but is needed for Mac
    $midi_out->open_virtual_port('RtMidiOut');
}
catch ($e) {}
$midi_out->open_port_by_name(qr/\Q$name/i);

$midi_out->start; # start the sequencer

$SIG{INT} = sub { # halt gracefully
    say "\nStop";
    try {
        $midi_out->stop; # stop the sequencer
        $midi_out->panic; # make sure all notes are off
    }
    catch ($e) {
        warn "Can't halt the MIDI out device: $e\n";
    }
    exit;
};

my $loop = IO::Async::Loop->new;

my $timer = IO::Async::Timer::Periodic->new(
   interval => $interval,
   on_tick  => sub { $midi_out->clock }, # send a clock tick!
);
$timer->start;

$loop->add($timer);
$loop->run;

The above code does a few things. First it uses modern Perl, then the modules that will make execution asynchronous, and finally the module that makes real-time MIDI possible.

Next up, a $name variable is captured for a unique MIDI device. (And to see what the names of MIDI devices on the system are, use JBARRETT’s little list_devices script.) Also, the beats per minute is taken from the command-line. If neither is given, usb is used for the name, and the BPM is set to “dance tempo.”

The clock needs a time interval to tick off. For us, this is a fraction of a second based on the beats per minute, and is assigned to the $interval variable.

To get the job done, we will need to open the named MIDI device for sending output messages to. This is done with the $name provided.

In order to not just die when we want to stop, $SIG{INT} is redefined to gracefully halt. This also sends a stop message to the open MIDI device. This stops the sequencer from playing.

Now for the meat and potatoes: The asynchronous loop and periodic timer. These tell the program to do its thing, in a non-blocking and event-driven manner. The periodic timer ticks off a clock message every $interval. Pretty simple!

As an example, here is the above code controlling my Volca Drum drum machine on a stock, funky groove. We invoke it on the command-line like this:

perl clock-gen-async.pl

Grooves

What we really want is to make our drum machine actually play something of our own making. So it’s refactor time… Let’s make a 4/4 time groove, with 16th-note resolution, that alternates between two different parts. “4/4” is a “time signature” in music jargon and means that there are four beats per measure (numerator), and a quarter note equals one beat (denominator). Other time signatures like the waltz’s 3/4 are simple, while odd meters like 7/8 are not.

In order to generate syncopated patterns, Math::Prime::XS and Music::CreatingRhythms are added to the use statements. “What are syncopated patterns?”, you may ask. Good question! “Syncopated” means, “characterized by displaced beats.” That is, every beat does not happen evenly, at exactly the same time. Instead, some are displaced. For example, a repeated [1 1 1 1] is even and boring. But when it becomes a repeated [1 1 0 1] things get spicier and more syncopated.

The desired MIDI channel is added to the command-line inputs. Most commonly, this will be channel 9 (in zero-based numbering). But some drum machines and sequencers are “multi-timbral” and use multiple channels simultaneously for individual sounds.

Next we define the drums to use. This is a hash-reference that includes the MIDI patch number, the channel it’s on, and the pattern to play. The combined patterns of all the drums, when played together at tempo, make a groove.

Now we compute intervals and friends. Previously, there was one $interval. Now there are a whole host of measurements to make before sending MIDI messages.

Then, as before, a named MIDI output device is opened, and a graceful stop is defined.

Next, a Music::CreatingRhythms object is created. And then, again as before, an asynchronous loop and periodic timer are instantiated and set in motion.

The meaty bits are in the timer’s on_tick callback. This contains all the logic needed to trigger our drum grooves.

As was done in the previous clock code, a clock message is sent, but also we keep track of the number of clock ticks that have passed. This number of ticks is used to trigger the drums. We care about 16 beats. So every 16th beat, we construct and play a queue of events.

Adjusting the drum patterns is where Math::Prime::XS and Music::CreatingRhythms come into play. The subroutine that does that is adjust_drums() and is fired every 4th measure. A measure is equal to four quarter-notes, and we use four pulses for each, to make 16 beats per measure. This routine reassigns either Euclidean or manual patterns of 16 beats to each drum pattern.

Managing the queue is next. If a drum is to be played at the current beat (as tallied by the $beat_count variable), it is added to the queue at full velocity (127). Then, after all the drums have been accounted for, the queue is played with $midi_out->note_on() messages. Lastly, the queue is “drained” by sending $midi_out->note_off() messages.

#!/usr/bin/env perl

use v5.36;
use feature 'try';
use IO::Async::Loop ();
use IO::Async::Timer::Periodic ();
use Math::Prime::XS qw(primes);
use MIDI::RtMidi::FFI::Device ();
use Music::CreatingRhythms ();

my $name = shift || 'usb'; # MIDI sequencer device
my $bpm  = shift || 120; # beats-per-minute
my $chan = shift // 9; # 0-15, 9=percussion, -1=multi-timbral

my $drums = {
    kick  => { num => 36, chan => $chan < 0 ? 0 : $chan, pat => [] },
    snare => { num => 38, chan => $chan < 0 ? 1 : $chan, pat => [] },
    hihat => { num => 42, chan => $chan < 0 ? 2 : $chan, pat => [] },
};

my $beats = 16; # beats in a measure
my $divisions = 4; # divisions of a quarter-note into 16ths
my $clocks_per_beat = 24; # PPQN
my $clock_interval = 60 / $bpm / $clocks_per_beat; # time / bpm / ppqn
my $sixteenth = $clocks_per_beat / $divisions; # clocks per 16th-note
my %primes = ( # for computing the pattern
    all  => [ primes($beats) ],
    to_5 => [ primes(5) ],
    to_7 => [ primes(7) ],
);
my $ticks = 0; # clock ticks
my $beat_count = 0; # how many beats?
my $toggle = 0; # part A or B?
my @queue; # priority queue for note_on/off messages

# open the named midi output device
my $midi_out = RtMidiOut->new;
try { # this will die on Windows but is needed for Mac
    $midi_out->open_virtual_port('RtMidiOut');
}
catch ($e) {}
$midi_out->open_port_by_name(qr/\Q$name/i);

$SIG{INT} = sub { # halt gracefully
    say "\nStop";
    try {
        $midi_out->stop; # stop the sequencer
        $midi_out->panic; # make sure all notes are off
    }
    catch ($e) {
        warn "Can't halt the MIDI out device: $e\n";
    }
    exit;
};

# for computing the pattern
my $mcr = Music::CreatingRhythms->new;

my $loop = IO::Async::Loop->new;

my $timer = IO::Async::Timer::Periodic->new(
    interval => $clock_interval,
    on_tick  => sub {
        $midi_out->clock;
        $ticks++;
        if ($ticks % $sixteenth == 0) {
            # adjust the drum pattern every 4th measure
            if ($beat_count % ($beats * $divisions) == 0) {
                adjust_drums($mcr, $drums, \%primes, \$toggle);
            }
            # add simultaneous drums to the queue
            for my $drum (keys %$drums) {
                if ($drums->{$drum}{pat}[ $beat_count % $beats ]) {
                    push @queue, { drum => $drum, velocity => 127 };
                }
            }
            # play the queue
            for my $drum (@queue) {
                $midi_out->note_on(
                    $drums->{ $drum->{drum} }{chan},
                    $drums->{ $drum->{drum} }{num},
                    $drum->{velocity}
                );
            }
            $beat_count++;
        }
        else {
            # drain the queue with note_off messages
            while (my $drum = pop @queue) {
                $midi_out->note_off(
                    $drums->{ $drum->{drum} }{chan},
                    $drums->{ $drum->{drum} }{num},
                    0
                );
            }
            @queue = (); # ensure the queue is empty
        }
    },
);
$timer->start;

$loop->add($timer);
$loop->run;

sub adjust_drums($mcr, $drums, $primes, $toggle) {
    # choose random primes to use by the hihat, kick, and snare
    my ($p, $q, $r) = map { $primes->{$_}[ int rand $primes->{$_}->@* ] } sort keys %$primes;
    if ($$toggle == 0) {
        say 'part A';
        $drums->{hihat}{pat} = $mcr->euclid($p, $beats);
        $drums->{kick}{pat}  = $mcr->euclid($q, $beats);
        $drums->{snare}{pat} = $mcr->rotate_n($r, $mcr->euclid(2, $beats));
        $$toggle = 1; # set to part B
    }
    else {
        say 'part B';
        $drums->{hihat}{pat} = $mcr->euclid($p, $beats);
        $drums->{kick}{pat}  = [qw(1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1)];
        $drums->{snare}{pat} = [qw(0 0 0 0 1 0 0 0 0 0 0 0 1 0 1 0)];
        $$toggle = 0; # set to part A
    }
}

(You may notice the inefficiency of attempting to drain an empty queue 23 times every 16th note. Oof! Fortunately, this doesn’t fire anything other than a single while loop condition. A more efficient solution would be to only drain the queue once, but this requires a bit more complexity that we won’t be adding, for brevity’s sake.)

On Windows, this works fine:

perl clocked-euclidean-drums.pl "gs wavetable" 90

To run with fluidsynth and hear the General MIDI percussion sounds, open a fresh new terminal session, and start up fluidsynth like so (mac syntax):

fluidsynth -a coreaudio -m coremidi -g 2.0 ~/Music/soundfont/FluidR3_GM.sf2

The FluidR3_GM.sf2 is a MIDI “soundfont” file and can be downloaded for free.

Next, enter this on the command-line (back in the previous terminal session):

perl clocked-euclidean-drums.pl fluid 90

You will hear standard kick, snare, and closed hihat cymbal. And here is a poor recording of this with my phone:

To run the code with my multi-timbral drum machine, I enter this on the command-line:

perl clocked-euclidean-drums.pl usb 90 -1

And here is what that sounds like:

The Module

I have coded this logic, and a bit more, into a friendly CPAN module. Check out the eg/euclidean.pl example program in the distribution. It is a work in progress. YMMV.

Credits

Thank you to Andrew Rodland (hobbs), who helped me wrap my head around the “no-sleeping asynchronous” algorithm.

To-do Challenges

  • Make patterns other than prime number based Euclidean phrases.

  • Toggle more than two groove parts.

  • Add snare fills to the (end of the) 4th bars. (here’s my version)

  • Make this code handle odd meter grooves.

Resources

Add 5.OLDSUPPORT placeholder and improve wording

Revise following AP comments

Perl commits on GitHub
Revise following AP comments

How to create a lexical sub from a sub ref

r/perl
Welcome to the Week #369 of The Weekly Challenge.
Thank you Team PWC for your continuous support and encouragement.

CPAN Dependencies, static and dynamic

blogs.perl.org

Dependencies or prerequisites are an integral feature of the CPAN software repository. They define what other CPAN modules are required for a particular CPAN distribution to be built, tested, or ultimately to function, as well as optionally to improve or add functionality. To define them properly for a distribution, it is helpful to understand exactly how they will be used, and what all the different distribution files like Makefile.PL, Build.PL, META.json, and MYMETA.json are for.

sidebar: In this post, I will focus on the "requires" relationship for dependencies, which are hard dependencies that must be satisfied, but "recommends" and "suggests" relationships are also defined that indicate strongly and weakly optional dependencies respectively; CPAN installers may install these based on the options that are passed.

Most commonly and at a basic level, dependencies are defined in a (generated) file called META.json in the root of the CPAN distribution, but this may not be the complete picture. CPAN installers historically would determine what is needed at the time that the user requests to install the distribution ("install time"), and though there is now the formal concept of static prerequisites (the most common case where they are the same for every install environment), some distributions need to determine prerequisites at install time, using the original dynamic configuration process.

The Configure Phase

The very first step any CPAN installer takes is to check the primary metadata file META.json, or if that's not present, META.yml (which represents an older metadata specification). In this file, it finds the prereqs key, and within it the configure sub-key. This contains the prerequisites for the "configure" phase of CPAN installation, which largely consists of running the configure script (a file called Makefile.PL or Build.PL), so these dependencies must be installed before doing anything else. This, like the installation of other prerequisites later, is a recursive process, which starts again at the configure phase for each dependency installed in the chain, until it makes its way back up the tree to complete the installation of the originally requested CPAN distribution.

sidebar: The configure phase was not initially part of the specification, so ancient CPAN installers could only use core modules in the configure script; the introduction of configure phase dependencies enabled the configure script to use alternative installers like Module::Build::Tiny in place of the original ExtUtils::MakeMaker. This also meant that Module::Build could be removed from core, and Module::Install's overly clever bundling mechanism was no longer necessary.

The CPAN installer also can check two other metadata keys at this point: dynamic_config and x_static_install. dynamic_config, as the name implies, was initially meant to indicate whether the configure phase needs to be run to dynamically configure the distribution, but unfortunately in practice only indicates whether the prereqs specifically are dynamic. In other words, if this key is present and set to 0 (false), this indicates that the full list of prerequisites in META.json can be used as-is. It does not, however, allow the CPAN installer to skip the configure phase, as the configure script may do anything else a Perl script can do relevant to configuring the distribution, such as determining whether it can be installed at all in the current environment, or even dynamically generating other source files from .PL scripts.

The newer x_static_install is an unofficial metadata key that some installers are beginning to recognize as a second attempt to allow simple distributions to skip dynamic configuration; if this key is set to 1 (true) then the installer may choose to ignore the configure script and install the distribution with its own standard process. As of this writing, the cpm installer and development releases of the cpanm installer implement their own installation procedure by default if this metadata key is set: installing static prerequisites, running tests if requested, and finally copying the files to the install location. Otherwise, installation proceeds through the dynamic process below.

Once the configure prerequisites are installed, the configure script (Makefile.PL or Build.PL) is run. For historical reasons, Build.PL is run if both are present. This script is expected to do a couple specific things to inform the rest of the installation process: create a Makefile or Build script to handle the testing and installation functionality, and create a MYMETA.json (or MYMETA.yml) secondary metadata file. When dynamic_config has not been disabled, this new metadata file will be consulted for the remaining set of prerequisites, so it is the job of the configure script to dynamically determine this new list of prerequisites. Note that, because this happens during the configure phase itself, configure prerequisites cannot be dynamic.

Everything Else

Once prereqs are gathered either statically or from MYMETA, the CPAN installer will then install the prereqs from the other three installation phases: build, test, and runtime. Though it will only be subsequently building and testing the module, the runtime phase is considered a subset that is needed for both of these operations, so these prereqs are also required at this point. The CPAN installer can then build, test, and install the distribution, using the Makefile or Build script (or its own implementation in the case of x_static_install).

Technically, only the runtime phase dependencies are needed after installation, but configure and build prereqs are usually relatively light and may also be present in the runtime phase, so it's usually not worth the effort to exclude these from the runtime environment, but a CPAN installer could decide to only install them temporarily. Any test phase prereqs can be excluded by doing "notest" installations with most CPAN installers, which will avoid installing them at all.

So what does this all mean, as a CPAN author? It largely depends on your choice of authoring tool, but for static dependencies they will often be defined as a straightforward list in a configuration or metadata file, used by the authoring tool to generate the prereqs key of the shipped META.json file. For dynamic dependencies, instead of the authoring tool, the chosen installer used in the configure script (like ExtUtils::MakeMaker or Module::Build) must assemble the final dependency list when it runs at install time. This is often a gritty manual process, but some tools like Dist::Zilla::Plugin::DynamicPrereqs and Dist::Zilla::Plugin::DynamicPrereqs::Meta can help to structure the necessary logic.

sidebar: You may be familiar with the cpanfile format for declaring dependencies, but this is not directly relevant to CPAN dependency specification. A cpanfile may be used as an input file to an authoring tool to assemble a distribution's prerequisites into META.json for shipping, or as input directly to applications like cpanm or carton, but CPAN installers won't reference it while installing a standard CPAN distribution (unless specifically used by the configure script such as with ExtUtils::MakeMaker::CPANfile).

For further reading, I've written more about the structure and history of CPAN authoring in my Dist::Zilla::Starter tutorial.

Every month, I write a newsletter which (among other things) discusses some of the technical projects I’ve been working on. It’s a useful exercise — partly as a record for other people, but mostly as a way for me to remember what I’ve actually done.

Because, as I’m sure you’ve noticed, it’s very easy to forget.

So this month, I decided to automate it.

(And, if you’re interested in the end result, this is also a good excuse to mention that the newsletter exists. Two birds, one stone.)


The Problem

All of my Git repositories live somewhere under /home/dave/git. Over time, that’s become… less organised than it might be. Some repos are directly under that directory, others are buried a couple of levels down, and I’m fairly sure there are a few I’ve completely forgotten about.

What I wanted was:

  • Given a month and a year
  • Find all Git repositories under that directory
  • Identify which ones had commits in that month
  • Summarise the work done in each repo

The first three are straightforward enough. The last one is where things get interesting.


Finding the Repositories

The first step is walking the directory tree and finding .git directories. This is a classic Perl task — File::Find still does exactly what you need.

use v5.40;
use File::Find;

sub find_repos ($root) {
  my @repos;

  find(
    sub {
      return unless $_ eq '.git';
      push @repos, $File::Find::dir;
    },
    $root
  );

  return @repos;
}

This gives us a list of repository directories to inspect. It’s simple, robust, and doesn’t require any external dependencies.

(There are, of course, other ways to do this — you could shell out to fd or find, for example — but keeping it in Perl keeps everything nicely self-contained.)


Getting Commits for a Month

For each repo, we can run git log with appropriate date filters.

sub commits_for_month ($repo, $since, $until) {
  my $cmd = sprintf(
    q{git -C %s log --since="%s" --until="%s" --pretty=format:"%%s"},
    $repo, $since, $until
  );

  my @commits = `$cmd`;
  chomp @commits;

  return @commits;
}

Where $since and $until define the month we’re interested in. I’ve been using something like:

my $since = "$year-$month-01";
my $until = "$year-$month-31"; # good enough for this purpose

Yes, that’s a bit hand-wavy around month lengths. No, it doesn’t matter in practice. Sometimes “good enough” really is good enough.


A Small Gotcha

It turns out I have a few repositories where I never got around to making a first commit. In that case, git log helpfully explodes with:

fatal: your current branch ‘master’ does not have any commits yet
Which is fair enough — but not helpful in a script that’s supposed to quietly churn through dozens of repositories.

The fix is simply to ignore failures:

my @commits = `$cmd 2>/dev/null`;

If there are no commits, we just get an empty list and move on. No warnings, no noise.

This is one of those little bits of defensive programming that makes the difference between a script you run once and a script you’re happy to run every month.


Summarising the Work

Once we have a list of commit messages, we can summarise them.

And this is where I cheated slightly.

I used OpenAPI::Client::OpenAI to feed the commit messages into an LLM and ask it to produce a short summary.

Something along these lines:

use OpenAPI::Client::OpenAI;

sub summarise_commits ($commits) {
  my $client = OpenAPI::Client::OpenAI->new(
    api_key => $ENV{OPENAI_API_KEY},
  );

  my $text = join "\n", @$commits;

  my $response = $client->chat->completions->create({
    model => 'gpt-4.1-mini',
    messages => [{
      role => 'user',
      content => "Summarise the following commit messages:\n\n$text",
    }],
  });

  return $response->choices->[0]->message->content;
}

Is this overkill? Almost certainly.

Could I have written some heuristics to group and summarise commit messages? Possibly.

Would it have been as much fun? Definitely not.

And in practice, it works remarkably well. Even messy, inconsistent commit messages tend to turn into something that looks like a coherent summary of work.


Putting It Together

For each repo:

  1. Get commits for the month
  2. Skip if there are none
  3. Generate a summary
  4. Print the repo name and summary

The output looks something like:

my-project
-----------
Refactored database layer, added caching, and fixed several edge-case bugs.

another-project
---------------
Initial scaffolding, basic API endpoints, and deployment configuration.

Which is already a pretty good starting point for a newsletter.


A Nice Side Effect

One unexpected benefit of this approach is that it surfaces projects I’d forgotten about.

Because the script walks the entire directory tree, it finds everything — including half-finished experiments, abandoned ideas, and repos I created at 11pm and never touched again.

Sometimes that’s useful. Sometimes it’s mildly embarrassing.

But it’s always interesting.


What Next?

This is very much a first draft.

It works, but it’s currently a script glued together with shell commands and assumptions about my directory structure. The obvious next step is to:

  • Turn it into a proper module
  • Add tests
  • Clean up the API
  • Release it to CPAN

At that point, it becomes something other people might actually want to use — not just a personal tool with hard-coded paths and questionable date handling.


A Future Enhancement

One idea I particularly like is to run this automatically using GitHub Actions.

For example:

  • Run monthly
  • Generate summaries for that month
  • Commit the results to a repository
  • Publish them via GitHub Pages

Over time, that would build up a permanent, browsable record of what I’ve been working on.

It’s a nice combination of:

  • automation
  • documentation
  • and a gentle nudge towards accountability

Which is either a fascinating historical archive…

…or a slightly alarming reminder of how many half-finished projects I have.


Closing Thoughts

This started as a small piece of automation to help me write a newsletter. But it’s turned into a nice example of what Perl is still very good at:

  • Gluing systems together
  • Wrapping command-line tools
  • Handling messy real-world data
  • Adding just enough intelligence to make the output useful

And, occasionally, outsourcing the hard thinking to a machine.

The code (such as it is currently is) is on GitHub at https://github.com/davorg/git-month-summary.

If you’re interested in the kind of projects this helps summarise, you can find my monthly newsletter over on Substack.

And if I get round to turning this into a CPAN module, I’ll let you know – well, if you’re subscribed to the newsletter!

The post Summarising a Month of Git Activity with Perl (and a Little Help from AI) first appeared on Perl Hacks.

Thank you Team PWC for your continuous support and encouragement.
Thank you Team PWC for your continuous support and encouragement.
Thank you Team PWC for your continuous support and encouragement.
Updates for great CPAN modules released last week. A module is considered great if its favorites count is greater or equal than 12.

  1. App::DBBrowser - Browse SQLite/MySQL/PostgreSQL databases and their tables interactively.
    • Version: 2.440 on 2026-04-11, with 18 votes
    • Previous CPAN version: 2.439 was released 1 month, 16 days before
    • Author: KUERBIS
  2. Attean - A Semantic Web Framework
    • Version: 0.036 on 2026-04-06, with 19 votes
    • Previous CPAN version: 0.035_01 was released the same day
    • Author: GWILLIAMS
  3. Bio::EnsEMBL - Bio::EnsEMBL - Ensembl Core API
    • Version: 114.0.0 on 2026-04-07, with 83 votes
    • Previous CPAN version: 114.0.0_50 was released 12 days before
    • Author: TAMARAEN
  4. CPANSA::DB - the CPAN Security Advisory data as a Perl data structure, mostly for CPAN::Audit
    • Version: 20260405.001 on 2026-04-05, with 25 votes
    • Previous CPAN version: 20260329.001 was released 6 days before
    • Author: BRIANDFOY
  5. Exporter - Implements default import method for modules
    • Version: 5.79 on 2026-04-06, with 28 votes
    • Previous CPAN version: 5.78 was released 2 years, 3 months, 6 days before
    • Author: TODDR
  6. Image::ExifTool - Read and write meta information
    • Version: 13.55 on 2026-04-07, with 44 votes
    • Previous CPAN version: 13.50 was released 2 months before
    • Author: EXIFTOOL
  7. JSON::Schema::Modern - Validate data against a schema using a JSON Schema
    • Version: 0.637 on 2026-04-08, with 16 votes
    • Previous CPAN version: 0.636 was released the same day
    • Author: ETHER
  8. Mail::Box - complete E-mail handling suite
    • Version: 4.02 on 2026-04-10, with 16 votes
    • Previous CPAN version: 4.01 was released 3 months, 28 days before
    • Author: MARKOV
  9. PDL - Perl Data Language
    • Version: 2.104 on 2026-04-08, with 102 votes
    • Previous CPAN version: 2.103 was released 1 month, 5 days before
    • Author: ETJ
  10. Pod::Simple - framework for parsing Pod
    • Version: 3.48 on 2026-04-05, with 20 votes
    • Previous CPAN version: 3.48 was released the same day
    • Author: KHW
  11. SPVM - The SPVM Language
    • Version: 0.990156 on 2026-04-08, with 36 votes
    • Previous CPAN version: 0.990155 was released 1 day before
    • Author: KIMOTO
  12. Term::Choose - Choose items from a list interactively.
    • Version: 1.782 on 2026-04-09, with 15 votes
    • Previous CPAN version: 1.781 was released 15 days before
    • Author: KUERBIS
  13. Test2::Harness - A new and improved test harness with better Test2 integration.
    • Version: 1.000170 on 2026-04-10, with 28 votes
    • Previous CPAN version: 1.000169 was released 1 day before
    • Author: EXODIST

Learning XS - Custom Ops

dev.to #perl

This is the next tutorial in my Learning XS series. This post explores one of Perl's most powerful optimisation techniques: Custom Ops. We'll build a real-world example, reimplementing one of my first XS cpan modules - a Shannon entropy calculator - and explain how to bypass Perl's subroutine call overhead entirely.

So firstly... What are Custom Ops?

When you call a Perl subroutine, Perl executes an entersub op that:

  1. Sets up a new stack frame
  2. Pushes arguments onto the stack
  3. Magical things happens
  4. Jumps to the subroutine code
  5. More Magical things can happens
  6. Cleans up and returns

For simple functions called millions of times (like mathematical operations), this overhead dominates execution time. Custom ops let you replace the entire entersub call with a single, specialised op - eliminating that overhead.

Next.. What is Shannon Entropy?

Shannon entropy measures the information content (or "randomness") of a string:

  • "aaaa" → 0 bits (no randomness, completely predictable)
  • "abcd" → 2 bits (4 equally likely symbols = log₂(4))
  • Random data → ~8 bits per byte (maximum entropy)

It's used in compression, cryptography, and password strength estimation. A perfect candidate for a small, focused XS module.

In Perl the algorithm for Shannon entropy can look something like:

sub entropy {
    my ($entropy, $len, $p, %t) = (0, length($_[0]));
    $t{$_}++ foreach split '', $_[0];
    foreach (values %t) {
        $p = $_/$len;
        $entropy -= $p * log $p;
    } 
    return $entropy / log 2;
}

Creating the Distribution

First we will need to create a new perl distribution. As we have done in the past we will use Module::Starter.

module-starter --module="Shannon::Entropy::XS" --author="Your Name" --email="your@email.com"

This will create a new directory called Shannon-Entropy-XS with the basic structure of a perl module.

Next we will need to update the Makefile.PL so that it knows we are using XS. One simple way to do this is to add XSMULTI => 1 to the WriteMakefile call. Which will tell MakeMaker to look for XS files in the lib/Shannon/Entropy/ directory.

use ExtUtils::MakeMaker;

my %WriteMakefileArgs = (
    NAME             => 'Shannon::Entropy::XS',
    AUTHOR           => q{Your Name <your@email.com>},
    VERSION_FROM     => 'lib/Shannon/Entropy/XS.pm',
    XSMULTI          => 1,
    LIBS             => [''],
    # ... rest of your config
);

WriteMakefile(%WriteMakefileArgs);

Next lets update the lib/Shannon/Entropy/XS.pm file to load our XS:

package Shannon::Entropy::XS;

use strict;
use warnings;
our $VERSION = '0.01';

require XSLoader;
XSLoader::load("Shannon::Entropy::XS", $VERSION);

use Exporter qw(import);
our @EXPORT_OK = qw(entropy);

1;

Now we can create the XS file. Create a new file called lib/Shannon/Entropy/XS.xs and add the basic structure:

#define PERL_NO_GET_CONTEXT
#include "EXTERN.h"
#include "perl.h"
#include "XSUB.h"

MODULE = Shannon::Entropy::XS  PACKAGE = Shannon::Entropy::XS
PROTOTYPES: ENABLE

That defines the package. If we compile this now, the basic tests added by Module::Starter will pass. To test that:

perl Makefile.PL
make test

Adding the Entropy Function

With the basic structure working, lets write a test for our entropy function. Create or update t/01-test.t:

use Test::More;
use strict;
use warnings;
use Shannon::Entropy::XS qw/entropy/;

is(entropy(''), 0);
is(entropy('0'), 0);
is(sprintf('%.3f', entropy('1223334444')), 1.846);
is(entropy('0123456789abcdef'), 4),
is(sprintf('%.3f', entropy('abcdefghijklmnopqrst123456789!@£[]"')), 5.170),

done_testing();

Now we need to implement the entropy calculation. First, add the C helper functions to our XS file, before the MODULE line:

#define PERL_NO_GET_CONTEXT
#include "EXTERN.h"
#include "perl.h"
#include "XSUB.h"
#include <math.h>

static int makehist(const unsigned char *str, int *hist, int len) {
    int chars[256];
    int histlen = 0, i;
    for (i = 0; i < 256; i++) chars[i] = -1;
    for (i = 0; i < len; i++) {
        int c = (int)str[i];
        if (chars[c] == -1) {
            chars[c] = histlen;
            histlen++;
        }
        hist[chars[c]]++;
    }
    return histlen;
}

static double entropy(const char *str) {
    int len = strlen(str);
    int hist[256] = {0};
    int histlen, i;
    double out = 0.0;
    if (len == 0) return 0.0;
    histlen = makehist((const unsigned char *)str, hist, len);
    for (i = 0; i < histlen; i++) {
        double p = (double)hist[i] / len;
        out -= p * log2(p);
    }
    return out;
}

MODULE = Shannon::Entropy::XS  PACKAGE = Shannon::Entropy::XS
PROTOTYPES: ENABLE

double
entropy(string)
    SV *string
    CODE:
        STRLEN len;
        const char *str = SvPV(string, len);
        RETVAL = entropy(str);
    OUTPUT:
        RETVAL

So what exactly are we doing here? The makehist() function builds a histogram of character frequencies by iterating through the string and mapping each unique character to a slot in the hist array. The chars[256] lookup table tracks which slot each byte value maps to, with -1 meaning "not yet seen". When we encounter a new character, we assign it the next available slot and increment histlen.

The entropy() function then implements the Shannon entropy formula. For each unique character in the histogram, we calculate its probability (count / total length) and accumulate the entropy value. Empty strings return 0 immediately to avoid division by zero.

Finally, the XS wrapper extracts the string from the Perl SV using SvPV() (which gives us both the string pointer and its length) and calls our C function. The result is returned as a double.

Now run make test and our entropy tests should pass. But every call still goes through Perl's entersub op. Let's optimise that with custom ops.

Adding Custom Ops

Custom ops require Perl 5.14+. We'll add three pieces:

  1. An XOP structure (for debugging/introspection)
  2. A PP function (the actual operation)
  3. A call checker (transforms the op tree at compile time)

First we need to add some back porting as part of the API we are about to use was introduced in 5.26, after the last import add:

#ifndef OpSIBLING
#  define OpSIBLING(o)              ((o)->op_sibling)
#endif
#ifndef OpMORESIB_set
#  define OpMORESIB_set(o, sib)     ((o)->op_sibling = (sib))
#endif
#ifndef OpLASTSIB_set
#  define OpLASTSIB_set(o, parent)  ((o)->op_sibling = NULL)
#endif

Next, we can add the XOP declaration after the last endif from above add:

#if PERL_VERSION >= 14
static XOP entropy_xop;
#endif

The XOP (eXtended OP) structure provides metadata - the op's name and description that appear in tools like B::Concise.

The PP Function

Every Perl op has a "pp" (push-pop) function that does the actual work. Add this after your entropy functions, inside a #if PERL_VERSION >= 14 block:

#if PERL_VERSION >= 14

static OP* pp_entropy(pTHX) {
    dSP;
    SV *sv = TOPs;
    STRLEN len;
    const char *str = SvPV(sv, len);
    double result = entropy(str);
    POPs;
    mPUSHn(result);
    RETURN;
}

#endif

A line by line explanation of the above code:

  1. dSP - Declares the stack pointer local variable
  2. SV *sv = TOPs - Gets the top of stack (our argument) without popping it
  3. SvPV(sv, len) - Extracts the string and its length
  4. entropy(str) - Calls our C function
  5. POPs - Pops the argument off the stack
  6. mPUSHn(result) - Pushes the result as a mortal (auto-freed) numeric value
  7. RETURN - Returns control to the next op in the chain

The Call Checker

This is where the real magic happens. A call checker is invoked at compile time when Perl sees a call to our function. We can inspect the op tree and transform it. Add this after the pp function:

static OP* entropy_call_checker(pTHX_ OP *entersubop, GV *namegv, SV *ckobj) {
    OP *pushop, *argop, *nextop, *newop;
    PERL_UNUSED_ARG(namegv);
    PERL_UNUSED_ARG(ckobj);

    pushop = cLISTOPx(entersubop)->op_first;
    if (!pushop) return entersubop;

    /* Handle null op wrapper in modern Perl */
    if (pushop->op_type == OP_NULL && cLISTOPx(pushop)->op_first) {
        pushop = cLISTOPx(pushop)->op_first;
    }

    argop = OpSIBLING(pushop);
    if (!argop) return entersubop;

    nextop = OpSIBLING(argop);
    if (!nextop) return entersubop;

    if (OpSIBLING(nextop)) return entersubop;

    OpMORESIB_set(pushop, nextop);
    OpLASTSIB_set(argop, NULL);
    newop = newUNOP(OP_CUSTOM, 0, argop);
    newop->op_ppaddr = pp_entropy;
    op_free(entersubop);
    return newop;
}

#endif

So what is actually happening here? The call checker receives the entersub op that Perl normally uses to call subroutines. Inside an entersub, the child ops form a linked list: pushmark -> arg1 -> arg2 -> ... -> cv. We use cLISTOPx(entersubop)->op_first to get the first child. In modern Perl (5.22+), this is often wrapped in a null op, so we check for that and unwrap it to find the actual pushmark. Then we use OpSIBLING() to navigate to our argument.

Before we do anything destructive, we validate that the call matches our expected signature. The series of null checks and the OpSIBLING(nextop) test ensure we have exactly one argument. If someone called entropy($a, $b) with two arguments, or used the &entropy sigil syntax, we bail out early by returning entersubop unchanged - Perl falls back to the normal subroutine call mechanism.

Once we've validated the call, we surgically detach the argument op from the entersub's child list. OpMORESIB_set(pushop, nextop) rewires pushmark to skip over our argument, linking it directly to the CV. OpLASTSIB_set(argop, NULL) disconnects argop from its siblings entirely, leaving it as a standalone op.

With the argument detached, we create our custom op using newUNOP(OP_CUSTOM, 0, argop). This creates a unary op with our argument as its child. We then assign our pp function to execute when this op runs: newop->op_ppaddr = pp_entropy. Finally, we free the original entersub with op_free() since it's no longer needed, and return our shiny new custom op to take its place in the op tree.

Wiring It Up in BOOT

The BOOT section runs when the module loads. We register our custom op and install the call checker. Add this after your XS entropy function:

BOOT:
#if PERL_VERSION >= 14
{
    CV *entropy_cv;
    XopENTRY_set(&entropy_xop, xop_name, "entropy");
    XopENTRY_set(&entropy_xop, xop_desc, "Shannon entropy calculation");
    Perl_custom_op_register(aTHX_ pp_entropy, &entropy_xop);
    entropy_cv = get_cv("Shannon::Entropy::XS::entropy", 0);
    if (entropy_cv) {
        cv_set_call_checker(entropy_cv, entropy_call_checker, (SV *)entropy_cv);
    }
}
#endif

The first thing we do is set up the XOP metadata using XopENTRY_set(). This registers the name "entropy" and a description "Shannon entropy calculation" with our custom op. These strings appear in debugging tools like B::Concise and Devel::Peek, making it much easier to understand op trees when things do go wrong.

Next, Perl_custom_op_register() connects our pp function to the XOP structure. This tells Perl's introspection system that whenever it encounters an OP_CUSTOM with our pp_entropy address, it should use our XOP metadata for display purposes.

Finally, we wire up the call checker itself. get_cv() retrieves the CV (code value) for our Shannon::Entropy::XS::entropy function - this is the subroutine object that Perl created when it compiled the XSUB. We then call cv_set_call_checker() to install our call checker on that CV. From now on, whenever Perl compiles a call to entropy(), it will invoke our entropy_call_checker function, giving us the chance to transform the op tree.

Now run make clean && make test and all tests should still pass - but now with custom ops!

Verifying It Works

You can use B::Concise to see the optimised op tree. Create a simple test script:

use B::Concise;
use Shannon::Entropy::XS qw(entropy);

B::Concise::compile('-exec', sub { entropy("test") })->();

Without custom ops, you'd see entersub. With them, you'll see our entropy custom op directly in the tree - proof that we've eliminated the subroutine call overhead!

B::Concise::compile(CODE(0x12b02b4f0))
1  <;> nextstate(main 1816 test.pl:4) v
2  <$> const[PV "test"] s
3  <0> entropy K/1
4  <1> leavesub[1 ref] K/REFC,1

Okay at this point you may or may not be asking why you would replace 9 lines of code with 111 lines... well it would be for the performance improvement you've just achieved. Take this quick benchmark as justification:

Benchmark: Short string (5 chars)
----------------------------------------
                           Rate     Shannon::Entropy Shannon::Entropy::XS
Shannon::Entropy       908579/s                   --                 -95%
Shannon::Entropy::XS 18023002/s                1884%                   --

Benchmark: Medium string (43 chars)
----------------------------------------
                          Rate     Shannon::Entropy Shannon::Entropy::XS
Shannon::Entropy      165257/s                   --                 -98%
Shannon::Entropy::XS 8525479/s                5059%                   -- 

Extension: Alternative Approaches to Custom Ops

So far we've covered what I consider the proper way to implement custom ops. But you may encounter other patterns in existing code, and it's worth understanding why some approaches are better than others.

The Proper Way (What We Did)

Our call checker restructures the op tree properly:

  1. Detaches arguments from the entersub
  2. Creates a new OP_CUSTOM unop
  3. Frees the old entersub entirely
newop = newUNOP(OP_CUSTOM, 0, argop);
newop->op_ppaddr = pp_entropy;
op_free(entersubop);
return newop;

This is more code, but it's correct. The op tree is clean, B::Concise shows your custom op, and it works in all contexts.

The Quick Hack (Avoid This)

Some other modules take a shortcut - they just swap the op_ppaddr pointer without restructuring the tree:

static OP* hack_call_checker(pTHX_ OP *entersubop, GV *namegv, SV *protosv) {
    PERL_UNUSED_ARG(namegv);
    PERL_UNUSED_ARG(protosv);
    entersubop->op_ppaddr = my_custom_op;
    return entersubop;  /* Return unchanged! */
}

The problem is that your pp function now has to deal with the entersub's stack frame, using TOPMARK/POPMARK manipulation:

static OP* my_custom_op(pTHX) {
    dSP;
    I32 ax = TOPMARK + 1;
    I32 items = (SP - PL_stack_base - TOPMARK) - 1;
    if (items < 1) croak("requires at least 1 argument");
    SV *sv = PL_stack_base[ax];
    /* ... do work ... */
    SP = PL_stack_base + POPMARK;
    EXTEND(SP, 1);
    PUSHs(newSVnv(result));  /* Memory leak! */
    PUTBACK;
    return NORMAL;
}

Why avoid this?

  • Fragile - Breaks when your function is called inside other expressions
  • Memory leaks - Easy to forget mortalization (note newSVnv() above leaks!)
  • Debugging confusion - B::Concise still shows entersub, not your custom op
  • Complex code - The pp function is harder to understand and maintain

Shared Call Checker (For Multiple Functions)

If you're building a library with many similar functions (like a maths library), you can use a generic call checker that takes a function pointer:

static OP* vec_unary_call_checker(pTHX_ OP *entersubop, GV *namegv, SV *ckobj, OP* (*pp_func)(pTHX)) {
    OP *pushop, *selfop, *nextop, *newop;
    PERL_UNUSED_ARG(namegv);
    PERL_UNUSED_ARG(ckobj);

    pushop = cUNOPx(entersubop)->op_first;
    if (!OpHAS_SIBLING(pushop)) return entersubop;

    selfop = OpSIBLING(pushop);
    if (!selfop) return entersubop;

    nextop = OpSIBLING(selfop);
    if (!nextop) return entersubop;

    if (OpSIBLING(nextop)) return entersubop;

    OpMORESIB_set(pushop, nextop);
    OpLASTSIB_set(selfop, NULL);

    newop = newUNOP(OP_CUSTOM, 0, selfop);
    newop->op_ppaddr = pp_func;
    op_free(entersubop);
    return newop;
}

Then create thin wrappers for each function:

static OP* vec_sum_call_checker(pTHX_ OP *entersubop, GV *namegv, SV *ckobj) {
    return vec_unary_call_checker(aTHX_ entersubop, namegv, ckobj, pp_vec_sum);
}

static OP* vec_mean_call_checker(pTHX_ OP *entersubop, GV *namegv, SV *ckobj) {
    return vec_unary_call_checker(aTHX_ entersubop, namegv, ckobj, pp_vec_mean);
}

static OP* vec_min_call_checker(pTHX_ OP *entersubop, GV *namegv, SV *ckobj) {
    return vec_unary_call_checker(aTHX_ entersubop, namegv, ckobj, pp_vec_min);
}

This keeps your code DRY while still doing proper op tree restructuring.

Binary Ops (Two Arguments)

For functions taking two arguments, use newBINOP instead of newUNOP:

static OP* binary_call_checker(pTHX_ OP *entersubop, GV *namegv, SV *ckobj) {
    OP *pushop, *arg1, *arg2, *nextop, *newop;
    PERL_UNUSED_ARG(namegv);
    PERL_UNUSED_ARG(ckobj);

    pushop = cUNOPx(entersubop)->op_first;
    if (!OpHAS_SIBLING(pushop)) return entersubop;

    arg1 = OpSIBLING(pushop);
    if (!arg1) return entersubop;

    arg2 = OpSIBLING(arg1);
    if (!arg2) return entersubop;

    nextop = OpSIBLING(arg2);
    if (!nextop) return entersubop;

    if (OpSIBLING(nextop)) return entersubop;

    OpMORESIB_set(pushop, nextop);
    OpLASTSIB_set(arg1, NULL);
    OpLASTSIB_set(arg2, NULL);

    newop = newBINOP(OP_CUSTOM, 0, arg1, arg2);
    newop->op_ppaddr = pp_my_binary_func;
    op_free(entersubop);
    return newop;
}

The pp function for a binary op pops two values:

static OP* pp_my_binary_func(pTHX) {
    dSP;
    SV *right = POPs;
    SV *left = POPs;
    /* combine left and right */
    mPUSHn(result);
    RETURN;
}

If you have any further questions then please do post them below.

Many years ago I wrote some monitoring tool that collects data using RRD (actually using perl module RRDs. The OS at that time was SLES 10 (rrdtool 1.2012 and rrdtool 1.3007).

Since upgrading to SLES 15 SP6 (perl-rrdtool-1.8.0-150600.1.4.x86_64) I see that my TICK boxes overlap its label.

The bottom of the graphs look like this:

Screenshot showing bottom of incorrect RRD graph

The purpose of the TICKs is just a kind of "color legend" for a bar graph (green/yellow/orange/red) classifying a status not shown.

The part responsible to draw that bottom part looks like this (the @graph_args array from RRDs::graph($fname, @settings, @graph_args);):

...
51  'COMMENT:    Last\\t Minimum\\t Average\\t Maximum\\t  Sample\\n'
52  'GPRINT:S_A:LAST:  %6.2lf\\t\\g'
53  'GPRINT:V_S_I:  %6.2lf\\t\\g'
54  'GPRINT:V_S_A:  %6.2lf\\t\\g'
55  'GPRINT:V_S_X:  %6.2lf\\t\\g'
56  'LINE:S_I'
57  'AREA:S_R#F061::STACK'
58  'LINE1:S_I#F066'
59  'LINE1:S_X#F066'
60  'LINE1:S_A#F06:System  \\n'
61  'GPRINT:P_A:LAST:  %6.2lf\\t\\g'
62  'GPRINT:V_P_I:  %6.2lf\\t\\g'
63  'GPRINT:V_P_A:  %6.2lf\\t\\g'
64  'GPRINT:V_P_X:  %6.2lf\\t\\g'
65  'LINE:P_I'
66  'AREA:P_R#C601::STACK'
67  'LINE1:P_I#C606'
68  'LINE1:P_X#C606'
69  'LINE1:P_A#C60:Peer    \\n'
70  'GPRINT:C_A:LAST:  %6.2lf\\t\\g'
71  'GPRINT:V_C_I:  %6.2lf\\t\\g'
72  'GPRINT:V_C_A:  %6.2lf\\t\\g'
73  'GPRINT:V_C_X:  %6.2lf\\t\\g'
74  'LINE:C_I'
75  'AREA:C_R#3F31::STACK'
76  'LINE1:C_I#3F36'
77  'LINE1:C_X#3F36'
78  'LINE1:C_A#3F3:Clock   \\n'
79  'GPRINT:A_A:LAST:  %6.2lf\\t\\g'
80  'GPRINT:V_A_I:  %6.2lf\\t\\g'
81  'GPRINT:V_A_A:  %6.2lf\\t\\g'
82  'GPRINT:V_A_X:  %6.2lf\\t\\g'
83  'LINE:A_I'
84  'AREA:A_R#06F1::STACK'
85  'LINE1:A_I#06F6'
86  'LINE1:A_X#06F6'
87  'LINE1:A_A#06F:Total   \\n'
88  'TICK:S0X#F00C:0.06'
89  'TICK:S1X#F90C:0.06'
90  'TICK:S2X#FF0C:0.06'
91  'TICK:S3X#0C09:0.06'
92  'TICK:S0A#F00C:0.04:Failed'
93  'TICK:S1A#F90C:0.04:Bad'
94  'TICK:S2A#FF0C:0.04:Marginal'
95  'TICK:S3A#0C09:0.04:Good'
96  'TICK:S0I#F00C:0.02'
97  'TICK:S1I#F90C:0.02'
98  'TICK:S2I#FF0C:0.02'
99  'TICK:S3I#0C09:0.02'

(The command is assembled in some complex way, so showing that code would just distract from the actual problem. Therefore I used the Perl debugger to extract the essential settings)

Here is an example how the bottom of the graph looked for the older version of rrdtool (the code calling it had not been changed):

bottom of rrdtool graph showing correct TICKs

The manual claims the syntax for TICKs still is:

TICK:vname#rrggbb[aa][:fraction[:legend]]

Package Versions

As the issue may be outside of rrdtool, I'll add more package versions:

> rpm -qa rrdtool librrd libpango\*
libpango-1_0-0-32bit-1.51.1-150600.1.3.x86_64
libpangomm-1_4-1-2.46.3-150600.1.2.x86_64
libpango-1_0-0-1.51.1-150600.1.3.x86_64
rrdtool-1.8.0-150600.1.4.x86_64

I saw that rrdtool-1.9.0-3.fc42.x86_64 in Fedora 42 has the same issue:

$ rpm -qa rrdtool rrd\* pango\* cairo\*
cairo-1.18.2-3.fc42.x86_64
cairo-gobject-1.18.2-3.fc42.x86_64
cairomm-1.14.5-8.fc42.x86_64
cairomm1.16-1.18.0-8.fc42.x86_64
pangomm-2.46.4-3.fc42.x86_64
pangomm2.48-2.56.1-1.fc42.x86_64
pango-1.56.4-2.fc42.x86_64
rrdtool-1.9.0-3.fc42.x86_64

Simplified test case

I managed to create a simplified test case:

First the sample RRD database as dump:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE rrd SYSTEM "https://oss.oetiker.ch/rrdtool/rrdtool.dtd">
<!-- Round Robin Database Dump -->
<rrd>
        <version>0003</version>
        <step>60</step> <!-- Seconds -->
        <lastupdate>1751958726</lastupdate> <!-- 2025-07-08 09:12:06 CEST -->

        <ds>
                <name> N_SYS </name>
                <type> GAUGE </type>
                <minimal_heartbeat>105</minimal_heartbeat>
                <min>0.0000000000e+00</min>
                <max>NaN</max>

                <!-- PDP Status -->
                <last_ds>1</last_ds>
                <value>6.0000000000e+00</value>
                <unknown_sec> 0 </unknown_sec>
        </ds>

        <ds>
                <name> N_PEER </name>
                <type> GAUGE </type>
                <minimal_heartbeat>105</minimal_heartbeat>
                <min>0.0000000000e+00</min>
                <max>NaN</max>

                <!-- PDP Status -->
                <last_ds>0.7</last_ds>
                <value>4.2000000000e+00</value>
                <unknown_sec> 0 </unknown_sec>
        </ds>

        <ds>
                <name> N_ALL </name>
                <type> GAUGE </type>
                <minimal_heartbeat>105</minimal_heartbeat>
                <min>0.0000000000e+00</min>
                <max>NaN</max>

                <!-- PDP Status -->
                <last_ds>0.769230769230769</last_ds>
                <value>4.6153846154e+00</value>
                <unknown_sec> 0 </unknown_sec>
        </ds>

        <!-- Round Robin Archives -->
        <rra>
                <cf>AVERAGE</cf>
                <pdp_per_row>1</pdp_per_row> <!-- 60 seconds -->

                <params>
                <xff>9.0000000000e-01</xff>
                </params>
                <cdp_prep>
                        <ds>
                        <primary_value>1.0000000000e+00</primary_value>
                        <secondary_value>1.0000000000e+00</secondary_value>
                        <value>NaN</value>
                        <unknown_datapoints>0</unknown_datapoints>
                        </ds>
                        <ds>
                        <primary_value>7.0000000000e-01</primary_value>
                        <secondary_value>6.0000000000e-01</secondary_value>
                        <value>NaN</value>
                        <unknown_datapoints>0</unknown_datapoints>
                        </ds>
                        <ds>
                        <primary_value>7.6923076923e-01</primary_value>
                        <secondary_value>6.9230769231e-01</secondary_value>
                        <value>NaN</value>
                        <unknown_datapoints>0</unknown_datapoints>
                        </ds>
                </cdp_prep>
                <database>
                        <!-- 2025-07-08 09:12:00 CEST / 1751958720 --> <row><v>1.0000000000e+00</v><v>8.0000000000e-01</v><v>8.4615384615e-01</v></row>
                </database>
        </rra>
</rrd>

The commands to create a demo.png:

$ rrdtool restore /tmp/demo.dump /tmp/demo.rrd
$ rrdtool graph /tmp/demo.png 'DEF:S_A=/tmp/demo.rrd:N_SYS:AVERAGE' 'CDEF:S1A=S_A,0.30,GT,S_A,UNKN,IF' 'TICK:S1A#F90C:0.04:Bad'

This results in this image: Output of test case

How you too can improve Perl 5

blogs.perl.org

tl;dr

Choose a bug that has a reproducing case from the list of open issues. Get the Perl source from GitHub, and put the reproducing case into a test file that we have designed for the purpose, and submit a pull request to GitHub for it. That's it.

The test is marked as expected to fail. Having this test in place helps us in several ways. One is that sometimes having to write a test is the straw that leads a developer to think it's just too much trouble to tackle this. Another is that, not all that infrequently, a change made for some other reason fixes one of these bugs, and we don't know it. But having this test would cause us to get notified that the bug is now fixed. It might even be that your work would show that the bug is already fixed.

We're happy to add you to the list of people who have contributed over the years to Perl's development, or to have you remain anonymous if you prefer.

If you've ever thought about contributing to open source, this is a great way to get started, without requiring much effort. Then you can see if doing this suits you.

Even if not, this is a way to easily improve your programming skills. The open bugs are generally things that aren't in the mainstream, so you get to see edge cases, the like of which you may never have thought about.

Details

The Perl interpreter has about 2200 open issues. That number is dauntingly large, and I believe leads people to avoid tackling them, "What's the use!".

I recently skimmed over the list, and started to look closer at ones I thought I might know something about or have the background knowledge to be able to address. I created my own, much smaller, list of ones to eventually look at. That means I don't have to do anything more right away, just when I have the time and inclination to look at some of them.

I haven't looked in detail at most of the ones I tagged, but even so, this little exercise led to the closing of about 1% of all the open issues. I'm familiar with some portions of the Perl implementation, but have never touched much of it. The number of already-fixed bugs has got to be quite a bit larger than the 1% I quickly found.

This file in Perl's GitHub repository has been designed specifically to contain these kinds of tests. It has detailed instructions at the top about how to add a test to it.

Some of the open issues contain code that would be trivial to add to that file. Others would be more challenging. Still others don't lend themselves to doing this at all (say the ones that wish Perl had some feature or another). Choose whatever you are comfortable with.

perlhack tells you how to work with GitHub and submit your change.

perlhacktips and perlgit go into more detail.

If you run into problems, you can ask on this irc channel

And read any comments below that may be added to clarify things

We're avid Perl programmers but we have been really wanting to get into Haskell or Erlang or something similar, though we don't know where to start. Any languages you guys recommend? if so, send some good tutorials [or give us a rundown yourself :>]

We must add that we're looking for pure-ish functional languages. Lisp syntax doesn't really sit right with us either so we don't really wish to use those.

edit [2026-04-05]: clarified why we don't accept lisp languages as suggestions

I wrote a follow up of my previous article on dev-to. It is titled Manage the health of your CLI tools at scale and explores how you might approach instrumenting your cli tools to ensure they run properly.

Caching is one of those things every application needs eventually. Whether it's DNS lookups, database rows, or computed results, keeping frequently accessed data close avoids expensive recomputation. The classic data structure for this is the Least Recently Used (LRU) cache: a fixed-size store that automatically evicts the oldest unused entry when full.

LRU::Cache is a new CPAN module that implements an LRU cache entirely in C via XS. Every operation — get, set, delete, exists — is O(1). No Perl-level hash ties, no linked list objects, no method dispatch on the hot path if you don't want it.

Quick Start

use LRU::Cache;

my $cache = LRU::Cache::new(1000);

$cache->set('user:42', { name => 'Alice', role => 'admin' });
$cache->set('user:43', { name => 'Bob',   role => 'editor' });

my $user = $cache->get('user:42');   # promotes to front
print $user->{name};                 # "Alice"

The constructor takes a single argument: the maximum number of entries. Once the cache hits capacity, the next set evicts whatever was accessed longest ago.

The Eviction Model

Every get or set promotes an entry to the front of the list. The entry at the tail — the one untouched for the longest — is the first to go when the cache is full.

my $cache = LRU::Cache::new(3);

$cache->set('a', 1);
$cache->set('b', 2);
$cache->set('c', 3);

# Access 'a' — it moves to the front
$cache->get('a');

# Cache is full. This evicts 'b' (the least recently used).
$cache->set('d', 4);

# 'b' is gone
print $cache->get('b');   # undef

After the get('a'), the internal order is a → c → b. Adding d evicts b from the tail.

Peeking Without Promoting

Sometimes you want to check a value without affecting the eviction order. peek does exactly that:

my $cache = LRU::Cache::new(3);
$cache->set('a', 1);
$cache->set('b', 2);
$cache->set('c', 3);

# peek returns the value but doesn't promote
my $val = $cache->peek('a');   # 1

# 'a' is still the oldest
my ($oldest_key) = $cache->oldest;   # 'a'

# Now get('a') promotes it
$cache->get('a');
($oldest_key) = $cache->oldest;      # 'b'

This is useful when you're iterating over cache contents for monitoring or debugging without disturbing the access pattern.

Inspecting the Cache

oldest and newest return both the key and value:

my $cache = LRU::Cache::new(5);
$cache->set('first',  'I was first');
$cache->set('second', 'I was second');
$cache->set('third',  'I was third');

my ($newest_key, $newest_val) = $cache->newest;
# $newest_key = 'third', $newest_val = 'I was third'

my ($oldest_key, $oldest_val) = $cache->oldest;
# $oldest_key = 'first', $oldest_val = 'I was first'

keys returns all keys in most-recently-used order:

$cache->get('first');   # promote 'first'
my @keys = $cache->keys;
# ('first', 'third', 'second')

And delete hands back whatever was removed:

my $deleted = $cache->delete('second');
# $deleted = 'I was second'

The Function-Style API

Method dispatch in Perl isn't free. For a cache that sits in a tight loop, the overhead of $cache->get(...) — which goes through entersub, method resolution, and argument shifting — adds up. LRU::Cache offers a function-style API that is custom OP optimised that eliminates this entirely:

use LRU::Cache qw(import);

my $cache = LRU::Cache::new(10_000);

lru_set($cache, 'key', $value);
my $v   = lru_get($cache, 'key');
my $hit = lru_exists($cache, 'key');
my $p   = lru_peek($cache, 'key');
lru_delete($cache, 'key');

As I say these aren't just wrappers. At compile time, Perl's call checker mechanism replaces the entersub op with a custom op that goes directly to the C implementation. No method lookup, no @_ construction, no argument unpacking. The cache pointer is read directly off the pad.

The result is roughly 2x faster than the method-style API, and 5–20x faster than a pure Perl implementation.

Benchmarks

Single-operation rates, no batching loops — each iteration is one cache call:

Operation Pure Perl XS Method XS Function
set (existing) 1,715,893/s 21,445,932/s 45,124,820/s
get (hit) 5,271,976/s 18,302,304/s 33,017,565/s
get (miss) 8,133,594/s 22,125,453/s 46,998,887/s
exists (hit) 7,080,776/s 19,521,882/s 38,745,035/s
peek 6,410,022/s 16,842,291/s 32,437,007/s

A Practical Example: Caching Expensive Lookups

Here's a pattern that comes up constantly — wrapping an expensive operation with a cache:

use LRU::Cache qw(import);

my $dns_cache = LRU::Cache::new(1000);

sub resolve {
    my ($domain) = @_;
    my $ip = lru_get($dns_cache, $domain);
    if (!defined $ip) {
        $ip = expensive_dns_lookup($domain);
        lru_set($dns_cache, $domain, $ip);
    }
    return $ip;
}

The cache transparently sits in front of the expensive call. Hits are served from C-backed memory in constant time. Misses fall through to the real lookup and get cached for next time. When the cache fills up, stale domains are evicted automatically.

Install

cpanm LRU::Cache

One can define a lexically-scoped sub as follows:

use feature qw( lexical_subs );

{
   my sub foo {
      ...
   }

   # This code can call `foo`.
}

# This code can't.

Is there a way of building a lexically-scoped sub from a reference?


I thought the (currently-experimental) refaliasing feature might be the way.

use feature qw( lexical_subs refaliasing declared_refs );

my \&foo = sub { ... };

This doesn't work, nor does any variant I tried.

So What Is A Binary Heap?

A binary heap is really just an sorted array pretending to be a tree. Each element has a parent and children, but instead of pointers you find them with simple array indexes. The trick is that every parent follows one rule relative to its children, and that rule decides what kind of heap you get:

  • Min heap - every parent is less than or equal to its children. The smallest element always lives at the root/top.
  • Max heap - every parent is greater than or equal to its children. The largest element always lives at the root/top.

Because the tree is complete, the parent of element at index i sits at index floor((i−1)/2), and its children sit at 2i+1 and 2i+2. No pointers, no linked lists — just arithmetic on array indices. That cache friendly layout is one reason heaps are so fast in practice.

The key operations:

Operation Time What it does
push O(log n) Insert a value, then sift it up to restore order
pop O(log n) Remove the root, move the last element up, sift down
peek O(1) Read the root without removing it
make_heap O(n) Turn an existing unsorted array into a valid heap (make_heap_min / make_heap_max, new('min'), new('max'))

This makes a heap the natural backbone of a priority queue — a collection where you always want the next most important item and you need insertions to stay fast.

Why Heap::PQ?

Heap::PQ brings binary heaps to Perl as a C extension. Where a pure Perl heap tops out around 300 operations per second, Heap::PQ's functional interface clocks over 11,000 when pushing a 1000 random integers — and if you are just handling plain integers then there is an optimised NV heap implementation that squeezes out 50% more performance. It ships with three interfaces (object oriented, functional custom ops or a raw array) so you can trade readability for throughput depending on what your code needs.

Installation

cpanm Heap::PQ

Getting Started

Create a heap, push values in, and they come back out in priority order:

use Heap::PQ;

my $pq = Heap::PQ::new('min');

$pq->push(5);
$pq->push(1);
$pq->push(3);

print $pq->peek, "\n";  # 1 — the smallest value is always at the root

while (!$pq->is_empty) {
    print $pq->pop, "\n";  # 1, 3, 5
}

Switch to 'max' and the largest element comes out first instead.

Three Ways to Use It

OO Interface

The object oriented API is the clearest to read if you're a traditional perl developer that likes OO. push returns the heap so calls can be chained:

my $heap = Heap::PQ::new('min');
$heap->push(10)->push(20)->push(5);
$heap->push_all(8, 2, 15);

my $top = $heap->pop;    # 2
my $size = $heap->size;  # 5

Functional Interface

Import with 'import' and Heap::PQ installs heap_push, heap_pop, heap_peek, heap_size, and heap_is_empty as custom ops. Perl compiles them down to optimised opcodes so there is no method dispatch overhead at runtime:

use Heap::PQ 'import';

my $h = Heap::PQ::new('min');

heap_push($h, 42);
heap_push($h, 7);

my $val  = heap_pop($h);      # 7
my $size = heap_size($h);     # 1
my $top  = heap_peek($h);     # 42

This is the fastest path as stated functional custom ops removes the method->dispatch overhead.

Raw Array Interface

You can also operate directly on a Perl array. Import with 'raw' to get make_heap_min, push_heap_min, pop_heap_min (and their _max counterparts). This skips the object entirely but the functional interface is still faster thanks to custom op optimisation:

use Heap::PQ 'raw';

my @data = (9, 4, 7, 1, 3);
Heap::PQ::make_heap_min(\@data);  # O(n) heapify in place

my $min = Heap::PQ::pop_heap_min(\@data);   # 1
Heap::PQ::push_heap_min(\@data, 2);

No object, no opaque struct — just an array whose layout satisfies the heap property. Useful when you already have data in an array and want to avoid copying it.

Custom Comparators

By default the heap compares values numerically. Pass a code reference as a second argument to new and you can heap order anything — hash references, objects, strings:

my $tasks = Heap::PQ::new('min', sub {
    $a->{due} <=> $b->{due}
});

$tasks->push({ name => 'Write tests',  due => 3 });
$tasks->push({ name => 'Ship release', due => 5 });
$tasks->push({ name => 'Fix bug',      due => 1 });

while (!$tasks->is_empty) {
    my $t = $tasks->pop;
    print "$t->{name}\n";
}
# Fix bug
# Write tests
# Ship release

The comparator receives two elements in $a and $b — exactly like Perl's sort — and should return a negative, zero, or positive value. String ordering works the same way:

my $alpha = Heap::PQ::new('min', sub { $a cmp $b });
$alpha->push_all('banana', 'apple', 'cherry');
print $alpha->pop, "\n";  # apple

Key Path Comparators

When your heap contains hash references and you just want to compare by a numeric field, pass the field name as a string instead of a code reference. The comparison happens entirely in C — no Perl callback overhead at all:

my $tasks = Heap::PQ::new('min', 'priority');

$tasks->push({ name => 'Write tests',  priority => 3 });
$tasks->push({ name => 'Ship release', priority => 5 });
$tasks->push({ name => 'Fix bug',      priority => 1 });

while (!$tasks->is_empty) {
    my $t = $tasks->pop;
    print "$t->{name}\n";
}
# Fix bug
# Write tests
# Ship release

Dot separated paths reach into nested hashes:

my $heap = Heap::PQ::new('min', 'meta.score');
$heap->push({ id => 'a', meta => { score => 42 } });
$heap->push({ id => 'b', meta => { score => 17 } });
print $heap->pop->{id};  # 'b'

The key is extracted once when you push, so sift operations run at the same speed as a plain numeric heap. In benchmarks, key path heaps match the no-comparator path (~6,400/s) while custom Perl comparators top out around ~1,200/s — a 5× difference.

NV Heaps — Native Doubles, No SV Overhead

When every value is a number, new_nv stores them as raw C doubles instead of Perl scalars. That eliminates SV allocation and reference counting entirely:

my $h = Heap::PQ::new_nv('min');
$h->push_all(3.14, 2.71, 1.41, 1.73);
print $h->pop, "\n";  # 1.41

In benchmarks the NV heap runs roughly 1.5× faster than the functional interface for push heavy workloads, and nearly 57× faster than pure Perl.

Searching and Deleting

Both search and delete accept a callback that receives each element in $_. search returns matching elements; delete removes them and returns how many were dropped:

my $pq = Heap::PQ::new('min');
$pq->push_all(1, 5, 10, 15, 20, 25);

# Find all elements greater than 12
my @big = $pq->search(sub { $_ > 12 });
# @big = (15, 20, 25)

# Remove them from the heap
my $removed = $pq->delete(sub { $_ > 12 });
# $removed = 3, heap now contains (1, 5, 10)

After a delete, Heap::PQ rebuilds the heap in O(n) with Floyd's algorithm so the ordering invariant is always intact.

Peeking at the Top N

peek_n returns the top N elements in sorted order without removing them:

my $scores = Heap::PQ::new('max');
$scores->push_all(88, 95, 72, 99, 84, 91);

my @top3 = $scores->peek_n(3);  # (99, 95, 91)
# Heap is unchanged — still has all 6 elements

peek_n works with the heap's built-in numeric comparison, so it is best suited to plain numeric heaps. For heaps that use a custom comparator, pop and re-push to inspect the top N.

Putting It Together: An Event Scheduler

A priority queue is a natural fit for an event loop — push events with a timestamp, pop them in chronological order:

use Heap::PQ 'import';

my $scheduler = Heap::PQ::new('min', 'time'); # compare by the 'time' key in C

heap_push($scheduler, { time => 1712500800, action => 'start_server' });
heap_push($scheduler, { time => 1712497200, action => 'load_config' });
heap_push($scheduler, { time => 1712504400, action => 'open_connections' });
heap_push($scheduler, { time => 1712502600, action => 'warm_cache' });

while (!heap_is_empty($scheduler)) {
    my $event = heap_pop($scheduler);
    printf "t=%d  %s\n", $event->{time}, $event->{action};
}

# t=1712497200  load_config
# t=1712500800  start_server
# t=1712502600  warm_cache
# t=1712504400  open_connections

Benchmarks

All numbers below are from pushing 1,000 random integers then popping them all, measured with Benchmark:

Implementation Rate vs Pure Perl
Pure Perl 305/s
Custom comparator 1,194/s 4x
Array::Heap 6,087/s 20x
Heap::PQ OO key path 6,359/s 21x
Heap::PQ OO 6,685/s 22x
Heap::PQ raw 8,665/s 28x
Heap::PQ func 11,416/s 37x
Heap::PQ NV 16,747/s 55x

The custom op overhead reduction is clearly visible in the functional row, and the NV heap's custom op implementation plus avoidance of SV allocation pushes throughput higher still.

If you have any questions please do comment.

A Side note: I'm currently on look out for a new contract or permanent opportunity. If you know of any relevant openings, I'd appreciate hearing from you - email@lnation.org

TL;DR

Searching for CLI modules on MetaCPAN returns 1690 results. And still I wrote another, Yet Another CLI framework. This is about why mine is the one you want to use.

Introduction

CLI clients, we all write them, we all want them, but the boilerplate is just horrible. I wanted to get rid of it: Burn it with 🔥.

At my previous dayjob we had something I wrote, or co-wrote, my boss wanted a sort of chef like API. It became zsknife, but had a huge downside. You needed to register each command in the main module and I didn’t like that at all. I forked the concept internally, made it so you didn’t needed to register, but it lacked discovery.

Updates for great CPAN modules released last week. A module is considered great if its favorites count is greater or equal than 12.

  1. App::Staticperl - perl, libc, 100 modules, all in one standalone 500kb file
    • Version: 1.5 on 2026-04-04, with 21 votes
    • Previous CPAN version: 1.46 was released 4 years, 1 month, 16 days before
    • Author: MLEHMANN
  2. Catalyst::Action::REST - Automated REST Method Dispatching
    • Version: 1.22 on 2026-03-30, with 13 votes
    • Previous CPAN version: 1.21 was released 8 years, 3 months, 25 days before
    • Author: ETHER
  3. CPANSA::DB - the CPAN Security Advisory data as a Perl data structure, mostly for CPAN::Audit
    • Version: 20260329.001 on 2026-03-29, with 25 votes
    • Previous CPAN version: 20260327.002 was released 1 day before
    • Author: BRIANDFOY
  4. Devel::NYTProf - Powerful fast feature-rich Perl source code profiler
    • Version: 6.15 on 2026-03-31, with 199 votes
    • Previous CPAN version: 6.14 was released 2 years, 5 months, 12 days before
    • Author: JKEENAN
  5. Devel::Size - Perl extension for finding the memory usage of Perl variables
    • Version: 0.87 on 2026-03-31, with 22 votes
    • Previous CPAN version: 0.86_50 was released 1 month, 19 days before
    • Author: NWCLARK
  6. Dios - Declarative Inside-Out Syntax
    • Version: 0.002014 on 2026-04-01, with 24 votes
    • Previous CPAN version: 0.002013 was released 1 year, 7 months, 14 days before
    • Author: DCONWAY
  7. Inline::Module - Support for Inline-based CPAN Extension Modules
    • Version: 0.35 on 2026-03-30, with 14 votes
    • Previous CPAN version: 0.34 was released 11 years, 1 month, 12 days before
    • Author: INGY
  8. IPC::Run - system() and background procs w/ piping, redirs, ptys (Unix, Win32)
    • Version: 20260402.0 on 2026-04-02, with 39 votes
    • Previous CPAN version: 20260401.0 was released the same day
    • Author: TODDR
  9. LWP - The World-Wide Web library for Perl
    • Version: 6.82 on 2026-03-29, with 212 votes
    • Previous CPAN version: 6.81 was released 5 months, 6 days before
    • Author: OALDERS
  10. Module::CoreList - what modules shipped with versions of perl
    • Version: 5.20260330 on 2026-03-29, with 45 votes
    • Previous CPAN version: 5.20260320 was released 8 days before
    • Author: BINGOS
  11. Module::Metadata - Gather package and POD information from perl module files
    • Version: 1.000039 on 2026-04-03, with 14 votes
    • Previous CPAN version: 1.000038 was released 2 years, 11 months, 5 days before
    • Author: ETHER
  12. Mouse - Moose minus the antlers
    • Version: v2.6.2 on 2026-04-04, with 63 votes
    • Previous CPAN version: v2.6.1 was released 3 months, 14 days before
    • Author: SYOHEX
  13. perl - The Perl 5 language interpreter
    • Version: 5.042002 on 2026-03-29, with 2251 votes
    • Previous CPAN version: 5.042001 was released 21 days before
    • Author: SHAY
  14. Pod::Simple - framework for parsing Pod
    • Version: 3.48 on 2026-04-04, with 20 votes
    • Previous CPAN version: 3.47 was released 10 months, 19 days before
    • Author: KHW
  15. Sidef - The Sidef Programming Language - A modern, high-level programming language
    • Version: 26.04 on 2026-04-01, with 122 votes
    • Previous CPAN version: 26.01 was released 2 months, 18 days before
    • Author: TRIZEN
  16. SPVM - The SPVM Language
    • Version: 0.990153 on 2026-03-28, with 36 votes
    • Previous CPAN version: 0.990152 was released 2 days before
    • Author: KIMOTO
  17. Sys::Virt - libvirt Perl API
    • Version: v12.2.0 on 2026-04-01, with 17 votes
    • Previous CPAN version: v12.1.0 was released 29 days before
    • Author: DANBERR
  18. Test2::Harness - A new and improved test harness with better Test2 integration.
    • Version: 1.000164 on 2026-04-01, with 28 votes
    • Previous CPAN version: 1.000164 was released the same day
    • Author: EXODIST
  19. WebService::Fastly - an interface to most facets of the [Fastly API](https://www.fastly.com/documentation/reference/api/).
    • Version: 14.01 on 2026-03-31, with 18 votes
    • Previous CPAN version: 14.00 was released 1 month, 14 days before
    • Author: FASTLY
  20. YAML::Syck - Fast, lightweight YAML loader and dumper
    • Version: 1.44 on 2026-04-02, with 18 votes
    • Previous CPAN version: 1.43 was released 1 day before
    • Author: TODDR

This is the weekly favourites list of CPAN distributions. Votes count: 65

Week's winner: App::FatPacker (+2)

Build date: 2026/04/04 19:54:08 GMT


Clicked for first time:

  • App::Chit - chat with AI from the command line
  • App::prepare4release - prepare a Perl distribution for release (skeleton)
  • Context::Preserve - Run code after a subroutine call, preserving the context the subroutine would have seen if it were the last statement in the caller
  • Data::HashMap::Shared - Type-specialized shared-memory hash maps for multiprocess access
  • DB::Handy - Pure-Perl flat-file relational database with DBI-like interface
  • EV::Websockets - WebSocket client/server using libwebsockets and EV
  • Rex::LibSSH - Rex connection backend using Net::LibSSH (no SFTP required)
  • Task::Kensho::All - Install all of Task::Kensho
  • WWW::Tracking - universal website visitors tracking

Increasing its reputation:

Living Perl: From Scripting to Geodynamics

Perl on Medium

A Different Path into Scientific Computing.

I'm using the AD DS28EC20 1-wire EEPROM (originally from Maxim I believe), and I'm trying to use the 32 byte Extended Read with CRC function (command 0xA5). I'm hoping to get some help from some of the CRC experts on this board. My interest would be an implementation in Python, Perl, or C.

https://www.analog.com/en/products/ds28ec20.html

EXTENDED READ MEMORY[A5h] command description

CRC16 Hardware Description and Polynomial

I have written some data to the first 32 bytes of memory. The following is the 32 bytes of data read plus the 0xA5 command and the two 0x00 address bytes - and the calculated CRC. So I believe that 35 bytes altogether, including the command byte 0xA5 and the two 0x00 address bytes are needed to reproduce the CRC.

0xA5
0x00
0x00
0x03
0x04
0x05
0x06
0x07
0x08
0x09
0x10
0x11
0x12
0x13
0x14
0x15
0x16
0x17
0x18
0x19
0x20
0x21
0x22
0x23
0x24
0x25
0x26
0x27
0x28
0x29
0x30
0x31
0x32
0x33
0x34


CRC: 0x1181 4481 0001000110000001

I created a down-and-dirty Perl script to calculate the data, which reads a file with the 35 bytes of data.

The data to read from the file, and the script:

0xA5
0x00
0x00
0x03
0x04
0x05
0x06
0x07
0x08
0x09
0x10
0x11
0x12
0x13
0x14
0x15
0x16
0x17
0x18
0x19
0x20
0x21
0x22
0x23
0x24
0x25
0x26
0x27
0x28
0x29
0x30
0x31
0x32
0x33
0x34
#!/usr/bin/env perl

use strict;
use warnings;

my($file_old, $file_new) = @ARGV;
my $index = 0;
my $num;
my $num2;
my @number_set_1;

my $crc;
my $crc_init = 0;
#my $crc_init = 0xFFFF;
my $crc_n;
my $poly = hex("0x8005");
my $poly_reflected = hex("0xA001");

die "$file_old name must have at least three chars, $!" if length($file_old) < 3;
die "$file_new name must have at least three chars, $!" if length($file_new) < 3;

open (IN, "<$file_old") or die "Can't open $file_old, $!";
open (OUT, ">$file_new") or die "Can't open $file_new, $!";

print "\nPOLY: $poly \n\n";

while(my $line = <IN>) {
    if(!($line !~ /\S/)) {
        chomp($line);
        #print "At Line Number $.: $line";
        #print OUT "$line";
        $number_set_1[$index] = hex($line);
        $num = pack("C", $number_set_1[$index]); # "C" = unsigned char (byte)
        $num2 = unpack("C",$num);
        #print "$index $number_set_1[$index] \n";
        print "$index $number_set_1[$index] $num2 \n";
        $index ++;
    }
}

print"\n\n";

printf("POLY = 0X%04X:\n\n", $poly);

$crc = $crc_init;

for(my $i = 0; $i < 34; $i++) {
    $crc ^= $number_set_1[$i];
    $crc = $crc & 1 ? ($crc >> 1) ^ $poly : $crc >> 1;
    $crc = $crc & 1 ? ($crc >> 1) ^ $poly : $crc >> 1;
    $crc = $crc & 1 ? ($crc >> 1) ^ $poly : $crc >> 1;
    $crc = $crc & 1 ? ($crc >> 1) ^ $poly : $crc >> 1;
    $crc = $crc & 1 ? ($crc >> 1) ^ $poly : $crc >> 1;
    $crc = $crc & 1 ? ($crc >> 1) ^ $poly : $crc >> 1;
    $crc = $crc & 1 ? ($crc >> 1) ^ $poly : $crc >> 1;
    $crc = $crc & 1 ? ($crc >> 1) ^ $poly : $crc >> 1;
    print "$i $number_set_1[$i] $crc\n";
}

$crc_n = ~$crc & 0xFFFF;

print"\nCRC:   $crc\nCRC_N: $crc_n\n";
printf("\nCRC:   0X%04X\nCRC_N: 0X%04X\n", $crc, $crc_n);

print"\n\n";

printf("POLY = 0X%04X:\n\n", $poly_reflected);

$crc = $crc_init;

for(my $i = 0; $i < 34; $i++) {
    $crc ^= $number_set_1[$i];
    $crc = $crc & 1 ? ($crc >> 1) ^ $poly_reflected : $crc >> 1;
    $crc = $crc & 1 ? ($crc >> 1) ^ $poly_reflected : $crc >> 1;
    $crc = $crc & 1 ? ($crc >> 1) ^ $poly_reflected : $crc >> 1;
    $crc = $crc & 1 ? ($crc >> 1) ^ $poly_reflected : $crc >> 1;
    $crc = $crc & 1 ? ($crc >> 1) ^ $poly_reflected : $crc >> 1;
    $crc = $crc & 1 ? ($crc >> 1) ^ $poly_reflected : $crc >> 1;
    $crc = $crc & 1 ? ($crc >> 1) ^ $poly_reflected : $crc >> 1;
    $crc = $crc & 1 ? ($crc >> 1) ^ $poly_reflected : $crc >> 1;
    print "$i $number_set_1[$i] $crc\n";
}

$crc_n = ~$crc & 0xFFFF;

print"\nCRC:   $crc\nCRC_N: $crc_n\n";
printf("\nCRC:   0X%04X\nCRC_N: 0X%04X\n\n", $crc, $crc_n);


close IN;
close OUT;

Here is the output from the script:

POLY: 32773

0 165 165
1 0 0
2 0 0
3 3 3
4 4 4
5 5 5
6 6 6
7 7 7
8 8 8
9 9 9
10 16 16
11 17 17
12 18 18
13 19 19
14 20 20
15 21 21
16 22 22
17 23 23
18 24 24
19 25 25
20 32 32
21 33 33
22 34 34
23 35 35
24 36 36
25 37 37
26 38 38
27 39 39
28 40 40
29 41 41
30 48 48
31 49 49
32 50 50
33 51 51
34 52 52


POLY = 0X8005:

0 165 43780
1 0 23721
2 0 20318
3 3 49992
4 4 9410
5 5 42272
6 6 37536
7 7 34199
8 8 44417
9 9 14508
10 16 62526
11 17 48624
12 18 36536
13 19 24973
14 20 57190
15 21 59865
16 22 7657
17 23 6685
18 24 51997
19 25 23753
20 32 36697
21 33 59529
22 34 25067
23 35 63591
24 36 31227
25 37 64127
26 38 40959
27 39 34970
28 40 16009
29 41 24637
30 48 29539
31 49 40566
32 50 40091
33 51 55451

CRC:   55451
CRC_N: 10084

CRC:   0XD89B
CRC_N: 0X2764


POLY = 0XA001:

0 165 31680
1 0 20603
2 0 8976
3 3 52578
4 4 10829
5 5 13866
6 6 56631
7 7 5341
8 8 40917
9 9 39326
10 16 25625
11 17 50789
12 18 9862
13 19 28646
14 20 34286
15 21 33732
16 22 23811
17 23 3933
18 24 62414
19 25 24243
20 32 27934
21 33 4141
22 34 1104
23 35 58693
24 36 59428
25 37 49193
26 38 1152
27 39 47685
28 40 60795
29 41 64876
30 48 14845
31 49 21817
32 50 50964
33 51 6791

CRC:   6791
CRC_N: 58744

CRC:   0X1A87
CRC_N: 0XE578

There are 2 passes made, one with the non-reflected poly, and one with the reflected poly. I believe the result that should match the device calculation should be the CRC_N value. As you can see, none of the results match the device output CRC of 0x1181.

Expected to happen: Calculate a CRC of 0x1181 which the device calculated.

What actually happened: CRC calculation of 0x2764 with non-inverted poly 0x8001, and 0xE578 with inverted poly 0xA001.

Any help or thoughts would be welcome.

In Perl regular expressions, (?(DEFINE)(?<NAME>...)) creates a referencable pattern which can be executed by the "recursion" mechanism e.g. (?&NAME). This is useful if the "..." is a parameter with unknown internal capture groups. As perlre alludes to, the (DEFINE) has to be placed at the end of the regex if evaluated in List context (because extra 'undef' values corresponding to inner groups in the "..." will be returned).

I'm wondering why this construct exists in Perl because AFAICT ordinary capture groups can be used by just inserting an (*ACCEPT) to prevent directly executing them.

For example, given $_ = "beforeMIDDLEafter" the following seem to do exactly the same:

/ ^ (\w*?) (?&SUB) (\w*) (*ACCEPT) (?<SUB>MI(D)(?1)LE) /x;
/ ^ (\w*?) (?&SUB) (\w*) (?(DEFINE)(?<SUB>MI(D)(?1)LE)) /x;

Both set @{^CAPTURE)=("before", "after") and both return ("before","after",undef,undef) in list context.

My question is: Are these exactly equivalent? Does the (DEFINE) construct provide any additional or different capability?

TPRF Board Announces the 2025 Annual Report

Perl Foundation News

The Board is pleased to share the 2025 Annual Report from the The Perl and Raku Foundation.

You can download the full report from the Perl and Raku Foundation website

Strengthening the Foundation

2025 was a year of both challenge and progress. Like many nonprofits, the Foundation faced funding constraints that required careful prioritization of resources. At the same time, increased focus on fundraising and donor engagement helped stabilize support for the work that matters most. A number of processes and tools were overhauled, allowing the Board to manage the funding more effectively, and pay grants more promptly and at lower overhead expense than had been the case previously.

Contributions from sponsors, corporate partners, and individual donors played a critical role in sustaining operations—particularly for core development and infrastructure.

Funding What Matters Most

Financial stewardship remained a top priority throughout the year. The Foundation focused its resources on:

  • Supporting the Perl 5 Core Maintenance Fund
  • Investing in Raku development and ecosystem improvements
  • Maintaining essential infrastructure and services

While some grant activity was reduced during tighter periods, the report describes the Foundations recovery from those trials and outlines a clear path toward expanding funding as donations grow.

Our total income for the year was $253,744.86, with total expenditures of $233,739.75. 92% of our spending supported grants, events, and infrastructure. Our largest single expenditure remains the Perl Core Maintenance Grants, one of the long-time pillars of the Foundation's programs.

A Community-Funded Future

The Foundation’s work is made possible by the community it serves. Every donation—whether from individuals or organizations—directly supports the developers, tools, and systems that keep Perl and Raku reliable and evolving.

In 2025, we also strengthened our fundraising efforts, building a more sustainable base of recurring and long-term support to ensure continuity in the years ahead.

Looking Ahead

Our focus for the coming year is clear:

  • Grow recurring donations and sponsorships
  • Restore and expand the grants program
  • Continue developing transparent, responsible financial management

We’re grateful to everyone who contributed in 2025. Your support keeps the ecosystem strong.

If you rely on Perl or Raku, we encourage you to take part in sustaining them. Your support is always welcome!

Writing a TOON Module for Perl

Perl Hacks

Every so often, a new data serialisation format appears and people get excited about it. Recently, one of those formats is **TOON** — Token-Oriented Object Notation. As the name suggests, it’s another way of representing the same kinds of data structures that you’d normally store in JSON or YAML: hashes, arrays, strings, numbers, booleans and nulls.

So the obvious Perl question is: *“Ok, where’s the CPAN module?”*

This post explains what TOON is, why some people think it’s useful, and why I decided to write a Perl module for it — with an interface that should feel very familiar to anyone who has used JSON.pm.

I should point out that I knew about [Data::Toon](https://metacpan.org/pod/Data::TOON) but I wanted something with an interface that was more like JSON.pm.

## What TOON Is

TOON stands for **Token-Oriented Object Notation**. It’s a textual format for representing structured data — the same data model as JSON:

* Objects (hashes)
* Arrays
* Strings
* Numbers
* Booleans
* Null

The idea behind TOON is that it is designed to be **easy for both humans and language models to read and write**. It tries to reduce punctuation noise and make the structure of data clearer.

If you think of the landscape like this:

| Format | Human-friendly | Machine-friendly | Very common |
| —— | ————– | —————- | ———– |
| JSON | Medium | Very | Yes |
| YAML | High | Medium | Yes |
| TOON | High | High | Not yet |

TOON is trying to sit in the middle: simpler than YAML, more readable than JSON.

Whether it succeeds at that is a matter of taste — but it’s an interesting idea.

## TOON vs JSON vs YAML

It’s probably easiest to understand TOON by comparing it to JSON and YAML. Here’s the same “person” record written in all three formats.

### JSON

{
“name”: “Arthur Dent”,
“age”: 42,
“email”: “arthur@example.com”,
“alive”: true,
“address”: {
“street”: “High Street”,
“city”: “Guildford”
},
“phones”: [
“01234 567890”,
“07700 900123”
]
}

### YAML

name: Arthur Dent
age: 42
email: arthur@example.com
alive: true
address:
street: High Street
city: Guildford
phones:
– 01234 567890
– 07700 900123

### TOON

name: Arthur Dent
age: 42
email: arthur@example.com
alive: true
address:
street: High Street
city: Guildford
phones[2]: 01234 567890,07700 900123

You can see that TOON sits somewhere between JSON and YAML:

* Less punctuation and quoting than JSON
* More explicit structure than YAML
* Still very easy to parse
* Still clearly structured for machines

That’s the idea, anyway.

## Why People Think TOON Is Useful

The current interest in TOON is largely driven by AI/LLM workflows.

People are using it because:

1. It is easier for humans to read than JSON.
2. It is less ambiguous and complex than YAML.
3. It maps cleanly to the JSON data model.
4. It is relatively easy to parse.
5. It works well in prompts and generated output.

In other words, it’s not trying to replace JSON for APIs, and it’s not trying to replace YAML for configuration files. It’s aiming at the space where humans and machines are collaborating on structured data.

You may or may not buy that argument — but it’s an interesting niche.

## Why I Wrote a Perl Module

I don’t have particularly strong opinions about TOON as a format. It might take off, it might not. We’ve seen plenty of “next big data format” ideas over the years.

But what I *do* have a strong opinion about is this:

> If a data format exists, then Perl should have a CPAN module for it that works the way Perl programmers expect.

Perl already has very good, very consistent interfaces for data serialisation:

* JSON
* YAML
* Storable
* Sereal

They all tend to follow the same pattern, particularly the object-oriented interface:

use JSON;
my $json = JSON->new->pretty->canonical;
my $text = $json->encode($data);
my $data = $json->decode($text);

So I wanted a TOON module that worked the same way.

## Design Goals

When designing the module, I had a few simple goals.

### 1. Familiar OO Interface

The primary interface should be object-oriented and feel like JSON.pm:

use TOON;
my $toon = TOON->new
->pretty
->canonical
->indent(2);
my $text = $toon->encode($data);
my $data = $toon->decode($text);

If you already know JSON, you already know how to use TOON.

There are also convenience functions, but the OO interface is the main one.

### 2. Pure Perl Implementation

Version 0.001 is pure Perl. That means:

* Easy to install
* No compiler required
* Works everywhere Perl works

If TOON becomes popular and performance matters, someone can always write an XS backend later.

### 3. Clean Separation of Components

Internally, the module is split into:

* **Tokenizer** – turns text into tokens
* **Parser** – turns tokens into Perl data structures
* **Emitter** – turns Perl data structures into TOON text
* **Error handling** – reports line/column errors cleanly

This makes it easier to test and maintain.

### 4. Do the Simple Things Well First

Version 0.001 supports:

* Scalars
* Arrayrefs
* Hashrefs
* undef → null
* Pretty printing
* Canonical key ordering

It does **not** (yet) try to serialise blessed objects or do anything clever. That can come later if people actually want it.

## Example Usage (OO Style)

Here’s a simple Perl data structure:

my $data = {
name => “Arthur Dent”,
age => 42,
drinks => [ “tea”, “coffee” ],
alive => 1,
};

### Encoding

use TOON;
my $toon = TOON->new->pretty->canonical;
my $text = $toon->encode($data);
print $text;

### Decoding

use TOON;
my $toon = TOON->new;
my $data = $toon->decode($text);
print $data->{name};

### Convenience Functions

use TOON qw(encode_toon decode_toon);
my $text = encode_toon($data);
my $data = decode_toon($text);

But the OO interface is where most of the flexibility lives.

## Command Line Tool

There’s also a command-line tool, toon_pp, similar to json_pp:

cat data.toon | toon_pp

Which will pretty-print TOON data.

## Final Thoughts

I don’t know whether TOON will become widely used. Predicting the success of data formats is a fool’s game. But the cost of supporting it in Perl is low, and the potential usefulness is high enough to make it worth doing.

And fundamentally, this is how CPAN has always worked:

> See a problem. Write a module. Upload it. See if anyone else finds it useful.

So now Perl has a TOON module. And if you already know how to use JSON.pm, you already know how to use it.

That was the goal.

The post Writing a TOON Module for Perl first appeared on Perl Hacks.

Updates for great CPAN modules released last week. A module is considered great if its favorites count is greater or equal than 12.

  1. Clone - recursively copy Perl datatypes
    • Version: 0.50 on 2026-03-28, with 33 votes
    • Previous CPAN version: 0.49 was released 3 days before
    • Author: ATOOMIC
  2. CPANSA::DB - the CPAN Security Advisory data as a Perl data structure, mostly for CPAN::Audit
    • Version: 20260327.002 on 2026-03-27, with 25 votes
    • Previous CPAN version: 20260318.001 was released 9 days before
    • Author: BRIANDFOY
  3. DBD::Oracle - Oracle database driver for the DBI module
    • Version: 1.95 on 2026-03-24, with 33 votes
    • Previous CPAN version: 1.91_5 was released 8 days before
    • Author: ZARQUON
  4. IPC::Run - system() and background procs w/ piping, redirs, ptys (Unix, Win32)
    • Version: 20260322.0 on 2026-03-22, with 39 votes
    • Previous CPAN version: 20250809.0 was released 7 months, 12 days before
    • Author: TODDR
  5. Mojo::Pg - Mojolicious ♥ PostgreSQL
    • Version: 4.29 on 2026-03-23, with 98 votes
    • Previous CPAN version: 4.28 was released 5 months, 23 days before
    • Author: SRI
  6. Object::Pad - a simple syntax for lexical field-based objects
    • Version: 0.825 on 2026-03-25, with 48 votes
    • Previous CPAN version: 0.824 was released 1 day before
    • Author: PEVANS
  7. PDL::Stats - a collection of statistics modules in Perl Data Language, with a quick-start guide for non-PDL people.
    • Version: 0.856 on 2026-03-22, with 15 votes
    • Previous CPAN version: 0.855 was released 1 year, 16 days before
    • Author: ETJ
  8. SPVM - The SPVM Language
    • Version: 0.990152 on 2026-03-26, with 36 votes
    • Previous CPAN version: 0.990151 was released the same day
    • Author: KIMOTO
  9. Term::Choose - Choose items from a list interactively.
    • Version: 1.781 on 2026-03-25, with 15 votes
    • Previous CPAN version: 1.780 was released 1 month, 20 days before
    • Author: KUERBIS
  10. YAML::Syck - Fast, lightweight YAML loader and dumper
    • Version: 1.42 on 2026-03-27, with 18 votes
    • Previous CPAN version: 1.41 was released 4 days before
    • Author: TODDR

(dcxxvii) metacpan weekly report - Mail::Make

Niceperl

This is the weekly favourites list of CPAN distributions. Votes count: 43

Week's winner: Mail::Make (+2)

Build date: 2026/03/28 20:47:31 GMT


Clicked for first time:


Increasing its reputation:

I just spend another fun and productive week in Marseille at the Koha Hackfest hosted by BibLibre. We (Mark, tadzik and me) arrived on Sunday (via plane from Vienna or Poland, and I came by train from Berlin via Strasbourg) and left on Friday.

There where the usual interesting discussions on all things Koha, presentations of new features and of course a lot of socializing. And cheese, so much cheese...

Elasticsearch

On the first day there was a discussion on Elasticsearch and getting rid of Zebra (the old search engine used by Koha). Actually getting rid of Zebra is not an option (now), because small installation won't want to set up and run Elasticsearch. But Mark proposed using our Marc Normalization Plugin as the basis for a new internal, DB-only search engine (so no need for an external index etc) and over the course of the week (and with LLM help) implemented a prototype. It would really be amazing if we could get this running!

I worked a bit on improving Elasticsearch indexing:

  • Bulk biblio ES index update after auth change: When merging (or updating) authorities, the Elasticsearch indexing of the linked biblios now will happen in one background job per authority instead of one background job per biblio. So an authority that is used in 100 biblios will now trigger one indexing background job with 100 biblio items instead of 100 background jobs with 1 biblio item each.
  • Zebraqueue should not be added to when only Elasticsearch is used: We added a new syspref "ElasticsearchEnableZebraQueue". If disabled, no data will be written to the zebraqueue table, because usually when using Elasticsearch you don't need to also run Zebra.

I got sign-offs and Pass-QA for both issues during the hackfest, thanks Fridolin, Paul and Baptiste (who owns the coolest tea mug at BibLibre..)

QA

I also did QA on a bunch of other issues: 22639, 35267, 36550, 39158, 40906, 41767, 41967, 42107. Some of them where of interest to me, some I did because other people nicely asked me to :-)

LLM, "AI" and Agentic Coding

This was again a hot topic, with some people using those tools to great effect, some hating them, and some in between. As in my last post on the German Perl Workshop I again want to point out this blog post: I Sold Out for $20 a Month and All I Got Was This Perfectly Generated Terraform, and during the event the post Thoughts on slowing the fuck down dropped (by Mario Zechner, who wrote the coding agent I (currently) use).

Anyway, Koha now has some guidelines on AI and LLM-assisted contributions and on using LLM features inside Koha.

Claude vs domm

While working on unit tests for Bug 40577 I struggled with a test failing only if I run the whole test script (as opposed to only the one subtest I was working on). It seemed to be a problem with mocked tests, so I asked Joubu (who was by chance just standing next to me). Together we figured out the scoping problem: If you use Test::MockObject/MockModule multiple times on a class from different scopes, the mocked methods/functions might not automatically be removed. You have to call unmock explicitly. After the patch was done, I described the error to Claude and asked for a fix, expecting to not get anything useable. But (to my slight horror) it produced the correct explanation and fix in very short time. On the one hand: amazing; on the other hand: very scary.

Other random stuff:

  • When it rains and a TGV arrives at the station, more people have the idea to take a taxi than taxis are available. So walking the short distance was necessary, but we (Katrin, who I met on the train, and me) still got wet. At least we had cold burgers...
  • Paul showed me a non-Koha tool he has written: mdv - A terminal markdown viewer with vim keybindings. Very nice, I especially like it to view checkouts of gitlab wikis!
  • I was not the only Team Scheisse fan attending! Philip++
  • Philip also pointed out the very detailed and interesting shared notes produced by various attendees during the event.
  • At my third visit to Marseille, I manage to navigate the city center quite well.
  • I finally made it to the Tangerine record store, very nice selection. I still did not let the shop owner talk me into buying a 200€ original UK pressing of Unknown Pleasures by Joy Division.
  • I did not get Moule Frits, but at least some Galette and Cidre.
  • After being to Senegal in February, I now realized that there are a lot of places selling Yassa and Mafe in Marseille. I guess they where there last year too, I just did not see them, having never eaten Yassa or Mafe before.
  • It can get very windy in Marseille.
  • I should do it like Jake(?) and cycle (at least partly) to the next hackfest.

Thanks

Thanks to BibLibre and Paul Poulain for organizing the event, and to all the attendees for making it such a wonderful 5 days!

Looking forward to meet you all again at the upcoming KohaCon in Karlsruhe

Updates

  • 2026-03-03: Added link to shared notes.

Make, Bash, and a scripting language of your choice


Creating AWS Resources…let me count the ways

You need to create an S3 bucket, an SQS queue, an IAM policy and a few other AWS resources. But how?…TIMTOWTDI

The Console

  • Pros: visual, immediate feedback, no tooling required, great for exploration
  • Cons: not repeatable, not version controllable, opaque, clickops doesn’t scale, “I swear I configured it the same way”

The AWS CLI

  • Pros: scriptable, composable, already installed, good for one-offs
  • Cons: not idempotent by default, no state management, error handling is manual, scripts can grow into monsters

CloudFormation

  • Pros: native AWS, state managed by AWS, rollback support, drift detection
  • Cons: YAML/JSON verbosity, slow feedback loop, stack update failures are painful, error messages are famously cryptic, proprietary to AWS, subject to change without notice

Terraform

  • Pros: multi-cloud, huge community, mature ecosystem, state management, plan before apply
  • Cons: state file complexity, backend configuration, provider versioning, HCL is yet another language to learn, overkill for small projects, often requires tricks & contortions

Pulumi

  • Pros: real programming languages, familiar abstractions, state management
  • Cons: even more complex than Terraform, another runtime to install and maintain

CDK

  • Pros: real programming languages, generates CloudFormation, good for large organizations
  • Cons: CloudFormation underneath means CloudFormation problems, Node.js dependency

…and the rest of crew…

Ansible, AWS SAM, Serverless Framework - each with their own opinions, dependencies, and learning curves.

Every option beyond the CLI adds a layer of abstraction, a new language or DSL, a state management story, and a new thing to learn and maintain. For large teams managing hundreds of resources across multiple environments that overhead is justified. For a solo developer or small team managing a focused set of resources it can feel like overkill.

Even in large organizations, not every project should be conflated into the corporate infrastructure IaC tool. Moreover, not every project gets the attention of the DevOps team necessary to create or support the application infrastructure.

What if you could get idempotent, repeatable, version-controlled infrastructure management using tools you already have? No new language, no state backend, no provider versioning. Just make, bash, a scripting language you’re comfortable with, and your cloud provider’s CLI.

And yes…my love affair with make is endless.

We’ll use AWS examples throughout, but the patterns apply equally to Google Cloud (gcloud) and Microsoft Azure (az). The CLI tools differ, the patterns don’t.


A word about the AWS CLI --query option

Before you reach for jq, perl, or python to parse CLI output, it’s worth knowing that most cloud CLIs have built-in query support. The AWS CLI’s --query flag implements JMESPath - a query language for JSON that handles the majority of filtering and extraction tasks without any additional tools:

# get a specific field
aws lambda get-function \
    --function-name my-function \
    --query 'Configuration.FunctionArn' \
    --output text

# filter a list
aws sqs list-queues \
    --query 'QueueUrls[?contains(@, `my-queue`)]|[0]' \
    --output text

--query is faster, requires no additional dependencies, and keeps your pipeline simple. Reach for it first. When it falls short - complex transformations, arithmetic, multi-value extraction - that’s when a one-liner earns its place:

# perl
aws lambda get-function --function-name my-function | \
    perl -MJSON -n0 -e '$l=decode_json($_); print $l->{Configuration}{FunctionArn}'

# python
aws lambda get-function --function-name my-function | \
    python3 -c "import json,sys; d=json.load(sys.stdin); print(d['Configuration']['FunctionArn'])"

Both get the job done. Use whichever lives in your shed.


What is Idempotency?

The word comes from mathematics - an operation is idempotent if applying it multiple times produces the same result as applying it once. Sort of like those ID10T errors…no matter how hard or how many times that user clicks on that button they get the same result.

In the context of infrastructure management it means this: running your resource creation script twice should have exactly the same outcome as running it once. The first run creates the resource. The second run detects it already exists and does nothing - no errors, no duplicates, no side effects.

This sounds simple but it’s surprisingly easy to get wrong. A naive script that just calls aws lambda create-function will fail on the second run with a ResourceConflictException. A slightly better script wraps that in error handling. A truly idempotent script never attempts to create a resource it knows already exists.

And it works in both directions. The idempotent bug - running a failing process repeatedly and getting the same error every time - is what happens when your failure path is idempotent too. Consistently wrong, no matter how many times you try. The patterns we’ll show are designed to ensure that success is idempotent while failure always leaves the door open for the next attempt.

Cloud APIs fall into four distinct behavioral categories when it comes to idempotency, and your tooling needs to handle each one differently:

Case 1 - The API is idempotent and produces output

Some APIs can be called repeatedly without error and return useful output each time - whether the resource was just created or already existed. aws events put-rule is a good example - it returns the rule ARN whether the rule was just created or already existed. The pattern: call the read API first, capture the output, call the write API only if the read returned nothing.

Case 2 - The API is idempotent but produces no output

Some write APIs succeed silently - they return nothing on success. aws s3api put-bucket-notification-configuration is a good example. It will happily overwrite an existing configuration without complaint, but returns no output to confirm success. The pattern: call the API, synthesize a value for your sentinel using && echo to capture something meaningful on success.

Case 3 - The API is not idempotent

Some APIs will fail with an error if you try to create a resource that already exists. aws lambda add-permission returns ResourceConflictException if the statement ID already exists. aws lambda create-function returns ResourceConflictException if the function already exists. These APIs give you no choice - you must query first and only call the write API if the resource is missing.

Case 4 - The API call fails

Any of the above can fail - network errors, permission problems, invalid parameters. When a call fails you must not leave behind a sentinel file that signals success. A stale sentinel is worse than no sentinel - it tells Make the resource exists when it doesn’t, and subsequent runs silently skip the creation step. The patterns: || rm -f $@ when writing directly, or else rm -f $@ when capturing to a variable first.


The Sentinel File

Before we look at the four patterns in detail, we need to introduce a concept that ties everything together: the sentinel file.

A sentinel file is simply a file whose existence signals that a task has been completed successfully. It contains no magic - it might hold the output of the API call that created the resource, or it might just be an empty file created with touch. What matters is that it exists when the task succeeded and doesn’t exist when it hasn’t.

make has used this pattern since the 1970s. When you declare a target in a Makefile, make checks whether a file with that name exists before deciding whether to run the recipe. If the file exists and is newer than its dependencies, make skips the recipe entirely. If the file doesn’t exist, make runs the recipe to create it.

For infrastructure management this is exactly the behavior we want:

my-resource:
    @value="$$(aws some-service describe-resource \
            --name $(RESOURCE_NAME) 2>&1)"; \
    if [[ -z "$$value" || "$$value" = "ResourceNotFound" ]]; then \
        value="$$(aws some-service create-resource \
            --name $(RESOURCE_NAME))"; \
    fi; \
    test -e $@ || echo "$$value" > $@

The first time you run make my-resource the file doesn’t exist, the recipe runs, the resource is created, and the API response is written to the sentinel file my-resource. The second time you run it, make sees the file exists and skips the recipe entirely - zero API calls.

When an API call fails we want to be sure we do not create the sentinel file. We’ll cover the failure case in more detail in Pattern 4 of the next section.


The Four Patterns

Armed with the sentinel file concept and an understanding of the four API behavioral categories, let’s look at concrete implementations of each pattern.


Pattern 1 - Idempotent API with output

The simplest case. Query the resource first - if it exists capture the output and write the sentinel. If it doesn’t exist, create it, capture the output, and write the sentinel. Either way you end up with a sentinel containing meaningful content.

The SQS queue creation is a good example:

sqs-queue:
    @queue="$$(aws sqs list-queues \
        --query 'QueueUrls[?contains(@, `$(QUEUE_NAME)`)]|[0]' \
        --output text --profile $(AWS_PROFILE) 2>&1)"; \
    if echo "$$queue" | grep -q 'error\|Error'; then \
        echo "ERROR: list-queues failed: $$queue" >&2; \
        exit 1; \
    elif [[ -z "$$queue" || "$$queue" = "None" ]]; then \
        queue="$(QUEUE_NAME)"; \
        aws sqs create-queue --queue-name $(QUEUE_NAME) \
            --profile $(AWS_PROFILE); \
    fi; \
    test -e $@ || echo "$$queue" > $@

Notice --query doing the filtering work before the output reaches the shell. No jq, no pipeline - the AWS CLI extracts exactly what we need. The result is either a queue URL or empty. If empty we create. Either way $$queue ends up with a value and the sentinel is written exactly once.

The EventBridge rule follows the same pattern:

lambda-eventbridge-rule:
    @rule="$$(aws events describe-rule \
            --name $(RULE_NAME) \
            --profile $(AWS_PROFILE) 2>&1)"; \
    if echo "$$rule" | grep -q 'ResourceNotFoundException'; then \
        rule="$$(aws events put-rule \
            --name $(RULE_NAME) \
            --schedule-expression "$(SCHEDULE_EXPRESSION)" \
            --state ENABLED \
            --profile $(AWS_PROFILE))"; \
    elif echo "$$rule" | grep -q 'error\|Error'; then \
        echo "ERROR: describe-rule failed: $$rule" >&2; \
        exit 1; \
    fi; \
    test -e $@ || echo "$$rule" > $@

Same shape - query, create if missing, write sentinel once.


Pattern 2 - Idempotent API with no output

Some APIs succeed silently. aws s3api put-bucket-notification-configuration is the canonical example - it happily overwrites an existing configuration and returns nothing. No output means nothing to write to the sentinel.

The solution is to synthesize a value using &&:

define notification_configuration =
use JSON;

my $lambda_function = $ENV{lambda_function};
my $function_arn = decode_json($lambda_function)->{Configuration}->{FunctionArn};

my $configuration = {
 LambdaFunctionConfigurations => [ {
   LambdaFunctionArn => $function_arn,
   Events => [ split ' ', $ENV{s3_event} ],
  }
 ]
};

print encode_json($configuration);
endef

export s_notification_configuration = $(value notification_configuration)

lambda-s3-trigger: lambda-s3-permission
        temp="$$(mktemp)"; trap 'rm -f "$$temp"' EXIT; \
        lambda_function="$$(cat lambda-function)"; \
        echo $$(s3_event="$(S3_EVENT)" lambda_function="$$lambda_function" \
          perl -e "$$s_notification_configuration") > $$temp; \
        trigger="$$(aws s3api put-bucket-notification-configuration \
            --bucket $(BUCKET_NAME) \
            --notification-configuration file://$$temp \
            --profile $(AWS_PROFILE) && cat $$temp)"; \
        test -e $@ || echo "$$trigger" > $@

The && cat $$temp is the key. If the API call succeeds the && fires and $$trigger gets the configuration JSON string - something meaningful to write to the sentinel. If the API call fails && doesn’t fire, $$trigger stays empty because the Makefile recipe aborts.

Using a scriptlet (s_notification_configuration) might seem like overkill, but it’s worth not having to fight shell quoting issues!

Writing JSON used in many AWS API calls to a temporary file is usually a better way than passing a string on the command line. Unless you wrap the JSON in quotes you’ll be fighting shell quoting and interpolation issues…and of course you can write your scriptlets in Perl or Python!


Pattern 3 - Non-idempotent API

Some APIs are not idempotent - they fail with a ResourceConflictException or similar if the resource already exists. aws lambda add-permission and aws lambda create-function are both in this category. There is no “create or update” variant - you must check existence first and only call the write API if the resource is missing.

The Lambda S3 permission target is a good example:

lambda-s3-permission: lambda-function s3-bucket
        @permission="$$(aws lambda get-policy \
                --function-name $(FUNCTION_NAME) \
                --profile $(AWS_PROFILE) 2>&1)"; \
        if echo "$$permission" | grep -q 'ResourceNotFoundException' || \
           ! echo "$$permission" | grep -q s3.amazonaws.com; then \
            permission="$$(aws lambda add-permission \
                --function-name $(FUNCTION_NAME) \
                --statement-id s3-trigger-$(BUCKET_NAME) \
                --action lambda:InvokeFunction \
                --principal s3.amazonaws.com \
                --source-arn arn:aws:s3:::$(BUCKET_NAME) \
                --profile $(AWS_PROFILE))"; \
        elif echo "$$permission" | grep -q 'error\|Error'; then \
            echo "ERROR: get-policy failed: $$permission" >&2; \
            exit 1; \
        fi; \
        if [[ -n "$$permission" ]]; then \
            test -e $@ || echo "$$permission" > $@; \
        else \
            rm -f $@; \
        fi

A few things worth noting here…

  • get-policy returns the full policy document which may contain multiple statements - we check for the presence of s3.amazonaws.com specifically using ! grep -q rather than just checking for an empty response. This handles the case where a policy exists but doesn’t yet have the S3 permission we need.
  • The sentinel is only written if $$permission is non-empty after the if block. This covers the case where get-policy returns nothing and add-permission also fails - the sentinel stays absent and the next make run will try again.
  • We pipe errors to our bash variable to detect the case where the resource does not exist or there may have been some other error. When other failures are possible 2>&1 combined with specific error string matching gives you both idempotency and visibility. Swallowing errors silently (2>/dev/null) is how idempotent bugs are born.

Pattern 4 - Failure handling

This isn’t a separate pattern so much as a discipline that applies to all three of the above. There are two mechanisms depending on how the sentinel is written.

Case 1: When the sentinel is written directly by the command:

aws lambda create-function ... > $@ || rm -f $@

|| rm -f $@ ensures that if the command fails the partial or empty sentinel is immediately cleaned up. Without it make sees the file on the next run and silently skips the recipe - an idempotent bug.

Case 2: When the sentinel is written by capturing output to a variable first:

if [[ -n "$$value" ]]; then \
    test -e $@ || echo "$$value" > $@; \
else \
    rm -f $@; \
fi

The else rm -f $@ serves the same purpose. If the variable is empty - because the API call failed - the sentinel is removed. If the sentinel doesn’t exist yet nothing is written. Either way the next make run will try again.

In both cases the goal is the same: a sentinel file should only exist when the underlying resource exists. A stale sentinel is worse than no sentinel.

Depending on the way your recipe is written you may not need to test the variable that capture the output at all. In Makefiles we .SHELLFLAGS := -ec which causes make to exit immediately if any command in a recipe fails. This means targets that don’t write to $@ - like our sqs-queue target above - don’t need explicit failure handling. make will die loudly and the sentinel won’t be written. In that case you don’t even need to test $$value and can simplify writing of the sentinel file like this:

test -e $@ || echo "$$value" > $@

Conclusion

Creating AWS resources can be done using several different tools…all of them eventually call AWS APIs and process the return payloads. Each of these tools has its place. Each adds something. Each also has a complexity, dependencies, and a learning curve score.

For a small project or a focused set of resources - the kind a solo developer or small team manages for a specific application - you don’t need tools with a high cognitive or resource load. You can use those tools you already have on your belt; make,bash, [insert favorite scripting language here], and aws. And you can leverage those same tools equally well with gcloud or az.

The four patterns we’ve covered handle every AWS API behavior you’ll encounter:

  • Query first, create only if missing, write a sentinel
  • Synthesize output when the API has none
  • Always check before calling a non-idempotent API
  • Clean up on failure with || rm -f $@

These aren’t new tricks - they’re straightforward applications of tools that have been around for decades. make has been managing file-based dependencies since 1976. The sentinel file pattern predates cloud computing entirely. We’re just applying them to a new problem.

One final thought. The idempotent bug - running a failing process repeatedly and getting the same error every time - is the mirror image of what we’ve built here. Our goal is idempotent success: run it once, it works. Run it again, it still works. Run it a hundred times, nothing changes. || rm -f $@ is what separates idempotent success from idempotent failure - it ensures that a bad run always leaves the door open for the next attempt rather than cementing the failure in place with a stale sentinel.

Your shed is already well stocked. Sometimes the right tool for the job is the one you’ve had hanging on the wall for thirty years.


Further Reading

  • “Advanced Bash-Scripting Guide” - https://tldp.org/LDP/abs/html/index.html
  • “GNU Make” - https://www.gnu.org/software/make/manual/html_node/index.html
  • Dave Oswald, “Perl One Liners for the Shell” (Perl conference presentation): https://www.slideshare.net/slideshow/perl-oneliners/77841913
  • Peteris Krumins, “Perl One-Liners” (No Starch Press): https://nostarch.com/perloneliners
  • Sundeep Agarwal, “Perl One-Liners Guide” (free online): https://learnbyexample.github.io/learn_perl_oneliners/
  • AWS CLI JMESPath query documentation: https://docs.aws.amazon.com/cli/latest/userguide/cli-usage-filter.html

building my todo list app

rjbs forgot what he was saying

For years, I’ve wanted a better daily to-do checklist. I had a good idea what I wanted from it, but I knew it was going to be a pain to produce. It didn’t have any complicated ideas, just a bunch of UI considerations, and that’s not my area of expertise, so I’ve made do with a bunch of worse (for me) options, which has, for me, led to worse outcomes. I accepted the tradeoffs, but I wasn’t thrilled. Now I’ve finally built exactly the app I wanted, and it went great. I call it, for now, NerfJournal.

Project Seven: NerfJournal

That’s right, this is another “stuff Rik did with Claude” post. This one feels like maybe the project that had the greatest impact on me so far, and that’s in three ways: First, the tool I’ve produced is genuinely useful and I use it daily. Secondly, it made clear the ways in which the realm of coding easily available to me was expanded by agents. Finally, it’s been a great way to not just access but also learn those things, which I’ll write about in a follow-up post.

Anyway, the project is called NerfJournal, because it’s like a bullet journal, but wouldn’t really hurt anybody. Except me, if Hasbro decides to complain about the name.

I try to stick to a routine in setting up my work today. I have “work diary”, a bit like Mark Dominus once wrote about, and which I got to see in practice when we last worked together. This journal is very simple. There’s a bunch of checkboxes of things I mean to do every day, and then there’s space for notes on what else I actually did. I try to add a new page to this every day, and I’ve got a Monday template and a “rest of the week” template. The Monday template includes stuff I only need to do once a week. Here’s a sample page, not filled in:

Monday agenda in Notion

You’ll see that the 6th item on the morning routine is to post to #cyr-scrum. This is the Slack channel where, every day, the Cyrus team members are each meant to post what we did the previous day and what we’re going to do today. While the Notion page includes “stuff I do every day, but might forget”, the #cyr-scrum post is generally “stuff I won’t do again once it’s done, and might need to carry over until tomorrow”.

That is: if I didn’t fill my water pitcher today, I failed, and tomorrow I’ll get a new instance of that to do. It’s not “still open”, it’s a new one, and it’s interesting (well, to me) whether I kept up a streak. On the other hand, if I post in #cyr-scrum that I’m going to complete ticket CYR-1234, but I don’t do it, I better do it tomorrow. And if I do, there’s no reason to see it again on the next day.

a scrum post

A problem here is that now I have two to-do lists. One is a real todo list that I can tick “done” on, and the other is a post in Slack that I want to refer back to, from time to time, to see whether I’m keeping up with what I said I’d do. GTD rightfully tells us that “more todo lists is worse than fewer todo list”, generally, and I wanted fewer. But I didn’t want to make Linear tasks every day for things like “drink water”. And putting my scrum in Notion would be tedious. And CalDAV with VTODO has its own problems.

What I wanted was a single todo list that would be easy to use, visually simple enough to just leave on my desktop for quick reference. I’d been thinking about such a program off and on (mostly off) for a year or so, and after some so-so but encourage experience having Claude produce SwiftUI applications for me, I thought I’d give this one a go.

The session took place over two days. After a brief false start using VTODO (well, Apple’s EventKit) as a backend, we pivoted to a custom data model and got something working. We iterated on that, adding features, fixing bugs, and tweaking the design for a good while. When I felt like it, I’d take a break to play Xbox or read a book. When I came back, Claude had not contexted switched. Meanwhile, I’d had time for that diffuse cognition mode to help me “think” about next steps.

The biggest shifts were about realizing that the data model was subtly wrong. This wouldn’t have been hard to fix by hand, but it would have been fiddly and boring. Instead, I said, ‘Here’s the new model, do it.” Claude asked some useful questions, then did it. Meanwhile, I read Wikipedia. (I also spent some time reading the Swift source code.)

As things stand now, the app seems very likely to be useful. There are a bunch of things I still want to add. Some of them, I have a good picture of how to get them. Others, I only know the general idea. In both cases, I feel confident that I can get closer to what I want without too much serious effort. Pruning development dead ends is cheap.

You can read the whole development transcript, but it’s long. Firefox says 400 pages. But it’s there in case you want to look.

Here’s the app, loaded with test data. (There’s a Perl program to spit out predictable test data which can be imported into the app for testing.)

today's todo

Here’s today’s page, and you can see what I’ve done and haven’t. At the bottom, if you squint, you might see that one of my code review tasks says “carried over - 1 day ago”, meaning that I first put it on my list yesterday, but still haven’t done it.

If we go back a while, we can see what a “finished” day looks like:

a completed page

Now I can see all the things I did, when I marked them done, their category, and so on. I’m afraid I don’t have any days logged now that show some other things that could happen: things that didn’t get done would be shown in a “not done” section, showing that they were carried over and (maybe) done four days later. Some items could be shown as abandoned – I decided not to do them or carry them over. This is useful for those “fill the water” tasks. If I didn’t do that task on Monday, then when Tuesday starts, Monday’s todo is automatically abandoned. You can see the distinction in the previous screenshot: tasks that will carry over get round ticky bubbles, but tasks that will get auto-abandoned get square ticky boxes.

This is all pretty good, but wasn’t this supposed to help with Scrum? Well, it does! There’s a menu option to generate a “mrkdwn” (Slack’s bogus Markdown for chat) version of the day’s todo list into the macOS clipboard. Then I paste that into Slack. I can configure the report (or multiple versions of a report) so it doesn’t include personal items, for example. All of that: reporting, categories, and so on, are handled in the bundle manager.

the bundle manager

The bundle manager is named for “bundles”, which are groups of tasks that I can dump onto my list with two clicks. I have one for the start of a sprint, and I have another for standard work days. I imagine that I’ll have other bundles later for things like “prepare to travel” or “conference day”. But when I click “start a new day”, I get a blank page, and I know I better start with my daily bundle.

…and one of the items on my daily bundle is “make the code review tasks”. It’s got a hyperlink (you may have noticed that todo items can have a little link icon). The hyperlink is an iterm: URI that, when clicked, prompts me to run a little Perl program. That program fetches all the GitLab and GitHub code review requests waiting on me, turns it into JSON, and passes it to another little program that turns them into todos in NerfJournal. So I click the link, click “yes, run this program”, and then a bunch of specific-to-today tasks show up. Then I mark the first task done. I am getting all my code review done daily, just about. It’s a big process improvement.

wasn’t this post about Claude?

Well, sort of. I did all this with Claude. I described what I wanted, and I said I wanted it in SwiftUI, and Clade got to work building. I’d test, find bugs, realize that I had the wrong design, and iterate. I spent a big hunk of two days on this, and it has been a huge win. I could’ve built this on my own, for sure, but it would’ve taken weeks, at least, including “learn SwiftUI from scratch”. Possible, of course, but a much larger investment on a tool that, in the end, I might not have liked!

Is the code bad? I’m not sure. I don’t think so, but I’m not a Swift expert yet. But also: it only runs on my machine. I can see everything it does, and I can see it’s safe. I do not plan to sell it, support it, or run my business on it. Effectively, I summoned into existence a specialized tool that helps me do the job at which I am an expert, saving my expert time for expert problems. I think I will end up doing a lot of this. And hopefully I’ll pick up some new skills, as I go, from paying close attention to the new code I’m reading.

I had Claude produce a complete daily to-do checklist for me – something like a bullet journal. The results were great, and I’ve been using the program for weeks and it’s definitely helping me stay on track and keep an eye on what I’m doing. The problem was that unlikely everything else I’d had Claude write, I was not competent to review this work. I felt confident it wasn’t going to cause me big problems, but what if I wanted to debug it myself? I realized there was a plausible solution waiting for me…

I gave Claude a prompt that went roughly like this:

We have built a really useful program here, and I am using it and enjoy it. Next, I would like to be able to work on it directly and to think more deeply about its architecture. The problem is that I don’t really know Swift.

I am an experienced programmer with decades of experience. I have worked primarily in Perl, but am well-versed in general programming concepts from many other different languages. It seems relevant so: I understand event-driven programming, observers, and MVC, in general.

I would like you to prepare a syllabus for me, which will help me learn SwiftUI, using NerfJournal as a working example. What might be useful here is a set of topics, starting with fundamentals and building to more complex ones, related to how the project actually works.

This began a back and forth that didn’t go on all that long. (You can read the transcript. Claude produced a syllabus. I proposed that we turn the project into a website. We fought with Jekyll for a while. Claude told me that I wouldn’t need some skills I thought I might want. (Later, I did want them.)

Still, in short order, I had: Unit 1: Swift as a Language. It started like this:

Before touching SwiftUI, you need the language it’s built on. Swift is statically typed, compiled, and designed around a distinction — value types vs. reference types — that will shape every decision in the units that follow.

This unit covers the language features you’ll see constantly in NerfJournal’s source: structs, enums, optionals, protocols, extensions, modules, closures, and computed properties. None of this is SwiftUI-specific; it’s just Swift.

The single most important idea in this unit is that structs are value types. Everything else makes more sense once that has settled in.

I felt that the text was good. It wasn’t confusing. It wasn’t unclear. It also didn’t captivate me or lead me to imagine I was reading a lost work of Edward Gibbon. But I didn’t need that, I just needed something to systematically help me learn SwiftUI, with an eye to working on the project I’d summoned into existence. On that front, the text was good.

Eventually, I did end up creating some skills and standing instructions. First, the standing instruction:

When the user asks a question about Swift or SwiftUI during a learning discussion, log it to learning/questions.md under the appropriate unit heading, then commit it. Do this automatically without being prompted.

As I read the content, I’d do all the things I’d normally do when reading a programming book: I’d close my eyes and think hard. I’d fiddle with the source code to see how things changed. I’d go consult the authoritative documentation. But sometimes, I’d also (or instead), ask Claude to elaborate on something.

At some point, the text said that extensions were “module-scoped”. I had no idea what a module was. The text didn’t say. Rather than consult the docs, I just asked Claude: “You refer to module scope. What is a module? Is this going to be explained later? If so, no problem.”

Claude said that no, its plan hadn’t included modules, and really they belonged in unit one. It provided me a clear and useful explanation and then, without prompting, wrote a commit to add the explanation to the Q&A appendix of the book. More questions like this came up, and Claude would populate the Q&A section.

Later, I added a skill, ‘next-chapter’:

Write the next unit of NerfLearning.

First, rebase this branch on main.

Review the changes between the state of this branch before rebasing and after. If changes to the project suggest that learning/SYLLABUS.md should be updated for future chapters, make those changes and commit it.

Then review the file learning/questions.md, which reflects questions from the reader during the last unit. Merge the material from the questions into the unit they came from. Remove the now-merged questions from the questions file. Commit that.

Then write the next unit from the syllabus. When doing so, reflect on the question-and-answers content you just merged into the previous unit. That reflects the kind of thing that the reader felt was missing from the text.

Commit the new unit.

I asked Claude to write Unit 2, and it did so. “It seems like the user wants more implementation details,” it mused, “I should make sure to cover how @ViewBuilder actually works.” Then it spit out another unit. Was the unit actually better because of those instructions? How the heck should I know!? But it remained good.

I’m up to unit six now, where I’m stalled mostly due to other things taking my time. I actually feel like I can read the whole program and pretty much follow along what it’s doing, how the syntax works, how the SwiftUI “magic” is suffused through the system, and how I’d change things in significant ways. I’m no expert. At best, I’m a beginner, but I have been given a huge boost in my learning process.

Of course this sort of process could go haywire. I would not want to learn a foreign language or culture this way and then go on a diplomatic mission. Software learning is much more forgiving, because so much of it can be trivially verified by checking authoritative sources or performing experiments. Also, I’ve got a lot of experience to draw on. But even so, it’s clear that this has been valuable and I’ll do something like this again.

There is sometimes an argument that “why will anybody learn anything anymore if the computer can do the work?” I don’t get this argument. Sure, some people will try to get by on the minimum, but that’s already the case. Now there are some longer levers for just skating by. But the same levers can be used to learn more, to achieve more, and to experiment more. I don’t think any of this is an unvarnished good, but it’s also clearly not just spicy autocorrect.

I’m hoping to get back to SwiftUI in a week or two. I’m piling up a number of little features I’d like to implement, and might try a few by hand.

You can read NerfLearning, up to wherever I’ve gotten to, if you like… but it’s targeting a pretty darn small audience.

I'm currently in a train from Berlin to Strasbourg and then onward to Marseille, traveling from the 28th(!) German Perl Workshop to the Koha Hackfest. I spend a few days after the Perl Workshop in Berlin with friends from school who moved to Berlin during/after university, hanging around at their homes and neighborhoods, visiting museums, professional industrial kitchens and other nice and foody places. But I want to review the Perl Workshop, so:

German Perl Workshop

It seems the last time I've attended a German Perl Workshop was in 2020 (literally days before the world shut down...), so I've missed a bunch of nice events and possibilities to meet up with old Perl friends. But even after this longish break it felt a bit like returning home :-)

I traveled to Berlin by sleeper train (worked without a problem) arriving on Monday morning a few hours before the workshop started. I went to a friends place (where I'm staying for the week), dumped my stuff, got a bike, and did a nice morning cycle through Tiergarten to the venue. Which was an actual church! And not even a secularized one.

Day 1

After a short introduction and welcome by Max Maischein (starting with a "Willkommen, liebe Gemeinde" fitting the location) he started the workshop with a talk on Claude Code and Coding-Agents. I only recently started to play around a bit with similar tools, so I could related to a lot of the topics mentioned. And I (again?) need to point out the blog post I Sold Out for $20 a Month and All I Got Was This Perfectly Generated Terraform which sums up my feelings and experiences with LLMs much better than I could.

Abigail then shared a nice story on how they (Booking.com) sharded a database, twice using some "interesting" tricks to move the data around and still getting reads from the correct replicas, all with nearly no downtime. Fun, but as "my" projects usually operate on a much smaller scale than Booking I will probably not try to recreate their solution.

For lunch I met with Michael at a nearby market hall for some Vietnamese food to do some planing for the upcoming Perl Toolchain Summit in Vienna.

Lars Dieckow then talked about data types in databases, or actually the lack of more complex types in databases and how one could still implement such types in SQL. Looks interesting, but probably a bit to hackish for me to actually use. I guess I have to continue handling such cases in code (which of course feels ugly, especially as I've learned to move more and more code into the DB using CTEs and window functions).

Next Flavio S. Glock showed his very impressive progress with PerlOnJava, a Perl distribution for the JVM. Cool, but probably not something I will use (mostly because I don't run Java anywhere, so adding it to our stack would make things more complex).

Then Lars showed us some of his beloved tools in Aus dem Nähkästchen, continuing a tradition started by Sven Guckes (RIP). I am already using some of the tools (realias, fzf, zoxide, htop, ripgrep) but now plan to finally clean up my dotfiles using xdg-ninja.

Now it was time for my first talk at this workshop, on Using class, the new-ish feature available in Perl (since 5.38) for native keywords for object-oriented programming. I also sneaked in some bibliographic data structures (MAB2 and MARCXML) to share my pain with the attendees. I was a tiny bit (more) nervous, as this was the first time I was using my current laptop (a Framework running Sway/Wayland) with an external projector, but wl-present worked like a charm. After the talk Wolfram Schneider showed me his MAB2->MARC online converter, which could maybe have been a basis for our tool, but then writing our own was a "fun" way to learn about MAB2.

The last talk of the day was Lee Johnson with I Bought A Scanner showing us how he got an old (ancient?) high-res foto scanner working again to scan his various film projects. Fun and interesting!

Between the end of the talks and the social event I went for some coffee with Paul Cochrane, and we where joined by Sawyer X and Flavio and some vegan tiramisu. Paul and me then cycled to the Indian restaurant through some light drizzle and along the Spree, and only then I realized that Paul cycled all the way from Hannover to Berlin. I was a bit envious (even though I in fact did cycle to Berlin 16 years ago (oh my, so long ago..)). Dinner was nice, but I did not stay too long.

Day 2

Tuesday started with Richard Jelinek first showing us his rather impressive off-grid house (or "A technocrat's house - 2050s standard") and the software used to automate it before moving on the the actual topic of his talk, Perl mit AI which turned out to be about a Perl implementation in Rust called pperl developed with massive LLM support. Which seems to be rather fast. As with PerlOnJava, I'm not sure I really want to use an alternative implementation (and of course currently pperl is marked as "Research Preview — WORK IN PROGRESS — please do not use in production environments") but maybe I will give it a try when it's more stable. Especially since we now have containers, which make setting up some experimental environments much easier.

Then Alexander Thurow shared his Thoughts on (Modern?) Software Development, lots of inspirational (or depressing) quotes and some LLM criticism lacking at the workshop (until now..)

Next up was Lars (again) with a talk on Hierarchien in SQL where we did a very nice derivation on how to get from some handcrafted SQL to recursive CTEs to query hierarchical graph data (DAG). I used (and even talked about) recursive CTEs a few times, but this was by far the best explanation I've ever seen. And we got to see some geizhals internals :-)

Sören Laird Sörries informed us on Digitale Souveränität und Made in Europe and I'm quite proud to say that I'm already using a lot of the services he showed (mailbox, Hetzner, fairphone, ..) though we could still do better (eg one project is still using a bunch of Google services)

Then Salve J. Nilsen (whose name I will promise to not mangle anymore) showed us his thoughts on What might a CPAN Steward organization look like?. We already talked about this topic a few weeks ago (in preparation of the Perl Toolchain Summit), so I was not paying a lot of attention (and instead hacked up a few short slides for a lightning talk) - Sorry. But in the discussion afterwards Salve clarified that the Cyber Resilience Act applies to all "CE-marked products" and that even a Perl API backend that power a mobile app running on a smartphone count as "CE-marked products". Before that I was under the assumption that only software running on actual physical products need the attestation. So we should really get this Steward organization going and hopefully even profit from it!

The last slot of the day was filled with the Lightning Talks hosted by R Geoffrey Avery and his gong. I submitted two and got a "double domm" slot, where I hurried through my microblog pipeline (on POSSE and getting not-twitter-tweets from my command line via some gitolite to my self hosted microblog and the on to Mastodon) followed by taking up Lars' challenge to show stuff from my own "Nähkästchen", in my case gopass and tofi (and some bash pipes) for an easy password manager.

We had the usual mixture of fun and/or informative short talks, but the highlight for me was Sebastian Gamaga, who did his first talk at a Perl event on How I learned about the problem differentiating a Hash from a HashRef. Good slides, well executed and showing a problem that I'm quite sure everybody encountered when first learning Perl (and I have to admit I also sometimes mix up hash/ref and regular/curly-braces when setting up a hash). Looking forward for a "proper" talk by Sebastian next year :-)

This evening I skipped having dinner with the Perl people, because I had to finish some slides for Wednesday and wanted to hang out with my non-Perl friends. But I've heard that a bunch of people had fun bouldering!

Day 3

I had a job call at 10:00 and (unfortunately) a bug to fix, so I missed the three talks in the morning session and only arrived at the venue during lunch break and in time for Paul Cochrane talking about Getting FIT in Perl (and fit he did get, too!). I've only recently started to collect exercise data (as I got a sport watch for my birthday) and being able to extract and analyze the data using my own software is indeed something I plan to do.

Next up was Julien Fiegehenn on Turning humans into SysAdmins, where he showed us how he used LLMs to adapt his developer mentorship framework to also work for sysadmin and getting them (LLMs, not fresh Sysadmins) to differentiate between Julian and Julien (among other things..)

For the final talk it was my turn again: Deploying Perl apps using Podman, make & gitlab. I'm not too happy with slides, as I had to rush a bit to finish them and did not properly highlight all the important points. But it still went well (enough) and it seemed that a few people found one of the main points (using bash / make in gitlab CI instead of specifying all the steps directly in .gitlab-ci.yml) useful.

Then Max spoke the closing words and announced the location of next years German Perl Workshop, which will take place in Hannover! Nice, I've never been there and plan to attend (and maybe join Paul on a bike ride there?)

Summary

As usual, a lot of thanks to the sponsors, the speakers, the orgas and the attendees. Thanks for making this nice event possible!

TPRC Talk Submission Deadline extended

Perl Foundation News

We are re-opening the talk submissions with a new deadline of April 21, 2026. Please submit your 20 minute talks, and 50 minute talks at https://tprc.us/. Let us know if you need help with your submission or your talk development, because we have mentors who can listen to your ideas and guide you.

We are also taking submissions for interactive sessions. These are sessions that have a theme, but invite maximum audience participation; sessions which take advantage of the gathering of community members that have a wide range of experience and ideas to share. You would introduce the theme and moderate the session. If you have ideas for interactive sessions, but don’t want to moderate them yourself, please go to our wiki to enter your ideas, and maybe someone else will pick up the ball!

Still on the [b]leading edge

Perl Hacks

About eighteen months ago, I wrote a post called On the Bleading Edge about my decision to start using Perl’s new class feature in real code. I knew I was getting ahead of parts of the ecosystem. I knew there would be occasional pain. I decided the benefits were worth it.

I still think that’s true.

But every now and then, the bleading edge reminds you why it’s called that.

Recently, I lost a couple of days to a bug that turned out not to be in my code, not in the module I was installing, and not even in the module that module depended on — but in the installer’s understanding of modern Perl syntax.

This is the story.

The Symptom

I was building a Docker image for Aphra. As part of the build, I needed to install App::HTTPThis, which depends on Plack::App::DirectoryIndex, which depends on WebServer::DirIndex.

The Docker build failed with this error:

#13 45.66 --> Working on WebServer::DirIndex
#13 45.66 Fetching https://www.cpan.org/authors/id/D/DA/DAVECROSS/WebServer-DirIndex-0.1.3.tar.gz ... OK
#13 45.83 Configuring WebServer-DirIndex-v0.1.3 ... OK
#13 46.21 Building WebServer-DirIndex-v0.1.3 ... OK
#13 46.75 Successfully installed WebServer-DirIndex-v0.1.3
#13 46.84 ! Installing the dependencies failed: Installed version (undef) of WebServer::DirIndex is not in range 'v0.1.0'
#13 46.84 ! Bailing out the installation for Plack-App-DirectoryIndex-v0.2.1.

Now, that’s a deeply confusing error message.

It clearly says that WebServer::DirIndex was successfully installed. And then immediately says that the installed version is undef and not in the required range.

At this point you start wondering if you’ve somehow broken version numbering, or if there’s a packaging error, or if the dependency chain is wrong.

But the version number in WebServer::DirIndex was fine. The module built. The tests passed. Everything looked normal.

So why did the installer think the version was undef?

When This Bug Appears

This only shows up in a fairly specific situation:

  • A module uses modern Perl class syntax
  • The module defines a $VERSION
  • Another module declares a prerequisite with a specific version requirement
  • The installer tries to check the installed version without loading the module
  • It uses Module::Metadata to extract $VERSION
  • And the version of Module::Metadata it is using doesn’t properly understand class

If you don’t specify a version requirement, you’ll probably never see this. Which is why I hadn’t seen it before. I don’t often pin minimum versions of my own modules, but in this case, the modules are more tightly coupled than I’d like, and specific versions are required.

So this bug only appears when you combine:

modern Perl syntax + version checks + older toolchain

Which is pretty much the definition of “bleading edge”.

The Real Culprit

The problem turned out to be an older version of Module::Metadata that had been fatpacked into cpanm.

cpanm uses Module::Metadata to inspect modules and extract $VERSION without loading the module. But the older Module::Metadata didn’t correctly understand the class keyword, so it couldn’t work out which package the $VERSION belonged to.

So when it checked the installed version, it found… nothing.

Hence:

Installed version (undef) of WebServer::DirIndex is not in range ‘v0.1.0’

The version wasn’t wrong. The installer just couldn’t see it.

An aside, you may find it amusing to hear an anecdote from my attempts to debug this problem.

I spun up a new Ubuntu Docker container, installed cpanm and tried to install Plack::App::DirectoryIndex. Initially, this gave the same error message. At least the problem was easily reproducible.

I then ran code that was very similar to the code cpanm uses to work out what a module’s version is.

$ perl -MModule::Metadata -E'say Module::Metadata->new_from_module("WebServer::DirIndex")->version'

This displayed an empty string. I was really onto something here. Module::Metadata couldn’t find the version.

I was using Module::Metadata version 1.000037 and, looking at the change log on CPAN, I saw this:

1.000038 2023-04-28 11:25:40Z
- detects "class" syntax
I installed 1.000038 and reran my command.
$ perl -MModule::Metadata -E'say Module::Metadata->new_from_module("WebServer::DirIndex")->version'
0.1.3

That seemed conclusive. Excitedly, I reran the Docker build.

It failed again.

You’ve probably worked out why. But it took me a frustrating half an hour to work it out.

cpanm doesn’t use the installed version of Module::Metadata. It uses its own, fatpacked version. Updating Module::Metadata wouldn’t fix my problem.

The Workaround

I found a workaround. That was to add a redundant package declaration alongside the class declaration, so older versions of Module::Metadata can still identify the package that owns $VERSION.

So instead of just this:

class WebServer::DirIndex {
  our $VERSION = '0.1.3';
  ...
}

I now have this:

package WebServer::DirIndex;

class WebServer::DirIndex {
  our $VERSION = '0.1.3';
  ...
}

It looks unnecessary. And in a perfect world, it would be unnecessary.

But it allows older tooling to work out the version correctly, and everything installs cleanly again.

The Proper Fix

Of course, the real fix was to update the toolchain.

So I raised an issue against App::cpanminus, pointing out that the fatpacked Module::Metadata was too old to cope properly with modules that use class.

Tatsuhiko Miyagawa responded very quickly, and a new release of cpanm appeared with an updated version of Module::Metadata.

This is one of the nice things about the Perl ecosystem. Sometimes you report a problem and the right person fixes it almost immediately.

When Do I Remove the Workaround?

This leaves me with an interesting question.

The correct fix is “use a recent cpanm”.

But the workaround is “add a redundant package line so older tooling doesn’t get confused”.

So when do I remove the workaround?

The answer is probably: not yet.

Because although a fixed cpanm exists, that doesn’t mean everyone is using it. Old Docker base images, CI environments, bootstrap scripts, and long-lived servers can all have surprisingly ancient versions of cpanm lurking in them.

And the workaround is harmless. It just offends my sense of neatness slightly.

So for now, the redundant package line stays. Not because modern Perl needs it, but because parts of the world around modern Perl are still catching up.

Life on the Bleading Edge

This is what life on the bleading edge actually looks like.

Not dramatic crashes. Not language bugs. Not catastrophic failures.

Just a tool, somewhere in the install chain, that looks at perfectly valid modern Perl code and quietly decides that your module doesn’t have a version number.

And then you lose two days proving that you are not, in fact, going mad.

But I’m still using class. And I’m still happy I am.

You just have to keep an eye on the whole toolchain — not just the language — when you decide to live a little closer to the future than everyone else.

The post Still on the [b]leading edge first appeared on Perl Hacks.

Abstract

Even if you’re skeptical about AI writing your code, you’re leaving time on the table.

Many developers have been slow to adopt AI in their workflows, and that’s understandable. As AI coding assistants become more capable the anxiety is real - nobody wants to feel like they’re training their replacement. But we’re not there yet. Skilled developers who understand logic, mathematics, business needs and user experience will be essential to guide application development for the foreseeable future.

The smarter play is to let AI handle the parts of the job you never liked anyway - the documentation, the release notes, the boilerplate tests - while you stay focused on the work that actually requires your experience and judgment. You don’t need to go all in on day one. Here are six places to start.


1. Unit Test Writing

Writing unit tests is one of those tasks most developers know they should do more of and few enjoy doing. It’s methodical, time-consuming, and the worst time to write them is when the code reviewer asks if they pass.

TDD is a fine theory. In practice, writing tests before you’ve vetted your design means rewriting your tests every time the design evolves - which is often. Most experienced developers write tests after the design has settled, and that’s a perfectly reasonable approach.

The important thing is that they get written at all. Even a test that simply validates use_ok(qw(Foo::Bar)) puts scaffolding in place that can be expanded when new features are added or behavior changes. A placeholder is infinitely more useful than nothing.

This is where AI earns its keep. Feed it a function or a module and it will identify the code paths that need coverage - the happy path, the edge cases, the boundary conditions, the error handling. It will suggest appropriate test data sets including the inputs most likely to expose bugs: empty strings, nulls, negative numbers, off-by-one values - the things a tired developer skips.

You review it, adjust it, own it. AI did the mechanical work of thinking through the permutations. You make sure it reflects how your code is actually used in the real world.


2. Documentation

“Documentation is like sex: when it’s good, it’s very, very good; and when it’s bad, it’s better than nothing.” - said someone somewhere.

Of course, there are developers that justify their disdain for writing documentation with one of two arguments (or both):

  1. The code is the documentation
  2. Documentation is wrong the moment it is written

It is true, the single source of truth regarding what code actually does is the code itself. What it is supposed to do is what documentation should be all about. When they diverge it’s either a defect in the software or a misunderstanding of the business requirement captured in the documentation.

Code that changes rapidly is difficult to document, but the intent of the code is not. Especially now with AI. It is trivial to ask AI to review the current documentation and align it with the code, negating point #2.

Feed AI a module and ask it to generate POD. It will describe what the code does. Your job is to verify that what it does is what it should do - which is a much faster review than writing from scratch.


3. Release Notes

If you’ve read this far you may have noticed the irony - this post was written by someone who just published a blog post about automating release notes with AI. So consider this section field-tested.

Release notes sit at the intersection of everything developers dislike: writing prose, summarizing work they’ve already mentally moved on from, and doing it with enough clarity that non-developers can understand what changed and why it matters. It’s the last thing standing between you and shipping.

The problem with feeding a git log to AI is that git logs are written for developers in the moment, not for readers after the fact. “Fix the thing” and “WIP” are not useful release note fodder.

The better approach is to give AI real context - a unified diff, a file manifest, and the actual source of the changed files. With those three inputs AI can identify the primary themes of a release, group related changes, and produce structured notes that actually reflect the architecture rather than just the line changes.

A simple make release-notes target can generate all three assets automatically from your last git tag. Upload them, prompt for your preferred format, and you have a first draft in seconds rather than thirty minutes. Here’s how I built it.

You still edit it. You add color, context, and the business rationale that only you know. But the mechanical work of reading every diff and turning it into coherent prose? Delegated.


4. Bug Triage

Debugging can be the most frustrating and the most rewarding experience for a developer. Most developers are predisposed to love a puzzle and there is nothing more puzzling than a race condition or a dangling pointer. Even though books and posters have been written about debugging it is sometimes difficult to know exactly where to start.

Describe the symptoms, share the relevant code, toss your theory at it. AI will validate or repudiate without ego - no colleague awkwardly telling you you’re wrong. It will suggest where to look, what telemetry to add, and before you know it you’re instrumenting the code that should have been instrumented from the start.

AI may not find your bug, but it will be a fantastic bug buddy.


5. Code Review

Since I’ve started using AI I’ve found that one of the most valuable things I can do with it is to give it my first draft of a piece of code. Anything more than a dozen or so lines is fair game.

Don’t waste your time polishing a piece of lava that just spewed from your noggin. There’s probably some gold in there and there’s definitely some ash. That’s ok. You created the framework for a discussion on design and implementation. Before you know it you have settled on a path.

AI’s strength is pattern recognition. It will recognize when your code needs to adopt a different pattern or when you nailed it. Get feedback. Push back. It’s not a one-way conversation. Question the approach, flag the inconsistencies that don’t feel right - your input into that review process is critical in evolving the molten rock into a solid foundation.


6. Legacy Code Deciphering

What defines “Legacy Code?” It’s a great question and hard to answer. And not to get too racy again, but as it has been said of pornography, I can’t exactly define it but I know it when I see it.

Fortunately (and yes I do mean fortunately) I have been involved in maintaining legacy code since the day I started working for a family run business in 1998. The code I maintained there was born literally in the late 70’s and still, to this day generates millions of dollars. You will never learn more about coding than by maintaining legacy code.

These are the major characteristics of legacy code from my experience (in order of visibility):

  1. It generates so much money for a company they could not possibly think of it being unavailable.
  2. It is monolithic and may in fact consist of modules in multiple languages.
  3. It is grown organically over the decades.
  4. It is more than 10 years old.
  5. The business rules are not documented, opaque and can only be discerned by a careful reading of the software. Product managers and users think they know what the software does, but probably do not have the entire picture.
  6. It cannot easily be re-written (by humans) because of #5.
  7. It contains as much dead code that is no longer serving any useful purpose as it does useful code.

I once maintained a C program that searched an ISAM database of legal judgments. The code had been ported from a proprietary in-memory binary tree implementation and was likely older than most of the developers reading this post. The business model was straightforward and terrifying - miss a judgment and we indemnify the client. Every change had to be essentially idempotent. You weren’t fixing code, you were performing surgery on a patient who would sue you if the scar was in the wrong place.

I was fortunate - there were no paydays for a client on my watch. But I wish I’d had AI back then. Not to write the code. To help me read it.

Now, where does AI come in? Points 5, 6, and definitely 7.

Throw a jabberwocky of a function at AI and ask it what it does. Not what it should do - what it actually does. The variable names are cryptic, the comments are either missing or lying, and the original author left the company during the Clinton administration. AI doesn’t care. It reads the code without preconception and gives you a plain English explanation of the logic, the assumptions baked in, and the side effects you never knew existed.

That explanation becomes your documentation. Those assumptions become your unit tests. Those side effects become the bug reports you never filed because you didn’t know they were bugs.

Dead code is where AI particularly shines. Show it a module and ask what’s unreachable. Ask what’s duplicated. Ask what hasn’t been touched in a decade but sits there quietly terrifying anyone who considers deleting it. AI will give you a map of the minefield so you can walk through it rather than around it forever.

Along the way AI will flag security vulnerabilities you never knew were there - input validation gaps, unsafe string handling, authentication assumptions that made sense in 1998 and are a liability today. It will also suggest where instrumentation is missing, the logging and telemetry that would have made every debugging session for the last twenty years shorter. You can’t go back and add it to history, but you can add it now before the next incident.

The irony of legacy code is that the skills required to understand it - patience, pattern recognition, the ability to hold an entire system in your head - are exactly the skills AI complements rather than replaces. You still need to understand the business. AI just helps you read the hieroglyphics.


Conclusion

None of the six items on this list require you to hand over the keys. You are still the architect, the decision maker, the person who understands the business and the user. AI is the tireless assistant who handles the parts of the job that drain your energy without advancing your craft.

The developers who thrive in the next decade won’t be the ones who resisted AI the longest. They’ll be the ones who figured out earliest how to delegate the tedious, the mechanical, and the repetitive - and spent the time they saved on the work that actually requires a human.

You don’t have to go all in. Start with a unit test. Paste some legacy code and ask AI to explain it or document it. Think of AI as that senior developer you go to with the tough problems - the one who has seen everything, judges nothing, and is available at 3am when the production system is on fire.

Only this one never sighs when you knock on the door.


Is all my 2026 blogging going to be about doing more random stuff with Claude? No, I promise. But I’m still working through my backlog of “and the next thing I tried to put it through its paces”, so here it comes. This time, more drudgery.

Project Four: MIME-Lite to Dist::Zilla

Here’s a quick one before I post beefy Project Five tomorrow.

A couple weeks ago, I got a pull request against MIME-Lite. That’s a pretty popular Perl email library that I maintain, and which Fastmail uses at least a little. The contributor had supplied a patch to use GitHub Actions to run the tests on every PR. Great!

Thing was, I already have a standard way of doing that, and I didn’t really want to pick up another way. The other thing was, my standard way relies on the distribution being built with Dist::Zilla. It’s not hard to convert from not using Dist::Zilla to using it, but it’s not trivial. And I don’t love MIME::Lite, so I wasn’t keen to spend a bunch of time on it.

I fired up Claude, pretty certain that I was going to get a crappy result.

rjbsConvert this distribution to building with Dist::Zilla, in the fashion of most RJBS-authored distributions, like Email::MIME.
claudeI’ll explore both the current MIME-Lite distribution and Email::MIME as a reference for the Dist::Zilla conversion pattern.

In about 20 minutes, it was done. Now, when I say “it”, I mean Claude. Claude wrote a big commit that wasn’t good enough. MIME-Lite had a bunch of little quirks, like optional dependencies and a slightly not-rjbs-standard documentation structure. I spent just a few minutes polishing things up and it was good to go.

This was a good example of needing some expert involvement. Understanding exactly what to do was going to require way too much context that couldn’t be found in the repository or in “understanding” of published copies of things. But some exploration of MIME-Lite and Email-MIME let Claude do a bunch of the work so I didn’t have to grumble my way through it. As I’ve seen a few times, it also did work I would not have bothered with, like converting the changelog to a more consistent format.

One funny nit… the original changelog had:

=item Version 1.101   (1997/03/01)

Baseline code.
Originally created: 11 December 1996.  Ho ho ho.

The new one:

1.101     1997-03-01
        - Baseline code. Originally created: 11 December 1996.

I guess Claude is part of the war on Christmas.

Claude also hallucinated author names weirdly and screwed up the Copyright. You can’t just let it do whatever it wants with no oversight! It was pretty bad, and I can definitely see somebody getting into hot water this way. This was probably the worst Claude experience I had that didn’t end in scrapping things. I think I probably could’ve done this one myself faster and with fewer errors. As the world’s leading Dist::Zilla expert… probably shouldn’t be a surprise.

With the conversion done, I dropped the standard CI workflow in place and… found problems in my conversion. Just what CI is for. Now it’s all sorted and happy and any new PRs will get test runs.

You can read my transcript of this conversation, but it’s a bit odd. I stopped and resumed the session at one point, and Claude waited at a permission prompt for 12 hours while I slept or something.

Originally, this was going to be the last in my series of stuff I did with Claude that I found compelling, but… the news, good or bad, is that I’ll be posting at least one more soon. This one, though, is definitely the piece of work that convinced me that Claude was useful for more than mechanical transformation.

Project Five: ClaudeLog

In some of my previous posts, I posted links to transcripts of my chats with Claude, including its internal monologue, its tools used, and other stuff that you won’t see if you just copy text out of the viewer. Those transcripts were generated by a tool that I built with Claude, which I’m going to talk about, now.

I think that the experience of using Claude in its current form (as opposed to what we saw from coding agents a year ago) is fairly stunning. It sometimes screws up, it sometimes makes errors, it sometimes takes four passes at doing things that any normal person would routinely do in one… but on the other hand, it can generate large, complex, readable, correct pieces of software from a standing start. Saying “it’s real good, folks” is not sufficient. I think that to understand what it can do, you should see for yourself. This is not to say that there are no reasons to hesitate or to worry about the serious impact that LLMs are having, and will continue to have. But one criticism I continue to see is “these things spew out unmaintainable nonsense”, which is not claim that really stands up to much real use.

Also, one friend recently said to me, “I want to be a programmer, not a copy editor.” I had to explain that while, yes, you do need to read and think about possible errors in agent-generated code, the experience is much more one of design and construct than of proof reading.

Since not everybody is going to say, “Hey, Rik, let’s pair and look at Claude,” and because I am not going to pair with every rando who might ask, I thought it would be good to provide a thorough transcript. I knew that Claude keeps a detailed log of its sessions (even though, amusingly, Claude claimed it knew of no such thing).

I had looked through the JSONL files in which sessions were stored, and the data looked a bit messy, but probably sufficient. Without giving it too much investigation, I opened up Claude Code and said…

I want to provide transcripts of my interactions with Claude to colleagues who are learning to use the system. I understand that my ~/.claude directory contains transcripts of the chat sessions. Please build a TUI application that:

  1. lets me drill down through projects to individual sessions, showing previews of the sessions as I go
  2. lets me select a session to be transcribed for my use
  3. produces a Markdown summary of the session that I can embed in a document

While your general instructions say to prefer Perl, in this case I would like you to use whatever language you believe will achieve the results most successfully with a pleasant terminal interface.

You can, of course, read the transcription of this whole conversation, produced by the program that the conversation eventually led to. There’s a lot of experimentation, a fair bit of stumbling, and overall you can see how I discover what it is I want while I’m building it. This is normal development, but…

In normal development with such a significant “discovery” aspect, it’s very common to spend a lot of time upshifting and downshifting. That is, first I imagine that general structure of what I want. I write a pseudo-code outline of the high-level code. Then I begin converting individual pieces into real code. Almost continuously, I’m shifting between design thinking and implementation. These are (for me, and I suspect for others) distinct ways of thinking, and while it’s not “context switching”, there is, I think, an analogous overhead.

Using Claude, I am really focusing on one of those angles at a time. I started with “here is a very rough outline” and within 20 minutes, I had a working program. I never, ever had to switch into implementation thinking, to get there. Then I had many tight, quick exchanges in the form, “This is wrong, change it” or “I’m ready to describe the next feature”.

At the top of the transcript, you’ll see this line:

Duration: 7h 50m (3h active)

This means that from the first to last event in the transcript, about eight hours passed on the clock, but I was only engaged in the chat for about three. Probably I took a long lunch in there, or maybe worked on something more pressing for a while. Or I just stopped and thought about it, or spent time reading transcripts and thinking about what could be better.

By the end of the day, I had a really useful program. The HTML it was generating was definitely fit for purpose. On the other hand, I made the mistake of looking at the code…

Look, it wasn’t absolutely impenetrable. It just felt very, very amateurish. It got the job done, and I could read it, but there was very little abstraction. There was nearly no separation of concerns. There were no real “layers” to speak of. And when layers did exist, they often duplicated work.

Hardly surprising: if you imagine Clade as often analogous to a junior programmer (not a great analogy, but often useful), and you imagine me as the lousy boss who kept saying, “Implement this feature immediately and I don’t care about code quality”, of course there was a ton of debt. And of course it racked up fast, because I was racking up features fast. The question I needed to answer was Will Claude’s code always be unmaintainable after a short while?

I decided to answer that the next day. Instead of adding more features, I’d just look at fixing the code. It was already clear to me how the code should’ve been structured. I just didn’t tell Claude about it, because I was focused on features. Why didn’t Claude architect it well from the beginning? Because nobody asked it to. This seems sort of stupid, but it’s often how this stuff works.

So, the next day, I started a new session. There were two key parts of this:

I am very happy with the output of this program. I would now like to begin some serious refactoring for maintainability. The first step should be fairly simple. I would like you to split out the code that takes a session JSONL file and emits HTML. Put that in its own module, so that the “build HTML from JSONL” can be called without the session browser being involved.

…and…

Next, I would like to separate out the generation of HTML from the interpretation of the JSONL file. I propose the following:

  • a SessionFile object that represents the log itself, and is created with a static method like session_file = SessionFile.from_path(filepath)
  • a SessionLog object that represents a sequence of SessionTurn
  • a SessionFile has a SessionLog, which is computed lazily when requested
  • a SessionTurn is either a UserTurn, a ClaudeTurn, a SystemTurn, or an IdleTurn
  • non-IdleTurn turns have a sequence of TurnEvents, which are things like the chat, the tool use, the thinking, and so on
  • the HTML generator is passed an already-computed SessionLog, which it then just formats, rather than formatting and event-aggregating all at once

This is a significant change. Make a plan, think hard, and ask me any questions you need to ask up front.

There’s a bunch of other stuff that went on, but these are the big ones. You can read the whole refactoring transcript.

Claude thought for a while, then asked me some (very reasonable) questions. I answered them, and Claude produced a plan. Pretty soon, it all worked. The program still did just what it did before, but now the code made sense, and it was much easier to add some new features, because they could be added as small changes to a few layers, instead of changes to the one big ball of mud.

Part of what made this easy was that up front I said, “save a copy of input and output now, so that while you refactor, you can keep checking you didn’t change the output”. Claude did this, checking the output repeatedly as it worked.

Again, a lot of other stuff went on, but I think this is an accurate summary.

One way of looking at this as a two-stage process is “Rik should’ve made sure Claude did it right to start”, and that’s an okay position, but it brings back some of that upshifting and downshifting that I mentioned.

Another option is, “Rik should’ve given Claude standing instructions to pay more attention to architecture.” Yes, probably that would be useful, but also then I’d probably be spending more time thinking about its architecture, because it would be talking about its architecture.

I think the best view might be: because fixing the design later is at least sometimes not as costly (in time, not to mention morale), the human is given more freedom to choose the approach that they will find more satisfying or pleasant. In the end, we want the code to be correct, reliable, readable, and maintainable. The path we take to get there isn’t necessarily all that relevant. (But please leave a good git history.)

I’ll publish the code for CladeLog at some point.

Answer

You can configure grub via several ways to use a specific kernel or you can configure grub to use the latest one, or you can tell grub to pick one from a selection.

One specific kernel

If you inspect /etc/grub/grub.cfg you’ll see entries like this:

# the \ are mine, these are usually one big line but for blog purposes I
# multilined them
menuentry 'Debian GNU/Linux GNU/Linux, with Linux 6.12.8-amd64' --class debian \
 --class gnu-linux --class gnu --class os $menuentry_id_option \
 'gnulinux-6.12.8-amd64-advanced-5522bbcf-dc03-4d36-a3fe-2902be938ed4' {

You can use two identifiers to configure grub; you can use 'Debian GNU/Linux GNU/Linux, with Linux 6.12.8-amd64' or you can use the $menuentry_id_option with gnulinux-6.12.8-amd64-advanced-5522bbcf-dc03-4d36-a3fe-2902be938ed4.

The Problem: Generating Release Notes is Boring

You’ve just finished a marathon refactoring - perhaps splitting a monolithic script into proper modules-and now you need to write the release notes. You could feed an AI a messy git log, but if you want high-fidelity summaries that actually understand your architecture, you need to provide better context.

The Solution: AI Loves Boring Tasks

…and is pretty good at them too!

Instead of manually describing changes or hoping it can interpret my ChangeLog, I’ve automated the production of three ephemeral “Sidecar” assets. These are generated on the fly, uploaded to the LLM, and then purged after analysis - no storage required.

The Assets

  • The Manifest (.lst): A simple list of every file touched, ensuring the AI knows the exact scope of the release.
  • The Logic (.diffs): A unified diff (using git diff --no-ext-diff) that provides the “what” and “why” of every code change.
  • The Context (.tar.gz): This is the “secret sauce.” It contains the full source of the changed files, allowing the AI to see the final implementation - not just the delta.

The Makefile Implementation

If you’ve read any of my blog posts you know I’m a huge Makefile fan. To automate this I’m naturally going to add a recipe to my Makefile or Makefile.am.

First, we explicitly set the shell to /usr/bin/env bash to ensure features like brace expansion work consistently across all dev environments.

# Ensure a portable bash environment for advanced shell features
SHELL := /usr/bin/env bash

.PHONY: release-notes clean-local

# Default to the version file, but allow command-line overrides
VERSION ?= $(shell cat VERSION)

release-notes:
    @curr_ver=$(VERSION); \
    last_tag=$$(git tag -l '[0-9]*.[0-9]*.[0-9]*' --sort=-v:refname | head -n 1); \
    diffs="release-$$curr_ver.diffs"; \
    diff_list="release-$$curr_ver.lst"; \
    diff_tarball="release-$$curr_ver.tar.gz"; \
    echo "Comparing $$last_tag to current $$curr_ver..."; \
    git diff --no-ext-diff "$$last_tag" "$$curr_ver" > "$$diffs"; \
    git diff --name-only --diff-filter=AMR "$$last_tag" "$$curr_ver" > "$$diff_list"; \
    tar -cf - -T "$$diff_list" --transform "s|^|release-$$curr_ver/|" | gzip > "$$diff_tarball"; \
    ls -alrt release-$$curr_ver*

clean-local:
    @echo "Cleaning ephemeral release assets..."
    rm -f release-*.{tar.gz,lst,diffs}

Breaking Down the Recipe

  • The Shell Choice (/usr/bin/env bash): We avoid hardcoding paths to ensure the script finds the correct Bash path on macOS, Linux, or inside a container.
  • The Version Override (VERSION ?=): This allows the “pre-flight” trick: running make release-notes VERSION=HEAD to iterate on notes before you’ve actually tagged the release.
  • Smart Tag Discovery (--sort=-v:refname): Using v:refname forces Git to use semantic versioning logic (so 1.10.0 correctly follows 1.2.0), while the glob pattern filters out “noisy” non-version tags.
  • The Diff Filter (--diff-filter=AMR): This ensures the tarball only includes files that actually exist (Added, Modified, or Renamed). If a release deleted a file, this filter prevents tar from erroring out when it can’t find the missing file on disk.
  • The Cleanup Crew (clean-local): remove the ephemeral artifacts using bash expansion

The AI Prompt

Once your assets are generated, upload them to AI and use a prompt that enforces your specific formatting standards.

The Prompt:

I have uploaded a file manifest (`.lst`), a unified diff
(`.diffs`), and a tarball of the source code (`.tar.gz`). Please
analyze these to identify the primary themes of this
release. Generate a release entry using this protocol:

1. Separate versions with 71 dashes.
2. Include a high-level executive summary paragraph.
3. Indent bullet points exactly 21 spaces to maintain a vertical column.
4. Group changes under category labels like `Core Bedrock:` or `Miniserver:`.
5. Provide plain-text ASCII only inside a code block.

The Result

In a recent dry run for Bedrock 3.8.8, this exact workflow allowed the AI to correctly identify that bedrock-miniserver.pl.in was refactored into Bedrock::MiniServer and Bedrock::MiniServer::CLI, while accurately noting that Bedrock::Error moved to the Bedrock-Core distribution.

In actuality, version 3.8.8 focuses on the primary goals of MiniServer maturation (more batteries included), Apache decoupling, and distribution hygiene. Here’s AI’s version:

-----------------------------------------------------------------------
Bedrock 3.8.8
-----------------------------------------------------------------------
This release focuses on the maturation of the MiniServer component, 
decoupling core Bedrock logic from Apache-specific dependencies, and 
improving distribution hygiene. Key architectural changes include 
refactoring the MiniServer into modular components and moving 
exception handling into the Bedrock-Core distribution.

2026-03-17 - 3.8.8 - MiniServer Maturation and Apache Decoupling
                     Miniserver:
                     - Refactored bedrock-miniserver.pl into modular 
                       Bedrock::MiniServer and Bedrock::MiniServer::CLI.
                     - Implemented zero-config scaffolding to 
                       automatically create application trees.
                     - Integrated full Bedrock configuration pipeline 
                       for parity with Apache environments.
                     - Updated bedrock_server_config to support both 
                       getter and setter operations.

                     Core:
                     - Moved Bedrock::Error and Bedrock::Exception to 
                       the Bedrock-Core distribution.
                     - Introduced Bedrock::FauxHandler as a production-
                       ready alias for test handlers.
                     - Added dist_dir() to BLM::Startup::Bedrock to 
                       expose distribution paths to templates.

                     Fixes:
                     - Demoted Apache-specific modules (mod_perl2, 
                       Apache2::Request) to optional recommendations.
                     - Improved Bedrock::Test::FauxHandler to handle 
                       caller-supplied loggers and safe destruction.

Conclusion

As I mentioned in a response to a recent Medium article, AI can be an accelerator for seasoned professionals. You’re not cheating. You did the work. AI does the wordsmithing. You edit, add color, and ship. What used to take 30 minutes now takes 3. Now that’s working smarter, not harder!

Pro-Tip

Add this to the top of your Makefile

SHELL := /usr/bin/env bash

# Default to the version file, but allow command-line overrides
VERSION ?= $(shell cat VERSION)

Copy this to a file named release-notes.mk

.PHONY: release-notes clean-local

release-notes:
    @curr_ver=$(VERSION); \
    last_tag=$$(git tag -l '[0-9]*.[0-9]*.[0-9]*' --sort=-v:refname | head -n 1); \
    diffs="release-$$curr_ver.diffs"; \
    diff_list="release-$$curr_ver.lst"; \
    diff_tarball="release-$$curr_ver.tar.gz"; \
    echo "Comparing $$last_tag to current $$curr_ver..."; \
    git diff --no-ext-diff "$$last_tag" "$$curr_ver" > "$$diffs"; \
    git diff --name-only --diff-filter=AMR "$$last_tag" "$$curr_ver" > "$$diff_list"; \
    tar -cf - -T "$$diff_list" --transform "s|^|release-$$curr_ver/|" | gzip > "$$diff_tarball"; \
    ls -alrt release-$$curr_ver*

clean-local:
    @echo "Cleaning ephemeral release assets..."
    rm -f release-*.{tar.gz,lst,diffs}

Then add release-notes.mk to your Makefile

include release-notes.mk

Here’s another post in which I fiddle around with Claude Code and determine that it is not just spewing out nonsense and bugs, but instead is doing kinda what I would’ve done, at least enough to reduce my total effort. This time, writing tests.

Project Three: JMAP-Tester test coverage

Okay, I progressed from code I’d throw away to code I would keep but not look at. I was progressing up the totem pole of how much cultural value we put on code. What was the next least respected code? Tests, of course.

Now, I actually love tests, and like treating them like first-class code, and building libraries to make testing better. One of those libraries is JMAP::Tester, which we use in tons of our testing. Until pretty recently, it didn’t have all that much testing of its own. That is: JMAP-Tester was used to test things, but was not itself tested. In December, as part of adding some features to JMAP::Tester, I started to expand its test coverage. This was rewarding and useful, but I didn’t get to 100% coverage. I used to strive for 100% (well, 95% coverage) on my code, but these days… well, who has the time?

Turns out, Claude has the time. This one was pretty darn impressive. You can read the whole transcript, but here’s an accurate summary:

rjbsThis project is a CPAN distribution. Produce a coverage report, which you can do with “dzil cover”. Find low-hanging fruit to add test coverage and draft the tests.
time passes
claudeI’ve increased test coverage from about 50% to about 95%.

That was it! You can read the pull request.

Well, there were a couple more bits, mostly me saying, “Make it look like how I’d have written it” (literally), and some git faff. I did go through and tweak the code to be more readable and better organized. I could’ve given Claude clearer instructions to avoid most of that, or had standing instructions about it… but really, it was a good way to keep my mind engaged while I reviewed that the code was testing what it should be, anyway.

If “write more coverage tests” was the only thing I could get out of Claude, it’d still be huge, but obviously there’s more. I’ll keep posting…

Let’s talk about music programming! There are a million aspects to this subject, but today, we’ll touch on generating rhythmic patterns with mathematical and combinatorial techniques. These include the generation of partitions, necklaces, and Euclidean patterns.

Stefan and J. Richard Hollos wrote an excellent little book called “Creating Rhythms” that has been turned into C, Perl, and Python. It features a number of algorithms that produce or modify lists of numbers or bit-vectors (of ones and zeroes). These can be beat onsets (the ones) and rests (the zeroes) of a rhythm. We’ll check out these concepts with Perl.

For each example, we’ll save the MIDI with the MIDI::Util module. Also, in order to actually hear the rhythms, we will need a MIDI synthesizer. For these illustrations, fluidsynth will work. Of course, any MIDI capable synth will do! I often control my eurorack analog synthesizer with code (and a MIDI interface module).

Here’s how I start fluidsynth on my mac in the terminal, in a separate session. It uses a generic soundfont file (sf2) that can be downloaded here (124MB zip).

fluidsynth -a coreaudio -m coremidi -g 2.0 ~/Music/soundfont/FluidR3_GM.sf2

So, how does Perl know what output port to use? There are a few ways, but with JBARRETT’s MIDI::RtMidi::FFI::Device, you can do this:

use MIDI::RtMidi::FFI::Device ();

my $midi_in = RtMidiIn->new;
my $midi_out = RtMidiOut->new;

print "Input devices:\n";
$midi_in->print_ports;
print "\n";

print "Output devices:\n";
$midi_out->print_ports;
print "\n";

This shows that fluidsynth is alive and ready for interaction.

Okay, on with the show!

First-up, let’s look at partition algorithms. With the part() function, we can generate all partitions of n, where n is 5, and the “parts” all add up to 5. Then taking one of these (say, the third element), we convert it to a binary sequence that can be interpreted as a rhythmic phrase, and play it 4 times.

#!/usr/bin/env perl
use strict;
use warnings;

use Music::CreatingRhythms ();

my $mcr = Music::CreatingRhythms->new;

my $parts = $mcr->part(5);
# [ [ 1, 1, 1, 1, 1 ], [ 1, 1, 1, 2 ], [ 1, 2, 2 ], [ 1, 1, 3 ], [ 2, 3 ], [ 1, 4 ], [ 5 ] ]

my $p = $parts->[2]; # [ 1, 2, 2 ]

my $seq = $mcr->int2b([$p]); # [ [ 1, 1, 0, 1, 0 ] ]

Now we render and save the rhythm:

use MIDI::Util qw(setup_score);

my $score = setup_score(bpm => 120, channel => 9);

for (1 .. 4) {
    for my $bit ($seq->[0]->@*) {
        if ($bit) {
            $score->n('en', 40);
        }
        else {
            $score->r('en');
        }
    }
}

$score->write_score('perldotcom-1.mid');

In order to play the MIDI file that is produced, we can use fluidsynth like this:

fluidsynth -i ~/Music/soundfont/FluidR3_GM.sf2 perldotcom-1.mid

Not terribly exciting yet.

Let’s see what the “compositions” of a number reveal. According to the Music::CreatingRhythms docs, a composition of a number is “the set of combinatorial variations of the partitions of n with the duplicates removed.”

Okay. Well, the 7 partitions of 5 are:

[[1, 1, 1, 1, 1], [1, 1, 1, 2], [1, 1, 3], [1, 2, 2], [1, 4], [2, 3], [5]]

And the 16 compositions of 5 are:

[[1, 1, 1, 1, 1], [1, 1, 1, 2], [1, 1, 2, 1], [1, 1, 3], [1, 2, 1, 1], [1, 2, 2], [1, 3, 1], [1, 4], [2, 1, 1, 1], [2, 1, 2], [2, 2, 1], [2, 3], [3, 1, 1], [3, 2], [4, 1], [5]]

That is, the list of compositions has, not only the partition [1, 2, 2], but also its variations: [2, 1, 2] and [2, 2, 1]. Same with the other partitions. Selections from this list will produce possibly cool rhythms.

Here are the compositions of 5 turned into sequences, played by a snare drum, and written to the disk:

use Music::CreatingRhythms ();
use MIDI::Util qw(setup_score);

my $mcr = Music::CreatingRhythms->new;

my $comps = $mcr->compm(5, 3); # compositions of 5 with 3 elements

my $seq = $mcr->int2b($comps);

my $score = setup_score(bpm => 120, channel => 9);

for my $pattern ($seq->@*) {
    for my $bit (@$pattern) {
        if ($bit) {
            $score->n('en', 40); # snare patch
        }
        else {
            $score->r('en');
        }
    }
}

$score->write_score('perldotcom-2.mid');

A little better. Like a syncopated snare solo.

Another way to play the MIDI file is to use timidity. On my mac, with the soundfont specified in the timidity.cfg configuration file, this would be:

timidity -c ~/timidity.cfg -Od perldotcom-2.mid

To convert a MIDI file to an mp3 (or other audio formats), I do this:

timidity -c ~/timidity.cfg perldotcom-2.mid -Ow -o - | ffmpeg -i - -acodec libmp3lame -ab 64k perldotcom-2.mp3

Okay. Enough technical details! What if we want a kick bass drum and hi-hat cymbals, too? Refactor time…

use MIDI::Util qw(setup_score);
use Music::CreatingRhythms ();

my $mcr = Music::CreatingRhythms->new;

my $s_comps = $mcr->compm(4, 2); # snare
my $s_seq = $mcr->int2b($s_comps);

my $k_comps = $mcr->compm(4, 3); # kick
my $k_seq = $mcr->int2b($k_comps);

my $score = setup_score(bpm => 120, channel => 9);

for (1 .. 8) { # repeats
    my $s_choice = $s_seq->[ int rand @$s_seq ];
    my $k_choice = $k_seq->[ int rand @$k_seq ];

    for my $i (0 .. $#$s_choice) { # pattern position
        my @notes = (42); # hi-hat every time
        if ($s_choice->[$i]) {
            push @notes, 40;
        }
        if ($k_choice->[$i]) {
            push @notes, 36;
        }
        $score->n('en', @notes);
    }
}

$score->write_score('perldotcom-3.mid');

Here we play generated kick and snare patterns, along with a steady hi-hat.

Next up, let’s look at rhythmic “necklaces.” Here we find many grooves of the world.

World rhythms

Image from The Geometry of Musical Rhythm

Rhythm necklaces are circular diagrams of equally spaced, connected nodes. A necklace is a lexicographical ordering with no rotational duplicates. For instance, the necklaces of 3 beats are [[1, 1, 1], [1, 1, 0], [1, 0, 0], [0, 0, 0]]. Notice that there is no [1, 0, 1] or [0, 1, 1]. Also, there are no rotated versions of [1, 0, 0], either.

So, how many 16 beat rhythm necklaces are there?

my $necklaces = $mcr->neck(16);
print scalar @$necklaces, "\n"; # 4116 of 'em!

Okay. Let’s generate necklaces of 8 instead, pull a random choice, and play the pattern with a percussion instrument.

use MIDI::Util qw(setup_score);
use Music::CreatingRhythms ();

my $patch = shift || 75; # claves

my $mcr = Music::CreatingRhythms->new;

my $necklaces = $mcr->neck(8);
my $choice = $necklaces->[ int rand @$necklaces ];

my $score = setup_score(bpm => 120, channel => 9);

for (1 .. 4) { # repeats
    for my $bit (@$choice) { # pattern position
        if ($bit) {
            $score->n('en', $patch);
        }
        else {
            $score->r('en');
        }
    }
}

$score->write_score('perldotcom-4.mid');

Here we choose from all necklaces. But note that this also includes the sequence with all ones and the sequence with all zeroes. More sophisticated code might skip these.

More interesting would be playing simultaneous beats.

use MIDI::Util qw(setup_score);
use Music::CreatingRhythms ();

my $mcr = Music::CreatingRhythms->new;

my $necklaces = $mcr->neck(8);

my $x_choice = $necklaces->[ int rand @$necklaces ];
my $y_choice = $necklaces->[ int rand @$necklaces ];
my $z_choice = $necklaces->[ int rand @$necklaces ];

my $score = setup_score(bpm => 120, channel => 9);

for (1 .. 4) { # repeats
    for my $i (0 .. $#$x_choice) { # pattern position
        my @notes;
        if ($x_choice->[$i]) {
            push @notes, 75; # claves
        }
        if ($y_choice->[$i]) {
            push @notes, 63; # hi_conga
        }
        if ($z_choice->[$i]) {
            push @notes, 64; # low_conga
        }
        $score->n('en', @notes);
    }
}

$score->write_score('perldotcom-5.mid');

And that sounds like:

How about Euclidean patterns? What are they, and why are they named for a geometer?

Euclidean patterns are a set number of positions P that are filled with a number of beats Q that is less than or equal to P. They are named for Euclid because they are generated by applying the “Euclidean algorithm,” which was originally designed to find the greatest common divisor (GCD) of two numbers, to distribute musical beats as evenly as possible.

use MIDI::Util qw(setup_score);
use Music::CreatingRhythms ();

my $mcr = Music::CreatingRhythms->new;

my $beats = 16;

my $s_seq = $mcr->rotate_n(4, $mcr->euclid(2, $beats)); # snare
my $k_seq = $mcr->euclid(2, $beats); # kick
my $h_seq = $mcr->euclid(11, $beats); # hi-hats

my $score = setup_score(bpm => 120, channel => 9);

for (1 .. 4) { # repeats
    for my $i (0 .. $beats - 1) { # pattern position
        my @notes;
        if ($s_seq->[$i]) {
            push @notes, 40; # snare
        }
        if ($k_seq->[$i]) {
            push @notes, 36; # kick
        }
        if ($h_seq->[$i]) {
            push @notes, 42; # hi-hats
        }
        if (@notes) {
            $score->n('en', @notes);
        }
        else {
            $score->r('en');
        }
    }
}

$score->write_score('perldotcom-6.mid');

Now we’re talkin’ - an actual drum groove! To reiterate, the euclid() method distributes a number of beats, like 2 or 11, over the number of beats, 16. The kick and snare use the same arguments, but the snare pattern is rotated by 4 beats, so that they alternate.

So what have we learned today?

  1. That you can use mathematical functions to generate sequences to represent rhythmic patterns.

  2. That you can play an entire sequence or simultaneous notes with MIDI.

References: