Shortcuts: s show h hide n next p prev

I am on windows machine trying to cross-compile with my cross-compiler to use x86_64 for a linux target machine. I am trying to use a cross compiler to build msquic which uses the openssl, however I am failing in locating the Pod/Usage.pm.

I have added the line use lib "/c/Strawberry/perl//lib/" in the openssl Configure file and and added a print statement showing the @inc by adding the line: print "DEBUG: @inc = @inc\n"; I see my Strawberry folder contains the Pod/Usage.pm. However, I get an error that states the following:

\[ 32%\] OpenSSL configure  
DEBUG: **[@inc](https://github.com/inc)** = /c/Strawberry/perl//lib/ /c/msquic-2.5.6/submodules/openssl/util/perl /usr/lib/perl5/site_perl /usr/share/perl5/site_perl /usr/lib/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib/perl5/core_perl /usr/share/perl5/core_perl /c/msquic-2.5.6/submodules/openssl/external/perl/Text-Template-1.56/lib  
Configuring OpenSSL version 3.5.6-dev for target mingw64  
Using os-specific seed configuration  
Created configdata.pm  
Running configdata.pm  
Can't locate Pod/Usage.pm in **[@inc](https://github.com/inc)** (you may need to install the Pod::Usage module) (**[@inc](https://github.com/inc)** entries checked: /usr/lib/perl5/site_perl /usr/share/perl5/site_perl /usr/lib/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib/perl5/core_perl /usr/share/perl5/core_perl) at configdata.pm line 22487.

As we can that the @inc later when running configdata.pm no longer contains the /c/Strawberry/perl/lib.

I have tried adding PERL5LIB as an environment variable which then produces this error:

Cwd.c: loadable library and perl binaries are mismatched (got second handshake key 0000000a00000890, needed 0000000000000000)Cwd.c: loadable library and perl binaries are mismatched (got second handshake key 0000000a00000890, needed 0000000000000000)

I believe what is happening is submake files are reverting to use my GIT perl instead and I am not sure how to fix this.

Random Scrambles

dev.to #perl

Week 370

My Solutions
Task 1 : Popular Word (By M. Anwar)

You are given a string paragraph and an array of the banned words. Write a script to return the most popular word that is not banned. It is guaranteed there is at least one word that is not banned and the answer is unique. The words in paragraph are case-insensitive and the answer should be in lowercase. The words can not contain punctuation symbols.

For frequency of words I immediately plan to use a hash. My only hangup on this one was when the input paragraph was not space separated but used punctuation marks instead. This could be more robust, but that's an abnormal input. I turn everything lowercase, check for spaces, strip out the punctuation marks, and put the word count in the hash %h. Then we ignore banned words.

sub strip($w) {
    my $out;
    for my $letter (split '', $w) {
    $out .= $letter if ($letter =~ /\w/);
    }
    return $out;
}

sub proc($paragraph, @banned) {
    say "Input:  \$paragraph = $paragraph\n\t\@banned = @banned";
    my @words;
    if ($paragraph =~ /\s/) {
    @words = split ' ', lc $paragraph;
    } else {
    my $word;
    for my $letter (split '', lc $paragraph) {
        if ($letter =~ /[a-zA-Z]/) {
        $word .= $letter;
        } else {
        push @words, $word;
        $word = "";
        }
    }
    }
    my %h;
    foreach my $word (@words) {
    $word = strip($word);
    $h{$word}++;
    }
    my $max = 0;
    my $max_word;
    for my $w (keys %h) {
    my $ban = 0;
    for my $banned_word (@banned) {
        if ($w eq $banned_word) {
        $ban = 1;
        last;
        }
    }
    next if ($ban);
    if ($max < $h{$w}) {
        $max_word = $w;
        $max = $h{$w};
    }
    }
    say "Output: $max_word";
}

Task 2: Scramble String (By R. B-W)

You are given two strings $str1 and $str2 of the same length. Write a script to return true if $str2 is a scramble of $str1 otherwise return false. String B is a scramble of string A if A can be transformed into B by a single (recursive) scramble operation.

Mr. Roger Bell-West then goes on to explain what a scramble is.

  • If the string consists of only one character, return the string.
  • Divide the string X into two non-empty parts.
  • Optionally, exchange the order of those parts.
  • Optionally, scramble each of those parts.
  • Concatenate the scrambled parts to return a single string.

When the task was first described, he had suggested choosing a random location for the split, and randomly deciding to perform the scrambles or swaps. I decided to embrace that approach.

I loop the scramble 10,000 times to increase the odds of success.

sub scramble($s) {
    my $len = length($s);
    if ($len == 1) {
    return $s;
    } else {
    my $pt = 1 + ($len - 1) * rand();
    my $a = substr $s, 0, $pt;
    my $b = substr $s, $pt;
    my $a_new = (int 2*rand() == 0) ? $a : scramble($a);
    my $b_new = (int 2*rand() == 0) ? $b : scramble($b);
    my $out = (int 2*rand() == 0) ? $a_new.$b_new : $b_new.$a_new;
    }
}

Originally published at Perl Weekly 769

Hi there,

Every week I see a post declaring about something being dead. Agile is dead! Testing is dead!, Algol-68 is dead! I am so fed-up with this. So I am not going to link to the article that was discussing 5 dead programming languages.

Last week finally I got home and because of the flight I had to postpone the Testing in Perl event so it will be held this Thursday. You are invited to watch the previous sessions (for now free of charge) and join the next one.

The Perl Maven WhatsApp group already has more than 70 members. Unfortunately recently we got a few spammers so I had to turn on registration-approval. This means that when you try to join I'll send you a private message asking who you are. This is the little extra step we have to do to avoid spam. Anyway, you are invited to join us!

Enjoy your week!

--
Your editor: Gabor Szabo.

Announcements

TPRC Talk Submission Deadline in 2 days!

Articles

Faster UTF-8 Validation

Way more information about UTF-8 than I can fit in my head.

Enums for Perl: Adopting Devel::CallParser and Building Enum::Declare

Compiling Google::ProtocolBuffers::Dynamic on Debian Trixie

For a long time I have been trying to encourage Perl Module authors to include installation instruction when external libraries are needed. Even if only for one or two Linux distributions. This information should be in the README of the project.

Happy sharing

How to share memory between processes? A survey of a bunch od Data::* modules.

Making an Asynchronous Clocking Drum Machine App in Perl

PDL in Rust -- Part Two

"The current PDL implementation in pperl covers roughly 3,000 assertions end-to-end: about 1,400 on the Perl-facing connector side and about 1,600 on the engine side. As of this writing roughly 98% of the connector assertions match upstream PDL 2.103 exactly, and most of the remaining couple of dozen we already know why they fail. By the time you read this the numbers will have drifted a little in our favour - give or take - but the shape is the point, not the decimal."

Discussion

parsing a csv with boms in every line

What kind of strange things people have to deal with?

A curious case of an autovivified env var

Should the documentation of autovivification be comprehensive?

Grants

Maintaining Perl 5 Core (Dave Mitchell): March 2026

PEVANS Core Perl 5: Grant Report for March 2026

Maintaining Perl (Tony Cook) March 2026

Perl

This week in PSC (221) | 2026-04-13

The Weekly Challenge

The Weekly Challenge by Mohammad Sajid Anwar will help you step out of your comfort-zone. You can even win prize money of $50 by participating in the weekly challenge. We pick one champion at the end of the month from among all of the contributors during the month, thanks to the sponsor Marc Perry.

The Weekly Challenge - 370

Welcome to a new week with a couple of fun tasks "Popular Word" and "Scramble String". If you are new to the weekly challenge then why not join us and have fun every week. For more information, please read the FAQ.

RECAP - The Weekly Challenge - 369

Enjoy a quick recap of last week's contributions by Team PWC dealing with the "Valid Tag" and "Group Division" tasks in Perl and Raku. You will find plenty of solutions to keep you busy.

Perl Weekly Challenge 369: Valid Tag

This post provides multiple amazing examples of technical versatility by providing a clean and direct solution written clearly in many different programming languages with tremendous flair, it shows great elegance in presenting the elegant and simple solution to the valid tag problem using the idiomatic language of each respective programming language. Abigail shows off expert level code writing skills by using the advanced character class arithmetic features present within Perl; these advanced ways of manipulating strings show the efficiency of his coding style, while simultaneously creating visually appealing code.

Perl Weekly Challenge 369: Group Division

This post offers an extremely polished and versatile engineering design for the island program project. Abigail has shown advanced expertise by using a 'chunk-and-fill' method over many different programming languages (Perl, C, and less common languages such as sed, including details on how string slices and fill-up strings can be done with the least amount of impact. Furthermore, it highlights a creative use of string replication operators as well as very efficient use of loops that guarantee that the final incomplete group will still have the correct amount of padding added based on the requirements given in the challenge.

Tag Division

In an idiomatic Raku implementation of the Group Division problem, as shown pretty clearly here by Arne Sommer, the gather/take construct is utilised nicely to collect the data clearly, and with the use of substr-rw for in-place string manipulation and the replication operator (x) to add padding, the solution is both easily readable and aesthetically pleasing.

Perl Weekly Challenge: Week 369

By taking advantage of mathematical precision and the crispness of concise syntax through the use of "one liners", Jaldhar has developed an efficient method for solving this problem, no matter if you're using Raku or Perl. Calculating the required amount of padding to add to a split string before actually splitting it, allows for quick and accurate results. Furthermore, the clever application of native string manipulation functions adds an additional level of efficiency and clarity to handling the grouping logic.

Divided Tags

This article offers a detailed examination of the many aspects of the "Valid Tag" challenge and provides a well-defined "word" in order to enhance the accuracy of processing. The body of this technical paper describes Jorg's unique application of the Perl programming language's ability to utilise global regular expressions (regex) to solve Task 2; and also the excellent "Shape" verb from the J programming language that has provided an efficient and generalised way to reshape and pad multi-dimensional arrays.

The Weekly Challenge 369

In this post, we look at the approach both challenges in a disciplined and structured manner. The focus is on having code that is easy to maintain and easy to read. We have examples showing clean, modular Perl and Python code, and show examples of how the "Group Division" challenge is solved using efficient use of list slicing and using generator expressions to partition and pad strings in a method that's worthy of professional quality.

string indexes

Luca Ferrari exhibits an incredible degree of technical ability through his creation of five unique environments in which to accomplish the Week 369 challenges. Those five programming environments are: Raku, Python, PostgreSQL (PL/Perl, PL/PgSQL, and PL/Java). Luca's elegant use of Raku's rotor method combined with the ability of Python's list slicing to achieve the same complex logic for string padding and partitioning, as demonstrated in his application of The Group Division challenge, show how many different programming languages can utilise very diverse methods to accomplish the same technically precise logic.

Perl Weekly Challenge 369

The post demonstrates exceptional compact Perl programming by distilling complex string processing procedures into "1.5-liners" that are quite efficient. For Task 1, he builds camelCase tags from a string of input values using the split, map, and join functions in order to achieve the desired result in one pass. In addition, Luis's approach to Task 2 makes use of a brilliant "alternation" regex (.{$size}|.+) as a means of capturing both full and partial segments of an input string in an optimal manner. The use of this regex allows him to create direct array-index padding, resulting in code that is not only concise, but also very technically accurate and well-balanced between the two principles.

Good Tags and Good Chunks

Matthias Muth has written an impressive article on internationalisation (I18N) relating to pragmatic problem solving while still maintaining a strong long-term support model through thoughtful design choices such as those found within both the "Valid Tag" and "Group Division" solutions provided by Matthias's book. The use of the Text::Unidecode module was one way to create technically superior solutions that would accommodate for the presence of non-ASCII character sets and would also adhere to the rules specified in the challenge itself (e.g., valid tag). Additionally, his solution for "Group Division" is equally amazing because he accomplished this via a mathematical pad for the purpose of executing a clean single regex global match - or a single line of functional code - that could otherwise be accomplished via several iterations of code.

Strings Will Tear Us Apart

In this post, Packy gives a thorough, contemporary example of string handling in Raku, Perl, Python, and Elixir. He creates a unique solution to the "Group Division" challenge by using the Raku function .comb with integer arguments to automatically divide into chunks and by using the Perl 'unpack' function with a per-function constructed template, demonstrating how you can creatively employ language idioms to efficiently and effectively resolve a common programming issue and provide a solution for data partitioning that uses little or minimal resources.

Fun with strings

In his article, Peter presents a practical and polished approach to developing an order of operations for the sanitisation of strings. By organising the procedure so that lower case, regular expression character removal, and space & character combination are completed before the creation of camelCase, the end product meets the requirements for both camel case formatting as well as length requirements while still producing clean, effective code.

The Weekly Challenge - 369: Valid Tag

In Reinier's method, a model for defensive programming has been developed that features validation of input for real alphabetic values before processing. Also, he has taken a somewhat technical approach (transforming input to remove non-letter characters by converting them to spaces in order to apply camelCase correctly while keeping word boundaries intact) through his use of multiple accurately readable regular expressions.

The Weekly Challenge - 369: Group Division

Reinier has created a very good tutorial solution which showcases how to use the "Perl 4-argument substr function" to extract and remove data from a string using a while loop to do so iteratively and then using string replication for the final padding makes the code extremely readable and a great technical example of using efficient in-place usage of strings.

The Weekly Challenge #369

The work done by Robbie within the "Valid Tag" review indicates a very well thought out way to approach hyphenated compound words as a single entity for case adjustment, in addition to providing an innovative solution for "Group Division", by utilising the four-argument form of Perl's substr in order to easily "chop and fill" strings, while also demonstrating his superior knowledge of high-performance string manipulation.

Divided Validity

Roger's technical review offers an interesting side-by-side comparison of various string handling paradigms from multiple programming languages. The "Valid Tag" part of the review shows how Crystal's highly performant state machine implementation allows for case conversion to be accomplished in a single pass. The "Group Division" analysis of in_groups_of() in Crystal is very interesting as well, as it illustrates just how compact that library function is compared to typical iterative slicing found in Typst, demonstrating that using built-in library functions can greatly simplify the implementation of algorithmic logic through less code complexity.

Group Tag

A wonderful illustration of Test-driven development is Simon's critique of Challenge 369. He found through tests that the sanitisation step of "Valid Tag", had to be completed before performing case formatting so that example 5 is handled properly. He also provides useful technical comparisons between both languages' ecosystem strengths, such as Python using more_itertools.grouper vs Perl doing it manually by iteration.

Weekly collections

NICEPERL's lists

Great CPAN modules released last week.

Events

Perl Maven online: Testing in Perl - part 4

April 23, 2026

Perl Toolchain Summit 2026

April 23-26, 2026

Boston Perl Mongers virtual monthly

May 12, 2026

The Perl and Raku Conference 2026

June 26-29, 2026, Greenville, SC, USA

You joined the Perl Weekly to get weekly e-mails about the Perl programming language and related topics.

Want to see more? See the archives of all the issues.

Not yet subscribed to the newsletter? Join us free of charge!

(C) Copyright Gabor Szabo
The articles are copyright the respective authors.

Its a mystery

Perl commits on GitHub
Its a mystery

Archive perldelta 5.43.10

Perl commits on GitHub
Archive perldelta 5.43.10

tick off 5.43.10

Perl commits on GitHub
tick off 5.43.10

Add epigraph for 5.43.10

Perl commits on GitHub
Add epigraph for 5.43.10

Perl · Objective-C · Visual Basic · CoffeeScript · Ruby

PDL in Rust -- Part Two

blogs.perl.org


Logo.png


Two weeks ago we posted "PDL in Rust - A Native
Reimplementation of the Perl Data Language"
. At the time the
score was 45 tests, all green. That was enough to say "it compiles,
it runs, here is the arithmetic surface." It was not enough to say
"you can use this."

We are now on the second number. The current PDL implementation in pperl covers roughly 3,000 assertions end-to-end: about 1,400 on the Perl-facing connector side and about 1,600 on the engine side. As of this writing roughly 98% of the connector assertions match upstream PDL 2.103 exactly, and most of the remaining couple of dozen we already know why they fail. By the time you read this the numbers will have drifted a little in our favour - give or take - but the shape is the point, not the decimal.

Everything described below is shipping, today, at https://perl.petamem.com.

What Part One Was

Part One was a flag-planting exercise. Someone on Reddit asked whether pperl would support PDL; we said yes; we then had to actually do it. Forty-five tests covering construction, arithmetic, reductions, a handful of transcendentals, and operator overloading. The point was to prove the mechanism - that a pure-Rust engine behind a pperl native module could stand up to Perl-side code that expects `use PDL; my $x = pdl([1,2,3]);` to just work. It did. But forty-five tests is not a usable PDL.

This post is about what happened when we turned the heat up.

What's in the Box

The Perl-facing module hierarchy now mirrors upstream PDL's own, file-for-file where it matters:

PDL              - top-level boot, exports pdl/zeroes/ones/...
PDL::Core        - constructors, accessors, DESTROY, magic
PDL::Ops         - arithmetic, comparison, bitwise, unary
PDL::Ufunc       - reductions, sorting, statistics
PDL::Math        - trig, hyperbolic, Bessel, erf, special functions
PDL::Primitive   - matmult, which, where, clip, append, convolve
PDL::Slices      - slice, xchg, mv, reshape, clump, dummy, range
PDL::Basic       - sequence, xvals, yvals, zvals, rvals, linvals
PDL::MatrixOps   - det, inv, LU, eigens, trace, norm
PDL::Bad         - bad-value flags, setbadat, copybad, locf
PDL::FFT         - Cooley-Tukey radix-2 FFT/IFFT
PDL::NiceSlice   - source filter: $x(:,0) syntax
PDL::Lite        - lightweight import variant
PDL::LiteF       - lightweight import with fewer functions

Fifteen data types. Broadcasting. Bad values. Dataflow. Affine slicing. Storable round-trip. FFT. 2D image convolutions. Matrix decompositions. Source-filter support for PDL::NiceSlice's $x(:,0) syntax. Operator overloading for ~40 operators. It is PDL, not a demo subset. Whether it walks quite like the grown-up yet is for the user to judge.

The Connector, in One Example

Part One described the split at a high level - a standalone Rust crate for the engine, a pperl native module for the bridge. This time let us walk one operation end to end, because the connector is where the Perl audience will want the detail.

Consider $A x $B, the matrix-multiply overload:

use PDL;
my $A = pdl([[1,2],[3,4]]);
my $B = pdl([[5,6],[7,8]]);
my $C = $A x $B;

Here is what the x overload looks like on our side. The handler is a plain Rust function registered into the PDL stash at boot. No use overload block, no BEGIN-time aliasing, no XSUB wrapper file:

// At boot, overload entries go straight into the PDL stash.
let overloads: &[(&[u8], unsafe extern "C" fn(*mut CV))] = &[
    (b"(+\0",    xs_plus_overload),
    (b"(*\0",    xs_mult_overload),
    (b"(x\0",    xs_matmult_overload),   // matrix multiply
    // ~40 more operators …
];
for &(name, func) in overloads {
    newXS(/* PDL::(X */, Some(func), file);
}

The handler itself extracts the PDLs from their SVs, validates dimensions, calls into the engine, wraps the result:

unsafe extern "C" fn xs_matmult_overload(_cv: *mut CV) {
    let mark  = xs_pop_mark();
    let items = xs_items(mark);
    let (a_sv, b_sv, _swap) = binop_args(mark, items);
    xs_sp_set(mark);
    let mut a_temp = false;
    let mut b_temp = false;
    let a = sv_to_pdl_or_scalar(a_sv, &mut a_temp);
    let b = sv_to_pdl_or_scalar(b_sv, &mut b_temp);
    let a_dims = core_vtable::pdl_dims(a);
    let b_dims = core_vtable::pdl_dims(b);
    if a_dims[0] != b_dims[1] {
        Perl_croak(b"PDL: matmult: dimension mismatch\0".as_ptr() as *const c_char);
    }
    let c = core_vtable::pdl_alloc_output();
    // --- THE ENTIRE DISPATCH ---
    // One call into Rust-PDL. No XS wrapping, no interpreter
    // round-trip, no hidden allocations.
    let err = rust_pdl::primitive::pp_matmult
                 ::pdl_run_matmult(a, b, c);
    if a_temp { core_vtable::pdl_destroy(a); }
    if b_temp { core_vtable::pdl_destroy(b); }
    xs_push(sv_2mortal(pdl_to_sv(c)));
}

Three things about this example are worth drawing out for a Perl audience.

The overload dispatch is a function-pointer table. $A x $B resolves as: stash lookup → function-pointer call → Rust. Upstream PDL's equivalent path is: use overload table lookup → dispatch to a Perl sub → XSUB boundary → PP-generated C → inner kernel. pperl removes the two middle layers entirely, because the handler is the kernel-caller, compiled into the interpreter binary.

sv_to_pdl is a magic walk, not a method call. PDL SVs in pperl store the engine's *mut c_pdl as PERL_MAGIC_ext on the inner SV - the same mechanism Perl itself uses for tied variables, dualvars, and a dozen other features. Extracting the pointer is a flag check plus a linked-list walk. No method dispatch, no can(), no string comparisons:

let flags = SvFLAGS(inner);
if flags & (SVs_GMG | SVs_SMG | SVs_RMG) != 0 {
    let mut mg = (*sv_any).xmg_u.xmg_magic;
    while !mg.is_null() {
        if (*mg).mg_virtual == &PDL_MAGIC_VTBL {
            return (*mg).mg_ptr as *mut c_pdl;  // ← the c_pdl pointer, raw
        }
        mg = (*mg).mg_moremagic;
    }
}

So $A->at(0) in pperl is: one stash method lookup, one magic walk, one pointer-arithmetic offset inside a Rust function. We did not invent a new mechanism for binding Rust objects to Perl variables - we reused Perl's own magic infrastructure, because that is what it is for.

Type promotion: upstream's rule, reimplemented from scratch. Matrix multiply with mixed types - long x float - must promote to double for numerical stability. Upstream's rule is "an NV scalar forces minimum promotion to double; an NV PDL keeps its wider type if already ≄ double." We ported the rule verbatim; the implementation is thirty lines of Rust match arms. No C preprocessor switch tables. No .xs file. No PDL::PP code-generation templates. When we discovered that pperl was keeping float instead of promoting to double for float + 0.2 - the fix was two match arms, one rebuild, twenty-five tests recovered. In the upstream world the equivalent change would touch PDL::PP's templates and trigger a rebuild of forty op files.

pd2multi - We build a Compiler

Upstream PDL has PDL::PP: a macro/template processor that weaves .pd definitions with C fragments and hands the output to cc. It has served PDL for thirty years. It works. But the output is always C, and it is a text-substitution engine, not a compiler.

We could not extend it. We needed something that could emit Rust today, and potentially other targets later - a C backend to validate against upstream's own output, a GPU backend for OpenCL, a Perl backend to let pperl's JIT see through the operation into the surrounding loop. So we built a proper parser → AST → emitter pipeline. pd2multi ingests the same input language, builds an explicit intermediate representation, and delegates output to pluggable backend emitters.

This is internal infrastructure. Nothing user-facing ships from it yet; it is the scaffolding under the engine. It is roughly two orders of magnitude more work than the template-substitution approach it replaces. The payoff is optionality: when we want to add a GPU backend, it is one more emitter walking the AST, not a new compiler.

Correctness

The connector test suite runs every assertion against both upstream perl5+PDL and pperl+Rust-PDL, compares outputs, and categorizes each result. A representative snapshot from the harness at the time of writing:

Tests: ~1380 matched, ~25 mismatched, ~110 extra / ~1400 total

Matched - the large majority, give or take: pperl produces the same output as upstream.

Mismatched - on the order of two dozen: pperl differs from upstream. Most of these are real work in progress and we know where the gaps are. The biggest cluster is in forward-flow dataflow - when a parent PDL is mutated, downstream slice children should see the update eagerly. Our implementation is currently lazy, which catches most cases via dirty-flag refresh but misses the paths that read child data without going through the normal accessor. A handful of tests in 220-broadcasting fail on that. The rest are distributed drift: error message text, one edge case in unbroadcast, STL round-trip issues. None are conceptual; all are bounded.

Extra - around a hundred: tests that pperl passes and upstream does not. Some of these are test-surface coverage - we wrote more thorough tests for overload dispatch, import semantics, and native-function dispatch than upstream did. Others are legitimate behavioural differences where we produce a correct result and upstream does not; the precision and diagnostics subsections below cover the interesting cases.

There is a separate, smaller number worth singling out.

Where pperl and Upstream Disagree, and pperl is Right

A handful of tests produce different results between the two implementations where pperl's result is more correct. Six in the first sweep; we are no longer surprised when we find another. Three patterns are worth naming.

Precision. pctover computes the k-th percentile along a dimension:

my $x = pdl(0,0,6,3,5,0,7,14,94,5,5,8,7,7,1,6,7,13,10,2,101,19,7,7,5);
my $r = $x->pctover(0.9);
printf "%.20f\n", $r->at();

Upstream says 17.00000000000000710543. pperl says 17.00000000000000000000. Both stringify to 17, so to the eye they are identical. But $r->at() == 17 is false on upstream (the 7e-15 drift catches the numeric comparison) and true on pperl. The test was written as a plain == 17 assertion, assuming the kernel would return an exact integer.

Why does pperl get it right? Same numerical recipe, different float accounting. Upstream's PP-generated C coerces intermediates into PDL_Double early and rounds once per sample. The Rust kernel, passing through LLVM's optimizer, keeps a wider accumulator and rounds once at the final divide. It is not cleverness on our side - it is what the modern toolchain does with the same numerical recipe when nothing in the code forces early narrowing. The same pattern appears in xlogvals and a couple of other reductions.

Perl context. lgamma returns two values: the log-gamma value, and the sign. Upstream's PP-generated XS returns both as a two-element list regardless of context. In scalar context, Perl takes the last element - so my $y = lgamma($x) assigns the sign (always 1), not the actual gamma. That is not what a Perl programmer expects. pperl respects GIMME_V: scalar context returns the value, list context returns the pair.

This is not a numerical-methods win. It is a Perl-semantics win - the kind that only a Perl-aware reimplementation can notice. We didn't set out to behave differently from upstream; we set out to behave the way Perl specifies.

Diagnostics. pdl("nonsense blurb") should reject its input. Upstream does reject it - but the error message reads found 'e' as part of a larger word in nonsense blurb, because the parser is reporting that it saw e (Euler's constant) embedded in a longer token, and that is what it complains about. pperl reports found disallowed character(s) 'nonsense,blurb', which is what an actual user wanted to know. A smaller case of the same pattern: $x->slice('0:-14') on a 10-element PDL silently returns a garbage view upstream; pperl croaks with an out-of-bounds error. Good diagnostics are a correctness property.

And in the interest of honest accounting: we have also found places where our implementation was wrong and upstream was right, and we conformed. The float + 0.2 promotion rule mentioned earlier was one of those: we were returning float, upstream correctly promoted to double, we fixed ourselves to match. The flow of corrections is bidirectional and that is the point - two independent implementations of the same spec are a stronger guarantee of correctness than either one alone.

Performance

The startup story from Part One holds and has tightened up. Here is the harness output from the current connector test suite, picking a representative sample:

PDL/010-constructor.t     ... ok  (17/17)   perl5: 84ms  / pperl: 9ms
PDL/110-core.t            ... ok  (121/121) perl5: 88ms  / pperl: 13ms
PDL/120-ops.t             ... ok  (75/75)   perl5: 94ms  / pperl: 12ms
PDL/150-primitive.t       ... ok  (60/60)   perl5: 86ms  / pperl: 11ms
PDL/190-matrixops.t       ... ok  (35/35)   perl5: 90ms  / pperl: 11ms
PDL/260-pdl-from-string.t ... ok  (60/60)   perl5: 86ms  / pperl: 13ms
PDL/300-niceslice.t       ... ok  (30/30)   perl5: 137ms / pperl: 75ms
PDL/330-flexraw-fastraw.t ... ok  (30/30)   perl5: 96ms  / pperl: 20ms

Across the full PDL suite, pperl's per-test wall time sits at 9–13ms for the typical path and climbs to 20–75ms for tests that do real I/O or source filtering. Upstream's corresponding numbers are 82–137ms. The typical startup advantage is 8–10Ɨ; the fastest cases reach 13Ɨ. This is not a compute-throughput benchmark - it is the cost of use PDL plus a few operations. And it holds because the entire PDL stack is compiled into the pperl binary. There is no module loading, no PP compilation, no dynamic linking.

Compute-throughput numbers are what most readers will actually care about, so here is a table from the current benchmark suite - release build, microseconds per call, averaged over a warm loop. Worth stating up front: these are dry numbers. No JIT loop-fusion, no Rayon parallelism, no GPU offload - none of the mechanisms that define pperl's architectural advantage are engaged here. What is being measured is the raw kernel-versus-kernel cost and the per-call dispatch overhead. The ceiling is considerably higher than what these numbers show; the floor is what they are.

op                    size    upstream µs    pperl µs    ratio
---------------   --------    -----------    --------    -----
pdl_from_str            10           66.6         3.4    20x
pdl_from_str           100          583.4        18.3    32x
pdl_from_str          1000         6667.7       183.4    36x

stringify 10 8.6 1.5 5.6x
stringify 100 60.0 13.1 4.6x
stringify 1000 620.6 140.8 4.3x

add_scalar 100 3.4 1.2 2.8x
matmult_sq 10 5.4 2.3 2.3x

slice 100 2.0 1.3 1.5x
slice 10000 2.1 1.3 1.7x
slice 1000000 2.2 1.4 1.5x

add_vec 1000000 1860.4 1939.1 0.96x
mul_vec 1000000 1879.3 1790.6 1.05x
sum 1000000 1468.3 1344.1 1.09x
exp 1000000 8174.9 8110.6 1.01x
matmult_sq 200 6329.4 7070.6 0.90x

sumover 1000000 4723.4 13214.5 0.36x

Three things stand out.

Parsing and formatting win cleanly. pdl_from_str - the pdl("1 2 3; 4 5 6") literal parser - is 20-36Ɨ faster because ours is a native Rust tokenizer where upstream's is a pure-Perl regex chain. Stringify is 4-6Ɨ faster for the same structural reason: a native formatter rather than Perl-side string assembly. These gains are not tuning artefacts; they will persist.

Small-op dispatch wins. On small arrays (100-element add_scalar, 10Ɨ10 matmult), pperl is 2-3Ɨ faster because the XS overhead is gone. This is the function-pointer overload dispatch from the connector walkthrough, now with numbers: Perl → stash lookup → Rust kernel, no intervening XSUB layer. On large arrays the kernel dominates and the dispatch win disappears into the noise - which is the correct outcome, not a disappointment.

Slice is zero-copy and measurably so. A slice operation on a 1M-element PDL takes about 1.4 µs on pperl, independent of input size - a constant-time affine view construction, as it should be. Upstream is within the same order of magnitude (about 2 µs); both implementations do the right thing here.

Outliers do happen. sumover at 1M elements is currently slower than upstream; it was detected, it is understood, and it is being worked on. That is the expected shape of a young engine against a thirty-year-old one.

These numbers are just a baseline. The "dry" performance without JIT, Autoparallelization or GPU. Autovec by the compiler happens for both. The numbers for JIT+Rayon and/or GPU will land in a dedicated post once the relevant parts of the JIT are ready.

How pperl PDL Compares to Upstream

Areapperl + Rust-PDLupstream + C PDL
Core numerics ~98% parity with upstream; occasionally more precise The reference implementation
Startup (use PDL + trivial op) ~9–13ms typical ~82–90ms typical
Build dependency cargo build, no C toolchain C compiler, XS, PP code-gen, Makefile toolchain
Thread model Rayon (prototypes working) Perl ithreads + pthread-guarded globals
Source filters Yes - NiceSlice works Yes
Forward-flow dataflow Partial; a handful of known test failures Full
PDL::PP compiler No - replaced by pd2multi (internal) Yes - runs at build time
Inline::Pdlpp No (no in-process C compiler) Yes
FITS / astronomy I/O Out of scope (user-space, on top of PDL) Yes, via Astro::FITS
Perl ithreads No - Rayon replaces the model Yes

On Compatibility

Our commitment is to upstream PDL's specified behaviour. When the specification and the current implementation disagree, we side with the specification. lgamma in scalar context is the cleanest illustration: Perl's GIMME_V contract is unambiguous; upstream's PP-generated XS happens not to honour it; ours does. We did not set out to deviate - we set out to mirror Perl.

Everywhere else, upstream is the reference. Thirty years of PDL carry numerical decisions and edge-case wisdom that the documentation does not always explain. Whenever we disagreed with upstream during implementation, our default assumption was that we were missing context. The float + 0.2 promotion case earlier in this post is exactly that: we thought we were right. However, upstream was right - we conformed.

What's Next

Three threads, in rough priority order.

Rayon-based parallelism. First prototypes are running: PDL operations inside pperl's JIT-compiled while-loops distribute across cores via work-stealing. Early results look promising. Mandelbrot-scale workloads already parallelize cleanly; the interesting work is in making parallelism show up for realistic PDL pipelines without per-call overhead eating the win.

OpenCL / GPU acceleration. This is a priority goal, and it is the reason pd2multi exists as a proper compiler rather than a template system. A GPU backend is one more emitter walking the AST, not a separate codegen project. The engineering cost is real but bounded.

JIT fusion across PDL boundaries. Cranelift compilation of surrounding Perl loops that call PDL operations, so the JIT can see through the op and fuse it with the loop body. This is the structural payoff of having a pure-Rust engine - the JIT can look inside - and it is the thing that will make the case for reimplementation in performance terms, not just compatibility terms. Serious work on this is pending the Rayon and GPU threads stabilizing.

No dates. We will post numbers when there are numbers to post.

In the meantime: use PDL; in pperl, today, at https://perl.petamem.com. Roughly three thousand assertions of work, about ninety-eight per cent of them matching the upstream beacon, a double-digit startup win, and a handful of cases where independent implementation caught things that a single implementation had missed.

- Richard C. Jelinek, PetaMem s.r.o.

Thank you Team PWC for your continuous support and encouragement.
Welcome to the Week #370 of The Weekly Challenge.

TPRC Talk Submission Deadline in 2 days!

Perl Foundation News

There are only 2 days left to submit a talk for TPRC! The cutoff is April 21. If you have an idea for a talk, it is definitely time to get it submitted. We need a wide variety of speakers and topics, so give it a try! Go to https://tprc.us/ to make your submission.

add 5.43.10 to perlhist

Perl commits on GitHub
add 5.43.10 to perlhist

I am confused.

cat << EOF > test.txt
this
is
a
text
file
EOF
cat test.txt | perl -0pe 's/.\n^text/hellohello/smg'

works, but

cat test.txt | perl -0pe 's/.$\n^text/hellohello/smg'

doesn't.

Annoyingly https://regex101.com/ doesn't detect this quirk. Is it to do with -0? And why would $\n^ syntax (which is usually perfectly fine) be wrong anyway?

Thank you Team PWC for your continuous support and encouragement.
As you know, The Weekly Challenge, primarily focus on Perl and Raku. During the Week #018, we received solutions to The Weekly Challenge - 018 by Orestis Zekai in Python. It was pleasant surprise to receive solutions in something other than Perl and Raku. Ever since regular team members also started contributing in other languages like Ada, APL, Awk, BASIC, Bash, Bc, Befunge-93, Bourne Shell, BQN, Brainfuck, C3, C, CESIL, Chef, COBOL, Coconut, C Shell, C++, Clojure, Crystal, CUDA, D, Dart, Dc, Elixir, Elm, Emacs Lisp, Erlang, Excel VBA, F#, Factor, Fennel, Fish, Forth, Fortran, Gembase, Gleam, GNAT, Go, GP, Groovy, Haskell, Haxe, HTML, Hy, Idris, IO, J, Janet, Java, JavaScript, Julia, K, Kap, Korn Shell, Kotlin, Lisp, Logo, Lua, M4, Maxima, Miranda, Modula 3, MMIX, Mumps, Myrddin, Nelua, Nim, Nix, Node.js, Nuweb, Oberon, Octave, OCaml, Odin, Ook, Pascal, PHP, PicoLisp, Python, PostgreSQL, Postscript, PowerShell, Prolog, R, Racket, Rexx, Ring, Roc, Ruby, Rust, Scala, Scheme, Sed, Smalltalk, SQL, Standard ML, SVG, Swift, Tcl, TypeScript, Typst, Uiua, V, Visual BASIC, WebAssembly, Wolfram, XSLT, YaBasic and Zig.
Thank you Team PWC for your continuous support and encouragement.

(dxcvi) 9 great CPAN modules released last week

r/perl
Updates for great CPAN modules released last week. A module is considered great if its favorites count is greater or equal than 12.

  1. App::Netdisco - An open source web-based network management tool.
    • Version: 2.098001 on 2026-04-16, with 859 votes
    • Previous CPAN version: 2.098000 was released the same day
    • Author: OLIVER
  2. Authen::Passphrase - hashed passwords/passphrases as objects
    • Version: 0.009 on 2026-04-15, with 14 votes
    • Previous CPAN version: 0.008 was released 14 years, 2 months, 11 days before
    • Author: LEONT
  3. Convert::Pheno - A module to interconvert common data models for phenotypic data
    • Version: 0.31 on 2026-04-17, with 15 votes
    • Previous CPAN version: 0.30 was released 2 days before
    • Author: MRUEDA
  4. CPANSA::DB - the CPAN Security Advisory data as a Perl data structure, mostly for CPAN::Audit
    • Version: 20260412.001 on 2026-04-12, with 25 votes
    • Previous CPAN version: 20260405.001 was released 7 days before
    • Author: BRIANDFOY
  5. Finance::Quote - Get stock and mutual fund quotes from various exchanges
    • Version: 1.69 on 2026-04-18, with 149 votes
    • Previous CPAN version: 1.68_02 was released 1 month, 5 days before
    • Author: BPSCHUCK
  6. Imager - Perl extension for Generating 24 bit Images
    • Version: 1.030 on 2026-04-13, with 68 votes
    • Previous CPAN version: 1.029 was released 6 months, 7 days before
    • Author: TONYC
  7. JSON::Schema::Modern - Validate data against a schema using a JSON Schema
    • Version: 0.638 on 2026-04-18, with 16 votes
    • Previous CPAN version: 0.637 was released 10 days before
    • Author: ETHER
  8. SPVM - The SPVM Language
    • Version: 0.990162 on 2026-04-18, with 36 votes
    • Previous CPAN version: 0.990161 was released the same day
    • Author: KIMOTO
  9. version - Structured version objects
    • Version: 0.9934 on 2026-04-12, with 22 votes
    • Previous CPAN version: 0.9933 was released 1 year, 7 months, 17 days before
    • Author: LEONT

parsing a csv with boms in every line

r/perl

Hi,

I need to parse a csv-file that contains a BOM at the beginning of every line (i.e. every line starts with 0xefbbbf).

At the moment I can't quite figure out if Text::CSV can handle this - any tips?

submitted by /u/ghiste
[link] [comments]

Faster UTF-8 Validation

blogs.perl.org

A while back, I received a pull request suggesting that I update the performance comparison with Encode.pm in my module, Unicode::UTF8. When I originally wrote Unicode::UTF8, Encode.pm used its own UTF-8 validation implementation. Since then, Karl Williamson has done extensive work improving Perl, and Encode.pm now relies on those validation routines based on Bjƶrn Hƶhrmann’s UTF-8 DFA decoder. It’s an elegant piece of code, widely adopted across many projects.

That said, I wasn’t particularly motivated to revisit the comparison, so I decided instead to look into faster scalar C implementations. While it has been suggested that Unicode::UTF8 should gain a SIMD implementation, and that may well be worthwhile, I wanted to first explore improvements to the scalar path, which would still be required as a fallback.

After some searching, I came across a tweet by Russ Cox showing performance numbers for a shift-based DFA implementation in Go, along with a link to a gist by Per Vognsen describing the technique.

It turned out that this shift-based DFA approach was quite popular a few years ago. Several implementations appeared in different programming languages and even in some RDBMs. However, I couldn’t find any reusable code, so I decided to implement it as a header-only C library, updating a UTF-8 validator I originally wrote in 2017.

I won’t go into the details of shift-based DFA encoding here. For a thorough explanation, I recommend Per Vognsen’s article, and I cover my own implementation in more detail here.

The main difference from a traditional DFA, such as Bjƶrn Hƶhrmann’s implementation, lies in how table lookups are handled. A conventional byte-class + transition-table DFA requires two lookups per byte: one to map the byte to a character class, and another to determine the next state. The shift-based DFA combines both steps into a single lookup, at the cost of a larger table. Both DFAs have a serial chain dependency where each byte’s state transition depends on the previous one, which limits instruction-level parallelism.

Two functions are provided for UTF-8 validation:

utf8_valid has no data-dependent branches. Each DFA step is a table lookup combined with a bitwise shift, with no conditional branches on the input byte value. Execution time scales with byte count, not byte values.

utf8_valid_ascii adds an ASCII fast path. There are different approaches to this: Perl, for instance, scans a leading ASCII prefix and falls back to DFA validation upon encountering a non-ASCII byte, never re-entering the fast path. I am not particularly fond of that approach, since even predominantly ASCII content today tends to contain non-ASCII bytes. Instead, I opted for a 16-byte chunk fast path that skips the DFA for chunks consisting entirely of ASCII bytes, and resumes DFA validation in a non-clean state when a non-ASCII byte is encountered.

The implementation showed promising results, but can we go faster? Yes! By splitting the UTF-8 stream in two, we can break the serial chain dependency and run two independent DFA chains in a single interleaved loop. Since the chains share no data dependencies, the CPU’s out-of-order engine can overlap their shift operations on cores with multiple shift-capable execution ports. I did try three interleaved streams but saw no improvement on the hardware available to me, the added overhead of splitting the stream further likely offsets any potential gain.

Whether utf8_valid_ascii is profitable depends heavily on the content. The following benchmark output illustrates this with ASCII-heavy content.

The benchmark includes two reference implementations: hoehrmann and utf8_valid_old, the previous scalar implementation in the C library.

en.txt: 80 KB; 82K code points; 1.00 units/point
  U+0000..U+007F          82K  99.9%
  U+0080..U+07FF           18   0.0%
  U+0800..U+FFFF           49   0.1%
  hoehrmann                    572 MB/s
  utf8_valid_old              3677 MB/s  (6.42x)
  utf8_valid                  6466 MB/s  (11.30x)
  utf8_valid_ascii           41001 MB/s  (71.63x)

sv.txt: 94 KB; 93K code points; 1.04 units/point
  U+0000..U+007F          90K  96.4%
  U+0080..U+07FF           3K   3.5%
  U+0800..U+FFFF          171   0.2%
  hoehrmann                    572 MB/s
  utf8_valid_old              1536 MB/s  (2.68x)
  utf8_valid                  6600 MB/s  (11.53x)
  utf8_valid_ascii            8581 MB/s  (15.00x)

units/point is a rough content-mix indicator: 1.00 is near-pure ASCII, ~1.7-1.9 is common for 2-byte-heavy text, and ~2.7-3.0 for CJK-heavy text. The code point distribution breaks down the input by Unicode range. Results are sorted slowest to fastest; the multiplier shows throughput relative to the slowest implementation.

Output from multibyte-heavy content:

ja.txt: 176 KB; 65K code points; 2.79 units/point
  U+0000..U+007F           7K  10.7%
  U+0080..U+07FF           30   0.0%
  U+0800..U+FFFF          58K  89.3%
  hoehrmann                    572 MB/s
  utf8_valid_old              2046 MB/s  (3.58x)
  utf8_valid_ascii            4200 MB/s  (7.34x)
  utf8_valid                  6492 MB/s  (11.34x)

ar.txt: 25 KB; 14K code points; 1.81 units/point
  U+0000..U+007F           3K  18.9%
  U+0080..U+07FF          12K  81.1%
  hoehrmann                    572 MB/s
  utf8_valid_old               984 MB/s  (1.72x)
  utf8_valid_ascii            4163 MB/s  (7.28x)
  utf8_valid                  6486 MB/s  (11.34x)

utf8_valid reaches approximately 0.71 cycles per byte on wide-issue cores, that’s pretty good for a C scalar implementation. See the Performance section in the GitHub repository for a more in-depth analysis and comparison across different architectures and compiler optimizations.

I couldn’t stop at just a validator. With spare bits left in the DFA table, I went ahead and implemented a complete UTF-8 library covering validation, decoding, navigation, and transcoding. The code is available on GitHub.

Unicode::UTF8 internally uses utf8_valid_ascii in its encode and decode routines.

Next steps: I have no further ideas for how to improve the scalar DFA implementation, and I am satisfied with the performance. A SIMD implementation may be worth exploring in the future, but the more immediate goal would be to get this incorporated into Perl core.

Enjoy! PS Now I can disregard the PR and tune up the numbers in the comparison ;o)

I found this sufficient of an obstacle that I wanted to post about it for future posterity.

I was able to cpanm Google::ProtocolBuffers::Dynamic after installing these packages on Debian Trixie.

  • build-essential (unsurprisingly)
  • cmake
  • libprotobuf-dev
  • libprotoc-dev

The last library eluded me and caused the most frustration. Anyway on to other things.

Weekly Challenge: Group Tag

dev.to #perl

Weekly Challenge 369

Each week Mohammad S. Anwar sends out The Weekly Challenge, a chance for all of us to come up with solutions to two weekly tasks. My solutions are written in Python first, and then converted to Perl. Unless otherwise stated, Copilot (and other AI tools) have NOT been used to generate the solution. It's a great way for us all to practice some coding.

Challenge, My solutions

Task 1: Valid Tag

Submitted by: Mohammad Sajid Anwar

You are given a given a string caption for a video.

Write a script to generate tag for the given string caption in three steps as mentioned below:

  1. Format as camelCase Starting with a lower-case letter and capitalising the first letter of each subsequent word: Merge all words in the caption into a single string starting with a #.
  2. Sanitise the String: Strip out all characters that are not English letters (a-z or A-Z).
  3. Enforce Length: If the resulting string exceeds 100 characters, truncate it so it is exactly 100 characters long.

My solution

The last example - where Hour starts with a upper case letter - would suggest that the second step needs to be done before the first one. As I used TDD when writing these solutions this was picked up when running the tests.

Nothing overly tricky about the solution. I take input_string and perform the necessary operations on it.

def valid_tag(input_string: str) -> str:
    input_string = re.sub(r"[^\sa-zA-Z]", "", input_string)
    input_string = input_string.strip().lower()
    input_string = re.sub(r" +([a-z])", upper_case_letter, input_string)
    return "#" + input_string[:99]

The Perl solution is similar.

sub main ($input_string) {
    $input_string =~ s/[^\sa-zA-Z]//g;
    $input_string =~ s/^\s+//;
    $input_string =~ s/\s+$//;
    $input_string = lc $input_string;
    $input_string =~ s/ +([a-z])/uc $1/eg;
    say "#" . substr($input_string, 0, 99);

Examples

$ ./ch-1.py "Cooking with 5 ingredients!"
#cookingWithIngredients

$ ./ch-1.py the-last-of-the-mohicans
#thelastofthemohicans

$ ./ch-1.py "  extra spaces here"
#extraSpacesHere

$ ./ch-1.py "iPhone 15 Pro Max Review"
#iphoneProMaxReview

$ ./ch-1.py "Ultimate 24-Hour Challenge: Living in a Smart Home controlled entirely by Artificial Intelligence and Voice Commands in the year 2026!"
#ultimateHourChallengeLivingInASmartHomeControlledEntirelyByArtificialIntelligenceAndVoiceCommandsIn

Task 2: Group Division

Task

You are given a string, group size and filler character.

Write a script to divide the string into groups of given size. In the last group if the string doesn’t have enough characters remaining fill with the given filler character.

My solution

This is one task where the Python and Perl solutions are completely different. Python's more_itertools module has the grouper function, so I use this to separate the string into parts with the filler if required. No need to reinvent a perfectly round wheel :)

def group_division(input_string: str, size: int, filler: str) -> list[str]:
    result = []
    for g in grouper(input_string, size, fillvalue=filler):
        result.append("".join(g))
    return result

Perl does not have a similar function, that I'm aware of. I use a traditional for loop to split the string into the different parts. After that I have a command that will add the filler to the last item in the @result array if required. The x operator is the replication operator in Perl.

sub main ( $input_string, $size, $filler ) {
    my @result = ();
    for ( my $i = 0 ; $i < length($input_string) ; $i += $size ) {
        push @result, substr( $input_string, $i, $size );
    }

    $result[-1] .= $filler x ( $size - length( $result[-1] ) );

    say "(" . join( ", ", @result ) . ")";
}

Examples

$ ./ch-2.py RakuPerl 4 "#"
['Raku', 'Perl']

$ ./ch-2.py Python 5 0
['Pytho', 'n0000']

$ ./ch-2.py 12345 3 x
['123', '45x']

$ ./ch-2.py HelloWorld 3 _
['Hel', 'loW', 'orl', 'd__']

$ ./ch-2.py AI 5 "!"
['AI!!!']

Zsh has regexp-replace, you don’t need sed:

autoload regexp-replace regexp-replace line '^pick' 'f'

Also, instead of hard coding values, use zstyle:

# in your .zshrc zstyle ':xyz:xyz:terminal' terminal "kitty" # in your script/function local myterm zstyle -s ':xyz:xyz:terminal' terminal myterm || myterm="Eterm"

How to get past Sub::Defer in the Perl debugger

dev.to #perl

YOU ARE IN A MAZE OF TWISTY LITTLE PASSAGES, ALL ALIKE

perl -d professional_adventure_2024.pl

Sub::Defer::CODE(0x8ab8938)(/epsi/perlbrew/perls/perl-5.16.3/lib/perl5/Sub/Defer.pm:55):
55:         $undeferred ||= undefer_sub($deferred_info->[3]);
auto(-1)  DB<2> v
52:       my $undeferred;
53:       my $deferred_info;
54        my $deferred = sub {
55==>       $undeferred ||= undefer_sub($deferred_info->[3]);
56:         goto &$undeferred;
57:       };
58:       $deferred_info = [ $target, $maker, \$undeferred, $deferred ];
59:       weaken($deferred_info->[3]);
60:       weaken($DEFERRED{$deferred} = $deferred_info);
61:       _install_coderef($target => $deferred) if defined $target;
  DB<2> n

Walkthrough:

  • 7n
  • 3s

Adopting one from Zefram

Back in 2011, Andrew Main, known to the Perl community as Zefram, released Devel::CallParser. It was a quiet piece of infrastructure: a C API that let XS modules attach custom argument parsers to Perl subroutines. Where the core's PL_keyword_plugin API was awkward to work with directly, CallParser gave you a structured way to extend Perl's syntax from C.

Zefram maintained it through 2013, fixing compatibility issues with indirect and Sub::StrictDecl, working around padrange optimiser changes, and shipping version 0.002. Then silence. The module sat on CPAN, unmaintained, while Perl kept moving.

By the current year and perl version it was breaking. I personally could not install it locally on my hardened macOS runtimes, many reports of issues on threaded builds, shifted qerror internals, and many red reports on CPAN testers. I needed CallParser for Object::Proto::Sugar, so I adopted it and so far have shipped six dev releases (0.003_01 through 0.003_06) to try get it passing green again on all envs. Not glamorous work, but Zefram built something worth preserving.

(RIP Zefram... I didn't know them personally but the infrastructure they left behind is still making new things possible.)

The Idea

With CallParser working again, I decided to implement an idea I'd thought about for a long time: give Perl a proper enum keyword.

Not a hash. Not a bunch of use constant lines. Not a class/object pretending to be an enumeration. An actual keyword that declares an enum at compile time, generates real constants, and gives you a meta object for introspection.

Enum::Declare

Here's what it looks like:

use Enum::Declare;

enum Colour {
    Red,
    Green,
    Blue
}

say Red;    # 0
say Green;  # 1
say Blue;   # 2

That's it. enum is a real keyword, parsed at compile time by an XS callback wired through cv_set_call_parser. The constants are true constants, not subroutine calls, not tied variables. The compiler sees them.

Explicit Values

enum HttpStatus {
    OK        = 200,
    Created   = 201,
    NotFound  = 404,
    Internal  = 500
}

String Enums

enum LogLevel :Str {
    Debug,
    Info,
    Warn = "warning",
    Error,
    Fatal
}

say Debug;  # "debug"
say Warn;   # "warning"

Without an explicit value, :Str lowercases the constant name. With one, it uses what you gave it.

Bitflags

enum Perms :Flags {
    Read,
    Write,
    Execute
}

my $rw = Read | Write;
say "can read"  if $rw & Read;
say "can write" if $rw & Write;

:Flags assigns powers of two automatically. Combine them with bitwise operators as you'd expect.

Exporting

# In your module:
enum StatusCode :Export {
    OK      = 200,
    NotFound = 404
}

# Consumers get the constants automatically, or use tags:
use MyModule qw(:StatusCode);

Meta Objects

Every enum gets a meta object accessible by calling the enum name as a function:

my $meta = Colour();

say $meta->count;           # 3
say $meta->name(0);         # "Red"
say $meta->value('Blue');   # 2
say $meta->valid(1);        # true

my @pairs = $meta->pairs;  # (Red => 0, Green => 1, Blue => 2)

Exhaustive Matching

Colour()->match($val, {
    Red   => sub { "stop" },
    Green => sub { "go" },
    Blue  => sub { "sky" },
});

Miss a variant and it dies. Every key/branch must be covered.

How It Works

Under the hood, use Enum::Declare installs an XS stub named enum into the calling package, then attaches a custom parser via cv_set_call_parser. When Perl encounters enum during compilation, the parser callback fires and:

  1. Reads the enum name (lex_read_ident)
  2. Reads optional attributes - :Str, :Flags, :Export, :Type
  3. Reads the { Name = Value, ... } variant block
  4. Builds the constant subs and enum data structures
  5. Installs the meta object
  6. Optionally wires up @EXPORT / @EXPORT_OK

All of this happens at compile time. By the time Perl starts executing your code, the constants exist, the meta object is ready, and the exports are in place.

Enum::Declare::Common

Once the keyword worked, the obvious next step was a library of common enums:

use Enum::Declare::Common::HTTP qw(:StatusCode :Method);
use Enum::Declare::Common::Calendar qw(:Weekday :Month);
use Enum::Declare::Common::Color qw(:CSS);

say OK;        # 200
say GET;       # "get"
say Monday;    # 1
say January;   # 1

Enum::Declare::Common ships 20 submodules covering HTTP status codes and methods, ISO country and currency codes, MIME types, 148 named CSS hex colours, timezone offsets, Unix permissions, log levels, and more. All built on the same enum keyword, all with meta objects, all exportable.

Integration with Object::Proto

Every enum in the Common collection is declared with the :Type attribute:

enum StatusCode :Type :Export {
    OK      = 200,
    Created = 201,
    ...
}

This registers the enum as a type in Object::Proto at load time, so you can use enum names directly as slot types:

use Enum::Declare::Common::HTTP qw(:StatusCode :Method);
use Enum::Declare::Common::LogLevel qw(:Level);
use Object::Proto;

object 'APIRequest',
    'method:Method:required',
    'status:StatusCode',
    'log_level:Level:default(' . Info . ')',
;

my $req = new APIRequest method => GET;
$req->status(OK);       # valid
$req->status(9999);     # dies - not a valid StatusCode
$req->status(200);      # coercion - resolves to OK

The type checks and coercions run in C via object_register_type_xs_ex. No Perl callback overhead. A single pair of C functions serves every enum type, only the data pointer differs.

If you're writing Perl and you've been using hashes or use constant blocks to fake enums, give Enum::Declare a try. In my opinion it's enums the way they should have always worked.

As always if you have any questions just post below.


Tony writes:

``` [Hours] [Activity] 2026/03/02 Monday 1.55 #24228 follow-up comment, check updates, research and comment 0.75 #24187 review updates, mark comment resolved, research 0.97 #24242 review, research 0.40 #24242 debugging and comment

1.02 #24001 debugging, research, testing

4.69

2026/03/03 Tuesday 0.15 #24242 review dicsussion 0.10 #24211 review discussion and apply to blead 0.53 #24242 comment 0.23 #24239 review and comment 0.18 #24223 review and approve 0.40 #24244 review and comment 0.58 #24245 review and approve 0.07 #24247 review, existing comments seem fine 0.50 #24187 review more, comments 0.08 #24244 review update and approve

0.23 #24195 research

3.05

2026/03/04 Wednesday 0.88 #24252 review, research and comments 0.75 #24251 review, research and comments 0.90 #24253 review, comments 0.12 #24239 review updates and approve 0.28 #24208 comment with guide to update

0.15 #24208 review update and approve

3.08

2026/03/05 Thursday 0.68 #24254 review and comments 0.18 #24256 review and approve 0.13 #24247 check CI results and restart an apparent spurious failure 0.18 #24241 review CI failures and comment

0.40 #24228 compare to #24252 behaviour, testing

1.57

2026/03/09 Monday 0.33 #24254 review updates and approve 0.40 #24253 review updates and comment 1.38 #24252 review updates and comments, research, testing and follow-up

0.68 #24105 rebase, testing

2.79

2026/03/10 Tuesday 0.57 test 5.42.1 on fedora, looks ok, message on list indicates likely a local problem 2.70 #24105 check everything covered, various fixes, testing,

push for CI

3.27

2026/03/11 Wednesday 0.88 #24105 check CI results, fixes, push for more CI 0.57 #24187 review discussion, research and comment 0.13 #24253 review updates and approve 0.12 #24252 review updates and approve with comment 0.23 #24228 review updates and approve 0.15 #24252 approve with perldelta update 1.15 #24001 debugging (what is PL_curcopdb?)

1.20 #24001 debugging, research

4.43

2026/03/12 Thursday 1.12 #24265 review, research and comment

1.33 #24001 research, testing, needs some thought

2.45

2026/03/13 Friday

0.82 research, email to list about benchmarking

0.82

2026/03/16 Monday 0.10 #24208 review updates and apply to blead 2.47 #24272 profiling, benchmarking, comment and work on bisect 0.75 #24272 review bisect results, confirm bisect results, briefly try to work out cause, long comment with results

0.32 #24287 review and approve

3.64

2026/03/17 Tuesday 0.23 #24265 recheck and approve 0.92 #24001 re-work, research and testing 0.57 #24105 rebase and testing, minor fix and push for CI 1.13 #24272 try to diagnose

0.45 #24056 re-work commit message

3.30

2026/03/18 Wednesday 0.40 #24105 check CI results, re-check, make PR #24294 0.75 #24099 review, research and comment 0.22 #24296/#24295 research and comment (both have the same problem)

1.37 #24277 review, testing, comment

2.74

2026/03/19 Thursday 2.05 #24227 research and comments

1.22 #24272 debugging

3.27

2026/03/20 Friday

0.53 #24251 research and follow-up

0.53

2026/03/23 Monday 0.40 #24251 review updates, research and approve with comment 1.22 #24304 review, comment 0.15 #24313 review, research and apply to blead 0.10 #24310 review (nothing to say) 0.17 #24309 review, research and approve 0.13 #24305 review and approve 0.53 #24290 review 1.32 #24056 more update commit message, simplify perldelta

note, push and update OP on PR

4.02

2026/03/24 Tuesday 0.32 #24318 review and review ticket, start workflow, research and comment 0.08 #24301 review and approve 0.37 #24290 more review and comments 0.53 #24289 review, research current PSC and approve with comment 0.38 #24288 review, research and comment 0.18 #24285 review, research and approve 0.72 #24282 review, research and comment

1.23 #24290 review updates, testing and more comment

3.81

2026/03/25 Wednesday 0.15 #24056 check rules, apply to blead 0.30 #24308 review, research and comments 0.82 #24304 review, research and comment, consider Paul’s reply 1.55 #23918 string comparison APIs, research, open #24319 0.13 #24290 review updates and follow-up

1.50 #24005 start on perldebapi, research

4.45

2026/03/26 Thursday 1.25 #24326 review and comment 0.47 #24290 review updates, comment and approve with comment 0.40 #24326 review, comment on side issue and approve 0.28 #24323 review, try to find the referenced documentation, comment 0.10 #24324 review and approve

0.13 #24323 review update and approve

2.63

2026/03/30 Monday 0.80 #24308 review updates and comments 0.08 #24290 review discussion and apply to blead 1.05 #24304 review updates and comment, long comment 1.55 #23676 research, make APIs public and document, testing and push for CI 0.60 #24187 review updates

1.08 #24187 testing, comment

5.16

2026/03/31 Tuesday 1.22 #23676 comment, comment on PR regarding qerror() name, research, work on perldelta 0.47 github notifications, minor updates 0.53 #24332 review original ticket discussion and the change, approve with comment 0.23 #24329 review, research and apply to blead 0.32 #24281 review, try to get a decent view, given github’s tab mis-handling, comment 0.12 #24280 review, comments 0.35 #23995 research and comment 0.08 #24105 follow-up on PR 24294

0.50 #24251 follow-up comment

3.82

Which I calculate is 63.52 hours.

Approximately 51 tickets were reviewed or worked on, and 6 patches were applied. ```


Paul writes:

A couple of bugfixes in March, combined with starting to line up a few development ideas to open 5.45 with.

  • 2 = Bugfix for field refalias memory leak
    • https://github.com/Perl/perl5/pull/24254
  • 2 = Improved field performance
    • https://github.com/Perl/perl5/pull/24265
  • 3 = Continue progress on implementing PPC0030
    • https://github.com/Perl/perl5/pull/24304 (draft)
  • 2 = Bugfix for deferred class seal
    • https://github.com/Perl/perl5/pull/24326

Total: 9 hours

Besides working up to the 5.44 release, my main focus now will be getting things like PPC0030, magic-v2, attributes-v2, and various class feature improvements lined up ready for the 5.45 development cycle.


Dave writes:

Last month was spent looking into race conditions in threads and threads::shared. I initially started looking at a specific ticket, where (with effort) I could reproduce a specific crash by running many instances in parallel for several hours. I think I have fixed that specific bug, but that led me to the rabbit hole of dynamic thread-safety checkers such as helgrind, and I am currently plunging down the rabbit hole of issues which that tools is flagging up.

Nothing has been pushed yet.

Summary:

  • 17:06 GH #24258 dist/threads/t/free.t: Rare test failure in debugging build on FreeBSD

Total:

  • 17:06 TOTAL (HH::MM)

This week in PSC (221) | 2026-04-13

blogs.perl.org

All three of us attended this long meeting consisting entirely of dealing with release blockers.

  • We found no blockers among the issues and patches new since last week.

  • We weighed up #23676 again and decided that it merits blocker status even though the breakage was the result of a C compiler change, not of a change to perl (in this dev cycle or otherwise).

  • We then reviewed all of the blockers we were already tracking and made decisions on how to proceed with all of them. Of particular note,

  • We agreed that #23131 simply cannot be pursued in this form – not just in this cycle but at all. We may eventually be able to virtualize the stash entirely, allowing much deeper optimization by using a more rational internal data structure while maintaing the Perl-land illusion that nothing has changed (effectively turning the stash hash tree into an API in the same way that a tied hash is an API that looks like a data structure). But until such time, Perl-visible changes to the stash data structure are simply too disruptive.

  • #24340 is part of a series that constitutes a promising-looking fix which we want to pursue, but it didn’t get fully stabilized soon enough in this cycle, and because it’s in a dark and tricky corner of the codebase, we don’t want to take the risk of shipping with undiscovered new breakage there rather than a longstanding bug. Reapplying this patch series early in the next cycle should give it ample time to bake and settle down, and we hope it will eventually ship successfully.

  • We spent some time debating #24341, tracing commit and issue tracker history to try to assess the correctness of the changes. We lean toward the view that this is breakage that core should fix, but discussion in the comments is ongoing.

[P5P posting of this summary]

Happy sharing

blogs.perl.org

So you've got a bunch of Perl worker processes and they need to share state. A work queue, a counter, a lookup table - the usual. What do you reach for?

Perl has solid options here, and they've been around for a while. File::Map gives you clean zero-copy access to mmap'd files - substr, index, even regexes run directly on mapped memory. LMDB_File wraps the Lightning Memory-Mapped Database - mature, ACID-compliant, lock-free concurrent readers via MVCC, crash-safe persistence. Hash::SharedMem offers a purpose-built concurrent hash with lock-free reads and atomic copy-on-write updates. Cache::FastMmap is the workhorse for shared caching - mmap-backed pages with per-page fcntl locking, LRU eviction, and optional compression.

These are all good, proven tools. But they have something in common: they're about storage. You put data in, you get data out. They don't give you a queue that consumers can block on. They don't give you a pub/sub channel, a ring buffer, a semaphore, a priority heap, or a lock-free MPMC algorithm. They don't do atomic counters or futex-based blocking with timeouts.

That's the gap the Data::*::Shared family fills - fourteen Perl modules that give you proper, typed, concurrent data structures backed by mmap. Not better storage - concurrent data structures that happen to live in shared memory. Queues, hash maps, pub/sub, stacks, ring buffers, heaps, graphs, sync primitives - the works. All written in XS/C, all designed to work across fork()'d processes with zero serialization overhead.

Let me walk you through what's in the box.

The Approach

Every module in the family uses the same core recipe:

  • mmap(MAP_SHARED) for the actual shared memory - no serialization, no copies, just raw memory visible to all processes
  • Linux futex for blocking/waiting - when a queue is empty and you want to wait for data, you sleep in the kernel, not in a spin loop
  • CAS (compare-and-swap) for lock-free operations where possible - no mutex, no contention, just atomic CPU instructions
  • PID-based crash recovery - if a process dies holding a lock, other processes detect the stale PID and recover automatically

Requires Linux (futex, memfd), 64-bit Perl 5.22+. A deliberate tradeoff - portable it isn't, but fast it is.

Three ways to create the backing memory:

# File-backed - persistent, survives restarts
my $q = Data::Queue::Shared::Int->new('/tmp/myq.shm', 1024);

# Anonymous - fork-inherited, no filesystem footprint
my $q = Data::Queue::Shared::Int->new(undef, 1024);

# memfd - passable via Unix socket fd, no filesystem visibility
my $q = Data::Queue::Shared::Int->new_memfd("my_queue", 1024);

The Modules

Here's the full roster, grouped by use case.

Message Passing

Data::Queue::Shared - Your bread-and-butter MPMC (multi-producer, multi-consumer) bounded queue. Integer variants use the Vyukov lock-free algorithm; string variant uses a mutex with a circular arena. Blocking and non-blocking modes, batch operations, the whole deal.

use Data::Queue::Shared;

my $q = Data::Queue::Shared::Int->new(undef, 4096);

# In producer
$q->push(42);
$q->push_multi(1, 2, 3, 4, 5);

# In consumer
my $val = $q->pop_wait(1.5); # block up to 1.5s
my @batch = $q->pop_multi(100);

Single-process throughput: ~5M ops/s for integers. That's roughly 3x MCE::Queue and 6x POSIX message queues.

Data::PubSub::Shared - Broadcast pub/sub over a ring buffer. Publishers write, subscribers each track their own cursor. If a subscriber falls behind, it auto-recovers to the oldest available message. No back-pressure on writers.

my $ps = Data::PubSub::Shared::Int->new(undef, 8192);
$ps->publish(42);

my $sub = $ps->subscribe;
my $val = $sub->poll_wait(1.0);

Batch publishing hits ~170M msgs/s for integers. Yes, really. It's just writing to mapped memory.

Data::ReqRep::Shared - Request-response pattern with per-request reply routing. Client acquires a response slot, sends a request carrying the slot ID, server replies to that specific slot. Supports both sync and async client styles.

# Server
my ($request, $id) = $rr->recv_wait(1.0);
$rr->reply($id, "processed: $request");

# Client (async)
my $id = $rr->send("do something");
my $response = $rr->get_wait($id, 2.0);

Around 200K req/s cross-process - competitive with Unix domain sockets but with true MPMC support.

Key-Value

Data::HashMap::Shared - This is the big one. Concurrent hash map with elastic capacity, optional LRU eviction (clock algorithm with lock-free reads), optional per-key TTL, atomic counters, sharding, cursors. Eleven type variants from II (int-int) to SS (string-string).

use Data::HashMap::Shared::SS;

my $map = Data::HashMap::Shared::SS->new('/tmp/cache.shm', 100_000);
$map->put("user:123", "alice");
my $name = $map->get("user:123");

# LRU cache with max 10K entries
my $cache = Data::HashMap::Shared::SS->new('/tmp/lru.shm', 100_000, 10_000);

# TTL - entries expire after 60 seconds
my $ttl = Data::HashMap::Shared::II->new('/tmp/ttl.shm', 100_000, 0, 60);

# Atomic counter (lock-free fast path under read lock)
$map->incr("hits:page_a");

Cross-process string reads: 3.25M/s. Integer lookups hit ~10M/s. And you get built-in LRU and TTL without an external cache layer.

Sequential & Positional

Data::Stack::Shared - Lock-free LIFO stack. Push, pop, peek. ~6.4M ops/s.

Data::Deque::Shared - Double-ended queue. Push/pop from both ends. Lock-free CAS. ~6.3M ops/s.

Data::RingBuffer::Shared - Fixed-size circular buffer that overwrites on wrap. No consumer tracking - you just read by position. Great for metrics windows and rolling logs. ~11.7M writes/s.

Data::Log::Shared - Append-only log. Unlike Queue (consumed on read) or RingBuffer (overwritten), Log retains everything until explicitly truncated. CAS-based append, cursor-based reads. ~8.9M appends/s.

Resource Management

Data::Pool::Shared - Object pool with allocate/free. CAS-based bitmap allocation, typed slots (I64, I32, F64, Str), scope guards for automatic cleanup, raw C pointers for FFI integration. PID-tracked slots are auto-recovered when a process dies.

my $pool = Data::Pool::Shared::I64->new(undef, 256);
my $idx = $pool->alloc;
$pool->set($idx, 42);
# ...
$pool->free($idx);

# Or with auto-cleanup
{
my $guard = $pool->alloc_guard;
$pool->set($$guard, 99);
} # auto-freed here

Data::BitSet::Shared - Fixed-size bitset with per-bit atomic CAS operations. Good for flags, membership tracking, allocation bitmaps. ~10.5M ops/s.

Data::Buffer::Shared - Type-specialized arrays (I8 through F64, plus Str) with atomic per-element access. Seqlock for bulk reads, RW lock for bulk writes. Think shared sensor arrays or metric buffers.

Graphs & Priority

Data::Graph::Shared - Directed weighted graph with mutex-protected mutations. Node bitmap pool, adjacency lists, per-node data. ~3.9M node adds/s, ~13.3M lookups/s.

Data::Heap::Shared - Binary min-heap for priority queues. Mutex-protected, futex blocking when empty. ~5.3M pushes/s.

Synchronization Primitives

Data::Sync::Shared - Five cross-process sync primitives in one module: Semaphore, Barrier, RWLock, Condvar, and Once. All futex-based, all with PID-based stale lock recovery, all with scope guards.

use Data::Sync::Shared;

my $sem = Data::Sync::Shared::Semaphore->new(undef, 4); # 4 permits
{
my $guard = $sem->acquire_guard;
# at most 4 processes here concurrently
}

my $barrier = Data::Sync::Shared::Barrier->new(undef, $num_workers);
$barrier->wait; # blocks until all workers arrive

my $once = Data::Sync::Shared::Once->new(undef);
if ($once->enter) {
init_expensive_thing();
$once->done;
}

At a Glance

ModulePatternConcurrencyThroughput
Queue::SharedMPMC queuelock-free (Int), mutex (Str)~5M ops/s
PubSub::Sharedbroadcast pub/sublock-free (Int), mutex (Str)~170M/s batched
ReqRep::Sharedrequest-responselock-free (Int), mutex (Str)~200K req/s
HashMap::Sharedhash map + LRU/TTLfutex RW lock, seqlock reads~10M gets/s
Stack::SharedLIFO stacklock-free CAS~6.4M ops/s
Deque::Shareddouble-ended queuelock-free CAS~6.3M ops/s
RingBuffer::Sharedcircular bufferlock-free CAS~11.7M writes/s
Log::Sharedappend-only loglock-free CAS~8.9M appends/s
Pool::Sharedobject poollock-free bitmap~3.3M alloc/s
BitSet::Sharedbitsetlock-free CAS~10.5M ops/s
Buffer::Sharedtyped arraysatomic + seqlockper-type
Graph::Shareddirected graphmutex~13.3M lookups/s
Heap::Sharedpriority queuemutex~5.3M pushes/s
Sync::Sharedsem/barrier/rwlock/condvar/oncefutex-

Type Specialization

Most modules come in typed variants - Int16, Int32, Int64, Str, and so on. This isn't just for type safety. An Int16 queue uses half the memory of an Int64 queue, which means double the cache density on the same hardware. When you're doing millions of operations per second, cache lines matter.

Event Loop Integration

Every module supports eventfd() for integration with event loops like EV, Mojo, or AnyEvent:

my $fd = $q->eventfd;
# register $fd with your event loop
# on readable: $q->eventfd_consume; then poll/pop

Signaling is explicit ($q->notify) so you can batch writes before waking consumers.

Playing Nice with Others: PDL, FFI::Platypus, OpenGL::Modern

One thing I want to highlight is that these aren't isolated islands. Because everything lives in mmap'd memory with known layouts, you get natural interop with other systems that work with raw pointers and packed data.

PDL is the obvious one. If you're doing numerical work in Perl - signal processing, image manipulation, statistics - PDL is your workhorse. The Buffer module's as_scalar returns a zero-copy scalar reference directly over the mmap'd region. Feed that to PDL and you've got an ndarray backed by shared memory:

use Data::Buffer::Shared::F64;
use PDL;

my $buf = Data::Buffer::Shared::F64->new('/tmp/signal.shm', 10000);

# one process fills the buffer with sensor data...
# another process reads it as a PDL:
my $pdl = PDL->new_from_specification(double, 10000);
${$pdl->get_dataref} = ${$buf->as_scalar};
$pdl->upd_data;

printf "mean=%.4f stddev=%.4f\n", $pdl->stats;

For typed arrays you can also use get_raw/set_raw for bulk transfers - a single memcpy under the hood, seqlock-guarded for consistency. That means you can build a multiprocess image pipeline where one process captures frames into a shared U8 buffer, another runs PDL convolutions on it, and a third renders the result - all communicating through shared memory with eventfd notifications, no serialization anywhere.

FFI::Platypus works just as naturally. Pool and Buffer both expose ptr() / data_ptr() - raw C pointers as unsigned integers, ready to hand to any C function through FFI. Need to call libc qsort directly on your shared data? Go ahead:

use Data::Pool::Shared;
use FFI::Platypus;

my $pool = Data::Pool::Shared::I64->new(undef, 1000);
# ... alloc and fill slots ...

my $ffi = FFI::Platypus->new(api => 2);
$ffi->lib(undef); # libc
$ffi->attach([qsort => 'c_qsort'] =>
['opaque', 'size_t', 'size_t', '(opaque,opaque)->int'] => 'void');

c_qsort($pool->data_ptr, 1000, 8, $comparator);
# slots are now sorted in-place, visible to all processes

Pool slots are contiguous in memory (data_ptr + idx * elem_size), so any C library that expects a flat array works out of the box.

OpenGL::Modern is where it gets fun. Buffer::F32 is essentially a shared vertex buffer. One process computes positions, another renders them - connected by a shared mmap region and eventfd:

# Compute process:
my $verts = Data::Buffer::Shared::F32->new('/tmp/verts.shm', 30000);
$verts->set_slice(0, @new_positions);
$verts->notify;

# Render process:
my $ref = $verts->as_scalar;
# on eventfd readable:
glBufferSubData_p(GL_ARRAY_BUFFER, 0, $$ref); # zero-copy upload

Pool goes further - it's a natural fit for particle systems. Particles are dynamically spawned (alloc) and despawned (free), each with a fixed-size state struct. A spawner process allocates particles, a physics process updates them, and the renderer uploads the live slots to a VBO via ptr(). The raw pointer goes straight to glBufferSubData_c - no packing, no intermediate copies.

The common thread here is that the data is already in the format the consuming library expects. F32 buffers are packed floats. I64 pools are packed int64s. There's no Perl-side serialization layer to bypass because there was never one to begin with.

Optional Keyword API

If you install XS::Parse::Keyword, several modules expose lexical keywords that bypass Perl method dispatch entirely:

use Data::Queue::Shared;

q_int_push $q, 42;
my $v = q_int_pop $q;

Zero dispatch overhead. The XS function gets called directly. It's optional - the method API works fine - but it's there when you need every last microsecond.

The Big Picture

Here's how the pieces fit together in a typical system:

  • Data::Queue::Shared distributes work from producers to a pool of workers
  • Data::HashMap::Shared acts as a shared cache or config store that all workers read from
  • Data::PubSub::Shared broadcasts events or status updates to whoever's listening
  • Data::Sync::Shared coordinates startup (Barrier), limits concurrency (Semaphore), and protects shared initialization (Once)
  • Data::Pool::Shared manages reusable resource slots
  • Data::RingBuffer::Shared or Data::Log::Shared holds recent metrics or audit trails

All of this running across fork()'d processes, communicating through shared memory at millions of operations per second, no serialization overhead.

Getting Started

Values are typed C scalars or fixed-length strings - no automatic serialization of arbitrary Perl structures. That's by design: raw mmap'd memory is what makes everything fast and FFI-friendly, but it means you won't be sharing hashrefs or blessed objects directly.

All modules follow the same pattern:

use Data::Queue::Shared;

# Pick your backing: file, anonymous, or memfd
my $q = Data::Queue::Shared::Int->new(undef, 4096);

if (fork() == 0) {
$q->push($$); # child pushes its PID
exit;
}

my $child_pid = $q->pop_wait(5.0);
say "Child reported in: $child_pid";

The modules are on GitHub under the vividsnow account. Each one has its own repo, test suite, and benchmarks you can run yourself.

If you've ever wished Perl had something like Go's channels and sync primitives but for fork()'d processes - well, now it does. Fourteen of them, actually.

Happy sharing

Let’s Make a Drum Machine application! Yeah! :D

There are basically two important things to handle: A MIDI “clock” and a groove to play.

Why asynchronous? Well, a simple while (1) { Time::HiRes::sleep($interval); ... } will not do because the time between ticks will fluctuate, often dramatically. IO::Async::Timer::Periodic is a great timer for this purpose. Its default scheduler uses system time, so intervals happen as close to the correct real-world time as possible.

Clocks

A MIDI clock tells a MIDI device about the tempo. This can be handed to a drum machine or a sequencer. Each clock tick tells the device to advance a step of a measured interval. Usually this is very short, and is often 24 pulses per quarter-note (four quarter-notes to a measure of four beats).

Here is code to do that, followed by an explanation of the parts:

#!/usr/bin/env perl

use v5.36;
use feature 'try';
use IO::Async::Loop ();
use IO::Async::Timer::Periodic ();
use MIDI::RtMidi::FFI::Device ();

my $name = shift || 'usb'; # MIDI sequencer device
my $bpm  = shift || 120; # beats per minute

my $interval = 60 / $bpm / 24; # time / bpm / clocks-per-beat

# open the named midi device for output
my $midi_out = RtMidiOut->new;
try { # this will die on Windows but is needed for Mac
    $midi_out->open_virtual_port('RtMidiOut');
}
catch ($e) {}
$midi_out->open_port_by_name(qr/\Q$name/i);

$midi_out->start; # start the sequencer

$SIG{INT} = sub { # halt gracefully
    say "\nStop";
    try {
        $midi_out->stop; # stop the sequencer
        $midi_out->panic; # make sure all notes are off
    }
    catch ($e) {
        warn "Can't halt the MIDI out device: $e\n";
    }
    exit;
};

my $loop = IO::Async::Loop->new;

my $timer = IO::Async::Timer::Periodic->new(
   interval => $interval,
   on_tick  => sub { $midi_out->clock }, # send a clock tick!
);
$timer->start;

$loop->add($timer);
$loop->run;

The above code does a few things. First it uses modern Perl, then the modules that will make execution asynchronous, and finally the module that makes real-time MIDI possible.

Next up, a $name variable is captured for a unique MIDI device. (And to see what the names of MIDI devices on the system are, use JBARRETT’s little list_devices script.) Also, the beats per minute is taken from the command-line. If neither is given, usb is used for the name, and the BPM is set to “dance tempo.”

The clock needs a time interval to tick off. For us, this is a fraction of a second based on the beats per minute, and is assigned to the $interval variable.

To get the job done, we will need to open the named MIDI device for sending output messages to. This is done with the $name provided.

In order to not just die when we want to stop, $SIG{INT} is redefined to gracefully halt. This also sends a stop message to the open MIDI device. This stops the sequencer from playing.

Now for the meat and potatoes: The asynchronous loop and periodic timer. These tell the program to do its thing, in a non-blocking and event-driven manner. The periodic timer ticks off a clock message every $interval. Pretty simple!

As an example, here is the above code controlling my Volca Drum drum machine on a stock, funky groove. We invoke it on the command-line like this:

perl clock-gen-async.pl

Grooves

What we really want is to make our drum machine actually play something of our own making. So it’s refactor time… Let’s make a 4/4 time groove, with 16th-note resolution, that alternates between two different parts. “4/4” is a “time signature” in music jargon and means that there are four beats per measure (numerator), and a quarter note equals one beat (denominator). Other time signatures like the waltz’s 3/4 are simple, while odd meters like 7/8 are not.

In order to generate syncopated patterns, Math::Prime::XS and Music::CreatingRhythms are added to the use statements. “What are syncopated patterns?”, you may ask. Good question! “Syncopated” means, “characterized by displaced beats.” That is, every beat does not happen evenly, at exactly the same time. Instead, some are displaced. For example, a repeated [1 1 1 1] is even and boring. But when it becomes a repeated [1 1 0 1] things get spicier and more syncopated.

The desired MIDI channel is added to the command-line inputs. Most commonly, this will be channel 9 (in zero-based numbering). But some drum machines and sequencers are “multi-timbral” and use multiple channels simultaneously for individual sounds.

Next we define the drums to use. This is a hash-reference that includes the MIDI patch number, the channel it’s on, and the pattern to play. The combined patterns of all the drums, when played together at tempo, make a groove.

Now we compute intervals and friends. Previously, there was one $interval. Now there are a whole host of measurements to make before sending MIDI messages.

Then, as before, a named MIDI output device is opened, and a graceful stop is defined.

Next, a Music::CreatingRhythms object is created. And then, again as before, an asynchronous loop and periodic timer are instantiated and set in motion.

The meaty bits are in the timer’s on_tick callback. This contains all the logic needed to trigger our drum grooves.

As was done in the previous clock code, a clock message is sent, but also we keep track of the number of clock ticks that have passed. This number of ticks is used to trigger the drums. We care about 16 beats. So every 16th beat, we construct and play a queue of events.

Adjusting the drum patterns is where Math::Prime::XS and Music::CreatingRhythms come into play. The subroutine that does that is adjust_drums() and is fired every 4th measure. A measure is equal to four quarter-notes, and we use four pulses for each, to make 16 beats per measure. This routine reassigns either Euclidean or manual patterns of 16 beats to each drum pattern.

Managing the queue is next. If a drum is to be played at the current beat (as tallied by the $beat_count variable), it is added to the queue at full velocity (127). Then, after all the drums have been accounted for, the queue is played with $midi_out->note_on() messages. Lastly, the queue is “drained” by sending $midi_out->note_off() messages.

#!/usr/bin/env perl

use v5.36;
use feature 'try';
use IO::Async::Loop ();
use IO::Async::Timer::Periodic ();
use Math::Prime::XS qw(primes);
use MIDI::RtMidi::FFI::Device ();
use Music::CreatingRhythms ();

my $name = shift || 'usb'; # MIDI sequencer device
my $bpm  = shift || 120; # beats-per-minute
my $chan = shift // 9; # 0-15, 9=percussion, -1=multi-timbral

my $drums = {
    kick  => { num => 36, chan => $chan < 0 ? 0 : $chan, pat => [] },
    snare => { num => 38, chan => $chan < 0 ? 1 : $chan, pat => [] },
    hihat => { num => 42, chan => $chan < 0 ? 2 : $chan, pat => [] },
};

my $beats = 16; # beats in a measure
my $divisions = 4; # divisions of a quarter-note into 16ths
my $clocks_per_beat = 24; # PPQN
my $clock_interval = 60 / $bpm / $clocks_per_beat; # time / bpm / ppqn
my $sixteenth = $clocks_per_beat / $divisions; # clocks per 16th-note
my %primes = ( # for computing the pattern
    all  => [ primes($beats) ],
    to_5 => [ primes(5) ],
    to_7 => [ primes(7) ],
);
my $ticks = 0; # clock ticks
my $beat_count = 0; # how many beats?
my $toggle = 0; # part A or B?
my @queue; # priority queue for note_on/off messages

# open the named midi output device
my $midi_out = RtMidiOut->new;
try { # this will die on Windows but is needed for Mac
    $midi_out->open_virtual_port('RtMidiOut');
}
catch ($e) {}
$midi_out->open_port_by_name(qr/\Q$name/i);

$SIG{INT} = sub { # halt gracefully
    say "\nStop";
    try {
        $midi_out->stop; # stop the sequencer
        $midi_out->panic; # make sure all notes are off
    }
    catch ($e) {
        warn "Can't halt the MIDI out device: $e\n";
    }
    exit;
};

# for computing the pattern
my $mcr = Music::CreatingRhythms->new;

my $loop = IO::Async::Loop->new;

my $timer = IO::Async::Timer::Periodic->new(
    interval => $clock_interval,
    on_tick  => sub {
        $midi_out->clock;
        $ticks++;
        if ($ticks % $sixteenth == 0) {
            # adjust the drum pattern every 4th measure
            if ($beat_count % ($beats * $divisions) == 0) {
                adjust_drums($mcr, $drums, \%primes, \$toggle);
            }
            # add simultaneous drums to the queue
            for my $drum (keys %$drums) {
                if ($drums->{$drum}{pat}[ $beat_count % $beats ]) {
                    push @queue, { drum => $drum, velocity => 127 };
                }
            }
            # play the queue
            for my $drum (@queue) {
                $midi_out->note_on(
                    $drums->{ $drum->{drum} }{chan},
                    $drums->{ $drum->{drum} }{num},
                    $drum->{velocity}
                );
            }
            $beat_count++;
        }
        else {
            # drain the queue with note_off messages
            while (my $drum = pop @queue) {
                $midi_out->note_off(
                    $drums->{ $drum->{drum} }{chan},
                    $drums->{ $drum->{drum} }{num},
                    0
                );
            }
            @queue = (); # ensure the queue is empty
        }
    },
);
$timer->start;

$loop->add($timer);
$loop->run;

sub adjust_drums($mcr, $drums, $primes, $toggle) {
    # choose random primes to use by the hihat, kick, and snare
    my ($p, $q, $r) = map { $primes->{$_}[ int rand $primes->{$_}->@* ] } sort keys %$primes;
    if ($$toggle == 0) {
        say 'part A';
        $drums->{hihat}{pat} = $mcr->euclid($p, $beats);
        $drums->{kick}{pat}  = $mcr->euclid($q, $beats);
        $drums->{snare}{pat} = $mcr->rotate_n($r, $mcr->euclid(2, $beats));
        $$toggle = 1; # set to part B
    }
    else {
        say 'part B';
        $drums->{hihat}{pat} = $mcr->euclid($p, $beats);
        $drums->{kick}{pat}  = [qw(1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1)];
        $drums->{snare}{pat} = [qw(0 0 0 0 1 0 0 0 0 0 0 0 1 0 1 0)];
        $$toggle = 0; # set to part A
    }
}

(You may notice the inefficiency of attempting to drain an empty queue 23 times every 16th note. Oof! Fortunately, this doesn’t fire anything other than a single while loop condition. A more efficient solution would be to only drain the queue once, but this requires a bit more complexity that we won’t be adding, for brevity’s sake.)

On Windows, this works fine:

perl clocked-euclidean-drums.pl "gs wavetable" 90

To run with fluidsynth and hear the General MIDI percussion sounds, open a fresh new terminal session, and start up fluidsynth like so (mac syntax):

fluidsynth -a coreaudio -m coremidi -g 2.0 ~/Music/soundfont/FluidR3_GM.sf2

The FluidR3_GM.sf2 is a MIDI “soundfont” file and can be downloaded for free.

Next, enter this on the command-line (back in the previous terminal session):

perl clocked-euclidean-drums.pl fluid 90

You will hear standard kick, snare, and closed hihat cymbal. And here is a poor recording of this with my phone:

To run the code with my multi-timbral drum machine, I enter this on the command-line:

perl clocked-euclidean-drums.pl usb 90 -1

And here is what that sounds like:

The Module

I have coded this logic, and a bit more, into a friendly CPAN module. Check out the eg/euclidean.pl example program in the distribution. It is a work in progress. YMMV.

Credits

Thank you to Andrew Rodland (hobbs), who helped me wrap my head around the “no-sleeping asynchronous” algorithm.

To-do Challenges

  • Make patterns other than prime number based Euclidean phrases.

  • Toggle more than two groove parts.

  • Add snare fills to the (end of the) 4th bars. (here’s my version)

  • Make this code handle odd meter grooves.

Resources

Every month, I write a newsletter which (among other things) discusses some of the technical projects I’ve been working on. It’s a useful exercise — partly as a record for other people, but mostly as a way for me to remember what I’ve actually done.

Because, as I’m sure you’ve noticed, it’s very easy to forget.

So this month, I decided to automate it.

(And, if you’re interested in the end result, this is also a good excuse to mention that the newsletter exists. Two birds, one stone.)


The Problem

All of my Git repositories live somewhere under /home/dave/git. Over time, that’s become… less organised than it might be. Some repos are directly under that directory, others are buried a couple of levels down, and I’m fairly sure there are a few I’ve completely forgotten about.

What I wanted was:

  • Given a month and a year
  • Find all Git repositories under that directory
  • Identify which ones had commits in that month
  • Summarise the work done in each repo

The first three are straightforward enough. The last one is where things get interesting.


Finding the Repositories

The first step is walking the directory tree and finding .git directories. This is a classic Perl task — File::Find still does exactly what you need.

use v5.40;
use File::Find;

sub find_repos ($root) {
  my @repos;

  find(
    sub {
      return unless $_ eq '.git';
      push @repos, $File::Find::dir;
    },
    $root
  );

  return @repos;
}

This gives us a list of repository directories to inspect. It’s simple, robust, and doesn’t require any external dependencies.

(There are, of course, other ways to do this — you could shell out to fd or find, for example — but keeping it in Perl keeps everything nicely self-contained.)


Getting Commits for a Month

For each repo, we can run git log with appropriate date filters.

sub commits_for_month ($repo, $since, $until) {
  my $cmd = sprintf(
    q{git -C %s log --since="%s" --until="%s" --pretty=format:"%%s"},
    $repo, $since, $until
  );

  my @commits = `$cmd`;
  chomp @commits;

  return @commits;
}

Where $since and $until define the month we’re interested in. I’ve been using something like:

my $since = "$year-$month-01";
my $until = "$year-$month-31"; # good enough for this purpose

Yes, that’s a bit hand-wavy around month lengths. No, it doesn’t matter in practice. Sometimes ā€œgood enoughā€ really is good enough.


A Small Gotcha

It turns out I have a few repositories where I never got around to making a first commit. In that case, git log helpfully explodes with:

fatal: your current branch ‘master’ does not have any commits yet
Which is fair enough — but not helpful in a script that’s supposed to quietly churn through dozens of repositories.

The fix is simply to ignore failures:

my @commits = `$cmd 2>/dev/null`;

If there are no commits, we just get an empty list and move on. No warnings, no noise.

This is one of those little bits of defensive programming that makes the difference between a script you run once and a script you’re happy to run every month.


Summarising the Work

Once we have a list of commit messages, we can summarise them.

And this is where I cheated slightly.

I used OpenAPI::Client::OpenAI to feed the commit messages into an LLM and ask it to produce a short summary.

Something along these lines:

use OpenAPI::Client::OpenAI;

sub summarise_commits ($commits) {
  my $client = OpenAPI::Client::OpenAI->new(
    api_key => $ENV{OPENAI_API_KEY},
  );

  my $text = join "\n", @$commits;

  my $response = $client->chat->completions->create({
    model => 'gpt-4.1-mini',
    messages => [{
      role => 'user',
      content => "Summarise the following commit messages:\n\n$text",
    }],
  });

  return $response->choices->[0]->message->content;
}

Is this overkill? Almost certainly.

Could I have written some heuristics to group and summarise commit messages? Possibly.

Would it have been as much fun? Definitely not.

And in practice, it works remarkably well. Even messy, inconsistent commit messages tend to turn into something that looks like a coherent summary of work.


Putting It Together

For each repo:

  1. Get commits for the month
  2. Skip if there are none
  3. Generate a summary
  4. Print the repo name and summary

The output looks something like:

my-project
-----------
Refactored database layer, added caching, and fixed several edge-case bugs.

another-project
---------------
Initial scaffolding, basic API endpoints, and deployment configuration.

Which is already a pretty good starting point for a newsletter.


A Nice Side Effect

One unexpected benefit of this approach is that it surfaces projects I’d forgotten about.

Because the script walks the entire directory tree, it finds everything — including half-finished experiments, abandoned ideas, and repos I created at 11pm and never touched again.

Sometimes that’s useful. Sometimes it’s mildly embarrassing.

But it’s always interesting.


What Next?

This is very much a first draft.

It works, but it’s currently a script glued together with shell commands and assumptions about my directory structure. The obvious next step is to:

  • Turn it into a proper module
  • Add tests
  • Clean up the API
  • Release it to CPAN

At that point, it becomes something other people might actually want to use — not just a personal tool with hard-coded paths and questionable date handling.


A Future Enhancement

One idea I particularly like is to run this automatically using GitHub Actions.

For example:

  • Run monthly
  • Generate summaries for that month
  • Commit the results to a repository
  • Publish them via GitHub Pages

Over time, that would build up a permanent, browsable record of what I’ve been working on.

It’s a nice combination of:

  • automation
  • documentation
  • and a gentle nudge towards accountability

Which is either a fascinating historical archive…

…or a slightly alarming reminder of how many half-finished projects I have.


Closing Thoughts

This started as a small piece of automation to help me write a newsletter. But it’s turned into a nice example of what Perl is still very good at:

  • Gluing systems together
  • Wrapping command-line tools
  • Handling messy real-world data
  • Adding just enough intelligence to make the output useful

And, occasionally, outsourcing the hard thinking to a machine.

The code (such as it is currently is) is on GitHub at https://github.com/davorg/git-month-summary.

If you’re interested in the kind of projects this helps summarise, you can find my monthly newsletter over on Substack.

And if I get round to turning this into a CPAN module, I’ll let you know – well, if you’re subscribed to the newsletter!

The post Summarising a Month of Git Activity with Perl (and a Little Help from AI) first appeared on Perl Hacks.

Updates for great CPAN modules released last week. A module is considered great if its favorites count is greater or equal than 12.

  1. App::DBBrowser - Browse SQLite/MySQL/PostgreSQL databases and their tables interactively.
    • Version: 2.440 on 2026-04-11, with 18 votes
    • Previous CPAN version: 2.439 was released 1 month, 16 days before
    • Author: KUERBIS
  2. Attean - A Semantic Web Framework
    • Version: 0.036 on 2026-04-06, with 19 votes
    • Previous CPAN version: 0.035_01 was released the same day
    • Author: GWILLIAMS
  3. Bio::EnsEMBL - Bio::EnsEMBL - Ensembl Core API
    • Version: 114.0.0 on 2026-04-07, with 83 votes
    • Previous CPAN version: 114.0.0_50 was released 12 days before
    • Author: TAMARAEN
  4. CPANSA::DB - the CPAN Security Advisory data as a Perl data structure, mostly for CPAN::Audit
    • Version: 20260405.001 on 2026-04-05, with 25 votes
    • Previous CPAN version: 20260329.001 was released 6 days before
    • Author: BRIANDFOY
  5. Exporter - Implements default import method for modules
    • Version: 5.79 on 2026-04-06, with 28 votes
    • Previous CPAN version: 5.78 was released 2 years, 3 months, 6 days before
    • Author: TODDR
  6. Image::ExifTool - Read and write meta information
    • Version: 13.55 on 2026-04-07, with 44 votes
    • Previous CPAN version: 13.50 was released 2 months before
    • Author: EXIFTOOL
  7. JSON::Schema::Modern - Validate data against a schema using a JSON Schema
    • Version: 0.637 on 2026-04-08, with 16 votes
    • Previous CPAN version: 0.636 was released the same day
    • Author: ETHER
  8. Mail::Box - complete E-mail handling suite
    • Version: 4.02 on 2026-04-10, with 16 votes
    • Previous CPAN version: 4.01 was released 3 months, 28 days before
    • Author: MARKOV
  9. PDL - Perl Data Language
    • Version: 2.104 on 2026-04-08, with 102 votes
    • Previous CPAN version: 2.103 was released 1 month, 5 days before
    • Author: ETJ
  10. Pod::Simple - framework for parsing Pod
    • Version: 3.48 on 2026-04-05, with 20 votes
    • Previous CPAN version: 3.48 was released the same day
    • Author: KHW
  11. SPVM - The SPVM Language
    • Version: 0.990156 on 2026-04-08, with 36 votes
    • Previous CPAN version: 0.990155 was released 1 day before
    • Author: KIMOTO
  12. Term::Choose - Choose items from a list interactively.
    • Version: 1.782 on 2026-04-09, with 15 votes
    • Previous CPAN version: 1.781 was released 15 days before
    • Author: KUERBIS
  13. Test2::Harness - A new and improved test harness with better Test2 integration.
    • Version: 1.000170 on 2026-04-10, with 28 votes
    • Previous CPAN version: 1.000169 was released 1 day before
    • Author: EXODIST

Many years ago I wrote some monitoring tool that collects data using RRD (actually using perl module RRDs. The OS at that time was SLES 10 (rrdtool 1.2012 and rrdtool 1.3007).

Since upgrading to SLES 15 SP6 (perl-rrdtool-1.8.0-150600.1.4.x86_64) I see that my TICK boxes overlap its label.

The bottom of the graphs look like this:

Screenshot showing bottom of incorrect RRD graph

The purpose of the TICKs is just a kind of "color legend" for a bar graph (green/yellow/orange/red) classifying a status not shown.

The part responsible to draw that bottom part looks like this (the @graph_args array from RRDs::graph($fname, @settings, @graph_args);):

...
51  'COMMENT:    Last\\t Minimum\\t Average\\t Maximum\\t  Sample\\n'
52  'GPRINT:S_A:LAST:  %6.2lf\\t\\g'
53  'GPRINT:V_S_I:  %6.2lf\\t\\g'
54  'GPRINT:V_S_A:  %6.2lf\\t\\g'
55  'GPRINT:V_S_X:  %6.2lf\\t\\g'
56  'LINE:S_I'
57  'AREA:S_R#F061::STACK'
58  'LINE1:S_I#F066'
59  'LINE1:S_X#F066'
60  'LINE1:S_A#F06:System  \\n'
61  'GPRINT:P_A:LAST:  %6.2lf\\t\\g'
62  'GPRINT:V_P_I:  %6.2lf\\t\\g'
63  'GPRINT:V_P_A:  %6.2lf\\t\\g'
64  'GPRINT:V_P_X:  %6.2lf\\t\\g'
65  'LINE:P_I'
66  'AREA:P_R#C601::STACK'
67  'LINE1:P_I#C606'
68  'LINE1:P_X#C606'
69  'LINE1:P_A#C60:Peer    \\n'
70  'GPRINT:C_A:LAST:  %6.2lf\\t\\g'
71  'GPRINT:V_C_I:  %6.2lf\\t\\g'
72  'GPRINT:V_C_A:  %6.2lf\\t\\g'
73  'GPRINT:V_C_X:  %6.2lf\\t\\g'
74  'LINE:C_I'
75  'AREA:C_R#3F31::STACK'
76  'LINE1:C_I#3F36'
77  'LINE1:C_X#3F36'
78  'LINE1:C_A#3F3:Clock   \\n'
79  'GPRINT:A_A:LAST:  %6.2lf\\t\\g'
80  'GPRINT:V_A_I:  %6.2lf\\t\\g'
81  'GPRINT:V_A_A:  %6.2lf\\t\\g'
82  'GPRINT:V_A_X:  %6.2lf\\t\\g'
83  'LINE:A_I'
84  'AREA:A_R#06F1::STACK'
85  'LINE1:A_I#06F6'
86  'LINE1:A_X#06F6'
87  'LINE1:A_A#06F:Total   \\n'
88  'TICK:S0X#F00C:0.06'
89  'TICK:S1X#F90C:0.06'
90  'TICK:S2X#FF0C:0.06'
91  'TICK:S3X#0C09:0.06'
92  'TICK:S0A#F00C:0.04:Failed'
93  'TICK:S1A#F90C:0.04:Bad'
94  'TICK:S2A#FF0C:0.04:Marginal'
95  'TICK:S3A#0C09:0.04:Good'
96  'TICK:S0I#F00C:0.02'
97  'TICK:S1I#F90C:0.02'
98  'TICK:S2I#FF0C:0.02'
99  'TICK:S3I#0C09:0.02'

(The command is assembled in some complex way, so showing that code would just distract from the actual problem. Therefore I used the Perl debugger to extract the essential settings)

Here is an example how the bottom of the graph looked for the older version of rrdtool (the code calling it had not been changed):

bottom of rrdtool graph showing correct TICKs

The manual claims the syntax for TICKs still is:

TICK:vname#rrggbb[aa][:fraction[:legend]]

Package Versions

As the issue may be outside of rrdtool, I'll add more package versions:

> rpm -qa rrdtool librrd libpango\*
libpango-1_0-0-32bit-1.51.1-150600.1.3.x86_64
libpangomm-1_4-1-2.46.3-150600.1.2.x86_64
libpango-1_0-0-1.51.1-150600.1.3.x86_64
rrdtool-1.8.0-150600.1.4.x86_64

I saw that rrdtool-1.9.0-3.fc42.x86_64 in Fedora 42 has the same issue:

$ rpm -qa rrdtool rrd\* pango\* cairo\*
cairo-1.18.2-3.fc42.x86_64
cairo-gobject-1.18.2-3.fc42.x86_64
cairomm-1.14.5-8.fc42.x86_64
cairomm1.16-1.18.0-8.fc42.x86_64
pangomm-2.46.4-3.fc42.x86_64
pangomm2.48-2.56.1-1.fc42.x86_64
pango-1.56.4-2.fc42.x86_64
rrdtool-1.9.0-3.fc42.x86_64

Simplified test case

I managed to create a simplified test case:

First the sample RRD database as dump:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE rrd SYSTEM "https://oss.oetiker.ch/rrdtool/rrdtool.dtd">
<!-- Round Robin Database Dump -->
<rrd>
        <version>0003</version>
        <step>60</step> <!-- Seconds -->
        <lastupdate>1751958726</lastupdate> <!-- 2025-07-08 09:12:06 CEST -->

        <ds>
                <name> N_SYS </name>
                <type> GAUGE </type>
                <minimal_heartbeat>105</minimal_heartbeat>
                <min>0.0000000000e+00</min>
                <max>NaN</max>

                <!-- PDP Status -->
                <last_ds>1</last_ds>
                <value>6.0000000000e+00</value>
                <unknown_sec> 0 </unknown_sec>
        </ds>

        <ds>
                <name> N_PEER </name>
                <type> GAUGE </type>
                <minimal_heartbeat>105</minimal_heartbeat>
                <min>0.0000000000e+00</min>
                <max>NaN</max>

                <!-- PDP Status -->
                <last_ds>0.7</last_ds>
                <value>4.2000000000e+00</value>
                <unknown_sec> 0 </unknown_sec>
        </ds>

        <ds>
                <name> N_ALL </name>
                <type> GAUGE </type>
                <minimal_heartbeat>105</minimal_heartbeat>
                <min>0.0000000000e+00</min>
                <max>NaN</max>

                <!-- PDP Status -->
                <last_ds>0.769230769230769</last_ds>
                <value>4.6153846154e+00</value>
                <unknown_sec> 0 </unknown_sec>
        </ds>

        <!-- Round Robin Archives -->
        <rra>
                <cf>AVERAGE</cf>
                <pdp_per_row>1</pdp_per_row> <!-- 60 seconds -->

                <params>
                <xff>9.0000000000e-01</xff>
                </params>
                <cdp_prep>
                        <ds>
                        <primary_value>1.0000000000e+00</primary_value>
                        <secondary_value>1.0000000000e+00</secondary_value>
                        <value>NaN</value>
                        <unknown_datapoints>0</unknown_datapoints>
                        </ds>
                        <ds>
                        <primary_value>7.0000000000e-01</primary_value>
                        <secondary_value>6.0000000000e-01</secondary_value>
                        <value>NaN</value>
                        <unknown_datapoints>0</unknown_datapoints>
                        </ds>
                        <ds>
                        <primary_value>7.6923076923e-01</primary_value>
                        <secondary_value>6.9230769231e-01</secondary_value>
                        <value>NaN</value>
                        <unknown_datapoints>0</unknown_datapoints>
                        </ds>
                </cdp_prep>
                <database>
                        <!-- 2025-07-08 09:12:00 CEST / 1751958720 --> <row><v>1.0000000000e+00</v><v>8.0000000000e-01</v><v>8.4615384615e-01</v></row>
                </database>
        </rra>
</rrd>

The commands to create a demo.png:

$ rrdtool restore /tmp/demo.dump /tmp/demo.rrd
$ rrdtool graph /tmp/demo.png 'DEF:S_A=/tmp/demo.rrd:N_SYS:AVERAGE' 'CDEF:S1A=S_A,0.30,GT,S_A,UNKN,IF' 'TICK:S1A#F90C:0.04:Bad'

This results in this image: Output of test case

We're avid Perl programmers but we have been really wanting to get into Haskell or Erlang or something similar, though we don't know where to start. Any languages you guys recommend? if so, send some good tutorials [or give us a rundown yourself :>]

We must add that we're looking for pure-ish functional languages. Lisp syntax doesn't really sit right with us either so we don't really wish to use those.

edit [2026-04-05]: clarified why we don't accept lisp languages as suggestions

One can define a lexically-scoped sub as follows:

use feature qw( lexical_subs );

{
   my sub foo {
      ...
   }

   # This code can call `foo`.
}

# This code can't.

Is there a way of building a lexically-scoped sub from a reference?


I thought the (currently-experimental) refaliasing feature might be the way.

use feature qw( lexical_subs refaliasing declared_refs );

my \&foo = sub { ... };

This doesn't work, nor does any variant I tried.

TL;DR

Searching for CLI modules on MetaCPAN returns 1690 results. And still I wrote another, Yet Another CLI framework. This is about why mine is the one you want to use.

Introduction

CLI clients, we all write them, we all want them, but the boilerplate is just horrible. I wanted to get rid of it: Burn it with šŸ”„.

At my previous dayjob we had something I wrote, or co-wrote, my boss wanted a sort of chef like API. It became zsknife, but had a huge downside. You needed to register each command in the main module and I didn’t like that at all. I forked the concept internally, made it so you didn’t needed to register, but it lacked discovery.

Updates for great CPAN modules released last week. A module is considered great if its favorites count is greater or equal than 12.

  1. App::Staticperl - perl, libc, 100 modules, all in one standalone 500kb file
    • Version: 1.5 on 2026-04-04, with 21 votes
    • Previous CPAN version: 1.46 was released 4 years, 1 month, 16 days before
    • Author: MLEHMANN
  2. Catalyst::Action::REST - Automated REST Method Dispatching
    • Version: 1.22 on 2026-03-30, with 13 votes
    • Previous CPAN version: 1.21 was released 8 years, 3 months, 25 days before
    • Author: ETHER
  3. CPANSA::DB - the CPAN Security Advisory data as a Perl data structure, mostly for CPAN::Audit
    • Version: 20260329.001 on 2026-03-29, with 25 votes
    • Previous CPAN version: 20260327.002 was released 1 day before
    • Author: BRIANDFOY
  4. Devel::NYTProf - Powerful fast feature-rich Perl source code profiler
    • Version: 6.15 on 2026-03-31, with 199 votes
    • Previous CPAN version: 6.14 was released 2 years, 5 months, 12 days before
    • Author: JKEENAN
  5. Devel::Size - Perl extension for finding the memory usage of Perl variables
    • Version: 0.87 on 2026-03-31, with 22 votes
    • Previous CPAN version: 0.86_50 was released 1 month, 19 days before
    • Author: NWCLARK
  6. Dios - Declarative Inside-Out Syntax
    • Version: 0.002014 on 2026-04-01, with 24 votes
    • Previous CPAN version: 0.002013 was released 1 year, 7 months, 14 days before
    • Author: DCONWAY
  7. Inline::Module - Support for Inline-based CPAN Extension Modules
    • Version: 0.35 on 2026-03-30, with 14 votes
    • Previous CPAN version: 0.34 was released 11 years, 1 month, 12 days before
    • Author: INGY
  8. IPC::Run - system() and background procs w/ piping, redirs, ptys (Unix, Win32)
    • Version: 20260402.0 on 2026-04-02, with 39 votes
    • Previous CPAN version: 20260401.0 was released the same day
    • Author: TODDR
  9. LWP - The World-Wide Web library for Perl
    • Version: 6.82 on 2026-03-29, with 212 votes
    • Previous CPAN version: 6.81 was released 5 months, 6 days before
    • Author: OALDERS
  10. Module::CoreList - what modules shipped with versions of perl
    • Version: 5.20260330 on 2026-03-29, with 45 votes
    • Previous CPAN version: 5.20260320 was released 8 days before
    • Author: BINGOS
  11. Module::Metadata - Gather package and POD information from perl module files
    • Version: 1.000039 on 2026-04-03, with 14 votes
    • Previous CPAN version: 1.000038 was released 2 years, 11 months, 5 days before
    • Author: ETHER
  12. Mouse - Moose minus the antlers
    • Version: v2.6.2 on 2026-04-04, with 63 votes
    • Previous CPAN version: v2.6.1 was released 3 months, 14 days before
    • Author: SYOHEX
  13. perl - The Perl 5 language interpreter
    • Version: 5.042002 on 2026-03-29, with 2251 votes
    • Previous CPAN version: 5.042001 was released 21 days before
    • Author: SHAY
  14. Pod::Simple - framework for parsing Pod
    • Version: 3.48 on 2026-04-04, with 20 votes
    • Previous CPAN version: 3.47 was released 10 months, 19 days before
    • Author: KHW
  15. Sidef - The Sidef Programming Language - A modern, high-level programming language
    • Version: 26.04 on 2026-04-01, with 122 votes
    • Previous CPAN version: 26.01 was released 2 months, 18 days before
    • Author: TRIZEN
  16. SPVM - The SPVM Language
    • Version: 0.990153 on 2026-03-28, with 36 votes
    • Previous CPAN version: 0.990152 was released 2 days before
    • Author: KIMOTO
  17. Sys::Virt - libvirt Perl API
    • Version: v12.2.0 on 2026-04-01, with 17 votes
    • Previous CPAN version: v12.1.0 was released 29 days before
    • Author: DANBERR
  18. Test2::Harness - A new and improved test harness with better Test2 integration.
    • Version: 1.000164 on 2026-04-01, with 28 votes
    • Previous CPAN version: 1.000164 was released the same day
    • Author: EXODIST
  19. WebService::Fastly - an interface to most facets of the [Fastly API](https://www.fastly.com/documentation/reference/api/).
    • Version: 14.01 on 2026-03-31, with 18 votes
    • Previous CPAN version: 14.00 was released 1 month, 14 days before
    • Author: FASTLY
  20. YAML::Syck - Fast, lightweight YAML loader and dumper
    • Version: 1.44 on 2026-04-02, with 18 votes
    • Previous CPAN version: 1.43 was released 1 day before
    • Author: TODDR

This is the weekly favourites list of CPAN distributions. Votes count: 65

Week's winner: App::FatPacker (+2)

Build date: 2026/04/04 19:54:08 GMT


Clicked for first time:

  • App::Chit - chat with AI from the command line
  • App::prepare4release - prepare a Perl distribution for release (skeleton)
  • Context::Preserve - Run code after a subroutine call, preserving the context the subroutine would have seen if it were the last statement in the caller
  • Data::HashMap::Shared - Type-specialized shared-memory hash maps for multiprocess access
  • DB::Handy - Pure-Perl flat-file relational database with DBI-like interface
  • EV::Websockets - WebSocket client/server using libwebsockets and EV
  • Rex::LibSSH - Rex connection backend using Net::LibSSH (no SFTP required)
  • Task::Kensho::All - Install all of Task::Kensho
  • WWW::Tracking - universal website visitors tracking

Increasing its reputation:

Living Perl: From Scripting to Geodynamics

Perl on Medium

A Different Path into Scientific Computing.

TPRF Board Announces the 2025 Annual Report

Perl Foundation News

The Board is pleased to share the 2025 Annual Report from the The Perl and Raku Foundation.

You can download the full report from the Perl and Raku Foundation website

Strengthening the Foundation

2025 was a year of both challenge and progress. Like many nonprofits, the Foundation faced funding constraints that required careful prioritization of resources. At the same time, increased focus on fundraising and donor engagement helped stabilize support for the work that matters most. A number of processes and tools were overhauled, allowing the Board to manage the funding more effectively, and pay grants more promptly and at lower overhead expense than had been the case previously.

Contributions from sponsors, corporate partners, and individual donors played a critical role in sustaining operations—particularly for core development and infrastructure.

Funding What Matters Most

Financial stewardship remained a top priority throughout the year. The Foundation focused its resources on:

  • Supporting the Perl 5 Core Maintenance Fund
  • Investing in Raku development and ecosystem improvements
  • Maintaining essential infrastructure and services

While some grant activity was reduced during tighter periods, the report describes the Foundations recovery from those trials and outlines a clear path toward expanding funding as donations grow.

Our total income for the year was $253,744.86, with total expenditures of $233,739.75. 92% of our spending supported grants, events, and infrastructure. Our largest single expenditure remains the Perl Core Maintenance Grants, one of the long-time pillars of the Foundation's programs.

A Community-Funded Future

The Foundation’s work is made possible by the community it serves. Every donation—whether from individuals or organizations—directly supports the developers, tools, and systems that keep Perl and Raku reliable and evolving.

In 2025, we also strengthened our fundraising efforts, building a more sustainable base of recurring and long-term support to ensure continuity in the years ahead.

Looking Ahead

Our focus for the coming year is clear:

  • Grow recurring donations and sponsorships
  • Restore and expand the grants program
  • Continue developing transparent, responsible financial management

We’re grateful to everyone who contributed in 2025. Your support keeps the ecosystem strong.

If you rely on Perl or Raku, we encourage you to take part in sustaining them. Your support is always welcome!

Writing a TOON Module for Perl

Perl Hacks

Every so often, a new data serialisation format appears and people get excited about it. Recently, one of those formats is **TOON** — Token-Oriented Object Notation. As the name suggests, it’s another way of representing the same kinds of data structures that you’d normally store in JSON or YAML: hashes, arrays, strings, numbers, booleans and nulls.

So the obvious Perl question is: *ā€œOk, where’s the CPAN module?ā€*

This post explains what TOON is, why some people think it’s useful, and why I decided to write a Perl module for it — with an interface that should feel very familiar to anyone who has used JSON.pm.

I should point out that I knew about [Data::Toon](https://metacpan.org/pod/Data::TOON) but I wanted something with an interface that was more like JSON.pm.

## What TOON Is

TOON stands for **Token-Oriented Object Notation**. It’s a textual format for representing structured data — the same data model as JSON:

* Objects (hashes)
* Arrays
* Strings
* Numbers
* Booleans
* Null

The idea behind TOON is that it is designed to be **easy for both humans and language models to read and write**. It tries to reduce punctuation noise and make the structure of data clearer.

If you think of the landscape like this:

| Format | Human-friendly | Machine-friendly | Very common |
| —— | ————– | —————- | ———– |
| JSON | Medium | Very | Yes |
| YAML | High | Medium | Yes |
| TOON | High | High | Not yet |

TOON is trying to sit in the middle: simpler than YAML, more readable than JSON.

Whether it succeeds at that is a matter of taste — but it’s an interesting idea.

## TOON vs JSON vs YAML

It’s probably easiest to understand TOON by comparing it to JSON and YAML. Here’s the same ā€œpersonā€ record written in all three formats.

### JSON

{
“name”: “Arthur Dent”,
“age”: 42,
“email”: “arthur@example.com”,
“alive”: true,
“address”: {
“street”: “High Street”,
“city”: “Guildford”
},
“phones”: [
“01234 567890”,
“07700 900123”
]
}

### YAML

name: Arthur Dent
age: 42
email: arthur@example.com
alive: true
address:
street: High Street
city: Guildford
phones:
– 01234 567890
– 07700 900123

### TOON

name: Arthur Dent
age: 42
email: arthur@example.com
alive: true
address:
street: High Street
city: Guildford
phones[2]: 01234 567890,07700 900123

You can see that TOON sits somewhere between JSON and YAML:

* Less punctuation and quoting than JSON
* More explicit structure than YAML
* Still very easy to parse
* Still clearly structured for machines

That’s the idea, anyway.

## Why People Think TOON Is Useful

The current interest in TOON is largely driven by AI/LLM workflows.

People are using it because:

1. It is easier for humans to read than JSON.
2. It is less ambiguous and complex than YAML.
3. It maps cleanly to the JSON data model.
4. It is relatively easy to parse.
5. It works well in prompts and generated output.

In other words, it’s not trying to replace JSON for APIs, and it’s not trying to replace YAML for configuration files. It’s aiming at the space where humans and machines are collaborating on structured data.

You may or may not buy that argument — but it’s an interesting niche.

## Why I Wrote a Perl Module

I don’t have particularly strong opinions about TOON as a format. It might take off, it might not. We’ve seen plenty of ā€œnext big data formatā€ ideas over the years.

But what I *do* have a strong opinion about is this:

> If a data format exists, then Perl should have a CPAN module for it that works the way Perl programmers expect.

Perl already has very good, very consistent interfaces for data serialisation:

* JSON
* YAML
* Storable
* Sereal

They all tend to follow the same pattern, particularly the object-oriented interface:

use JSON;
my $json = JSON->new->pretty->canonical;
my $text = $json->encode($data);
my $data = $json->decode($text);

So I wanted a TOON module that worked the same way.

## Design Goals

When designing the module, I had a few simple goals.

### 1. Familiar OO Interface

The primary interface should be object-oriented and feel like JSON.pm:

use TOON;
my $toon = TOON->new
->pretty
->canonical
->indent(2);
my $text = $toon->encode($data);
my $data = $toon->decode($text);

If you already know JSON, you already know how to use TOON.

There are also convenience functions, but the OO interface is the main one.

### 2. Pure Perl Implementation

Version 0.001 is pure Perl. That means:

* Easy to install
* No compiler required
* Works everywhere Perl works

If TOON becomes popular and performance matters, someone can always write an XS backend later.

### 3. Clean Separation of Components

Internally, the module is split into:

* **Tokenizer** – turns text into tokens
* **Parser** – turns tokens into Perl data structures
* **Emitter** – turns Perl data structures into TOON text
* **Error handling** – reports line/column errors cleanly

This makes it easier to test and maintain.

### 4. Do the Simple Things Well First

Version 0.001 supports:

* Scalars
* Arrayrefs
* Hashrefs
* undef → null
* Pretty printing
* Canonical key ordering

It does **not** (yet) try to serialise blessed objects or do anything clever. That can come later if people actually want it.

## Example Usage (OO Style)

Here’s a simple Perl data structure:

my $data = {
name => “Arthur Dent”,
age => 42,
drinks => [ “tea”, “coffee” ],
alive => 1,
};

### Encoding

use TOON;
my $toon = TOON->new->pretty->canonical;
my $text = $toon->encode($data);
print $text;

### Decoding

use TOON;
my $toon = TOON->new;
my $data = $toon->decode($text);
print $data->{name};

### Convenience Functions

use TOON qw(encode_toon decode_toon);
my $text = encode_toon($data);
my $data = decode_toon($text);

But the OO interface is where most of the flexibility lives.

## Command Line Tool

There’s also a command-line tool, toon_pp, similar to json_pp:

cat data.toon | toon_pp

Which will pretty-print TOON data.

## Final Thoughts

I don’t know whether TOON will become widely used. Predicting the success of data formats is a fool’s game. But the cost of supporting it in Perl is low, and the potential usefulness is high enough to make it worth doing.

And fundamentally, this is how CPAN has always worked:

> See a problem. Write a module. Upload it. See if anyone else finds it useful.

So now Perl has a TOON module. And if you already know how to use JSON.pm, you already know how to use it.

That was the goal.

The post Writing a TOON Module for Perl first appeared on Perl Hacks.

Updates for great CPAN modules released last week. A module is considered great if its favorites count is greater or equal than 12.

  1. Clone - recursively copy Perl datatypes
    • Version: 0.50 on 2026-03-28, with 33 votes
    • Previous CPAN version: 0.49 was released 3 days before
    • Author: ATOOMIC
  2. CPANSA::DB - the CPAN Security Advisory data as a Perl data structure, mostly for CPAN::Audit
    • Version: 20260327.002 on 2026-03-27, with 25 votes
    • Previous CPAN version: 20260318.001 was released 9 days before
    • Author: BRIANDFOY
  3. DBD::Oracle - Oracle database driver for the DBI module
    • Version: 1.95 on 2026-03-24, with 33 votes
    • Previous CPAN version: 1.91_5 was released 8 days before
    • Author: ZARQUON
  4. IPC::Run - system() and background procs w/ piping, redirs, ptys (Unix, Win32)
    • Version: 20260322.0 on 2026-03-22, with 39 votes
    • Previous CPAN version: 20250809.0 was released 7 months, 12 days before
    • Author: TODDR
  5. Mojo::Pg - Mojolicious ♄ PostgreSQL
    • Version: 4.29 on 2026-03-23, with 98 votes
    • Previous CPAN version: 4.28 was released 5 months, 23 days before
    • Author: SRI
  6. Object::Pad - a simple syntax for lexical field-based objects
    • Version: 0.825 on 2026-03-25, with 48 votes
    • Previous CPAN version: 0.824 was released 1 day before
    • Author: PEVANS
  7. PDL::Stats - a collection of statistics modules in Perl Data Language, with a quick-start guide for non-PDL people.
    • Version: 0.856 on 2026-03-22, with 15 votes
    • Previous CPAN version: 0.855 was released 1 year, 16 days before
    • Author: ETJ
  8. SPVM - The SPVM Language
    • Version: 0.990152 on 2026-03-26, with 36 votes
    • Previous CPAN version: 0.990151 was released the same day
    • Author: KIMOTO
  9. Term::Choose - Choose items from a list interactively.
    • Version: 1.781 on 2026-03-25, with 15 votes
    • Previous CPAN version: 1.780 was released 1 month, 20 days before
    • Author: KUERBIS
  10. YAML::Syck - Fast, lightweight YAML loader and dumper
    • Version: 1.42 on 2026-03-27, with 18 votes
    • Previous CPAN version: 1.41 was released 4 days before
    • Author: TODDR

I just spend another fun and productive week in Marseille at the Koha Hackfest hosted by BibLibre. We (Mark, tadzik and me) arrived on Sunday (via plane from Vienna or Poland, and I came by train from Berlin via Strasbourg) and left on Friday.

There where the usual interesting discussions on all things Koha, presentations of new features and of course a lot of socializing. And cheese, so much cheese...

Elasticsearch

On the first day there was a discussion on Elasticsearch and getting rid of Zebra (the old search engine used by Koha). Actually getting rid of Zebra is not an option (now), because small installation won't want to set up and run Elasticsearch. But Mark proposed using our Marc Normalization Plugin as the basis for a new internal, DB-only search engine (so no need for an external index etc) and over the course of the week (and with LLM help) implemented a prototype. It would really be amazing if we could get this running!

I worked a bit on improving Elasticsearch indexing:

  • Bulk biblio ES index update after auth change: When merging (or updating) authorities, the Elasticsearch indexing of the linked biblios now will happen in one background job per authority instead of one background job per biblio. So an authority that is used in 100 biblios will now trigger one indexing background job with 100 biblio items instead of 100 background jobs with 1 biblio item each.
  • Zebraqueue should not be added to when only Elasticsearch is used: We added a new syspref "ElasticsearchEnableZebraQueue". If disabled, no data will be written to the zebraqueue table, because usually when using Elasticsearch you don't need to also run Zebra.

I got sign-offs and Pass-QA for both issues during the hackfest, thanks Fridolin, Paul and Baptiste (who owns the coolest tea mug at BibLibre..)

QA

I also did QA on a bunch of other issues: 22639, 35267, 36550, 39158, 40906, 41767, 41967, 42107. Some of them where of interest to me, some I did because other people nicely asked me to :-)

LLM, "AI" and Agentic Coding

This was again a hot topic, with some people using those tools to great effect, some hating them, and some in between. As in my last post on the German Perl Workshop I again want to point out this blog post: I Sold Out for $20 a Month and All I Got Was This Perfectly Generated Terraform, and during the event the post Thoughts on slowing the fuck down dropped (by Mario Zechner, who wrote the coding agent I (currently) use).

Anyway, Koha now has some guidelines on AI and LLM-assisted contributions and on using LLM features inside Koha.

Claude vs domm

While working on unit tests for Bug 40577 I struggled with a test failing only if I run the whole test script (as opposed to only the one subtest I was working on). It seemed to be a problem with mocked tests, so I asked Joubu (who was by chance just standing next to me). Together we figured out the scoping problem: If you use Test::MockObject/MockModule multiple times on a class from different scopes, the mocked methods/functions might not automatically be removed. You have to call unmock explicitly. After the patch was done, I described the error to Claude and asked for a fix, expecting to not get anything useable. But (to my slight horror) it produced the correct explanation and fix in very short time. On the one hand: amazing; on the other hand: very scary.

Other random stuff:

  • When it rains and a TGV arrives at the station, more people have the idea to take a taxi than taxis are available. So walking the short distance was necessary, but we (Katrin, who I met on the train, and me) still got wet. At least we had cold burgers...
  • Paul showed me a non-Koha tool he has written: mdv - A terminal markdown viewer with vim keybindings. Very nice, I especially like it to view checkouts of gitlab wikis!
  • I was not the only Team Scheisse fan attending! Philip++
  • Philip also pointed out the very detailed and interesting shared notes produced by various attendees during the event.
  • At my third visit to Marseille, I manage to navigate the city center quite well.
  • I finally made it to the Tangerine record store, very nice selection. I still did not let the shop owner talk me into buying a 200€ original UK pressing of Unknown Pleasures by Joy Division.
  • I did not get Moule Frits, but at least some Galette and Cidre.
  • After being to Senegal in February, I now realized that there are a lot of places selling Yassa and Mafe in Marseille. I guess they where there last year too, I just did not see them, having never eaten Yassa or Mafe before.
  • It can get very windy in Marseille.
  • I should do it like Jake(?) and cycle (at least partly) to the next hackfest.

Thanks

Thanks to BibLibre and Paul Poulain for organizing the event, and to all the attendees for making it such a wonderful 5 days!

Looking forward to meet you all again at the upcoming KohaCon in Karlsruhe

Updates

  • 2026-03-03: Added link to shared notes.

building my todo list app

rjbs forgot what he was saying

For years, I’ve wanted a better daily to-do checklist. I had a good idea what I wanted from it, but I knew it was going to be a pain to produce. It didn’t have any complicated ideas, just a bunch of UI considerations, and that’s not my area of expertise, so I’ve made do with a bunch of worse (for me) options, which has, for me, led to worse outcomes. I accepted the tradeoffs, but I wasn’t thrilled. Now I’ve finally built exactly the app I wanted, and it went great. I call it, for now, NerfJournal.

Project Seven: NerfJournal

That’s right, this is another ā€œstuff Rik did with Claudeā€ post. This one feels like maybe the project that had the greatest impact on me so far, and that’s in three ways: First, the tool I’ve produced is genuinely useful and I use it daily. Secondly, it made clear the ways in which the realm of coding easily available to me was expanded by agents. Finally, it’s been a great way to not just access but also learn those things, which I’ll write about in a follow-up post.

Anyway, the project is called NerfJournal, because it’s like a bullet journal, but wouldn’t really hurt anybody. Except me, if Hasbro decides to complain about the name.

I try to stick to a routine in setting up my work today. I have ā€œwork diaryā€, a bit like Mark Dominus once wrote about, and which I got to see in practice when we last worked together. This journal is very simple. There’s a bunch of checkboxes of things I mean to do every day, and then there’s space for notes on what else I actually did. I try to add a new page to this every day, and I’ve got a Monday template and a ā€œrest of the weekā€ template. The Monday template includes stuff I only need to do once a week. Here’s a sample page, not filled in:

Monday agenda in Notion

You’ll see that the 6th item on the morning routine is to post to #cyr-scrum. This is the Slack channel where, every day, the Cyrus team members are each meant to post what we did the previous day and what we’re going to do today. While the Notion page includes ā€œstuff I do every day, but might forgetā€, the #cyr-scrum post is generally ā€œstuff I won’t do again once it’s done, and might need to carry over until tomorrowā€.

That is: if I didn’t fill my water pitcher today, I failed, and tomorrow I’ll get a new instance of that to do. It’s not ā€œstill openā€, it’s a new one, and it’s interesting (well, to me) whether I kept up a streak. On the other hand, if I post in #cyr-scrum that I’m going to complete ticket CYR-1234, but I don’t do it, I better do it tomorrow. And if I do, there’s no reason to see it again on the next day.

a scrum post

A problem here is that now I have two to-do lists. One is a real todo list that I can tick ā€œdoneā€ on, and the other is a post in Slack that I want to refer back to, from time to time, to see whether I’m keeping up with what I said I’d do. GTD rightfully tells us that ā€œmore todo lists is worse than fewer todo listā€, generally, and I wanted fewer. But I didn’t want to make Linear tasks every day for things like ā€œdrink waterā€. And putting my scrum in Notion would be tedious. And CalDAV with VTODO has its own problems.

What I wanted was a single todo list that would be easy to use, visually simple enough to just leave on my desktop for quick reference. I’d been thinking about such a program off and on (mostly off) for a year or so, and after some so-so but encourage experience having Claude produce SwiftUI applications for me, I thought I’d give this one a go.

The session took place over two days. After a brief false start using VTODO (well, Apple’s EventKit) as a backend, we pivoted to a custom data model and got something working. We iterated on that, adding features, fixing bugs, and tweaking the design for a good while. When I felt like it, I’d take a break to play Xbox or read a book. When I came back, Claude had not contexted switched. Meanwhile, I’d had time for that diffuse cognition mode to help me ā€œthinkā€ about next steps.

The biggest shifts were about realizing that the data model was subtly wrong. This wouldn’t have been hard to fix by hand, but it would have been fiddly and boring. Instead, I said, ā€˜Here’s the new model, do it.ā€ Claude asked some useful questions, then did it. Meanwhile, I read Wikipedia. (I also spent some time reading the Swift source code.)

As things stand now, the app seems very likely to be useful. There are a bunch of things I still want to add. Some of them, I have a good picture of how to get them. Others, I only know the general idea. In both cases, I feel confident that I can get closer to what I want without too much serious effort. Pruning development dead ends is cheap.

You can read the whole development transcript, but it’s long. Firefox says 400 pages. But it’s there in case you want to look.

Here’s the app, loaded with test data. (There’s a Perl program to spit out predictable test data which can be imported into the app for testing.)

today's todo

Here’s today’s page, and you can see what I’ve done and haven’t. At the bottom, if you squint, you might see that one of my code review tasks says ā€œcarried over - 1 day agoā€, meaning that I first put it on my list yesterday, but still haven’t done it.

If we go back a while, we can see what a ā€œfinishedā€ day looks like:

a completed page

Now I can see all the things I did, when I marked them done, their category, and so on. I’m afraid I don’t have any days logged now that show some other things that could happen: things that didn’t get done would be shown in a ā€œnot doneā€ section, showing that they were carried over and (maybe) done four days later. Some items could be shown as abandoned – I decided not to do them or carry them over. This is useful for those ā€œfill the waterā€ tasks. If I didn’t do that task on Monday, then when Tuesday starts, Monday’s todo is automatically abandoned. You can see the distinction in the previous screenshot: tasks that will carry over get round ticky bubbles, but tasks that will get auto-abandoned get square ticky boxes.

This is all pretty good, but wasn’t this supposed to help with Scrum? Well, it does! There’s a menu option to generate a ā€œmrkdwnā€ (Slack’s bogus Markdown for chat) version of the day’s todo list into the macOS clipboard. Then I paste that into Slack. I can configure the report (or multiple versions of a report) so it doesn’t include personal items, for example. All of that: reporting, categories, and so on, are handled in the bundle manager.

the bundle manager

The bundle manager is named for ā€œbundlesā€, which are groups of tasks that I can dump onto my list with two clicks. I have one for the start of a sprint, and I have another for standard work days. I imagine that I’ll have other bundles later for things like ā€œprepare to travelā€ or ā€œconference dayā€. But when I click ā€œstart a new dayā€, I get a blank page, and I know I better start with my daily bundle.

…and one of the items on my daily bundle is ā€œmake the code review tasksā€. It’s got a hyperlink (you may have noticed that todo items can have a little link icon). The hyperlink is an iterm: URI that, when clicked, prompts me to run a little Perl program. That program fetches all the GitLab and GitHub code review requests waiting on me, turns it into JSON, and passes it to another little program that turns them into todos in NerfJournal. So I click the link, click ā€œyes, run this programā€, and then a bunch of specific-to-today tasks show up. Then I mark the first task done. I am getting all my code review done daily, just about. It’s a big process improvement.

wasn’t this post about Claude?

Well, sort of. I did all this with Claude. I described what I wanted, and I said I wanted it in SwiftUI, and Clade got to work building. I’d test, find bugs, realize that I had the wrong design, and iterate. I spent a big hunk of two days on this, and it has been a huge win. I could’ve built this on my own, for sure, but it would’ve taken weeks, at least, including ā€œlearn SwiftUI from scratchā€. Possible, of course, but a much larger investment on a tool that, in the end, I might not have liked!

Is the code bad? I’m not sure. I don’t think so, but I’m not a Swift expert yet. But also: it only runs on my machine. I can see everything it does, and I can see it’s safe. I do not plan to sell it, support it, or run my business on it. Effectively, I summoned into existence a specialized tool that helps me do the job at which I am an expert, saving my expert time for expert problems. I think I will end up doing a lot of this. And hopefully I’ll pick up some new skills, as I go, from paying close attention to the new code I’m reading.

I had Claude produce a complete daily to-do checklist for me – something like a bullet journal. The results were great, and I’ve been using the program for weeks and it’s definitely helping me stay on track and keep an eye on what I’m doing. The problem was that unlikely everything else I’d had Claude write, I was not competent to review this work. I felt confident it wasn’t going to cause me big problems, but what if I wanted to debug it myself? I realized there was a plausible solution waiting for me…

I gave Claude a prompt that went roughly like this:

We have built a really useful program here, and I am using it and enjoy it. Next, I would like to be able to work on it directly and to think more deeply about its architecture. The problem is that I don’t really know Swift.

I am an experienced programmer with decades of experience. I have worked primarily in Perl, but am well-versed in general programming concepts from many other different languages. It seems relevant so: I understand event-driven programming, observers, and MVC, in general.

I would like you to prepare a syllabus for me, which will help me learn SwiftUI, using NerfJournal as a working example. What might be useful here is a set of topics, starting with fundamentals and building to more complex ones, related to how the project actually works.

This began a back and forth that didn’t go on all that long. (You can read the transcript. Claude produced a syllabus. I proposed that we turn the project into a website. We fought with Jekyll for a while. Claude told me that I wouldn’t need some skills I thought I might want. (Later, I did want them.)

Still, in short order, I had: Unit 1: Swift as a Language. It started like this:

Before touching SwiftUI, you need the language it’s built on. Swift is statically typed, compiled, and designed around a distinction — value types vs. reference types — that will shape every decision in the units that follow.

This unit covers the language features you’ll see constantly in NerfJournal’s source: structs, enums, optionals, protocols, extensions, modules, closures, and computed properties. None of this is SwiftUI-specific; it’s just Swift.

The single most important idea in this unit is that structs are value types. Everything else makes more sense once that has settled in.

I felt that the text was good. It wasn’t confusing. It wasn’t unclear. It also didn’t captivate me or lead me to imagine I was reading a lost work of Edward Gibbon. But I didn’t need that, I just needed something to systematically help me learn SwiftUI, with an eye to working on the project I’d summoned into existence. On that front, the text was good.

Eventually, I did end up creating some skills and standing instructions. First, the standing instruction:

When the user asks a question about Swift or SwiftUI during a learning discussion, log it to learning/questions.md under the appropriate unit heading, then commit it. Do this automatically without being prompted.

As I read the content, I’d do all the things I’d normally do when reading a programming book: I’d close my eyes and think hard. I’d fiddle with the source code to see how things changed. I’d go consult the authoritative documentation. But sometimes, I’d also (or instead), ask Claude to elaborate on something.

At some point, the text said that extensions were ā€œmodule-scopedā€. I had no idea what a module was. The text didn’t say. Rather than consult the docs, I just asked Claude: ā€œYou refer to module scope. What is a module? Is this going to be explained later? If so, no problem.ā€

Claude said that no, its plan hadn’t included modules, and really they belonged in unit one. It provided me a clear and useful explanation and then, without prompting, wrote a commit to add the explanation to the Q&A appendix of the book. More questions like this came up, and Claude would populate the Q&A section.

Later, I added a skill, ā€˜next-chapter’:

Write the next unit of NerfLearning.

First, rebase this branch on main.

Review the changes between the state of this branch before rebasing and after. If changes to the project suggest that learning/SYLLABUS.md should be updated for future chapters, make those changes and commit it.

Then review the file learning/questions.md, which reflects questions from the reader during the last unit. Merge the material from the questions into the unit they came from. Remove the now-merged questions from the questions file. Commit that.

Then write the next unit from the syllabus. When doing so, reflect on the question-and-answers content you just merged into the previous unit. That reflects the kind of thing that the reader felt was missing from the text.

Commit the new unit.

I asked Claude to write Unit 2, and it did so. ā€œIt seems like the user wants more implementation details,ā€ it mused, ā€œI should make sure to cover how @ViewBuilder actually works.ā€ Then it spit out another unit. Was the unit actually better because of those instructions? How the heck should I know!? But it remained good.

I’m up to unit six now, where I’m stalled mostly due to other things taking my time. I actually feel like I can read the whole program and pretty much follow along what it’s doing, how the syntax works, how the SwiftUI ā€œmagicā€ is suffused through the system, and how I’d change things in significant ways. I’m no expert. At best, I’m a beginner, but I have been given a huge boost in my learning process.

Of course this sort of process could go haywire. I would not want to learn a foreign language or culture this way and then go on a diplomatic mission. Software learning is much more forgiving, because so much of it can be trivially verified by checking authoritative sources or performing experiments. Also, I’ve got a lot of experience to draw on. But even so, it’s clear that this has been valuable and I’ll do something like this again.

There is sometimes an argument that ā€œwhy will anybody learn anything anymore if the computer can do the work?ā€ I don’t get this argument. Sure, some people will try to get by on the minimum, but that’s already the case. Now there are some longer levers for just skating by. But the same levers can be used to learn more, to achieve more, and to experiment more. I don’t think any of this is an unvarnished good, but it’s also clearly not just spicy autocorrect.

I’m hoping to get back to SwiftUI in a week or two. I’m piling up a number of little features I’d like to implement, and might try a few by hand.

You can read NerfLearning, up to wherever I’ve gotten to, if you like… but it’s targeting a pretty darn small audience.

I'm currently in a train from Berlin to Strasbourg and then onward to Marseille, traveling from the 28th(!) German Perl Workshop to the Koha Hackfest. I spend a few days after the Perl Workshop in Berlin with friends from school who moved to Berlin during/after university, hanging around at their homes and neighborhoods, visiting museums, professional industrial kitchens and other nice and foody places. But I want to review the Perl Workshop, so:

German Perl Workshop

It seems the last time I've attended a German Perl Workshop was in 2020 (literally days before the world shut down...), so I've missed a bunch of nice events and possibilities to meet up with old Perl friends. But even after this longish break it felt a bit like returning home :-)

I traveled to Berlin by sleeper train (worked without a problem) arriving on Monday morning a few hours before the workshop started. I went to a friends place (where I'm staying for the week), dumped my stuff, got a bike, and did a nice morning cycle through Tiergarten to the venue. Which was an actual church! And not even a secularized one.

Day 1

After a short introduction and welcome by Max Maischein (starting with a "Willkommen, liebe Gemeinde" fitting the location) he started the workshop with a talk on Claude Code and Coding-Agents. I only recently started to play around a bit with similar tools, so I could related to a lot of the topics mentioned. And I (again?) need to point out the blog post I Sold Out for $20 a Month and All I Got Was This Perfectly Generated Terraform which sums up my feelings and experiences with LLMs much better than I could.

Abigail then shared a nice story on how they (Booking.com) sharded a database, twice using some "interesting" tricks to move the data around and still getting reads from the correct replicas, all with nearly no downtime. Fun, but as "my" projects usually operate on a much smaller scale than Booking I will probably not try to recreate their solution.

For lunch I met with Michael at a nearby market hall for some Vietnamese food to do some planing for the upcoming Perl Toolchain Summit in Vienna.

Lars Dieckow then talked about data types in databases, or actually the lack of more complex types in databases and how one could still implement such types in SQL. Looks interesting, but probably a bit to hackish for me to actually use. I guess I have to continue handling such cases in code (which of course feels ugly, especially as I've learned to move more and more code into the DB using CTEs and window functions).

Next Flavio S. Glock showed his very impressive progress with PerlOnJava, a Perl distribution for the JVM. Cool, but probably not something I will use (mostly because I don't run Java anywhere, so adding it to our stack would make things more complex).

Then Lars showed us some of his beloved tools in Aus dem NƤhkƤstchen, continuing a tradition started by Sven Guckes (RIP). I am already using some of the tools (realias, fzf, zoxide, htop, ripgrep) but now plan to finally clean up my dotfiles using xdg-ninja.

Now it was time for my first talk at this workshop, on Using class, the new-ish feature available in Perl (since 5.38) for native keywords for object-oriented programming. I also sneaked in some bibliographic data structures (MAB2 and MARCXML) to share my pain with the attendees. I was a tiny bit (more) nervous, as this was the first time I was using my current laptop (a Framework running Sway/Wayland) with an external projector, but wl-present worked like a charm. After the talk Wolfram Schneider showed me his MAB2->MARC online converter, which could maybe have been a basis for our tool, but then writing our own was a "fun" way to learn about MAB2.

The last talk of the day was Lee Johnson with I Bought A Scanner showing us how he got an old (ancient?) high-res foto scanner working again to scan his various film projects. Fun and interesting!

Between the end of the talks and the social event I went for some coffee with Paul Cochrane, and we where joined by Sawyer X and Flavio and some vegan tiramisu. Paul and me then cycled to the Indian restaurant through some light drizzle and along the Spree, and only then I realized that Paul cycled all the way from Hannover to Berlin. I was a bit envious (even though I in fact did cycle to Berlin 16 years ago (oh my, so long ago..)). Dinner was nice, but I did not stay too long.

Day 2

Tuesday started with Richard Jelinek first showing us his rather impressive off-grid house (or "A technocrat's house - 2050s standard") and the software used to automate it before moving on the the actual topic of his talk, Perl mit AI which turned out to be about a Perl implementation in Rust called pperl developed with massive LLM support. Which seems to be rather fast. As with PerlOnJava, I'm not sure I really want to use an alternative implementation (and of course currently pperl is marked as "Research Preview — WORK IN PROGRESS — please do not use in production environments") but maybe I will give it a try when it's more stable. Especially since we now have containers, which make setting up some experimental environments much easier.

Then Alexander Thurow shared his Thoughts on (Modern?) Software Development, lots of inspirational (or depressing) quotes and some LLM criticism lacking at the workshop (until now..)

Next up was Lars (again) with a talk on Hierarchien in SQL where we did a very nice derivation on how to get from some handcrafted SQL to recursive CTEs to query hierarchical graph data (DAG). I used (and even talked about) recursive CTEs a few times, but this was by far the best explanation I've ever seen. And we got to see some geizhals internals :-)

Sƶren Laird Sƶrries informed us on Digitale SouverƤnitƤt und Made in Europe and I'm quite proud to say that I'm already using a lot of the services he showed (mailbox, Hetzner, fairphone, ..) though we could still do better (eg one project is still using a bunch of Google services)

Then Salve J. Nilsen (whose name I will promise to not mangle anymore) showed us his thoughts on What might a CPAN Steward organization look like?. We already talked about this topic a few weeks ago (in preparation of the Perl Toolchain Summit), so I was not paying a lot of attention (and instead hacked up a few short slides for a lightning talk) - Sorry. But in the discussion afterwards Salve clarified that the Cyber Resilience Act applies to all "CE-marked products" and that even a Perl API backend that power a mobile app running on a smartphone count as "CE-marked products". Before that I was under the assumption that only software running on actual physical products need the attestation. So we should really get this Steward organization going and hopefully even profit from it!

The last slot of the day was filled with the Lightning Talks hosted by R Geoffrey Avery and his gong. I submitted two and got a "double domm" slot, where I hurried through my microblog pipeline (on POSSE and getting not-twitter-tweets from my command line via some gitolite to my self hosted microblog and the on to Mastodon) followed by taking up Lars' challenge to show stuff from my own "NƤhkƤstchen", in my case gopass and tofi (and some bash pipes) for an easy password manager.

We had the usual mixture of fun and/or informative short talks, but the highlight for me was Sebastian Gamaga, who did his first talk at a Perl event on How I learned about the problem differentiating a Hash from a HashRef. Good slides, well executed and showing a problem that I'm quite sure everybody encountered when first learning Perl (and I have to admit I also sometimes mix up hash/ref and regular/curly-braces when setting up a hash). Looking forward for a "proper" talk by Sebastian next year :-)

This evening I skipped having dinner with the Perl people, because I had to finish some slides for Wednesday and wanted to hang out with my non-Perl friends. But I've heard that a bunch of people had fun bouldering!

Day 3

I had a job call at 10:00 and (unfortunately) a bug to fix, so I missed the three talks in the morning session and only arrived at the venue during lunch break and in time for Paul Cochrane talking about Getting FIT in Perl (and fit he did get, too!). I've only recently started to collect exercise data (as I got a sport watch for my birthday) and being able to extract and analyze the data using my own software is indeed something I plan to do.

Next up was Julien Fiegehenn on Turning humans into SysAdmins, where he showed us how he used LLMs to adapt his developer mentorship framework to also work for sysadmin and getting them (LLMs, not fresh Sysadmins) to differentiate between Julian and Julien (among other things..)

For the final talk it was my turn again: Deploying Perl apps using Podman, make & gitlab. I'm not too happy with slides, as I had to rush a bit to finish them and did not properly highlight all the important points. But it still went well (enough) and it seemed that a few people found one of the main points (using bash / make in gitlab CI instead of specifying all the steps directly in .gitlab-ci.yml) useful.

Then Max spoke the closing words and announced the location of next years German Perl Workshop, which will take place in Hannover! Nice, I've never been there and plan to attend (and maybe join Paul on a bike ride there?)

Summary

As usual, a lot of thanks to the sponsors, the speakers, the orgas and the attendees. Thanks for making this nice event possible!

Still on the [b]leading edge

Perl Hacks

About eighteen months ago, I wrote a post called On the Bleading Edge about my decision to start using Perl’s new class feature in real code. I knew I was getting ahead of parts of the ecosystem. I knew there would be occasional pain. I decided the benefits were worth it.

I still think that’s true.

But every now and then, the bleading edge reminds you why it’s called that.

Recently, I lost a couple of days to a bug that turned out not to be in my code, not in the module I was installing, and not even in the module that module depended on — but in the installer’s understanding of modern Perl syntax.

This is the story.

The Symptom

I was building a Docker image for Aphra. As part of the build, I needed to install App::HTTPThis, which depends on Plack::App::DirectoryIndex, which depends on WebServer::DirIndex.

The Docker build failed with this error:

#13 45.66 --> Working on WebServer::DirIndex
#13 45.66 Fetching https://www.cpan.org/authors/id/D/DA/DAVECROSS/WebServer-DirIndex-0.1.3.tar.gz ... OK
#13 45.83 Configuring WebServer-DirIndex-v0.1.3 ... OK
#13 46.21 Building WebServer-DirIndex-v0.1.3 ... OK
#13 46.75 Successfully installed WebServer-DirIndex-v0.1.3
#13 46.84 ! Installing the dependencies failed: Installed version (undef) of WebServer::DirIndex is not in range 'v0.1.0'
#13 46.84 ! Bailing out the installation for Plack-App-DirectoryIndex-v0.2.1.

Now, that’s a deeply confusing error message.

It clearly says that WebServer::DirIndex was successfully installed. And then immediately says that the installed version is undef and not in the required range.

At this point you start wondering if you’ve somehow broken version numbering, or if there’s a packaging error, or if the dependency chain is wrong.

But the version number in WebServer::DirIndex was fine. The module built. The tests passed. Everything looked normal.

So why did the installer think the version was undef?

When This Bug Appears

This only shows up in a fairly specific situation:

  • A module uses modern Perl class syntax
  • The module defines a $VERSION
  • Another module declares a prerequisite with a specific version requirement
  • The installer tries to check the installed version without loading the module
  • It uses Module::Metadata to extract $VERSION
  • And the version of Module::Metadata it is using doesn’t properly understand class

If you don’t specify a version requirement, you’ll probably never see this. Which is why I hadn’t seen it before. I don’t often pin minimum versions of my own modules, but in this case, the modules are more tightly coupled than I’d like, and specific versions are required.

So this bug only appears when you combine:

modern Perl syntax + version checks + older toolchain

Which is pretty much the definition of ā€œbleading edgeā€.

The Real Culprit

The problem turned out to be an older version of Module::Metadata that had been fatpacked into cpanm.

cpanm uses Module::Metadata to inspect modules and extract $VERSION without loading the module. But the older Module::Metadata didn’t correctly understand the class keyword, so it couldn’t work out which package the $VERSION belonged to.

So when it checked the installed version, it found… nothing.

Hence:

Installed version (undef) of WebServer::DirIndex is not in range ‘v0.1.0’

The version wasn’t wrong. The installer just couldn’t see it.

An aside, you may find it amusing to hear an anecdote from my attempts to debug this problem.

I spun up a new Ubuntu Docker container, installed cpanm and tried to install Plack::App::DirectoryIndex. Initially, this gave the same error message. At least the problem was easily reproducible.

I then ran code that was very similar to the code cpanm uses to work out what a module’s version is.

$ perl -MModule::Metadata -E'say Module::Metadata->new_from_module("WebServer::DirIndex")->version'

This displayed an empty string. I was really onto something here. Module::Metadata couldn’t find the version.

I was using Module::Metadata version 1.000037 and, looking at the change log on CPAN, I saw this:

1.000038 2023-04-28 11:25:40Z
- detects "class" syntax
I installed 1.000038 and reran my command.
$ perl -MModule::Metadata -E'say Module::Metadata->new_from_module("WebServer::DirIndex")->version'
0.1.3

That seemed conclusive. Excitedly, I reran the Docker build.

It failed again.

You’ve probably worked out why. But it took me a frustrating half an hour to work it out.

cpanm doesn’t use the installed version of Module::Metadata. It uses its own, fatpacked version. Updating Module::Metadata wouldn’t fix my problem.

The Workaround

I found a workaround. That was to add a redundant package declaration alongside the class declaration, so older versions of Module::Metadata can still identify the package that owns $VERSION.

So instead of just this:

class WebServer::DirIndex {
  our $VERSION = '0.1.3';
  ...
}

I now have this:

package WebServer::DirIndex;

class WebServer::DirIndex {
  our $VERSION = '0.1.3';
  ...
}

It looks unnecessary. And in a perfect world, it would be unnecessary.

But it allows older tooling to work out the version correctly, and everything installs cleanly again.

The Proper Fix

Of course, the real fix was to update the toolchain.

So I raised an issue against App::cpanminus, pointing out that the fatpacked Module::Metadata was too old to cope properly with modules that use class.

Tatsuhiko Miyagawa responded very quickly, and a new release of cpanm appeared with an updated version of Module::Metadata.

This is one of the nice things about the Perl ecosystem. Sometimes you report a problem and the right person fixes it almost immediately.

When Do I Remove the Workaround?

This leaves me with an interesting question.

The correct fix is ā€œuse a recent cpanmā€.

But the workaround is ā€œadd a redundant package line so older tooling doesn’t get confusedā€.

So when do I remove the workaround?

The answer is probably: not yet.

Because although a fixed cpanm exists, that doesn’t mean everyone is using it. Old Docker base images, CI environments, bootstrap scripts, and long-lived servers can all have surprisingly ancient versions of cpanm lurking in them.

And the workaround is harmless. It just offends my sense of neatness slightly.

So for now, the redundant package line stays. Not because modern Perl needs it, but because parts of the world around modern Perl are still catching up.

Life on the Bleading Edge

This is what life on the bleading edge actually looks like.

Not dramatic crashes. Not language bugs. Not catastrophic failures.

Just a tool, somewhere in the install chain, that looks at perfectly valid modern Perl code and quietly decides that your module doesn’t have a version number.

And then you lose two days proving that you are not, in fact, going mad.

But I’m still using class. And I’m still happy I am.

You just have to keep an eye on the whole toolchain — not just the language — when you decide to live a little closer to the future than everyone else.

The post Still on the [b]leading edge first appeared on Perl Hacks.