Published by Perl Steering Council on Wednesday 02 July 2025 01:23
All three of us attended.
I had the pleasure of attending The Perl & Raku Conference (TPRC) 2025 in Greenville, SC as a volunteer. As always, opinions are my own.
The Conference
The conference went quite well. Unfortunately, a major weather event disrupted flights across the US, particularly around Atlanta, causing travel issues for some attendees and speakers. This led to a few talk cancellations.
We adopted it by consolidating the two talk tracks into one. There was still a diverse range of topics, and judging by the audience reactions, some of the talks were very well received.
The conference was attended by 40-50 people.
The Venue
The event was hosted at a Holiday Inn Express in Greenville, which turned out to be an excellent choice. The hotel was clean, recently renovated (following flood repairs last year), and very reasonably priced: $139 + tax per night for a suite. The staff were quite friendly and accommodating. It also proved to be a great low-cost venue for hosting a conference - more on that below.
The location was practical. Although rural, there were a few walkable restaurants nearby. Downtown Greenville was just a 15-minute drive away, and Lyft was inexpensive and reliable. As a bonus, the hotel is within walking distance of a Flixbus stop, which made my return trip to Atlanta easy.
Special thanks to Ruth, our conference chair, for offering airport rides to several of us.
Supporting the Local Community
This year, we also made an effort to give back to the local community. Ruth partnered with a local food bank, and through a raffle event, we raised $1,040, which equates to over 4,160 meals!
Conference dinner was catered from a local restaurant nearby.
We also featured local beer and wine, which was a nice addition. I tried Cheerwine, a local cherry soda, for the first time.
Local Beer Selection (Find Raku)
A Leaner, Smarter Conference Model
Warning: opinion ahead.
As I wrote before, financial sustainability has been a challenge for The Perl and Raku Foundation in recent years. Particularly, the foundation has been spending more on conferences than it earned overall. For example, in 2024, the foundation spent $50k on a conference when the foundation's annual income was only $40k, which isn't sustainable when they also spent $100k in grants.
This year, Ruth did a fantastic job transforming the event into a break-even conference. We kept costs low in several key ways:
1. The hotel provided the conference space at almost no cost. They provided breakfast and coffee.
2. We were allowed to bring our own food and drink, so volunteers sourced everything affordably from places like Walmart.
3. We avoided typical hotel surcharges we've seen in the past (like $40k for food or $10k for A/V).
4. All the equipment was rented externally and managed in-house.
Of course, a more economical setup meant more labor work - but thanks to Rich, a wonderful local volunteer with professional sandwich-making experience, we managed well! He led the team in preparing custom lunches on-site.
We also restructured the event to a 3-day format (2 days of talks + 1 day of classes/hackathon), instead of the previous 5-day model. I recall it was my idea, and while it was controversial, it worked smoothly in the end.
Attendance
The number of attendees is declining. The conference had ~100 attendees in 2023, ~80 in 2024, and we had around 40-50 in 2025.
There are many contributing factors; Perl's decline across industries, increased preference for virtual events, and challenges with international travel this year. Unlike previous locations such as Houston, Greenville doesn't have a strong base of local Perl companies.
While a smaller crowd has its trade-offs, such as smaller hallway tracks and a narrower pool of speakers, optimistically it also allowed for deeper connections among attendees.
Looking Ahead to 2026
I have little idea how our 2026 conference will look like. There is a discussion to hold it in Michigan, also I heard a suggestion to make it a Perl-only event. We also liked this venue so a return to Greenville is definitely on the table.
The foundation should not lose money to hold conferences, so naturally Greenville is a good choice. I'm eager to learn if there are other hotels that enable us to hold a low-cost conference and are close enough to an airport.
Whatever we decide, one thing is clear: the key to a successful conference is great volunteers, especially during the event itself. If you're interested in helping out, I encourage you to reach out.
Lastly, I appreciate these sponsors who support TPRF: DuckDuckGo, Webpros, Proxmox, Fastmail, SUSE, e-card, and Grant Street Group.
Published by jhnc on Tuesday 01 July 2025 21:43
I thought I'd try using mmap
to search a multi-gigabyte file without running out of memory. I tested on a file that did actually fit and the File::Slurp
version took less than a minute but the File::Map
version was still running after many minutes so I killed it.
I tested smaller files and found that the File::Map
version got progressively slower as file size increased (2x size => 4x time) while the File::Slurp
performance remained fairly constant (2x size => 2x time).
Am I not using the module correctly, or does File::Map
always get slow on large files?
for n in 1 4 16 32 64 128 256 512 4096; do
seq $n | xargs -I@ seq 100000 > data
ls -l data
time perl -MFile::Slurp -e '
$s = read_file("data");
$re = qr/^(99999|12345|4325|11111|50000)$/m;
while ($s =~ m/$re/g){ ++$matches }
print $matches;
'
time perl -MFile::Map=:all -e '
map_file $s, "data";
advise $s, "sequential";
$re = qr/^(99999|12345|4325|11111|50000)$/m;
while ($s =~ m/$re/g){ ++$matches }
print $matches;
'
done
n | size | matches | usr(slurp) | usr(slurp)/n | sys(slurp) | sys(slurp)/n | usr(map) | usr(map)/n | sys(map) | sys(map)/n |
---|---|---|---|---|---|---|---|---|---|---|
1 | 588895 | 5 | 0.033 | 0.033 | 0.007 | 0.007 | 0.014 | 0.014 | 0.001 | 0.001 |
4 | 2355580 | 20 | 0.051 | 0.013 | 0.007 | 0.002 | 0.032 | 0.008 | 0.005 | 0.001 |
16 | 9422320 | 80 | 0.109 | 0.007 | 0.015 | 0.001 | 0.138 | 0.009 | 0.012 | 0.001 |
32 | 18844640 | 160 | 0.184 | 0.005 | 0.024 | 0.001 | 0.400 | 0.013 | 0.021 | 0.001 |
64 | 37689280 | 320 | 0.328 | 0.005 | 0.049 | 0.001 | 2.666 | 0.042 | 4.305 | 0.067 |
128 | 75378560 | 640 | 0.629 | 0.005 | 0.079 | 0.001 | 10.014 | 0.078 | 17.638 | 0.138 |
256 | 150757120 | 1280 | 1.220 | 0.005 | 0.162 | 0.001 | 40.237 | 0.157 | 73.829 | 0.288 |
512 | 301514240 | 2560 | 2.423 | 0.005 | 0.323 | 0.001 | 158.729 | 0.310 | 302.041 | 0.590 |
4096 | 2412113920 | 20480 | 19.468 | 0.005 | 2.424 | 0.001 | ? | ? | ? | ? |
Instead of manually calculating the table from ls
and time
output, following @TLP's suggestion, here's a Perl Benchmark
version (warning output elided) that also indicates that File::Slurp
's performance is independent of file size but File::Map
gets slower:
#!/bin/bash
for n in 1 4 16 32 64 128 256 512; do
seq $n | xargs -I@ seq 100000 > data$n
done
perl -MBenchmark=cmpthese -MFile::Slurp -MFile::Map=:all -e '
@n = (1,4,16,32,64,128,256,512);
sub test_slurp {
my ($s,$re,$matches);
$s = read_file($f);
$re = qr/^(99999|12345|4325|11111|50000)$/m;
while ($s =~ m/$re/g){ ++$matches }
}
sub test_map {
my ($mm,$re,$matches);
map_file $mm, $f;
advise $mm, "sequential";
$re = qr/^(99999|12345|4325|11111|50000)$/m;
while ($mm =~ m/$re/g){ ++$matches }
}
for $n (@n) {
$f = "data$n";
cmpthese(-1, { "map($n)" => \&test_map, "slurp($n)" => \&test_slurp });
}
'
Rate map(1) slurp(1)
map(1) 198/s -- -1%
slurp(1) 200/s 1% --
Rate map(4) slurp(4)
map(4) 38.3/s -- -20%
slurp(4) 48.1/s 26% --
Rate map(16) slurp(16)
map(16) 6.60/s -- -48%
slurp(16) 12.6/s 91% --
Rate map(32) slurp(32)
map(32) 1.98/s -- -62%
slurp(32) 5.17/s 161% --
s/iter map(64) slurp(64)
map(64) 7.93 -- -96%
slurp(64) 0.350 2166% --
s/iter map(128) slurp(128)
map(128) 31.6 -- -98%
slurp(128) 0.730 4233% --
s/iter map(256) slurp(256)
map(256) 129 -- -99%
slurp(256) 1.55 8244% --
s/iter map(512) slurp(512)
map(512) 521 -- -99%
slurp(512) 2.82 18372% --
Published by thibaultduponchelle on Tuesday 01 July 2025 20:46
Add epigraph for 5.42.0-RC3
Published by thibaultduponchelle on Tuesday 01 July 2025 20:38
Disarm RC3 in patchlevel.h - Arm release_status in META.json
Published by thibaultduponchelle on Tuesday 01 July 2025 18:21
perldelta/perlhist: small update of metrics
Published by thibaultduponchelle on Tuesday 01 July 2025 18:07
Update perlhist
Published by thibaultduponchelle on Tuesday 01 July 2025 18:04
Arm RC3 in patchlevel.h - release_status is testing in META.json
Published by /u/aanzeijar on Tuesday 01 July 2025 16:37
Take this as a frustrated rant, but maybe the resident core contributors know something I don't know.
I'm currently trying to clean up some old code that relies on Params::Validate for runtime type checking, and I catch myself wishing for something like TypeScript's or Python's type hint system. Yes I know Moose exists. Yes I know Corinna exists. And Type::Params, and Params::Check, and Func::Params, and Type::Tiny and a dozen source filters I won't touch.
And you know what: all of them are fucking ugly. I just want to be able to say:
sub do_stuff :returns(Int) ($number : Int)
and have an IDE yell at me if I plug in something that is annotated as a string or an arrayref. Is that too much to ask? The semantics can even be pluggable for all I care! Just have something that can be optionally statically analysed. And the syntax is already there! Perl has had attributes on nearly everything for ages. All that is missing is a little bit of glue code, and a way to express what I mean with a type expression. I don't even need the runtime checks that Params::Validate does if the static analysis passes.
I know roughly why this never happened (I think it was bikeshedding on p5p between different people not being able to agree which flavour it should be), but even then - we have entire type systems in Moose for fields. We have rigid class hierarchies in Corinna but I can't tell the IDE of the consumer of my function that I want a bloody int? What is this madness?
/rant
How do you feel about substitution regexes without a replacement list?
'Cause I had an idea that instead it could be:
d/foo/
That would be nice.
However adding such an abstraction into the core would not worth the gain on two characters :D
What are your opinions? Also If I missed somehow that such a feature is already existing which somewhat feels like a replacement(pun intended), please enlighten me!
Published by chrisarg on Tuesday 01 July 2025 08:01
I had created the library in C as part of a bigger project to create a multithreaded and hardware (GPU, and soon TPU) accelerated library to manipulate fingerprints for text. At some point, I figured one can have fun vibe coding the interface to Perl. The first post in the series just dropped ; it provides the background, rationale, the prompt and the first output by Claude 3.7. Subsequent posts will critique the solution and document subsequent interactions with the chatbot.
Part 2 will be about the alienfile (a task that botched by the LLM). Suggestions for subsequent prompts welcome ; as I said this is a project whose C backend (except the TPU part) is nearly complete, so I am just having fun with the Perl part.
Published by chrisarg on Tuesday 01 July 2025 03:10
In this multipart series we will explore the benefits (and pitfalls) of vibe coding a Perl interface to an external (or foreign) library through a large language model.
Those of you who follow me on X/Twitter (as @ChristosArgyrop and @ArgyropChristos),
Bluesky , mast.hpc, mstdn.science,
mstdn.social know that I have been very critical of the hype behind AI and the hallucinations of both models and the
human AI influencers (informally known as botlickers in some corners of the web).
However, there are application areas of vibe coding with AI, e.g. semi-automating the task of creating API from one one language to another, in which the chatbots may actually deliver well and act as productivity boosters.
In this application area, we will be leveraging the AI tools as more or less enhanced auto-complete tools that can help a developer navigate less familiar, more technical and
possibly more boring aspects of the target language's 'guts'. If AI were to deliver in this area, then meaningless language wars can be averted, and wrong (at least in my opinion) reasons
to prefer one language, i.e. the availability of some exotic library, may be avoided.
For my foray in this area, I chose to interface to the library Bit that I wrote to support text fingerprinting for some research
applications. The library based on David Hanson's Bit_T library discussed in
Chapter 13 of "C Interfaces and Implementations"
has been extended to incorporate additional operations on bitsets (such as counts on unions/differences/intersections of sets) and fast population counts in both CPU and GPU
(with TPU implementations coming down the road). Hence, this is a test case that can illustrate the utility of Perl in using code that executes transparently in various hardware assets.
Similar to Hanson's original implementation (after all, my work is based on his!) the library interface is implemented to
an Abstract Data Type in C; a crucial aspect of the implementation is to manage memory and avoid leaks without looking (at all, or as little as possible!) under the hood.
For our experiments we will use Claude 3.7 Thinking Sonnet through the Github Copilot vscode interface. This is going to be a multipart series post that will be published throughout the summer.
Our focus will be on interactions between me and the bot, and in particular critiquing the responses it has given me for both high level (e.g. choice of approach) and technical aspects of the project.
For the prompt I provided the "bit.c", "bit.h", makefile and the readme file of the Bit github repository as context to Claude and then I issued the following:
Look at the bit.h file that defines two abstract data types Bit_T and Bit_T_DB
and their interfaces in C. I would like to wrap these interfaces in Perl.
The build process will consist of using the Alien package in Perl to make bit available to Perl.
Explore possible options for the creation of the interface:
A. Creation of the bit C library :
1) compiling bit.c and bit.h as dynamic libraries
2) compiling bit.c nd bit.h into a static library
B. Interfacing with the bit C library from Perl:
1) using SWIG i files
2) using Platypus::FFI
3) using Inline::C
4) using XS
Since you are an expert with deep knowledge of the Perl ecosystem, consider pros and cons of
each approach. and then recommend the optimal Consider lightweight options for encapsulating
the Bit_T and Bit_T_DB abstract data types as Perl objects.
Provide wrapping recommendations targetting a beginner, intermediate and experienced Perl
programmer since those may differ. When providing recommendations , consider the dependencies
that are required for each implementation and highlight them in your analysis.
Once you have provided the recommendation, provide the implementation of the wrapper for
these 3 programmer levels. To reduce the token output (and your risk of hallucinating),
provide a full implementation of the wrapper for the beginner level and only the
changes (if any) that are required for the intermediate and advanced programmer levels.
In my prompt specification, I clearly specified that
Note that I gave the bot considerable freedom over the "low" level aspects of wrapping (type of library, choice of the cross-language interface),
but restricted it in terms of how I wanted the library served.
Furthermore, I guided it to consider alternative viewpoints and level of expertise, providing a unique opportunity to test Claude's
representation1 of conversations about wrappers. Finally, I alerted the chatbot to consider the token-sequence length dependent risk of hallucinations and
tried to limit the length of the output of the alternative implementations.
The output based on this prompt is shown below (inside the horizontal lines); the formatting was also part of the output
Let me analyze the options for creating a Perl interface to the Bit_T and Bit_T_DB C library and provide implementation recommendations for programmers at different experience levels.
I see the Makefile already creates a dynamic library (libbit.so
) in the build
directory with this rule:
$(TARGET): $(OBJ)
$(CC) $(CFLAGS) -shared -o $@ $^ $(LDFLAGS)
Pros:
Cons:
To compile as a static library, you would modify the Makefile:
TARGET = $(BUILD_DIR)/libbit.a
$(TARGET): $(OBJ)
ar rcs $@ $^
Pros:
Cons:
Pros:
Cons:
Pros:
Cons:
Pros:
Cons:
Pros:
Cons:
FFI::Platypus provides the easiest approach for beginners with minimal C knowledge. It works with your existing dynamic library build.
Inline::C offers more control for intermediate users who understand some C code but want to avoid complex build processes.
XS provides the highest performance and greatest control, suitable for experienced Perl programmers.
Let's create the Alien::Bit module first to handle library discovery, then implement Bit::Set and Bit::Set::DB modules using FFI::Platypus.
# /media/chrisarg/Software-Dev/Bit/perl-wrapper/lib/Alien/Bit.pm
package Alien::Bit;
use strict;
use warnings;
use parent 'Alien::Base';
use File::Spec;
sub dynamic_libs {
my $class = shift;
# Try to find the library in common locations
my @potential_paths = (
# From the build directory
'/media/chrisarg/Software-Dev/Bit/bit/build/libbit.so',
# System paths
'/usr/local/lib/libbit.so',
'/usr/lib/libbit.so',
# Environment variable if set
$ENV{BIT_LIB_PATH} ? File::Spec->catfile($ENV{BIT_LIB_PATH}, 'libbit.so') : ()
);
foreach my $path (@potential_paths) {
return $path if -f $path && -r $path;
}
die "Could not find libbit.so library. Please set BIT_LIB_PATH environment variable to its location.\n";
}
sub include_dirs {
return (
'/media/chrisarg/Software-Dev/Bit/bit/include',
'/usr/local/include',
'/usr/include',
$ENV{BIT_INCLUDE_PATH} || ()
);
}
1;
# /media/chrisarg/Software-Dev/Bit/perl-wrapper/lib/Bit/Set.pm
package Bit::Set;
use strict;
use warnings;
use FFI::Platypus;
use FFI::Platypus::Memory qw(malloc free);
use FFI::Platypus::Buffer qw(scalar_to_buffer buffer_to_scalar);
use Alien::Bit;
# Create FFI::Platypus object
my $ffi = FFI::Platypus->new(api => 1);
# Add path to our dynamic library
$ffi->lib(Alien::Bit->dynamic_libs);
# Define opaque types for our bitset pointers
$ffi->type('opaque' => 'Bit_T');
$ffi->type('opaque*' => 'Bit_T_Ptr');
# Define our Bit_T functions
$ffi->attach(Bit_new => ['int'] => 'Bit_T' => sub {
my ($xsub, $self, $length) = @_;
die "Length must be a positive integer" unless defined $length && $length > 0;
my $bit_set = $xsub->($length);
die "Failed to create bit set" unless $bit_set;
return bless { _handle => $bit_set }, $self;
});
$ffi->attach(Bit_free => ['Bit_T_Ptr'] => 'opaque' => sub {
my ($xsub, $self) = @_;
my $ptr = \$self->{_handle};
return $xsub->($ptr);
});
$ffi->attach(Bit_length => ['Bit_T'] => 'int' => sub {
my ($xsub, $self) = @_;
return $xsub->($self->{_handle});
});
$ffi->attach(Bit_count => ['Bit_T'] => 'int' => sub {
my ($xsub, $self) = @_;
return $xsub->($self->{_handle});
});
$ffi->attach(Bit_buffer_size => ['int'] => 'int');
$ffi->attach(Bit_bset => ['Bit_T', 'int'] => 'void' => sub {
my ($xsub, $self, $index) = @_;
die "Index must be non-negative" unless defined $index && $index >= 0;
$xsub->($self->{_handle}, $index);
});
$ffi->attach(Bit_bclear => ['Bit_T', 'int'] => 'void' => sub {
my ($xsub, $self, $index) = @_;
die "Index must be non-negative" unless defined $index && $index >= 0;
$xsub->($self->{_handle}, $index);
});
$ffi->attach(Bit_get => ['Bit_T', 'int'] => 'int' => sub {
my ($xsub, $self, $index) = @_;
die "Index must be non-negative" unless defined $index && $index >= 0;
return $xsub->($self->{_handle}, $index);
});
$ffi->attach(Bit_set => ['Bit_T', 'int', 'int'] => 'void' => sub {
my ($xsub, $self, $lo, $hi) = @_;
die "Low index must be non-negative" unless defined $lo && $lo >= 0;
die "High index must be greater than or equal to low index" unless defined $hi && $hi >= $lo;
$xsub->($self->{_handle}, $lo, $hi);
});
$ffi->attach(Bit_clear => ['Bit_T', 'int', 'int'] => 'void' => sub {
my ($xsub, $self, $lo, $hi) = @_;
die "Low index must be non-negative" unless defined $lo && $lo >= 0;
die "High index must be greater than or equal to low index" unless defined $hi && $hi >= $lo;
$xsub->($self->{_handle}, $lo, $hi);
});
# Comparison operations
$ffi->attach(Bit_eq => ['Bit_T', 'Bit_T'] => 'int' => sub {
my ($xsub, $self, $other) = @_;
die "Other bitset must be a Bit::Set object" unless ref $other eq ref $self;
return $xsub->($self->{_handle}, $other->{_handle});
});
# Set operations
$ffi->attach(Bit_union => ['Bit_T', 'Bit_T'] => 'Bit_T' => sub {
my ($xsub, $self, $other) = @_;
die "Other bitset must be a Bit::Set object" unless ref $other eq ref $self;
my $result_handle = $xsub->($self->{_handle}, $other->{_handle});
return bless { _handle => $result_handle }, ref $self;
});
$ffi->attach(Bit_inter => ['Bit_T', 'Bit_T'] => 'Bit_T' => sub {
my ($xsub, $self, $other) = @_;
die "Other bitset must be a Bit::Set object" unless ref $other eq ref $self;
my $result_handle = $xsub->($self->{_handle}, $other->{_handle});
return bless { _handle => $result_handle }, ref $self;
});
# Count operations
$ffi->attach(Bit_inter_count => ['Bit_T', 'Bit_T'] => 'int' => sub {
my ($xsub, $self, $other) = @_;
die "Other bitset must be a Bit::Set object" unless ref $other eq ref $self;
return $xsub->($self->{_handle}, $other->{_handle});
});
# Constructor and destructor
sub new {
my ($class, $length) = @_;
return $class->Bit_new($length);
}
sub DESTROY {
my ($self) = @_;
$self->Bit_free() if defined $self->{_handle};
}
# Convenient accessor methods
sub length {
my ($self) = @_;
return $self->Bit_length();
}
sub count {
my ($self) = @_;
return $self->Bit_count();
}
sub get {
my ($self, $index) = @_;
return $self->Bit_get($index);
}
sub set {
my ($self, $index) = @_;
$self->Bit_bset($index);
return $self;
}
sub clear {
my ($self, $index) = @_;
$self->Bit_bclear($index);
return $self;
}
sub set_range {
my ($self, $lo, $hi) = @_;
$self->Bit_set($lo, $hi);
return $self;
}
sub clear_range {
my ($self, $lo, $hi) = @_;
$self->Bit_clear($lo, $hi);
return $self;
}
sub equals {
my ($self, $other) = @_;
return $self->Bit_eq($other) ? 1 : 0;
}
sub union {
my ($self, $other) = @_;
return $self->Bit_union($other);
}
sub intersection {
my ($self, $other) = @_;
return $self->Bit_inter($other);
}
sub intersection_count {
my ($self, $other) = @_;
return $self->Bit_inter_count($other);
}
1;
# /media/chrisarg/Software-Dev/Bit/perl-wrapper/lib/Bit/Set/DB.pm
package Bit::Set::DB;
use strict;
use warnings;
use FFI::Platypus;
use FFI::Platypus::Memory qw(malloc free);
use FFI::Platypus::Buffer qw(scalar_to_buffer buffer_to_scalar);
use Bit::Set;
use Alien::Bit;
# Create FFI::Platypus object
my $ffi = FFI::Platypus->new(api => 1);
# Add path to our dynamic library
$ffi->lib(Alien::Bit->dynamic_libs);
# Define opaque types for our bitset pointers
$ffi->type('opaque' => 'Bit_T');
$ffi->type('opaque' => 'Bit_T_DB');
$ffi->type('opaque*' => 'Bit_T_DB_Ptr');
# Define struct for SETOP_COUNT_OPTS
$ffi->type('record(SETOP_COUNT_OPTS)' => [
num_cpu_threads => 'int',
device_id => 'int',
upd_1st_operand => 'bool',
upd_2nd_operand => 'bool',
release_1st_operand => 'bool',
release_2nd_operand => 'bool',
release_counts => 'bool'
]);
# Define Bit_T_DB functions
$ffi->attach(BitDB_new => ['int', 'int'] => 'Bit_T_DB' => sub {
my ($xsub, $self, $length, $num_of_bitsets) = @_;
die "Length must be a positive integer" unless defined $length && $length > 0;
die "Number of bitsets must be a positive integer" unless defined $num_of_bitsets && $num_of_bitsets > 0;
my $db = $xsub->($length, $num_of_bitsets);
die "Failed to create bitset DB" unless $db;
return bless { _handle => $db }, $self;
});
$ffi->attach(BitDB_free => ['Bit_T_DB_Ptr'] => 'opaque' => sub {
my ($xsub, $self) = @_;
my $ptr = \$self->{_handle};
return $xsub->($ptr);
});
$ffi->attach(BitDB_length => ['Bit_T_DB'] => 'int' => sub {
my ($xsub, $self) = @_;
return $xsub->($self->{_handle});
});
$ffi->attach(BitDB_nelem => ['Bit_T_DB'] => 'int' => sub {
my ($xsub, $self) = @_;
return $xsub->($self->{_handle});
});
$ffi->attach(BitDB_count_at => ['Bit_T_DB', 'int'] => 'int' => sub {
my ($xsub, $self, $index) = @_;
die "Index must be non-negative" unless defined $index && $index >= 0;
return $xsub->($self->{_handle}, $index);
});
$ffi->attach(BitDB_clear => ['Bit_T_DB'] => 'void' => sub {
my ($xsub, $self) = @_;
$xsub->($self->{_handle});
});
$ffi->attach(BitDB_clear_at => ['Bit_T_DB', 'int'] => 'void' => sub {
my ($xsub, $self, $index) = @_;
die "Index must be non-negative" unless defined $index && $index >= 0;
$xsub->($self->{_handle}, $index);
});
$ffi->attach(BitDB_get_from => ['Bit_T_DB', 'int'] => 'Bit_T' => sub {
my ($xsub, $self, $index) = @_;
die "Index must be non-negative" unless defined $index && $index >= 0;
my $bit_handle = $xsub->($self->{_handle}, $index);
return bless { _handle => $bit_handle }, 'Bit::Set';
});
$ffi->attach(BitDB_put_at => ['Bit_T_DB', 'int', 'Bit_T'] => 'void' => sub {
my ($xsub, $self, $index, $bitset) = @_;
die "Index must be non-negative" unless defined $index && $index >= 0;
die "Bitset must be a Bit::Set object" unless ref $bitset eq 'Bit::Set';
$xsub->($self->{_handle}, $index, $bitset->{_handle});
});
# CPU-specific intersection count function
$ffi->attach(BitDB_inter_count_cpu => ['Bit_T_DB', 'Bit_T_DB', 'SETOP_COUNT_OPTS'] => 'int*' => sub {
my ($xsub, $self, $other, $opts) = @_;
die "Other must be a Bit::Set::DB object" unless ref $other eq ref $self;
$opts ||= {
num_cpu_threads => 1,
device_id => -1,
upd_1st_operand => 0,
upd_2nd_operand => 0,
release_1st_operand => 0,
release_2nd_operand => 0,
release_counts => 0
};
return $xsub->($self->{_handle}, $other->{_handle}, $opts);
});
# Constructor and destructor
sub new {
my ($class, $length, $num_of_bitsets) = @_;
return $class->BitDB_new($length, $num_of_bitsets);
}
sub DESTROY {
my ($self) = @_;
$self->BitDB_free() if defined $self->{_handle};
}
# Convenient accessor methods
sub length {
my ($self) = @_;
return $self->BitDB_length();
}
sub num_of_bitsets {
my ($self) = @_;
return $self->BitDB_nelem();
}
sub count_at {
my ($self, $index) = @_;
return $self->BitDB_count_at($index);
}
sub clear {
my ($self) = @_;
$self->BitDB_clear();
return $self;
}
sub clear_at {
my ($self, $index) = @_;
$self->BitDB_clear_at($index);
return $self;
}
sub get_bitset {
my ($self, $index) = @_;
return $self->BitDB_get_from($index);
}
sub put_bitset {
my ($self, $index, $bitset) = @_;
$self->BitDB_put_at($index, $bitset);
return $self;
}
sub intersection_count_cpu {
my ($self, $other, $opts) = @_;
return $self->BitDB_inter_count_cpu($other, $opts);
}
1;
#!/usr/bin/perl
use strict;
use warnings;
use Bit::Set;
use Bit::Set::DB;
# Create a new bitset with 1024 bits
my $bitset1 = Bit::Set->new(1024);
my $bitset2 = Bit::Set->new(1024);
# Set some bits
$bitset1->set(42);
$bitset1->set(100);
$bitset2->set(42);
$bitset2->set(200);
# Check if a bit is set
print "Bit 42 in bitset1: ", $bitset1->get(42), "\n";
print "Bit 100 in bitset1: ", $bitset1->get(100), "\n";
print "Bit 200 in bitset1: ", $bitset1->get(200), "\n";
# Count the number of bits set
print "Number of bits set in bitset1: ", $bitset1->count(), "\n";
print "Number of bits set in bitset2: ", $bitset2->count(), "\n";
# Calculate intersection count
my $intersection_count = $bitset1->intersection_count($bitset2);
print "Intersection count: $intersection_count\n";
# Create a new bitset as the union of the two bitsets
my $union = $bitset1->union($bitset2);
print "Number of bits set in union: ", $union->count(), "\n";
# Create a BitDB with 3 bitsets of length 1024
my $db = Bit::Set::DB->new(1024, 3);
# Put our bitsets into the DB
$db->put_bitset(0, $bitset1);
$db->put_bitset(1, $bitset2);
$db->put_bitset(2, $union);
# Count bits in each bitset in the DB
print "Bits set in DB at index 0: ", $db->count_at(0), "\n";
print "Bits set in DB at index 1: ", $db->count_at(1), "\n";
print "Bits set in DB at index 2: ", $db->count_at(2), "\n";
For intermediate-level Perl programmers, here are the changes required to implement the wrapper using Inline::C:
# Bit::Set using Inline::C
package Bit::Set;
use strict;
use warnings;
use Inline C => Config =>
BUILD_NOISY => 1,
CLEAN_AFTER_BUILD => 0,
LIBS => '-L/media/chrisarg/Software-Dev/Bit/bit/build -lbit',
INC => '-I/media/chrisarg/Software-Dev/Bit/bit/include';
use Inline C => <<'END_C';
#include "bit.h"
typedef struct Bit_T* Bit_T;
// Wrapper functions
Bit_T create_bitset(int length) {
return Bit_new(length);
}
void free_bitset(Bit_T set) {
Bit_free(&set);
}
int get_length(Bit_T set) {
return Bit_length(set);
}
int get_count(Bit_T set) {
return Bit_count(set);
}
void set_bit(Bit_T set, int index) {
Bit_bset(set, index);
}
void clear_bit(Bit_T set, int index) {
Bit_bclear(set, index);
}
int get_bit(Bit_T set, int index) {
return Bit_get(set, index);
}
// Add more wrapper functions as needed
END_C
# Perl OO interface
sub new {
my ($class, $length) = @_;
my $handle = create_bitset($length);
return bless { _handle => $handle }, $class;
}
# Add more methods similar to the FFI version
For advanced Perl programmers, here are the key changes required to implement using XS:
# Bit::Set.xs
#include "EXTERN.h"
#include "perl.h"
#include "XSUB.h"
#include "bit.h"
typedef struct Bit_T* Bit_T;
typedef struct Bit_T_DB* Bit_T_DB;
MODULE = Bit::Set PACKAGE = Bit::Set
PROTOTYPES: DISABLE
Bit_T
Bit_new(length)
int length
void
Bit_free(set)
Bit_T set
CODE:
Bit_free(&set);
int
Bit_length(set)
Bit_T set
int
Bit_count(set)
Bit_T set
void
Bit_bset(set, index)
Bit_T set
int index
# Add more functions as needed
And update the Makefile.PL:
use ExtUtils::MakeMaker;
WriteMakefile(
NAME => 'Bit::Set',
VERSION_FROM => 'lib/Bit/Set.pm',
LIBS => ['-L/media/chrisarg/Software-Dev/Bit/bit/build -lbit'],
INC => '-I/media/chrisarg/Software-Dev/Bit/bit/include',
OBJECT => '$(O_FILES)',
);
For most users, especially beginners, the FFI::Platypus approach offers the best balance of simplicity,
performance, and maintainability. It works well with your existing dynamic library build process and
provides a clean, object-oriented interface to the Bit_T and Bit_T_DB types. The implementation provided
covers all the core functionality while being extensible for additional features as needed.
In this first post, we will not consider the implementation(s) which will be covered in separate posts, but will restrict ourselves to Claude's output with respect to the other questions. I give Claude bonus points for clearly identifying the niches for each potential choice:
However:
Note, I did not use the word understanding, as I do not think that LLMs can understant: they are merely noisy statistical pattern generators that can be tasked to create rough solutions for refining.
I alerted the bot to the (substantial) risk of hallucinations and decreased
Published by /u/ReplacementSlight413 on Tuesday 01 July 2025 02:52
I had created the library in C as part of a bigger project to create a multithreaded and hardware (GPU, and soon TPU) accelerated library to manipulate fingerprints for text. At some point, I figured one can have fun vibe coding the interface to Perl. The first post in the series just dropped ; it provides the background, rationale, the prompt and the first output by Claude 3.7. Subsequent posts will critique the solution and document subsequent interactions with the chatbot.
Part 2 will be about the alienfile (a task that botched by the LLM). Suggestions for subsequent prompts welcome ; as I said this is a project whose C backend (except the TPU part) is nearly complete, so I am just having fun with the Perl part.
Published by /u/niceperl on Monday 30 June 2025 19:03
This simple script, who gets a metacpan page:
use strict; use warnings; use LWP::UserAgent; my $ua = LWP::UserAgent->new; my $url = 'https://metacpan.org/release/GBROWN/App-rdapper-1.14'; my $response = $ua->get($url); # Check the response if ($response->is_success) { print "OK: $url\n"; } else { print "KO: ", $response->status_line, "\n"; }
Prints at console:
KO: 402 Payment Required
For others $url, it works fine. Just curious about that response message, does anyone know anything about that?
Published by /u/briandfoy on Monday 30 June 2025 11:31
![]() | submitted by /u/briandfoy [link] [comments] |
Published by Gabor Szabo on Monday 30 June 2025 05:45
Originally published at Perl Weekly 727
Hi there!
Recently I ran a poll asking Which versions of Perl do you use at work?. I asked it both in the The Perl Community group on Facebook and in the Perl Maven community Telegram Channel and the Perl 5 Telegram channel.
There were 75 votes in the Facebook group: 1% use 5.42; 17% use 5.40; 30% use 5.38; 5% use 5.36; 29% use 5.22-5.34; 13% use 5.12-5.20; 5% use 5.10 or older.
There were 29 votes in the Telegram group(s): 14% use 5.42; 31% use 5.40; 34% use 5.38; 7% use 5.36; 38% use 5.22-5.34; 17% use 5.12-5.20; 7% use 5.10 or older. Yes, people could select multiple answers.
You can still go an vote in either of those polls.
Many people commented that they use the version of perl that comes with the OS, to which Dave Cross posed Stop using your system Perl. I don't fully agree, but he has a point.
I don't recall what exactly prompted me to do this, but a few days ago I started to write a book about OOP - Object Oriented Perl. I took some of the slides I had for my advanced Perl course, started to update them and started to write explanations around the examples. As I write them I post the articles in the Perl Community group on Facebook and on my LinkedIn page. If you use Facebook I'd recommend you join that group and if you use LinkedIn I'd recommend you follow me. If you would like to go further and connect with me on LinkedIn, please include a message saying that you are a reader of the Perl Weekly newsletter so I'll have some background.
Besides publishing them on the social sites I also collect the writings about Perl OOP on my web site and I also started to publish the book on Leanpub.
As I am sure you know editing the Perl weekly, writing these examples and articles and the book takes a lot of time that I should spend earning money by helping my clients with their Perl code-base. Luckily there are some people who support me financially via Patreon, GitHub. (Hint: it would be awesome if you'd also sponsor me with $10/month.)
There are many people who like to get something 'tangible' to really justify their support. If you are such a person I created the Leanpub book especially for you. You can buy the Perl OOP book and some of my other books getting a pdf and an epub (for Kindle) version. You will both support my work and get an e-book. You will also get all the new versions of the book as I update it.
Enjoy your week!
--
Your editor: Gabor Szabo.
For years Dave has been telling us to use our own version of Perl and not to rely on the one that comes with our version of the Operating System. There is a lot of merit in what he is say and what he is writing in this article. I personally would go to the most extreme and use a Docker container for better separation and to make the development, testing, and production environments as similar as possible. With that said I am not 100% sold on the idea. I do understand the value in using the perl that comes with the OS and the perl modules that can be installed with the package-management system of the operating system. (e.g. apt/yum). See an extensive discussion on the topic.
FIT files record the activities of people using devices such as sports watches and bike head units.
There are all kinds of new and nice ways to write Perl code. Reading this article you'll learn about the new, experimental 'class' feature a bit more.
AWS DynamoDB is a fully managed NoSQL database service provided by AWS.
How do you manage the dependencies of a Perl module that depends on another perl module (distribution) developed in the same monorepo?
What happens when the same module is required both by your code and by one of your dependencies? Which version of that shared dependency will be installed?
Just a few days ago I asked on our Telegram channel which companies use Perl for application development and now I see this post. Apparently there is a new web-site listing 'Perl jobs'. I don't know who is behind that site, but the background image has 'use strict' and 'use warnings' so it isn't that bad.
A discussion with some good (and some bad) suggestions there.
The Weekly Challenge by Mohammad Sajid Anwar will help you step out of your comfort-zone. You can even win prize money of $50 by participating in the weekly challenge. We pick one champion at the end of the month from among all of the contributors during the month, thanks to the sponsor Lance Wicks.
Welcome to a new week with a couple of fun tasks "Replace all ?" and "Good String". If you are new to the weekly challenge then why not join us and have fun every week. For more information, please read the FAQ.
Enjoy a quick recap of last week's contributions by Team PWC dealing with the "Missing Integers" and "MAD" tasks in Perl and Raku. You will find plenty of solutions to keep you busy.
Taking advantage of Perl's hash lookup speed, O(1) time complexity. Keeps the implementation readable and concise.
A compact, correct, and Perl-savvy solution sets. The post’s minimalist style reflects a strong grasp of both Perl idioms and the challenge requirements.
A well-structured, educational post that highlights the expressive power of Raku while staying true to challenge constraints. The inclusion of verbose output, clean modularity, and idiomatic constructs makes it an excellent read for both Raku learners and seasoned scripters.
Offers elegant solutions in Raku and also provides working Perl equivalents. It balances code with commentary.
A technically sharp, creative post—especially with the use of PDL.
Practical, ready-to-run code with clear explanation and includes both straightforward Perl and more advanced PDL solutions.
It provides concise and elegant solutions to both challenges with clear explanations of the reasoning behind them. The use of idiomatic Perl (e.g., map, grep, keys, and smart use of hashes) is idiomatic and effective.
It demonstrates good teaching style by explaining the problem, providing example inputs/outputs, and showing step-by-step approaches. The inclusion of multiple languages (Raku, Perl, Python, Elixir) is very valuable for readers interested in cross-language algorithm implementations.
Both solutions are clean, efficient, and well-documented. It prioritises readability over one-liners—a good choice for maintainability.
Using none to check if a number is missing is idiomatic and easy to read. Setting $"=', ' for array printing is a good touch.
Away week, still got Rust for us. As always, a detailed and descriptive approach makes the reading fun.
This post is a well-written, thoughtful exploration of two classic array problems, each solved elegantly in both Python and Perl, with a particular focus on bitarray usage and iteration techniques.
A section for newbies and for people who need some refreshing of their Perl knowledge. If you have questions or suggestions about the articles, let me know and I'll try to make the necessary changes. The included articles are from the Perl Maven Tutorial and are part of the Perl Maven eBook.
Great CPAN modules released last week.
July 9, 2025
You joined the Perl Weekly to get weekly e-mails about the Perl programming language and related topics.
Want to see more? See the archives of all the issues.
Not yet subscribed to the newsletter? Join us free of charge!
(C) Copyright Gabor Szabo
The articles are copyright the respective authors.
Published on Monday 30 June 2025 00:00
Published on Monday 30 June 2025 00:00
In this multipart series we will explore the benefits (and pitfalls) of vibe coding a Perl interface to an external (or foreign) library through a large language model. Those of you who follow me on X/Twitter (as @ChristosArgyrop and @ArgyropChristos), Bluesky , mast.hpc, mstdn.science, mstdn.social know that I have been very critical of the hype behind AI and the hallucinations of both models and the human AI influencers (informally known as botlickers in some corners of the web).
However, there are application areas of vibe coding with AI, e.g. semi-automating the task of creating API from one one language to another, in which the chatbots may actually deliver well and act as productivity boosters. In this application area, we will be leveraging the AI tools as more or less enhanced auto-complete tools that can help a developer navigate less familiar, more technical and possibly more boring aspects of the target language’s ‘guts’. If AI were to deliver in this area, then meaningless language wars can be averted, and wrong (at least in my opinion) reasons to prefer one language, i.e. the availability of some exotic library, may be avoided.
For my foray in this area, I chose to interface to the library Bit that I wrote to support text fingerprinting for some research applications. The library based on David Hanson’s Bit_T library discussed in Chapter 13 of “C Interfaces and Implementations” has been extended to incorporate additional operations on bitsets (such as counts on unions/differences/intersections of sets) and fast population counts in both CPU and GPU (with TPU implementations coming down the road). Hence, this is a test case that can illustrate the utility of Perl in using code that executes transparently in various hardware assets. Similar to Hanson’s original implementation (after all, my work is based on his!) the library interface is implemented to an Abstract Data Type in C; a crucial aspect of the implementation is to manage memory and avoid leaks without looking (at all, or as little as possible!) under the hood. For our experiments we will use Claude 3.7 Thinking Sonnet through the Github Copilot vscode interface. This is going to be a multipart series post that will be published throughout the summer. Our focus will be on interactions between me and the bot, and in particular critiquing the responses it has given me for both high level (e.g. choice of approach) and technical aspects of the project.
For the prompt I provided the “bit.c”, “bit.h”, makefile and the readme file of the Bit github repository as context to Claude and then I issued the following:
Look at the bit.h file that defines two abstract data types Bit_T and Bit_T_DB
and their interfaces in C. I would like to wrap these interfaces in Perl.
The build process will consist of using the Alien package in Perl to make bit available to Perl.
Explore possible options for the creation of the interface:
A. Creation of the bit C library :
1) compiling bit.c and bit.h as dynamic libraries
2) compiling bit.c nd bit.h into a static library
B. Interfacing with the bit C library from Perl:
1) using SWIG i files
2) using Platypus::FFI
3) using Inline::C
4) using XS
Since you are an expert with deep knowledge of the Perl ecosystem, consider pros and cons of
each approach. and then recommend the optimal Consider lightweight options for encapsulating
the Bit_T and Bit_T_DB abstract data types as Perl objects.
Provide wrapping recommendations targetting a beginner, intermediate and experienced Perl
programmer since those may differ. When providing recommendations , consider the dependencies
that are required for each implementation and highlight them in your analysis.
Once you have provided the recommendation, provide the implementation of the wrapper for
these 3 programmer levels. To reduce the token output (and your risk of hallucinating),
provide a full implementation of the wrapper for the beginner level and only the
changes (if any) that are required for the intermediate and advanced programmer levels.
In my prompt specification, I clearly specified that
Note that I gave the bot considerable freedom over the “low” level aspects of wrapping (type of library, choice of the cross-language interface), but restricted it in terms of how I wanted the library served. Furthermore, I guided it to consider alternative viewpoints and level of expertise, providing a unique opportunity to test Claude’s representation1 of conversations about wrappers. Finally, I alerted the chatbot to consider the token-sequence length dependent risk of hallucinations and tried to limit the length of the output of the alternative implementations.
The output based on this prompt is shown below (inside the horizontal lines); the formatting was also part of the output
Let me analyze the options for creating a Perl interface to the Bit_T and Bit_T_DB C library and provide implementation recommendations for programmers at different experience levels.
I see the Makefile already creates a dynamic library (libbit.so
) in the build
directory with this rule:
$(TARGET): $(OBJ)
$(CC) $(CFLAGS) -shared -o $@ $^ $(LDFLAGS)
Pros:
Cons:
To compile as a static library, you would modify the Makefile:
TARGET = $(BUILD_DIR)/libbit.a
$(TARGET): $(OBJ)
ar rcs $@ $^
Pros:
Cons:
Pros:
Cons:
Pros:
Cons:
Pros:
Cons:
Pros:
Cons:
FFI::Platypus provides the easiest approach for beginners with minimal C knowledge. It works with your existing dynamic library build.
Inline::C offers more control for intermediate users who understand some C code but want to avoid complex build processes.
XS provides the highest performance and greatest control, suitable for experienced Perl programmers.
Let’s create the Alien::Bit module first to handle library discovery, then implement Bit::Set and Bit::Set::DB modules using FFI::Platypus.
# /media/chrisarg/Software-Dev/Bit/perl-wrapper/lib/Alien/Bit.pm
package Alien::Bit;
use strict;
use warnings;
use parent 'Alien::Base';
use File::Spec;
sub dynamic_libs {
my $class = shift;
# Try to find the library in common locations
my @potential_paths = (
# From the build directory
'/media/chrisarg/Software-Dev/Bit/bit/build/libbit.so',
# System paths
'/usr/local/lib/libbit.so',
'/usr/lib/libbit.so',
# Environment variable if set
$ENV{BIT_LIB_PATH} ? File::Spec->catfile($ENV{BIT_LIB_PATH}, 'libbit.so') : ()
);
foreach my $path (@potential_paths) {
return $path if -f $path && -r $path;
}
die "Could not find libbit.so library. Please set BIT_LIB_PATH environment variable to its location.\n";
}
sub include_dirs {
return (
'/media/chrisarg/Software-Dev/Bit/bit/include',
'/usr/local/include',
'/usr/include',
$ENV{BIT_INCLUDE_PATH} || ()
);
}
1;
# /media/chrisarg/Software-Dev/Bit/perl-wrapper/lib/Bit/Set.pm
package Bit::Set;
use strict;
use warnings;
use FFI::Platypus;
use FFI::Platypus::Memory qw(malloc free);
use FFI::Platypus::Buffer qw(scalar_to_buffer buffer_to_scalar);
use Alien::Bit;
# Create FFI::Platypus object
my $ffi = FFI::Platypus->new(api => 1);
# Add path to our dynamic library
$ffi->lib(Alien::Bit->dynamic_libs);
# Define opaque types for our bitset pointers
$ffi->type('opaque' => 'Bit_T');
$ffi->type('opaque*' => 'Bit_T_Ptr');
# Define our Bit_T functions
$ffi->attach(Bit_new => ['int'] => 'Bit_T' => sub {
my ($xsub, $self, $length) = @_;
die "Length must be a positive integer" unless defined $length && $length > 0;
my $bit_set = $xsub->($length);
die "Failed to create bit set" unless $bit_set;
return bless { _handle => $bit_set }, $self;
});
$ffi->attach(Bit_free => ['Bit_T_Ptr'] => 'opaque' => sub {
my ($xsub, $self) = @_;
my $ptr = \$self->{_handle};
return $xsub->($ptr);
});
$ffi->attach(Bit_length => ['Bit_T'] => 'int' => sub {
my ($xsub, $self) = @_;
return $xsub->($self->{_handle});
});
$ffi->attach(Bit_count => ['Bit_T'] => 'int' => sub {
my ($xsub, $self) = @_;
return $xsub->($self->{_handle});
});
$ffi->attach(Bit_buffer_size => ['int'] => 'int');
$ffi->attach(Bit_bset => ['Bit_T', 'int'] => 'void' => sub {
my ($xsub, $self, $index) = @_;
die "Index must be non-negative" unless defined $index && $index >= 0;
$xsub->($self->{_handle}, $index);
});
$ffi->attach(Bit_bclear => ['Bit_T', 'int'] => 'void' => sub {
my ($xsub, $self, $index) = @_;
die "Index must be non-negative" unless defined $index && $index >= 0;
$xsub->($self->{_handle}, $index);
});
$ffi->attach(Bit_get => ['Bit_T', 'int'] => 'int' => sub {
my ($xsub, $self, $index) = @_;
die "Index must be non-negative" unless defined $index && $index >= 0;
return $xsub->($self->{_handle}, $index);
});
$ffi->attach(Bit_set => ['Bit_T', 'int', 'int'] => 'void' => sub {
my ($xsub, $self, $lo, $hi) = @_;
die "Low index must be non-negative" unless defined $lo && $lo >= 0;
die "High index must be greater than or equal to low index" unless defined $hi && $hi >= $lo;
$xsub->($self->{_handle}, $lo, $hi);
});
$ffi->attach(Bit_clear => ['Bit_T', 'int', 'int'] => 'void' => sub {
my ($xsub, $self, $lo, $hi) = @_;
die "Low index must be non-negative" unless defined $lo && $lo >= 0;
die "High index must be greater than or equal to low index" unless defined $hi && $hi >= $lo;
$xsub->($self->{_handle}, $lo, $hi);
});
# Comparison operations
$ffi->attach(Bit_eq => ['Bit_T', 'Bit_T'] => 'int' => sub {
my ($xsub, $self, $other) = @_;
die "Other bitset must be a Bit::Set object" unless ref $other eq ref $self;
return $xsub->($self->{_handle}, $other->{_handle});
});
# Set operations
$ffi->attach(Bit_union => ['Bit_T', 'Bit_T'] => 'Bit_T' => sub {
my ($xsub, $self, $other) = @_;
die "Other bitset must be a Bit::Set object" unless ref $other eq ref $self;
my $result_handle = $xsub->($self->{_handle}, $other->{_handle});
return bless { _handle => $result_handle }, ref $self;
});
$ffi->attach(Bit_inter => ['Bit_T', 'Bit_T'] => 'Bit_T' => sub {
my ($xsub, $self, $other) = @_;
die "Other bitset must be a Bit::Set object" unless ref $other eq ref $self;
my $result_handle = $xsub->($self->{_handle}, $other->{_handle});
return bless { _handle => $result_handle }, ref $self;
});
# Count operations
$ffi->attach(Bit_inter_count => ['Bit_T', 'Bit_T'] => 'int' => sub {
my ($xsub, $self, $other) = @_;
die "Other bitset must be a Bit::Set object" unless ref $other eq ref $self;
return $xsub->($self->{_handle}, $other->{_handle});
});
# Constructor and destructor
sub new {
my ($class, $length) = @_;
return $class->Bit_new($length);
}
sub DESTROY {
my ($self) = @_;
$self->Bit_free() if defined $self->{_handle};
}
# Convenient accessor methods
sub length {
my ($self) = @_;
return $self->Bit_length();
}
sub count {
my ($self) = @_;
return $self->Bit_count();
}
sub get {
my ($self, $index) = @_;
return $self->Bit_get($index);
}
sub set {
my ($self, $index) = @_;
$self->Bit_bset($index);
return $self;
}
sub clear {
my ($self, $index) = @_;
$self->Bit_bclear($index);
return $self;
}
sub set_range {
my ($self, $lo, $hi) = @_;
$self->Bit_set($lo, $hi);
return $self;
}
sub clear_range {
my ($self, $lo, $hi) = @_;
$self->Bit_clear($lo, $hi);
return $self;
}
sub equals {
my ($self, $other) = @_;
return $self->Bit_eq($other) ? 1 : 0;
}
sub union {
my ($self, $other) = @_;
return $self->Bit_union($other);
}
sub intersection {
my ($self, $other) = @_;
return $self->Bit_inter($other);
}
sub intersection_count {
my ($self, $other) = @_;
return $self->Bit_inter_count($other);
}
1;
# /media/chrisarg/Software-Dev/Bit/perl-wrapper/lib/Bit/Set/DB.pm
package Bit::Set::DB;
use strict;
use warnings;
use FFI::Platypus;
use FFI::Platypus::Memory qw(malloc free);
use FFI::Platypus::Buffer qw(scalar_to_buffer buffer_to_scalar);
use Bit::Set;
use Alien::Bit;
# Create FFI::Platypus object
my $ffi = FFI::Platypus->new(api => 1);
# Add path to our dynamic library
$ffi->lib(Alien::Bit->dynamic_libs);
# Define opaque types for our bitset pointers
$ffi->type('opaque' => 'Bit_T');
$ffi->type('opaque' => 'Bit_T_DB');
$ffi->type('opaque*' => 'Bit_T_DB_Ptr');
# Define struct for SETOP_COUNT_OPTS
$ffi->type('record(SETOP_COUNT_OPTS)' => [
num_cpu_threads => 'int',
device_id => 'int',
upd_1st_operand => 'bool',
upd_2nd_operand => 'bool',
release_1st_operand => 'bool',
release_2nd_operand => 'bool',
release_counts => 'bool'
]);
# Define Bit_T_DB functions
$ffi->attach(BitDB_new => ['int', 'int'] => 'Bit_T_DB' => sub {
my ($xsub, $self, $length, $num_of_bitsets) = @_;
die "Length must be a positive integer" unless defined $length && $length > 0;
die "Number of bitsets must be a positive integer" unless defined $num_of_bitsets && $num_of_bitsets > 0;
my $db = $xsub->($length, $num_of_bitsets);
die "Failed to create bitset DB" unless $db;
return bless { _handle => $db }, $self;
});
$ffi->attach(BitDB_free => ['Bit_T_DB_Ptr'] => 'opaque' => sub {
my ($xsub, $self) = @_;
my $ptr = \$self->{_handle};
return $xsub->($ptr);
});
$ffi->attach(BitDB_length => ['Bit_T_DB'] => 'int' => sub {
my ($xsub, $self) = @_;
return $xsub->($self->{_handle});
});
$ffi->attach(BitDB_nelem => ['Bit_T_DB'] => 'int' => sub {
my ($xsub, $self) = @_;
return $xsub->($self->{_handle});
});
$ffi->attach(BitDB_count_at => ['Bit_T_DB', 'int'] => 'int' => sub {
my ($xsub, $self, $index) = @_;
die "Index must be non-negative" unless defined $index && $index >= 0;
return $xsub->($self->{_handle}, $index);
});
$ffi->attach(BitDB_clear => ['Bit_T_DB'] => 'void' => sub {
my ($xsub, $self) = @_;
$xsub->($self->{_handle});
});
$ffi->attach(BitDB_clear_at => ['Bit_T_DB', 'int'] => 'void' => sub {
my ($xsub, $self, $index) = @_;
die "Index must be non-negative" unless defined $index && $index >= 0;
$xsub->($self->{_handle}, $index);
});
$ffi->attach(BitDB_get_from => ['Bit_T_DB', 'int'] => 'Bit_T' => sub {
my ($xsub, $self, $index) = @_;
die "Index must be non-negative" unless defined $index && $index >= 0;
my $bit_handle = $xsub->($self->{_handle}, $index);
return bless { _handle => $bit_handle }, 'Bit::Set';
});
$ffi->attach(BitDB_put_at => ['Bit_T_DB', 'int', 'Bit_T'] => 'void' => sub {
my ($xsub, $self, $index, $bitset) = @_;
die "Index must be non-negative" unless defined $index && $index >= 0;
die "Bitset must be a Bit::Set object" unless ref $bitset eq 'Bit::Set';
$xsub->($self->{_handle}, $index, $bitset->{_handle});
});
# CPU-specific intersection count function
$ffi->attach(BitDB_inter_count_cpu => ['Bit_T_DB', 'Bit_T_DB', 'SETOP_COUNT_OPTS'] => 'int*' => sub {
my ($xsub, $self, $other, $opts) = @_;
die "Other must be a Bit::Set::DB object" unless ref $other eq ref $self;
$opts ||= {
num_cpu_threads => 1,
device_id => -1,
upd_1st_operand => 0,
upd_2nd_operand => 0,
release_1st_operand => 0,
release_2nd_operand => 0,
release_counts => 0
};
return $xsub->($self->{_handle}, $other->{_handle}, $opts);
});
# Constructor and destructor
sub new {
my ($class, $length, $num_of_bitsets) = @_;
return $class->BitDB_new($length, $num_of_bitsets);
}
sub DESTROY {
my ($self) = @_;
$self->BitDB_free() if defined $self->{_handle};
}
# Convenient accessor methods
sub length {
my ($self) = @_;
return $self->BitDB_length();
}
sub num_of_bitsets {
my ($self) = @_;
return $self->BitDB_nelem();
}
sub count_at {
my ($self, $index) = @_;
return $self->BitDB_count_at($index);
}
sub clear {
my ($self) = @_;
$self->BitDB_clear();
return $self;
}
sub clear_at {
my ($self, $index) = @_;
$self->BitDB_clear_at($index);
return $self;
}
sub get_bitset {
my ($self, $index) = @_;
return $self->BitDB_get_from($index);
}
sub put_bitset {
my ($self, $index, $bitset) = @_;
$self->BitDB_put_at($index, $bitset);
return $self;
}
sub intersection_count_cpu {
my ($self, $other, $opts) = @_;
return $self->BitDB_inter_count_cpu($other, $opts);
}
1;
#!/usr/bin/perl
use strict;
use warnings;
use Bit::Set;
use Bit::Set::DB;
# Create a new bitset with 1024 bits
my $bitset1 = Bit::Set->new(1024);
my $bitset2 = Bit::Set->new(1024);
# Set some bits
$bitset1->set(42);
$bitset1->set(100);
$bitset2->set(42);
$bitset2->set(200);
# Check if a bit is set
print "Bit 42 in bitset1: ", $bitset1->get(42), "\n";
print "Bit 100 in bitset1: ", $bitset1->get(100), "\n";
print "Bit 200 in bitset1: ", $bitset1->get(200), "\n";
# Count the number of bits set
print "Number of bits set in bitset1: ", $bitset1->count(), "\n";
print "Number of bits set in bitset2: ", $bitset2->count(), "\n";
# Calculate intersection count
my $intersection_count = $bitset1->intersection_count($bitset2);
print "Intersection count: $intersection_count\n";
# Create a new bitset as the union of the two bitsets
my $union = $bitset1->union($bitset2);
print "Number of bits set in union: ", $union->count(), "\n";
# Create a BitDB with 3 bitsets of length 1024
my $db = Bit::Set::DB->new(1024, 3);
# Put our bitsets into the DB
$db->put_bitset(0, $bitset1);
$db->put_bitset(1, $bitset2);
$db->put_bitset(2, $union);
# Count bits in each bitset in the DB
print "Bits set in DB at index 0: ", $db->count_at(0), "\n";
print "Bits set in DB at index 1: ", $db->count_at(1), "\n";
print "Bits set in DB at index 2: ", $db->count_at(2), "\n";
For intermediate-level Perl programmers, here are the changes required to implement the wrapper using Inline::C:
# Bit::Set using Inline::C
package Bit::Set;
use strict;
use warnings;
use Inline C => Config =>
BUILD_NOISY => 1,
CLEAN_AFTER_BUILD => 0,
LIBS => '-L/media/chrisarg/Software-Dev/Bit/bit/build -lbit',
INC => '-I/media/chrisarg/Software-Dev/Bit/bit/include';
use Inline C => <<'END_C';
#include "bit.h"
typedef struct Bit_T* Bit_T;
// Wrapper functions
Bit_T create_bitset(int length) {
return Bit_new(length);
}
void free_bitset(Bit_T set) {
Bit_free(&set);
}
int get_length(Bit_T set) {
return Bit_length(set);
}
int get_count(Bit_T set) {
return Bit_count(set);
}
void set_bit(Bit_T set, int index) {
Bit_bset(set, index);
}
void clear_bit(Bit_T set, int index) {
Bit_bclear(set, index);
}
int get_bit(Bit_T set, int index) {
return Bit_get(set, index);
}
// Add more wrapper functions as needed
END_C
# Perl OO interface
sub new {
my ($class, $length) = @_;
my $handle = create_bitset($length);
return bless { _handle => $handle }, $class;
}
# Add more methods similar to the FFI version
For advanced Perl programmers, here are the key changes required to implement using XS:
# Bit::Set.xs
#include "EXTERN.h"
#include "perl.h"
#include "XSUB.h"
#include "bit.h"
typedef struct Bit_T* Bit_T;
typedef struct Bit_T_DB* Bit_T_DB;
MODULE = Bit::Set PACKAGE = Bit::Set
PROTOTYPES: DISABLE
Bit_T
Bit_new(length)
int length
void
Bit_free(set)
Bit_T set
CODE:
Bit_free(&set);
int
Bit_length(set)
Bit_T set
int
Bit_count(set)
Bit_T set
void
Bit_bset(set, index)
Bit_T set
int index
# Add more functions as needed
And update the Makefile.PL:
use ExtUtils::MakeMaker;
WriteMakefile(
NAME => 'Bit::Set',
VERSION_FROM => 'lib/Bit/Set.pm',
LIBS => ['-L/media/chrisarg/Software-Dev/Bit/bit/build -lbit'],
INC => '-I/media/chrisarg/Software-Dev/Bit/bit/include',
OBJECT => '$(O_FILES)',
);
For most users, especially beginners, the FFI::Platypus approach offers the best balance of simplicity, performance, and maintainability. It works well with your existing dynamic library build process and provides a clean, object-oriented interface to the Bit_T and Bit_T_DB types. The implementation provided covers all the core functionality while being extensible for additional features as needed.
In this first post, we will not consider the implementation(s) which will be covered in separate posts, but will restrict ourselves to Claude’s output with respect to the other questions. I give Claude bonus points for clearly identifying the niches for each potential choice:
However:
I alerted the bot to the (substantial) risk of hallucinations and decreased
Note, I did not use the word understanding, as I do not think that LLMs can understant: they are merely noisy statistical pattern generators that can be tasked to create rough solutions for refining. ↩
Published on Sunday 29 June 2025 12:09
The examples used here are from the weekly challenge problem statement and demonstrate the working solution.
You are given an array of n integers. Write a script to find all the missing integers in the range 1..n in the given array.
The core of the solution is contained in a single subroutine. The resulting code can be contained in a single file.
The approach we take is to use the given array as hash keys. Then we’ll iterate over the range 1..n and see which hash keys are missing.
sub find_missing{
my %h = ();
my @missing = ();
do{ $h{$_} = -1 } for @_;
@missing = grep {
!exists($h{$_})} 1 .. @_;
return @missing;
}
◇
Fragment referenced in 1.
Just to make sure things work as expected we’ll define a few short tests.
MAIN:{
say q/(/ . join(q/, /, find_missing 1, 2, 1, 3, 2, 5) . q/)/;
say q/(/ . join(q/, /, find_missing 1, 1, 1) . q/)/;
say q/(/ . join(q/, /, find_missing 2, 2, 1) . q/)/;
}
◇
Fragment referenced in 1.
$ perl perl/ch-1.pl (4, 6) (2, 3) (3)
You are given an array of distinct integers. Write a script to find all pairs of elements with minimum absolute difference (MAD) of any two elements.
We’ll use a hash based approach like we did in Part 1. The amount of code is small, just a single subroutine.
Since we need to have a nested loop to access all pairs we’ll make an effort to only do it once. What we’ll do is store the pairs in a list keyed by the differences. We’ll also track the minimum difference in a variable to avoid sorting to find it later.
sub mad_pairs{
my %mad = ();
my $mad = ~0;
for my $i (0 .. @_ - 1){
for my $j ($i + 1 .. @_ - 1){
my $d = abs($_[$i] - $_[$j]);
$mad = $d if $d < $mad;
push @{$mad{$d}}, [$_[$i], $_[$j]];
}
}
return @{$mad{$mad}};
}
◇
Fragment referenced in 4.
The main section is just some basic tests. Yeah, we’ll do lazy string formatting with chop!
MAIN:{
my $s = q//;
do{
$s .= q/[/ . join(q/, /, @{$_}) . q/], /;
} for mad_pairs 4, 1, 2, 3;
chop $s;
chop $s;
say $s;
$s = q//;
do{
$s .= q/[/ . join(q/, /, @{$_}) . q/], /;
} for mad_pairs 1, 3, 7, 11, 15;
chop $s;
chop $s;
say $s;
$s = q//;
do{
$s .= q/[/ . join(q/, /, @{$_}) . q/], /;
} for mad_pairs 1, 5, 3, 8;
chop $s;
chop $s;
say $s;
$s = q//;
}
◇
Fragment referenced in 4.
$ perl perl/ch-2.pl [4, 3], [1, 2], [2, 3] [1, 3] [1, 3], [5, 3]
Published by Ae Ep on Sunday 29 June 2025 03:17
I am trying to find a few distinct byte sequences within a large file using Perl's pack function and reading the file in binary mode.
An example of what I am hoping to achieve is to find sequences like:
00 03 76 72 B0 04
64 2E 92 10 02 05
9F 3D 20 1A 1A DD
I've put together something that is working using
...
binmode($fh)
...
$/ = pack ("C*", (0x00, 0x03, 0x76, 0x72, 0xB0, 0x04));
while ($data = <$fh>) {
#...
}
After getting all occurrences I add them to an array. Next, I reset the file position back to 0 and repeat the same steps on the same file with the next byte pattern. This is tedious and inefficient.
I have already tried using the pack function but it doesn't seem like it's possible to have "multiple delimiters" be used while seeking through a file. I am expecting to be able to search for 2 to 5 byte sequences in one "go" but would like some advice on what is the most efficient way to do this.
Previous questions asked relate more to strings rather than looking for byte sequences in binary mode.
Side question: Is there a way to add a logical operator to match similar sequences that differ by 1 byte using the pack function? e.g.
$/ = pack ("C*", (0x00, 0x03, 0x76, 0x72, 0x10, 0x04 or 0x01));
Q: What is the State of the Art for creating Artificial Neural Networks with Perl?
Why would I want to use an ANN in the first place? Well, maybe I have some crime/unusual incident data that I want to correlate with the Phases of the Moon to test the Lunar Effect, but the data is noisy, the effect is non-linear or confounded by weather. For whatever reason you want to “learn” a general pattern going from input to output, neural networks are one more method in your data science toolbox.
A search of CPAN for Neural Networks yields one page of results for you to sift through. The back propagation algorithm is a nice exercise in programming and it attracted a few attempts at the beginning of the century, starting with Statistics::LTU in 1997 before there was an AI namespace in CPAN. Neural networks then get their own namespace, leading to AI::NeuralNet::BackProp, AI::NeuralNet::Mesh, AI::NeuralNet::Simple (for those wanting a gentle introduction to AI). Perl isn’t one for naming rigidity, so there’s also AI::Perceptron, AI::NNFlex, AI::NNEasy and AI::Nerl::Network (love the speeling). AI::LibNeural is the first module in this list to wrap an external C++ library for use with Perl.
Most of these have been given the thumbs up (look for ++ icons near the name) by interested Perl users to indicate that it’s been of some use to them. It means the documentation is there, it installs and works for them. Is it right for you? NeilB puts a lot of work into his reviews, but hasn’t scratched the AI itch yet, so I’ll have to give one a try.
Sometimes trawling the CPAN dredges up interesting results you weren’t thinking about. I had no idea we had AI::PSO for running Particle Swarm Optimizations, AI::DecisionTree or AI::Categorizer to help with categorization tasks and AI::PredictionClient for TensorFlow Serving. Maybe I’ll come back to these one day. Searching specifically for [Py]Torch gets you almost nothing, but I did find AI::TensorFlow::Libtensorflow which provides bindings for the libtensorflow
deep learning library.
A flexible and efficient library for Deep Learning
AI::MXNet gets lots of love from users (not surprising given the popularity of convolutional neural networks). With a recent update for recurrent neural networks (RNN) in June 2023 and the weight of an Apache project behind the underlying library, it should be the obvious choice. But checking out the project page and decision-making disaster strikes!
MXNet had a lot of work on it, but then was retired in Sep 2023 because the Project Management Committee were unresponsive over several months, having uploaded their consciousnesses to a datacube in Iceland or maybe they just went on to other things because of … reasons.
It should still be perfectly fine to use. That Apache project had 87 contributors, so I expect it to be feature-rich and generally bug-free. Any bugs in the Perl module could be reported/fixed and you always have the source code for the library to hack on to suit your needs. I’ll skip it this time because I’m really only after a simple ANN, not the whole Deep Learning ecosystem, and I couldn’t find the package in the Fedora repository (adding the extra friction of building it myself).
A Fast Artificial Neural Network
FANN has been around for over 15 years is generally faster to train and run than either TensorFlow or PyTorch. The speed and lightweight nature make it ideal for embedded systems. Its smaller community may have an impact on your choice. From my 10 minute inspection, AI::FANN seemed to be the easier to get up to speed with. It had a short, simple example at the top of the docs that I could understand and run without much fuss.
In contrast, AI::MXNet leads with a Convolutional Neural Net (CNN) for recognizing hand-written digits in the MNIST dataset. It gives you a feel for the depth of the feature set, at the risk of intimidating the casual reader. Mind you, if I was looking for image classification (where CNNs shine) or treating history as an input (using RNNs as mentioned above), I’d put the time in going through AI::MXNet.
The downside to the original FANN site is the documentation consists of a series of blog posts that tell you all the things you can do, but not how to do them. You’re best bet is to read the examples source code like all the other C programmers out there.
Installation was easy. You just need the FANN build libraries (header files, etc) and the Perl module that interfaces to them. You could build from source or get libfan-dev on Ubuntu. For me on Fedora, it was just a matter of
dnf install fann-devel
cpanm AI::FANN
(See Tools for using cpanm)
To get started, I tried out the XOR example in the docs. XOR is a classic example of how a multi-layered perceptron (MLP) can tackle problems that are not linearly separable. The hidden layers of the MLP can solve problems inaccessible to single layer perceptrons. It gave me confidence in using a data structure to initialize the network and importing data from a file. An hour later, I was already scratching the itch that drew me to neural networks in the first place.
A nice introduction is FANN’s step-by-step guide which will take you through a bit about learning rates and activation functions as you consider how to build and tweak your first neural network. There are few heuristics to go by, so just start playing around until you get a result.
Be careful that too many neurons in the hidden layers will lead to overfitting of your data. You’ll end up with a network that can reproduce the training data perfectly, but fail to learn the underlying signal you wanted to discover. You might start with something between the number of input and output neurons. And be aware that machine learning algorithms are data-hungry.
Activation functions can affect how long it takes to train your network. Previous experience with other neural network tools way back in 2005 taught us the importance of normalizing the input, ideally to a range of [-1, 1], because most of the training time was spent just adjusting the weights to the point where the real learning could begin. Use your own judgement.
While we see the train_on_data
and run
methods in the example, you have to look down in the docs for the test
method which you’ll need to evaluate the trained network. The MSE
method will tell you the Mean Squared Error for your model and lower values are better. There’s no documentation for it yet, but it should do what it says on the tin.
A network that gives you rubbish is no good, so we need to evaluate how well it has learned on the training data. The usual process is to split the dataset into training and testing sets, reserving 20-30% of the data for testing. Once the network has finished training, its weights are fixed and then run on the testing data with the network’s output compared with the expected output given in the dataset.
Cross-validation is another popular method of evaluation, splitting the dataset into 10 subsets where you train on 9 sets and test on the 10th, rotating the sets to improve the network’s response. Once you are satisfied with the performance of your network, you are ready to run it on live data. Just remember to sanity check the results while you build trust in the responses.
Going back every time and manually creating networks with different sizes of layers sounds tedious. Ideally, I’d have a script that takes the network layers and sizes as arguments and returns the evaluation score. Couple this with the Minion job queue from Mojolicious (it’s nice!) and you’d have a great tool for finding the best available neural network for the given data while you’re doing other things.
The one thing not easy to find on the website is the file format specification for the datafiles, so this is what I worked out. They are space separated files of integers or floats like this
Number_of_runs Number_of_inputs Number_of_outputs
Input row 1
Output row 1
Input row 2
Output row 2
...
This is a script that will turn an array of arrayrefs from the XOR example into the file format used by libfann.
use v5.24; # postfix dereferencing is cool
my @xor_data = ( [[-1, -1], [-1] ],
[[-1, 1], [1] ],
[[1, -1], [1] ],
[[1, 1], [-1] ] );
write_datafile('xor.data', @xor_data);
sub write_datafile {
my ($filename, @data) = @_;
open my $fh, '>', $filename;
my ($in, $out) = $data[0]->@*;
say $fh join q{ }, scalar @data, scalar @$in, scalar @$out;
for my $test (@data) {
say $fh join q{ }, $test->[0]->@*;
say $fh join q{ }, $test->[1]->@*;
}
close $fh;
}
Have you used any of these modules? Share your experience to help the next person choose. Have I missed anything or got something wrong? Let us know in the comments below.
Image credit: “Perceptron” by fdecomite is licensed under CC BY 2.0
Published by prz on Saturday 28 June 2025 17:14
Published by Steve Waring on Saturday 28 June 2025 03:24
my $modifiers = 'i';
my $string = "abcabc\n";
$string =~ s/(?$modifiers)B/Z/;
print $string;
$modifiers = 'g';
$string =~ s/(?$modifiers)a/q/;
print $string;
The first regex works and changes the first b
to Z
.
The second regex gives:
Useless (?g) - use /g modifier in regex; marked by <-- HERE in m/(?g <-- HERE
Why is the error message saying it's in m/
when it is in s/
?
I don't know if the /g
is going to be needed in advance. Is the only way to deal with this to remove the g
from $modifiers
and use an if
statement to choose one of almost two identical regex statements, one having a /g
?
Published by U. Windl on Friday 27 June 2025 22:59
I wrote some server providing XML output with some debugging comments, like this (just a short extract):
<Result>
<Channel>DB1.PagesFreeMinPct</Channel>
<Value>55.578</Value>
<CustomUnit>%</CustomUnit>
<!--BEG: thresholds-->
<!--(igoring value: end="~")-->
<LimitMinWarning>15</LimitMinWarning>
<!--(igoring value: end="~")-->
<LimitMinError>10</LimitMinError>
<!--END: thresholds-->
</Result>
When using the output in some software named PRTG, I got this error message:
XML Parser mismatch: Wanted </!--(igoring>, got
So obviously that "XML Parser" has no idea what XML comments are; so (while waiting that the product may be fixed some day) I decided to add a quick fix to my server removing the comments (actually the comments are created by XSLT, delivering the result in $XML
).
Using $XML =~ s/<!--.*-->//g;
I was able to remove the XML comments, but I still have lines containing some indent spaces only.
So I tried to remove those lines consisting only of spaces, but failed using $XML =~ s/^\s+\n$//mg;
.
However that doesn't work.
Output would look like
<Result>
<Channel>DB1.PagesFreeMinPct</Channel>
<Value>55.578</Value>
<CustomUnit>%</CustomUnit>
<LimitMinWarning>15</LimitMinWarning>
<LimitMinError>10</LimitMinError>
</Result>
I guess my multi-line substitution is wrong.
Using $XML =~ s/^\s+$//mg;
instead still leaves some blank lines in the output.
<Result>
<Channel>DB1.PagesFreeMaxPct</Channel>
<Value>57.688</Value>
<CustomUnit>%</CustomUnit>
<LimitMinWarning>15</LimitMinWarning>
<LimitMinError>10</LimitMinError>
</Result>
Published by Zilore Mumba on Friday 27 June 2025 16:01
I have data file1, which starts from the beginning of the data recording. Data file 2 contains data from later than the beginning of the record but goes beyond file1. I an trying to update file1 so that it contains the whole record, from the beginning in file1 to the end of file2.
My attempt so far is not successful. Below are parts of the sample files
file1
'LIVING01' 2022 01 0.0 3.6 2.0 8.0 5.6 51.0 62.0 73.6 5.9 29.6 11.5 40.3 2.4 5.6 0.7 0.0 5.4 5.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 35.4 12.9 1.0 10.8 1.0 17.1
'LIVING01' 2022 05 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
'LIVING01' 2022 06 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 -99.0
'LIVING01' 2022 09 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 -99.0
'LIVING01' 2022 10 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
'LIVING01' 2023 02 8.3 0.0 0.0 3.0 11.7 0.0 0.0 0.0 1.9 0.0 0.0 0.0 2.8 1.2 0.0 3.9 32.3 72.8 14.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 -99.0 -99.0 -99.0
'LIVING01' 2023 06 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 -99.0
'LIVING01' 2023 08 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
'LIVING01' 2023 09 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 -99.0
'LIVING01' 2023 10 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.9 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 22.5 0.0
'LIVING01' 2023 11 0.0 0.0 0.0 0.0 0.0 3.3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 27.9 0.0 0.0 0.0 0.0 0.0 0.5 3.0 0.8 0.0 2.5 11.0 1.0 0.0 0.0 0.0 0.5 -99.0
'LIVING01' 2023 12 0.0 0.0 0.0 0.0 0.0 0.0 4.1 0.0 0.5 0.0 0.0 0.0 5.5 9.1 1.5 0.0 0.0 0.0 0.0 0.0 0.0 4.6 5.9 0.0 3.2 57.0 44.1 0.0 0.0 0.0 0.0
file2
'LIVING01' 2023 11 0.0 0.0 0.0 0.0 0.0 3.3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 27.9 0.0 0.0 0.0 0.0 0.0 0.5 3.0 0.8 0.0 2.5 11.0 1.0 0.0 0.0 0.0 0.5 -99.0
'LIVING01' 2023 12 0.0 0.0 0.0 0.0 0.0 0.0 4.1 0.0 0.5 0.0 0.0 0.0 5.5 9.1 1.5 0.0 0.0 0.0 0.0 0.0 0.0 4.6 5.9 0.0 3.2 57.0 44.1 0.0 0.0 0.0 0.0
'LIVING01' 2024 01 2.0 0.0 0.0 0.0 13.0 0.0 0.0 0.0 0.0 0.0 7.4 9.9 3.4 1.1 22.0 6.3 5.1 36.3 0.0 1.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
'LIVING01' 2024 02 1.8 0.0 0.0 0.0 0.0 0.0 16.5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 4.8 0.0 0.0 0.0 -99.0 -99.0
'LIVING01' 2024 03 0.0 0.0 0.0 0.0 0.0 1.9 5.9 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 14.6 0.0 1.0 0.0
'LIVING01' 2024 04 0.0 0.0 0.0 0.0 0.0 9.0 20.5 10.4 4.6 9.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 -99.0
'LIVING01' 2024 05 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
'LIVING01' 2024 06 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 -99.0
'LIVING01' 2024 07 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
'LIVING01' 2024 08 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
'LIVING01' 2024 09 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 -99.0
'LIVING01' 2024 10 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
'LIVING01' 2024 11 0.0 0.0 0.0 0.0 0.0 0.0 5.4 0.0 0.0 0.0 2.5 0.0 6.6 0.0 18.0 6.2 0.0 0.0 0.0 0.0 0.0 4.5 0.0 0.0 2.0 0.0 0.0 0.0 0.0 0.0 -99.0
'LIVING01' 2024 12 0.0 0.0 0.0 0.0 0.0 0.0 0.9 0.0 0.0 2.0 0.0 0.0 0.0 0.0 13.0 3.0 0.0 1.5 0.0 0.0 0.0 12.9 0.0 0.0 0.0 0.2 18.8 0.0 2.0 4.5 10.0
In my code below, I am able to write file1 to a temporary file, then I get the line number of the last line in file1. I then try to skip all lines in file2, up to the last line number in file1.
open(my $FILE1, "< file1") or die "open 'file1': $! ($^E)";
open(my $TEMPF, "> tempfile.txt") or die "open 'tempfile.txt': $! ($^E)";
while (my $line = readline $FILE1){
my($stn_name, $yr, $mn, $dat)=split(' ', $line, 4);
my @data=split(' ', $dat, 31);
my $ts = sprintf("%08s %04d %02d", $stn_name, $yr, $mn);
my $format = "%s".( " %6.1f" x 31 )."\n";
printf $TEMPF $format, $ts, @data;
printf $format, $ts, @data;
if(eof){
my $endpoint=$.;
}
}
close($FILE1);
print "$endpoint\n"; #Note, not recognised!!!
open(my $FILE2, "< file2") or die "open 'file2': $! ($^E)";
while (my $lines = <$FILE2>){
next unless($. <= $endpoint);
if($. > $endpoint){
goto label; #Read file2 to end line of file 1, then goto label
}
}
label:
while (my $nline = <$FILE2>){
next if($. <= $entpoint);
if($. > $entpoint){
.....
process the data;
printf $TEMPF $format, $ts, @data;
printf $format, $ts, @data;
}
}
close($FILE2);
close($TEMPF);
What can I try next?
Recently, Gabor ran a poll in a Perl Facebook community asking which version of Perl people used in their production systems. The results were eye-opening—and not in a good way. A surprisingly large number of developers replied with something along the lines of “whatever version is included with my OS.”
If that’s you, this post is for you. I don’t say that to shame or scold—many of us started out this way. But if you’re serious about writing and running Perl in 2025, it’s time to stop relying on the system Perl.
Let’s unpack why.
When we talk about the system Perl, we mean the version of Perl that comes pre-installed with your operating system—be it a Linux distro like Debian or CentOS, or even macOS. This is the version used by the OS itself for various internal tasks and scripts. It’s typically located in /usr/bin/perl
and tied closely to system packages.
It’s tempting to just use what’s already there. But that decision brings a lot of hidden baggage—and some very real risks.
The Perl Core Support Policy states that only the two most recent stable release series of Perl are supported by the Perl development team [Update: fixed text in previous sentence]. As of mid-2025, that means:
Perl 5.40 (released May 2024)
Perl 5.38 (released July 2023)
If you’re using anything older—like 5.36, 5.32, or 5.16—you’re outside the officially supported window. That means no guaranteed bug fixes, security patches, or compatibility updates from core CPAN tools like ExtUtils::MakeMaker
, Module::Build
, or Test::More
.
Using an old system Perl often means you’re several versions behind, and no one upstream is responsible for keeping that working anymore.
System Perl is frozen in time—usually the version that was current when the OS release cycle began. Depending on your distro, that could mean Perl 5.10, 5.16, or 5.26—versions that are years behind the latest stable Perl (currently 5.40).
This means you’re missing out on:
New language features (builtin
, class/method/field
, signatures
, try/catch
)
Performance improvements
Bug fixes
Critical security patches
If you’ve ever looked at modern Perl documentation and found your code mysteriously breaking, chances are your system Perl is too old.
System Perl isn’t just a convenience—it’s a dependency. Your operating system relies on it for package management, system maintenance tasks, and assorted glue scripts. If you install or upgrade CPAN modules into the system Perl (especially with cpan
or cpanm
as root), you run the risk of breaking something your OS depends on.
It’s a kind of dependency hell that’s completely avoidable—if you stop using system Perl.
When you use system Perl, your environment is essentially defined by your distro. That’s fine until you want to:
Move your application to another system
Run CI tests on a different platform
Upgrade your OS
Onboard a new developer
You lose the ability to create predictable, portable environments. That’s not a luxury—it’s a requirement for sane development in modern software teams.
perlbrew
or plenv
These tools let you install multiple versions of Perl in your home directory and switch between them easily. Want to test your code on Perl 5.32 and 5.40? perlbrew
makes it a breeze.
You get:
A clean separation from system Perl
The freedom to upgrade or downgrade at will
Zero risk of breaking your OS
It takes minutes to set up and pays for itself tenfold in flexibility.
local::lib
or Carton
Managing CPAN dependencies globally is a recipe for pain. Instead, use:
local::lib
: keeps modules in your home directory.
Carton
: locks your CPAN dependencies (like npm
or pip
) so deployments are repeatable.
Your production system should run with exactly the same modules and versions as your dev environment. Carton helps you achieve that.
If you’re building larger apps or APIs, containerising your Perl environment ensures true consistency across dev, test, and production. You can even start from a system Perl inside the container—as long as it’s isolated and under your control.
You never want to be the person debugging a bug that only happens on production, because prod is using the distro’s ancient Perl and no one can remember which CPAN modules got installed by hand.
Once you step away from the system Perl, you gain:
Access to the full language. Use the latest features without backports or compatibility hacks.
Freedom from fear. Install CPAN modules freely without the risk of breaking your OS.
Portability. Move projects between machines or teams with minimal friction.
Better testing. Easily test your code across multiple Perl versions.
Security. Stay up to date with patches and fixes on your schedule, not the distro’s.
Modern practices. Align your Perl workflow with the kinds of practices standard in other languages (think virtualenv
, rbenv
, nvm
, etc.).
I know the argument. You’ve got a handful of scripts, or maybe a cron job or two, and they seem fine. Why bother with all this?
Because “it just works” only holds true until:
You upgrade your OS and Perl changes under you.
A script stops working and you don’t know why.
You want to install a module and suddenly apt
is yelling at you about conflicts.
You realise the module you need requires Perl 5.34, but your system has 5.16.
Don’t wait for it to break. Get ahead of it.
You don’t have to refactor your entire setup overnight. But you can do this:
Install perlbrew
and try it out.
Start a new project with Carton
to lock dependencies.
Choose a current version of Perl and commit to using it moving forward.
Once you’ve seen how smooth things can be with a clean, controlled Perl environment, you won’t want to go back.
Your system Perl is for your operating system—not for your apps. Treat it as off-limits. Modern Perl deserves modern tools, and so do you.
Take the first step. Your future self (and probably your ops team) will thank you.
The post Stop using your system Perl first appeared on Perl Hacks.
Recently, Gabor ran a poll in a Perl Facebook community asking which version of Perl people used in their production systems. The results were eye-opening—and not in a good way. A surprisingly large number of developers replied with something along the lines of “whatever version is included with my OS.”
If that’s you, this post is for you. I don’t say that to shame or scold—many of us started out this way. But if you’re serious about writing and running Perl in 2025, it’s time to stop relying on the system Perl.
Let’s unpack why.
When we talk about the system Perl, we mean the version of Perl that comes pre-installed with your operating system—be it a Linux distro like Debian or CentOS, or even macOS. This is the version used by the OS itself for various internal tasks and scripts. It’s typically located in /usr/bin/perl
and tied closely to system packages.
It’s tempting to just use what’s already there. But that decision brings a lot of hidden baggage—and some very real risks.
The Perl Core Support Policy states that only the two most recent stable release series of Perl are supported by the Perl development team [Update: fixed text in previous sentence]. As of mid-2025, that means:
Perl 5.40 (released May 2024)
Perl 5.38 (released July 2023)
If you’re using anything older—like 5.36, 5.32, or 5.16—you’re outside the officially supported window. That means no guaranteed bug fixes, security patches, or compatibility updates from core CPAN tools like ExtUtils::MakeMaker
, Module::Build
, or Test::More
.
Using an old system Perl often means you’re several versions behind , and no one upstream is responsible for keeping that working anymore.
System Perl is frozen in time—usually the version that was current when the OS release cycle began. Depending on your distro, that could mean Perl 5.10, 5.16, or 5.26—versions that are years behind the latest stable Perl (currently 5.40).
This means you’re missing out on:
New language features (builtin
, class/method/field
, signatures
, try/catch
)
Performance improvements
Bug fixes
Critical security patches
Support: anything older than Perl 5.38 is no longer officially maintained by the core Perl team
If you’ve ever looked at modern Perl documentation and found your code mysteriously breaking, chances are your system Perl is too old.
System Perl isn’t just a convenience—it’s a dependency. Your operating system relies on it for package management, system maintenance tasks, and assorted glue scripts. If you install or upgrade CPAN modules into the system Perl (especially with cpan
or cpanm
as root), you run the risk of breaking something your OS depends on.
It’s a kind of dependency hell that’s completely avoidable— if you stop using system Perl.
When you use system Perl, your environment is essentially defined by your distro. That’s fine until you want to:
Move your application to another system
Run CI tests on a different platform
Upgrade your OS
Onboard a new developer
You lose the ability to create predictable, portable environments. That’s not a luxury— it’s a requirement for sane development in modern software teams.
perlbrew
or plenv
These tools let you install multiple versions of Perl in your home directory and switch between them easily. Want to test your code on Perl 5.32 and 5.40? perlbrew
makes it a breeze.
You get:
A clean separation from system Perl
The freedom to upgrade or downgrade at will
Zero risk of breaking your OS
It takes minutes to set up and pays for itself tenfold in flexibility.
local::lib
or Carton
Managing CPAN dependencies globally is a recipe for pain. Instead, use:
local::lib
: keeps modules in your home directory.
Carton
: locks your CPAN dependencies (like npm
or pip
) so deployments are repeatable.
Your production system should run with exactly the same modules and versions as your dev environment. Carton helps you achieve that.
If you’re building larger apps or APIs, containerising your Perl environment ensures true consistency across dev, test, and production. You can even start from a system Perl inside the container—as long as it’s isolated and under your control.
You never want to be the person debugging a bug that only happens on production, because prod is using the distro’s ancient Perl and no one can remember which CPAN modules got installed by hand.
Once you step away from the system Perl, you gain:
Access to the full language. Use the latest features without backports or compatibility hacks.
Freedom from fear. Install CPAN modules freely without the risk of breaking your OS.
Portability. Move projects between machines or teams with minimal friction.
Better testing. Easily test your code across multiple Perl versions.
Security. Stay up to date with patches and fixes on your schedule, not the distro’s.
Modern practices. Align your Perl workflow with the kinds of practices standard in other languages (think virtualenv
, rbenv
, nvm
, etc.).
I know the argument. You’ve got a handful of scripts, or maybe a cron job or two, and they seem fine. Why bother with all this?
Because “it just works” only holds true until:
You upgrade your OS and Perl changes under you.
A script stops working and you don’t know why.
You want to install a module and suddenly apt
is yelling at you about conflicts.
You realise the module you need requires Perl 5.34, but your system has 5.16.
Don’t wait for it to break. Get ahead of it.
You don’t have to refactor your entire setup overnight. But you can do this:
Install perlbrew
and try it out.
Start a new project with Carton
to lock dependencies.
Choose a current version of Perl and commit to using it moving forward.
Once you’ve seen how smooth things can be with a clean, controlled Perl environment, you won’t want to go back.
Your system Perl is for your operating system—not for your apps. Treat it as off-limits. Modern Perl deserves modern tools, and so do you.
Take the first step. Your future self (and probably your ops team) will thank you.
The post Stop using your system Perl first appeared on Perl Hacks.
Published by Brett Estrade on Friday 27 June 2025 12:13
Paper and talk submissions will be accepted until July 01, 2025 18:59 CDT
In particular I'd like to invite anyone who regrets not submitting a talk to the TPRC or who has gotten bit by the speaking bug. You are welcome to give your talk remotely.
Published by Mohammad Sajid Anwar on Friday 27 June 2025 01:44
Lexical Method in the latest release Perl v5.42 RC1. For more details, follow the link: https://theweeklychallenge.org/blog/lexical-method-v542
Published on Thursday 26 June 2025 00:00
Published on Wednesday 25 June 2025 10:10
Last time, we worked out how to extract, collate, and print statistics about the data contained in a FIT file. Now we’re going to take the next logical step and plot the time series data.
Now that we’ve extracted data from the FIT file, what else can we do with
it? Since this is time series data, the most natural next step is to
visualise the data values over time. Since I know that
Gnuplot handles time series data
well,1 I chose to use
Chart::Gnuplot
to plot the
data.
An additional point in Gnuplot’s favour is that it can plot two datasets on the same graph, each with its own y-axis. Such functionality is handy when searching for correlations between datasets of different y-axis scales and ranges that share the same baseline data series.
Clearly Chart::Gnuplot
relies on Gnuplot, so we need to install it first:
$ sudo apt install -y gnuplot
Now we can install Chart::Gnuplot
with cpanm
:
$ cpanm Chart::Gnuplot
Something I like looking at is how my heart rate evolved throughout a ride; it gives me an idea of how much effort I was putting in. So, we’ll start off by looking at how the heart rate data varied over time. In other words, we want time on the x-axis and heart rate on the y-axis.
One great thing about Gnuplot is that if you give it a format string for the time data, then plotting “just works”. In other words, explicit conversion to datetime data for the x-axis is unnecessary.
Here’s a script to extract the FIT data from our example data file. It
displays some statistics about the activity and plots heart rate versus
time. I’ve given the script the filename geo-fit-plot-data.pl
:
1use strict;
2use warnings;
3
4use Geo::FIT;
5use Scalar::Util qw(reftype);
6use List::Util qw(max sum);
7use Chart::Gnuplot;
8
9
10sub main {
11 my @activity_data = extract_activity_data();
12
13 show_activity_statistics(@activity_data);
14 plot_activity_data(@activity_data);
15}
16
17sub extract_activity_data {
18 my $fit = Geo::FIT->new();
19 $fit->file( "2025-05-08-07-58-33.fit" );
20 $fit->open or die $fit->error;
21
22 my $record_callback = sub {
23 my ($self, $descriptor, $values) = @_;
24 my @all_field_names = $self->fields_list($descriptor);
25
26 my %event_data;
27 for my $field_name (@all_field_names) {
28 my $field_value = $self->field_value($field_name, $descriptor, $values);
29 if ($field_value =~ /[a-zA-Z]/) {
30 $event_data{$field_name} = $field_value;
31 }
32 }
33
34 return \%event_data;
35 };
36
37 $fit->data_message_callback_by_name('record', $record_callback ) or die $fit->error;
38
39 my @header_things = $fit->fetch_header;
40
41 my $event_data;
42 my @activity_data;
43 do {
44 $event_data = $fit->fetch;
45 my $reftype = reftype $event_data;
46 if (defined $reftype && $reftype eq 'HASH' && defined %$event_data{'timestamp'}) {
47 push @activity_data, $event_data;
48 }
49 } while ( $event_data );
50
51 $fit->close;
52
53 return @activity_data;
54}
55
56# extract and return the numerical parts of an array of FIT data values
57sub num_parts {
58 my $field_name = shift;
59 my @activity_data = @_;
60
61 return map { (split ' ', $_->{$field_name})[0] } @activity_data;
62}
63
64# return the average of an array of numbers
65sub avg {
66 my @array = @_;
67
68 return (sum @array) / (scalar @array);
69}
70
71sub show_activity_statistics {
72 my @activity_data = @_;
73
74 print "Found ", scalar @activity_data, " entries in FIT file\n";
75 my $available_fields = join ", ", sort keys %{$activity_data[0]};
76 print "Available fields: $available_fields\n";
77
78 my $total_distance_m = (split ' ', ${$activity_data[-1]}{'distance'})[0];
79 my $total_distance = $total_distance_m/1000;
80 print "Total distance: $total_distance km\n";
81
82 my @speeds = num_parts('speed', @activity_data);
83 my $maximum_speed = max @speeds;
84 my $maximum_speed_km = $maximum_speed*3.6;
85 print "Maximum speed: $maximum_speed m/s = $maximum_speed_km km/h\n";
86
87 my $average_speed = avg(@speeds);
88 my $average_speed_km = sprintf("%0.2f", $average_speed*3.6);
89 $average_speed = sprintf("%0.2f", $average_speed);
90 print "Average speed: $average_speed m/s = $average_speed_km km/h\n";
91
92 my @powers = num_parts('power', @activity_data);
93 my $maximum_power = max @powers;
94 print "Maximum power: $maximum_power W\n";
95
96 my $average_power = avg(@powers);
97 $average_power = sprintf("%0.2f", $average_power);
98 print "Average power: $average_power W\n";
99
100 my @heart_rates = num_parts('heart_rate', @activity_data);
101 my $maximum_heart_rate = max @heart_rates;
102 print "Maximum heart rate: $maximum_heart_rate bpm\n";
103
104 my $average_heart_rate = avg(@heart_rates);
105 $average_heart_rate = sprintf("%0.2f", $average_heart_rate);
106 print "Average heart rate: $average_heart_rate bpm\n";
107}
108
109sub plot_activity_data {
110 my @activity_data = @_;
111
112 my @heart_rates = num_parts('heart_rate', @activity_data);
113 my @times = map { $_->{'timestamp'} } @activity_data;
114
115 my $date = "2025-05-08";
116
117 my $chart = Chart::Gnuplot->new(
118 output => "watopia-figure-8-heart-rate.png",
119 title => "Figure 8 in Watopia on $date: heart rate over time",
120 xlabel => "Time",
121 ylabel => "Heart rate (bpm)",
122 terminal => "png size 1024, 768",
123 timeaxis => "x",
124 xtics => {
125 labelfmt => '%H:%M',
126 },
127 );
128
129 my $data_set = Chart::Gnuplot::DataSet->new(
130 xdata => \@times,
131 ydata => \@heart_rates,
132 timefmt => "%Y-%m-%dT%H:%M:%SZ",
133 style => "lines",
134 );
135
136 $chart->plot2d($data_set);
137}
138
139main();
A lot has happened between this code and the previous scripts. Let’s review it to see what’s changed.
The biggest changes were structural. I’ve moved the code into separate routines, improving encapsulation and making each more focused on one task.
The FIT file data extraction code I’ve put into its own routine
(extract_activity_data()
; lines 17-54). This sub returns the array of
event data that we’ve been
using.
I’ve also created two utility routines num_parts()
(lines 57-62) and
avg()
(lines 65-69). These return the numerical parts of the activity
data and average data series value, respectively.
The ride statistics calculation and display code is now located in the
show_activity_statistics()
routine. Now it’s out of the way, allowing us
to concentrate on other things.
The plotting code is new and sits in a sub called plot_activity_data()
(lines 109-137). We’ll focus much more on that later.
These routines are called from a main()
routine (lines 10-15) giving us a
nice bird’s eye view of what the script is trying to achieve. Running all
the code is now as simple as calling main()
(line 139).
Let’s zoom in on the plotting code in plot_activity_data()
. After having
imported Chart::Gnuplot
at the top of the file (line 7), we need to do a
bit of organising before we can set up the chart. We extract the activity
data with extract_activity_data()
(line 11) and pass this as an argument
to plot_activity_data()
(line 14). At the top of plot_activity_data()
we fetch an array of the numerical heart rate data (line 112) and an array
of all the timestamps (line 113).
The activity’s date (line 115) is assigned as a string variable because I want this to appear in the chart’s title. Although the date is present in the activity data, I’ve chosen not to calculate its value until later. This way we get the plotting code up and running sooner, as there’s still a lot to discuss.
Now we’re ready to set up the chart, which takes place on lines 117-127.
We create a new Chart::Gnuplot
object on line 117 and configure the plot
with various keyword arguments to the object’s constructor.
The parameters are as follows:
output
specifies the name of the output file as a string. The name
I’ve chosen reflects the activity as well as the data being plotted.title
is a string to use as the plot’s title. To provide context, I
mention the name of the route (Figure 8)
within Zwift’s main virtual world (Watopia)
as well as the date of the activity. To highlight that we’re
plotting heart rate over time, I’ve mentioned this in the title also.xlabel
is a string describing the x-axis data.ylabel
is a string describing the y-axis data.terminal
option tells Gnuplot to use the PNG2
“terminal”3 and to set its dimensions to 1024x768.timeaxis
tells Gnuplot which axis contains time-based data (in this
case the x-axis). This enables Gnuplot to space out the data along the
axis evenly. Often, the spacing between points in time-based data isn’t
regular; for instance, data points can be missing. Hence,
naively plotting unevenly-spaced time data can produce a distorted graph.
Telling Gnuplot that the x-axis contains time-based data allows it to
add appropriate space where necessary.xtics
is a hash of options to configure the behaviour of the ticks on
the x-axis. The setting here displays hour and minute information at
each tick mark for our time data. We omit the year, month and day
information as this is the same for all data points.Now that the main chart parameters have been set, we can focus on the data
we want to plot. In Chart::Gnuplot
parlance, a Chart::Gnuplot::DataSet
object represents a set of data to plot. Lines 129-134 instantiate such an
object which we later pass to the Chart::Gnuplot
object when plotting the
data (line 136). One configures Chart::Gnuplot::DataSet
objects similarly
to how Chart::Gnuplot
objects are constructed: by passing various options
to its constructor. These options include the data to plot and how this
data should be styled on the graph.
The options used here have the following meanings:
xdata
is an array reference to the data to use for the x-axis. If
this option is omitted, then Gnuplot uses the array indices of the
y-data as the x-axis values.ydata
is an array reference to the data to use for the y-axis.timefmt
specifies the format string Gnuplot should use when reading
the time data in the xdata
array. Timestamps are strings and
we need to inform Gnuplot how to parse them into a form useful for
x-axis data. Were the x-axis data a numerical data type, this option
wouldn’t be necessary.style
is a string specifying the style to use for plotting the data.
In this example, we’re plotting the data points as a set of connected
lines. Check out the
Chart::Gnuplot
documentation
for a full list of the available style options.We finally get to plot the data on line 136. The data set gets passed to
the Chart::Gnuplot
object as the argument to its plot2d()
method. As
its name suggests, this plots 2D data, i.e. y versus x. Gnuplot can also
display 3D data, in which case we’d call plot3d()
. When plotting 3D data
we’d have to include a z dimension when setting up the data set.
Running this code
$ perl geo-fit-plot-data.pl
generates this plot:
A couple of things are apparent when looking at this graph. It took me a while to get going (my pulse rose steadily over the first ~15 minutes of the ride) and the time is weird (6 am? Me? Lol, no). We’ll try to explain the heart rate behaviour later.
But first, what’s up with the time data? Did I really start riding at 6 o’clock in the morning? I’m not a morning person, so that’s not right. Also, I’m pretty sure my neighbours wouldn’t appreciate me coughing and wheezing at 6 am while trying to punish myself on Zwift. So what’s going on?
For those following carefully, you might have noticed the trailing Z
on
the timestamp data. This means that the time zone is UTC. Given that this
data is from May and I live in Germany, this implies that the local time
would have been 8 am. Still rather early for me, but not too early to
disturb the neighbours too much.4 In other
words, we need to fix the time zone to get the time data right.
How do we fix the time zone? I’m glad you asked! We need to parse the
timestamp into a DateTime
object, set the time zone, and then pass the
fixed time data to Gnuplot. It turns out that the standard DateTime
library doesn’t parse date/time strings.
Instead, we need to use
DateTime::Format::Strptime
.
This module parses date/time strings much like the strptime(3)
POSIX
function
does and returns DateTime
objects.
Since the module isn’t part of the core Perl distribution, we need to install it:
$ cpanm DateTime::Format::Strptime
Most of the code changes that follow take place only within the plotting
routine (plot_activity_data()
). So, I’m going to focus on that from now
on and won’t create a new script for the new version of the code.
The first thing to do is to import the DateTime::Format::Strptime
module:
use Scalar::Util qw(reftype);
use List::Util qw(max sum);
use Chart::Gnuplot;
+use DateTime::Format::Strptime;
Extending plot_activity_data()
to set the correct time zone, we get this
code:
1sub plot_activity_data {
2 my @activity_data = @_;
3
4 # extract data to plot from full activity data
5 my @heart_rates = num_parts('heart_rate', @activity_data);
6 my @timestamps = map { $_->{'timestamp'} } @activity_data;
7
8 # fix time zone in time data
9 my $date_parser = DateTime::Format::Strptime->new(
10 pattern => "%Y-%m-%dT%H:%M:%SZ",
11 time_zone => 'UTC',
12 );
13
14 my @times = map {
15 my $dt = $date_parser->parse_datetime($_);
16 $dt->set_time_zone('Europe/Berlin');
17 my $time_string = $dt->strftime("%H:%M:%S");
18 $time_string;
19 } @timestamps;
20
21 # determine date from timestamp data
22 my $dt = $date_parser->parse_datetime($timestamps[0]);
23 my $date = $dt->strftime("%Y-%m-%d");
24
25 # plot data
26 my $chart = Chart::Gnuplot->new(
27 output => "watopia-figure-8-heart-rate.png",
28 title => "Figure 8 in Watopia on $date: heart rate over time",
29 xlabel => "Time",
30 ylabel => "Heart rate (bpm)",
31 terminal => "png size 1024, 768",
32 timeaxis => "x",
33 xtics => {
34 labelfmt => '%H:%M',
35 },
36 );
37
38 my $data_set = Chart::Gnuplot::DataSet->new(
39 xdata => \@times,
40 ydata => \@heart_rates,
41 timefmt => "%H:%M:%S",
42 style => "lines",
43 );
44
45 $chart->plot2d($data_set);
46}
The timestamp data is no longer read straight into the @times
array; it’s
stored in the @timestamps
temporary array (line 6). This change also
makes the variable naming a bit more consistent, which is nice.
To parse a timestamp string into a DateTime
object, we need to tell
DateTime::Format::Strptime
how to format the timestamp (lines 8-12). This
is the purpose of the pattern
argument in the DateTime::Format::Strptime
constructor (line 10). You might have noticed that we used the same pattern
when telling Gnuplot what format the time data was in. We also specify the
time zone (line 11) to ensure that the date/time data is parsed as UTC.
Next, we fix the time zone in all elements of the @timestamps
array (lines
14-19). I’ve chosen to do this within a map
here. I could extract this
code into a well-named routine, but it does the job for now. The map
parses the date/time string into a Date::Time
object (line 15) and sets
the time zone to Europe/Berlin
5 (line 16). We only need to
plot the time data,6 hence we format the DateTime
object as a string including only hour, minute and second information (line
17). Even though we only use hours and minutes for the x-axis tick labels
later, the time data is resolved down to the second, hence we retain the
seconds information in the @times
array.
One could write a more compact version of the time zone correction code like this:
my @times = map {
$date_parser->parse_datetime($_)
->set_time_zone('Europe/Berlin')
->strftime("%H:%M:%S");
} @timestamps;
yet, in this case, I find giving each step a name (via a variable) helps the code explain itself. YMMV.
The next chunk of code (lines 22-23) isn’t related to the time zone fix. It
generalises working out the current date from the activity data. This way I
can use a FIT file from a different activity without having to update the
$date
variable by hand. The process is simple: all elements of the
@timestamps
array have the same date, so we choose to parse only the first
one (line 22)7. This gives us a DateTime
object
which we convert into a formatted date string (via the strftime()
method)
composed of the year, month and day (line 23). We don’t need to fix the
time zone because UTC is sufficient in this case to extract the date
information. Of course, if you’re in a time zone close to the international
date line you might need to set the time zone to get the correct date.
The last thing to change is the timefmt
option to the
Chart::Gnuplot::DataSet
object on line 41. Because we now only have hour,
minute and second information, we need to update the time format string to
reflect this.
Now we’re ready to run the script again! Doing so
$ perl geo-fit-plot-data.pl
creates this graph
where we see that the time information is correct. Yay! 🎉
Now that I look at the graph again, I realise something: it doesn’t matter
when the data was taken (at least, not for this use case). What matters
more is the elapsed time from the start of the activity until the end. It
looks like we need to munge the time
data again. The job now is to convert the timestamp information into
seconds elapsed since the ride began. Since we’ve parsed the timestamp data
into DateTime
objects (in line 15 above), we can convert that value into
the number of seconds since the epoch (via the epoch()
method). As soon as we
know the epoch value for each element in the @timestamps
array, we can
subtract the first element’s epoch value from each element in the array.
This will give us an array containing elapsed seconds since the beginning of
the activity. Elapsed seconds are a bit too fine-grained for an activity
extending over an hour, so we’ll also convert seconds to minutes.
Making these changes to the plot_activity_data()
code, we get:
1sub plot_activity_data {
2 my @activity_data = @_;
3
4 # extract data to plot from full activity data
5 my @heart_rates = num_parts('heart_rate', @activity_data);
6 my @timestamps = map { $_->{'timestamp'} } @activity_data;
7
8 # parse timestamp data
9 my $date_parser = DateTime::Format::Strptime->new(
10 pattern => "%Y-%m-%dT%H:%M:%SZ",
11 time_zone => 'UTC',
12 );
13
14 # get the epoch time for the first point in the time data
15 my $first_epoch_time = $date_parser->parse_datetime($timestamps[0])->epoch;
16
17 # convert timestamp data to elapsed minutes from start of activity
18 my @times = map {
19 my $dt = $date_parser->parse_datetime($_);
20 my $epoch_time = $dt->epoch;
21 my $elapsed_time = ($epoch_time - $first_epoch_time)/60;
22 $elapsed_time;
23 } @timestamps;
24
25 # determine date from timestamp data
26 my $dt = $date_parser->parse_datetime($timestamps[0]);
27 my $date = $dt->strftime("%Y-%m-%d");
28
29 # plot data
30 my $chart = Chart::Gnuplot->new(
31 output => "watopia-figure-8-heart-rate.png",
32 title => "Figure 8 in Watopia on $date: heart rate over time",
33 xlabel => "Elapsed time (min)",
34 ylabel => "Heart rate (bpm)",
35 terminal => "png size 1024, 768",
36 );
37
38 my $data_set = Chart::Gnuplot::DataSet->new(
39 xdata => \@times,
40 ydata => \@heart_rates,
41 style => "lines",
42 );
43
44 $chart->plot2d($data_set);
45}
The main changes occur in lines 14-23. We parse the date/time information
from the first timestamp (line 15), chaining the epoch()
method call to
find the number of seconds since the epoch. We store this result in a
variable for later use; it holds the epoch time at the beginning of the data
series. After parsing the timestamps into DateTime
objects (line 19), we
find the epoch time for each time point (line 20). Line 21 calculates the
elapsed time from the time stored in $first_epoch_time
and converts
seconds to minutes by dividing by 60. The map
returns this value (line
22) and hence @times
now contains a series of elapsed time values in
minutes.
It’s important to note here that we’re no longer plotting a date/time value
on the x-axis; the elapsed time is a purely numerical value. Thus, we
update the x-axis label string (line 33) to highlight this fact and remove
the timeaxis
and xtics
/labelfmt
options from the Chart::Gnuplot
constructor. The timefmt
option to the Chart::Gnuplot::DataSet
constructor is also no longer necessary and it too has been removed.
The script is now ready to go!
Running it
$ perl geo-fit-plot-data.pl
gives
That’s better!
Our statistics output from earlier told us that the maximum heart rate was 165 bpm with an average of 142 bpm. Looking at the graph, an average of 142 bpm seems about right. It also looks like the maximum pulse value occurred at an elapsed time of just short of 50 minutes. We can check that guess more closely later.
What’s intriguing me now is what caused this pattern in the heart rate data.
What could have caused the values to go up and down like that? Is there a
correlation with other data fields? We know from earlier that there’s an
altitude
field, so we can try plotting that along with the heart rate data
and see how (or if) they’re related.
Careful readers might have noticed something: how can you have a variation in altitude when you’re sitting on an indoor trainer? Well, Zwift simulates going up and downhill by changing the resistance in the smart trainer. The resistance is then correlated to a gradient and, given time and speed data, one can work out a virtual altitude gain or loss. Thus, for the data set we’re analysing here, altitude is a sensible parameter to consider. Even if you had no vertical motion whatsoever!
As I mentioned earlier, one of the things I like about Gnuplot is that one can plot two data sets with different y-axes on the same plot. Plotting heart rate and altitude on the same graph is one such use case.
To plot an extra data set on our graph, we need to set up another
Chart::Gnuplot::DataSet
object, this time for the altitude data. Before
we can do that, we’ll have to extract the altitude data from the full
activity data set. Gnuplot also needs to know which data to plot on the
primary and secondary y-axes (i.e. on the left- and right-hand sides of the
graph). And we must remember to label our axes
properly. That’s a fair bit of work, so I’ve done
the hard
yakka for
ya. 😉
Here’s the updated plot_activity_data()
code:
1sub plot_activity_data {
2 my @activity_data = @_;
3
4 # extract data to plot from full activity data
5 my @heart_rates = num_parts('heart_rate', @activity_data);
6 my @timestamps = map { $_->{'timestamp'} } @activity_data;
7 my @altitudes = num_parts('altitude', @activity_data);
8
9 # parse timestamp data
10 my $date_parser = DateTime::Format::Strptime->new(
11 pattern => "%Y-%m-%dT%H:%M:%SZ",
12 time_zone => 'UTC',
13 );
14
15 # get the epoch time for the first point in the time data
16 my $first_epoch_time = $date_parser->parse_datetime($timestamps[0])->epoch;
17
18 # convert timestamp data to elapsed minutes from start of activity
19 my @times = map {
20 my $dt = $date_parser->parse_datetime($_);
21 my $epoch_time = $dt->epoch;
22 my $elapsed_time = ($epoch_time - $first_epoch_time)/60;
23 $elapsed_time;
24 } @timestamps;
25
26 # determine date from timestamp data
27 my $dt = $date_parser->parse_datetime($timestamps[0]);
28 my $date = $dt->strftime("%Y-%m-%d");
29
30 # plot data
31 my $chart = Chart::Gnuplot->new(
32 output => "watopia-figure-8-heart-rate-and-altitude.png",
33 title => "Figure 8 in Watopia on $date: heart rate and altitude over time",
34 xlabel => "Elapsed time (min)",
35 ylabel => "Heart rate (bpm)",
36 terminal => "png size 1024, 768",
37 xtics => {
38 incr => 5,
39 },
40 y2label => 'Altitude (m)',
41 y2range => [-10, 70],
42 y2tics => {
43 incr => 10,
44 },
45 );
46
47 my $heart_rate_ds = Chart::Gnuplot::DataSet->new(
48 xdata => \@times,
49 ydata => \@heart_rates,
50 style => "lines",
51 );
52
53 my $altitude_ds = Chart::Gnuplot::DataSet->new(
54 xdata => \@times,
55 ydata => \@altitudes,
56 style => "boxes",
57 axes => "x1y2",
58 );
59
60 $chart->plot2d($altitude_ds, $heart_rate_ds);
61}
Line 7 extracts the altitude data from the full activity data. This code
also strips the unit information from the altitude data so that we only have
the numerical part, which is what Gnuplot needs. We store the altitude data
in the @altitudes
array. This we use later to create a
Chart::Gnuplot::DataSet
object on lines 53-58. An important line to note
here is the axes
setting on line 57; it tells Gnuplot to use the secondary
y-axis on the right-hand side for this data set. I’ve chosen to use the
boxes
style for the altitude data (line 56) so that the output looks a bit
like the hills and valleys that it represents.
To make the time data a bit easier to read and analyse, I’ve set the increment for the ticks on the x-axis to 5 (lines 37-39). This way it’ll be easier to refer to specific changes in altitude and heart rate data.
The settings for the secondary y-axis use the same names as for the primary
y-axis, with the exception that the string y2
replaces y
. For instance,
to set the axis label for the secondary y-axis, we specify the y2label
value, as in line 40 above.
I’ve set the range on the secondary y-axis explicitly (line 41) because the output looks better than what the automatic range was able to make in this case. Similarly, I’ve set the increment on the secondary y-axis ticks (lines 42-44) because the automatic output wasn’t as good as what I wanted.
I’ve also renamed the variable for the heart rate data set on line 47 to be
more descriptive; the name $data_set
was much too generic.
We specify the altitude data set first in the call to plot2d()
(line 60)
because we want the heart rate data plotted “on top” of the altitude data.
Had we used $heart_rate_ds
first in this call, the altitude data would
have obscured part of the heart rate data.
Running our script in the now familiar way
$ perl geo-fit-plot-data.pl
gives this plot
Cool! Now it’s a bit clearer why the heart rate evolved the way it did.
At the beginning of the graph (in the first ~10 minutes) it looks like I was getting warmed up and my pulse was finding a kind of base level (~130 bpm). Then things started going uphill at about the 10-minute mark and my pulse also kept going upwards. This makes sense because I was working harder. Between about 13 minutes and 19 minutes came the first hill climb on the route and here I was riding even harder. The effort is reflected in the heart rate data which rose to around 160 bpm at the top of the hill. That explains why the heart rate went up from the beginning to roughly the 18-minute mark.
Looking back over the Zwift data for that particular ride, it seems that I took the KOM8 for that climb at that time, so no wonder my pulse was high!9 Note that this wasn’t a special record or anything like that; it was a short-term live result10 and someone else took the jersey with a faster time not long after I’d done my best time up that climb.
It was all downhill shortly after the hill climb, which also explains why the heart rate went down straight afterwards. We also see similar behaviour on the second hill climb (from about 37 minutes to 42 minutes). Although my pulse rose throughout the hill climb, it didn’t rise as high this time. This indicates that I was getting tired and wasn’t able to put as much effort in.
Just in case you’re wondering how the altitude can go negative,11 part of the route goes through “underwater tunnels”. This highlights the flexibility of the virtual worlds within Zwift: the designers have enormous room to let their imaginations run wild. There are all kinds of fun things to discover along the various routes and many that don’t exist in the Real World™. Along with the underwater tunnels (where it’s like riding through a giant aquarium, with sunken shipwrecks, fish, and whales), there is a wild west style town complete with a steam train from that era chugging past. There are also Mayan ruins with llamas (or maybe alpacas?) wandering around and even a section with dinosaurs grazing at the side of the road.
Here’s what it looks like riding through an underwater tunnel:
I think that’s pretty cool.
At the end of the ride (at ~53 minutes) my pulse dropped sharply. Since this was the warm-down phase of the ride, this also makes sense.
There are two peaks in the heart rate data that don’t correlate with altitude (one at ~25 minutes and another at ~48 minutes). The altitude change at these locations would suggest that things are fairly flat. What’s going on there?
One other parameter that we could consider for correlations is power output. Going uphill requires more power than riding on the flat, so we’d expect to see higher power values (and therefore higher heart rates) when climbing. If flat roads require less power, what’s causing the peaks in the pulse? Maybe there’s another puzzle hiding in the data.
Let’s combine the heart rate data with power output and see what other relationships we can discover. To do this we need to extract power output data instead of altitude data. Then we need to change the secondary y-axis data set and configuration to produce a nice plot of power output. Making these changes gives this code:
1sub plot_activity_data {
2 my @activity_data = @_;
3
4 # extract data to plot from full activity data
5 my @heart_rates = num_parts('heart_rate', @activity_data);
6 my @timestamps = map { $_->{'timestamp'} } @activity_data;
7 my @powers = num_parts('power', @activity_data);
8
9 # parse timestamp data
10 my $date_parser = DateTime::Format::Strptime->new(
11 pattern => "%Y-%m-%dT%H:%M:%SZ",
12 time_zone => 'UTC',
13 );
14
15 # get the epoch time for the first point in the time data
16 my $first_epoch_time = $date_parser->parse_datetime($timestamps[0])->epoch;
17
18 # convert timestamp data to elapsed minutes from start of activity
19 my @times = map {
20 my $dt = $date_parser->parse_datetime($_);
21 my $epoch_time = $dt->epoch;
22 my $elapsed_time = ($epoch_time - $first_epoch_time)/60;
23 $elapsed_time;
24 } @timestamps;
25
26 # determine date from timestamp data
27 my $dt = $date_parser->parse_datetime($timestamps[0]);
28 my $date = $dt->strftime("%Y-%m-%d");
29
30 # plot data
31 my $chart = Chart::Gnuplot->new(
32 output => "watopia-figure-8-heart-rate-and-power.png",
33 title => "Figure 8 in Watopia on $date: heart rate and power over time",
34 xlabel => "Elapsed time (min)",
35 ylabel => "Heart rate (bpm)",
36 terminal => "png size 1024, 768",
37 xtics => {
38 incr => 5,
39 },
40 ytics => {
41 mirror => "off",
42 },
43 y2label => 'Power (W)',
44 y2range => [0, 1100],
45 y2tics => {
46 incr => 100,
47 },
48 );
49
50 my $heart_rate_ds = Chart::Gnuplot::DataSet->new(
51 xdata => \@times,
52 ydata => \@heart_rates,
53 style => "lines",
54 );
55
56 my $power_ds = Chart::Gnuplot::DataSet->new(
57 xdata => \@times,
58 ydata => \@powers,
59 style => "lines",
60 axes => "x1y2",
61 );
62
63 $chart->plot2d($power_ds, $heart_rate_ds);
64}
On line 7, I swapped out the altitude data extraction code with power output. Then, I updated the output filename (line 32) and plot title (line 33) to highlight that we’re now plotting heart rate and power data.
The mirror
option to the ytics
setting (lines 40-42) isn’t an obvious
change. Its purpose is to stop the ticks from the primary y-axis from being
mirrored to the secondary y-axis (on the right-hand side). We want to stop
these mirrored ticks from appearing because they’ll clash with the secondary
y-axis tick marks. The reason we didn’t need this before is that all the
y-axis ticks happened to line up and the issue wasn’t obvious until now.
I’ve updated the secondary axis label setting to mention power (line 43).
Also, I’ve set the range to match the data we’re plotting (line 44) and to
space out the data nicely via the incr
option to the y2tics
setting
(lines 45-47). It seemed more appropriate to use lines to plot power output
as opposed to the bars we used for the altitude data, hence the change to
the style
option on line 59.
As we did when plotting altitude, we pass the power data set ($power_ds
)
to the plot2d()
call before $heart_rate_ds
(line 63).
Running the script again
$ perl geo-fit-plot-data.pl
produces this plot:
This plot shows the correlation between heart rate and power output that we expected for the first hill climb. The power output increases steadily from the 3-minute mark up to about the 18-minute mark. After that, it dropped suddenly once I’d reached the top of the climb. This makes sense: I’d just done a personal best up that climb and needed a bit of respite!
However, now we can see clearly what caused the spikes in heart rate at 25 minutes and 48 minutes: there are two large spikes in power output. The first spike maxes out at 1023 W;12 what value the other peak has, it’s hard to tell. We’ll try to work out what that value is later. These spikes in power result from sprints. In Zwift, not only can one try to go up hills as fast as possible, but flatter sections have sprints where one also tries to go as fast as possible, albeit for shorter distances (say 200m or 500m).
Great! We’ve worked out another puzzle in the data!
Zwift produces what they call timelines of a given ride, which is much the same as what we’ve been plotting here. For instance, for the FIT file we’ve been looking at, this is the timeline graph:
Zwift plots several datasets on this graph that have very different value ranges. The plot above shows power output, cadence, heart rate, and altitude data all on one graph! A lot is going on here and because of the different data values and ranges, Zwift doesn’t display values on the y-axes. Their solution is to show all four values at a given time point when the user hovers their mouse over the graph. This solution only works within a web browser and needs lots of JavaScript to work, hence this is something I like to avoid. That (and familiarity) is largely the reason why I prefer PNG output for my graphs.
If you take a close look at the timeline graph, you’ll notice that the maximum power is given as 937 W and not 1023 W, which we worked out from the FIT file data. I don’t know what’s going on here, as the same graph in the Zwift Companion App shows the 1023 W that we got. The graph above is a screenshot from the web application in a browser on my laptop and, at least theoretically, it’s supposed to display the same data. I’ve noticed a few inconsistencies between the web browser view and that from the Zwift Companion App, so maybe this discrepancy is one bug that still needs shaking out.
Y’know what’d also be cool beyond plotting this data? Playing around with it interactively.
That’s also possible with Perl, but it’s another story.
I’ve been using Gnuplot since the late 90’s. Back then, it was the only freely available plotting software which handled time data well. ↩︎
By default, Gnuplot will generate Postscript output. ↩︎
One can interpret the word “terminal” as a kind of “screen” or “canvas” that the plotting library draws its output on. ↩︎
I’ve later found out that they haven’t heard anything, so that’s good! ↩︎
I live in Germany, so this is the relevant time zone for me. ↩︎
All dates are the same and displaying them would be redundant, hence we omit the date information. ↩︎
All elements in the array have the same date, so using the first one does the job. ↩︎
KOM stands for “king of the mountains”. ↩︎
Yes, I am stoked that I managed to take that jersey! Even if it was only for a short time. ↩︎
A live result that makes it onto a leaderboard is valid only for one hour. ↩︎
Around the 5-minute mark and again shortly before the 35-minute mark. ↩︎
One thing that this value implies is that I could power a small bar heater for one second. But not for very much longer! ↩︎
If you’re a Perl developer, you’ve probably heard it before: “Is Perl still a thing?”
The short answer? Yes. Absolutely.
The longer answer? It’s evolving—quietly, but purposefully—and there’s still real demand for skilled Perl developers across a number of industries.
Let’s explore where the opportunities are today and how to find them.
Despite not being the trendiest language, Perl continues to power core infrastructure in fields like:
The reality is, companies with decades of code running in Perl aren’t eager to rip and replace something that still works flawlessly.
One major shift in recent years is the rise of remote-first hiring. More companies are hiring global developers to work on existing Perl systems—whether it's maintaining codebases, modernizing legacy apps, or integrating Perl into cloud workflows.
These roles aren’t always posted on major job boards. That’s why using niche platforms is key.
General job boards often bury Perl listings under unrelated content or make it difficult to filter accurately.
That’s why developers increasingly rely on specialized platforms like
Perl-Jobs.com — a focused job board built specifically
for the Perl community, offering remote, freelance, and full-time listings from companies that actually want your Perl skills.
It saves time and connects you with opportunities that are actually relevant.
Perl isn’t dead—it’s just not loud. There are still solid, high-paying roles out there for developers who know how to write clean, efficient Perl code. And with the right tools and platforms, you don’t have to hunt blindly to find them.
So whether you're actively job hunting or just keeping an eye on the market, it’s a good time to dust off the resume and see where Perl can take you.
Published by Ronak Bhatt on Monday 23 June 2025 14:39
The tech world moves quickly — some languages just can’t keep up. Are you clinging to one that’s quietly dying? By Ronak Bhatt
FIT files record the activities of people using devices such as sports watches and bike head units. Platforms such as Strava and Zwift understand this now quasi-standard format. So does Perl! Here I discuss how to parse FIT files and calculate some basic statistics from the extracted data.
I love data. Geographical data, time series data, simulation data, whatever. Whenever I get my hands on a new dataset, I like to have a look at it and visualise it. This way I can get a feel for what’s available and to see what kind of information I can extract from the long lists of numbers. I guess this comes with having worked in science for so long: there’s always some interesting dataset to look at and analyse and try to understand.
I began collecting lots of data recently when I started riding my bike more. Bike head units can save all sorts of information about one’s ride. There are standard parameters such as time, position, altitude, temperature, and speed. If you have extra sensors then you can also measure power output, heart rate, and cadence. This is a wealth of information just waiting to be played with!
I’ve also recently started using Zwift1 and there I can get even more data than on my road bike. Now I can get power and cadence data along with the rest of the various aspects of a normal training ride.
My head unit is from Garmin2 and thus saves ride data in their standard FIT format. Zwift also allows you to save ride data in FIT format, so you don’t have to deal with multiple formats when reading and analysing ride data. FIT files can also be uploaded to Strava3 where you can track all the riding you’re doing in one location.
But what if you don’t want to use an online service to look at your ride data? What if you want to do this yourself, using your own tools? That’s what I’m going to talk about here: reading ride data from FIT files and analysing the resulting information.
Because I like Perl, I wondered if there are any modules available to read
FIT files. It turns out that there are two:
Geo::FIT
and
Parser::FIT
. I chose to use
Geo::FIT
because Parser::FIT
is still in alpha status. Also, Geo::FIT
is quite mature with its last release in 2024, so it is still up-to-date.
The Garmin developer site explains all the gory details of the FIT format. The developer docs give a good high-level overview of what the format is for:
The Flexible and Interoperable Data Transfer (FIT) protocol is a format designed specifically for the storing and sharing of data that originates from sport, fitness and health devices. It is specifically designed to be compact, interoperable and extensible.
A FIT file has a well-defined structure and contains a series of records of different types. There are definition messages which describe the data appearing in the file. There are also data messages which contain the data fields storing a ride’s various parameters. Header fields contain such things as CRC information which one can use to check a file’s integrity.
As noted above, to extract the data, I’m going to use the
Geo::FIT
module. It’s based on
the Garmin::Fit
module originally by Kiyokazu
Suto and later
expanded upon by Matjaz Rihtar.
Unfortunately, neither was ever released to
CPAN. The latest releases
of the Garmin::FIT
code (either version) were in 2017. In contrast,
Geo::FIT
’s most recent release is from 2024-07-13 and it’s available on
CPAN, making it easy to install. It’s great to see that someone has taken
on the mantle of maintaining this codebase!
To install Geo::FIT
, we’ll use cpanm
:
$ cpanm Geo::FIT
Now we’re ready to start parsing FIT files and extracting their data.
As mentioned earlier, FIT files store event data in data messages. Each event has various fields, depending upon the kind of device (e.g. watch or head unit) used to record the activity. More fields are possible if other peripherals are attached to the main device (e.g. power meter or heart rate monitor). We wish to extract all available event data.
To extract (and if we want to, process) the event data, Geo::FIT
requires
that we define a callback function and register it. Geo::FIT
calls this
function each time it detects a data message, allowing us to process the
file in small bites as a stream of data rather than one giant blob.
A simple example should explain the process. I’m going to adapt the example
mentioned in the module’s
synopsis. Here’s the code
(which I’ve put into a file called geo-fit-basic-data-extraction.pl
):
1use strict;
2use warnings;
3
4use Geo::FIT;
5
6my $fit = Geo::FIT->new();
7$fit->file( "2025-05-08-07-58-33.fit" );
8$fit->open or die $fit->error;
9
10my $record_callback = sub {
11 my ($self, $descriptor, $values) = @_;
12 my $time= $self->field_value( 'timestamp', $descriptor, $values );
13 my $lat = $self->field_value( 'position_lat', $descriptor, $values );
14 my $lon = $self->field_value( 'position_long', $descriptor, $values );
15 print "Time was: ", join("\t", $time, $lat, $lon), "\n"
16};
17
18$fit->data_message_callback_by_name('record', $record_callback ) or die $fit->error;
19
20my @header_things = $fit->fetch_header;
21
221 while ( $fit->fetch );
23
24$fit->close;
The only changes I’ve made from the original example code have been to
include the strict
and warnings
strictures on lines 1 and 2, and to
replace the $fname
variable with the name of a FIT file exported from one
of my recent Zwift rides (line 7).
After having imported the module (line 4), we instantiate a Geo::FIT
object (line 6). We then tell Geo::FIT
the name of the file to process by
calling the file()
method on line 7. This method returns the name of the
file if it’s called without an argument. We open the file on line 8 and
barf with an error if anything went wrong.
Lines 10-16 define the callback function, which must accept the given
argument list. Within the callback, the field_value()
method extracts the
value with the given field name from the FIT data message (lines 12-14).
I’ll talk about how to find out what field names are available later. In
this example, we extract the timestamp as well as the latitude and longitude
of where the event happened. Considering that Garmin is a company that has
focused on GPS sensors, it makes sense that such data is the minimum we
would expect to find in a FIT file.
On line 18 we register the callback with the Geo::FIT
object. We tell it
that the callback should be run whenever Geo::FIT
sees a data message with
the name record
4. Again, the code barfs with an
error if anything goes wrong.
The next line (line 20) looks innocuous but is actually necessary. The
fetch_header()
method must be called before we can fetch any data from
the FIT file. Calling this method also returns header information, part of
which we can use to check the file integrity. This is something we might
want to use in a robust application as opposed to a simple script such as
that here.
The main action takes place on line 22. We read each data message from the
FIT file and–if it’s a data message with the name record
–process it with
our callback.
At the end (line 24), we’re good little developers and close the file.
Running this code, you’ll see lots of output whiz past. It’ll look something like this:
$ perl geo-fit-basic-data-extraction.pl
<snip>
Time was: 2025-05-08T06:53:10Z -11.6379448 deg 166.9560685 deg
Time was: 2025-05-08T06:53:11Z -11.6379450 deg 166.9560904 deg
Time was: 2025-05-08T06:53:12Z -11.6379451 deg 166.9561073 deg
Time was: 2025-05-08T06:53:13Z -11.6379452 deg 166.9561185 deg
Time was: 2025-05-08T06:53:14Z -11.6379452 deg 166.9561232 deg
Time was: 2025-05-08T06:53:15Z -11.6379452 deg 166.9561233 deg
Time was: 2025-05-08T06:53:16Z -11.6379452 deg 166.9561233 deg
Time was: 2025-05-08T06:53:17Z -11.6379452 deg 166.9561233 deg
This tells us that, at the end of my ride on Zwift, I was at a position of roughly 11°S, 167°E shortly before 07:00 UTC on the 8th of May 2025.5 Because Zwift has virtual worlds, this position tells little of my actual physical location at the time. Hint: my spare room (where I was riding my indoor trainer) isn’t located at this position. 😉
We want to get serious, though, and not only extract position and timestamp
data. There’s more in there to discover! So how do we find out what fields
are available? For this task, we need to use the fields_list()
method.
To extract the list of available field names, I wrote the following script,
which I called geo-fit-find-field-names.pl
:
1use strict;
2use warnings;
3
4use Geo::FIT;
5use Scalar::Util qw(reftype);
6
7my $fit = Geo::FIT->new();
8$fit->file( "2025-05-08-07-58-33.fit" );
9$fit->open or die $fit->error;
10
11my $record_callback = sub {
12 my ($self, $descriptor, $values) = @_;
13 my @all_field_names = $self->fields_list($descriptor);
14
15 return \@all_field_names;
16};
17
18$fit->data_message_callback_by_name('record', $record_callback ) or die $fit->error;
19
20my @header_things = $fit->fetch_header;
21
22my $found_field_names = 0;
23do {
24 my $field_names = $fit->fetch;
25 my $reftype = reftype $field_names;
26 if (defined $reftype && $reftype eq 'ARRAY') {
27 print "Number of field names found: ", scalar @{$field_names}, "\n";
28
29 while (my @next_field_names = splice @{$field_names}, 0, 5) {
30 my $joined_field_names = join ", ", @next_field_names;
31 print $joined_field_names, "\n";
32 }
33 $found_field_names = 1;
34 }
35} while ( !$found_field_names );
36
37$fit->close;
This script extracts and prints the field names from the first data message
it finds. Here, I’ve changed the callback (lines 11-16) to only return the
list of all available field names by calling the fields_list()
method. We
return the list of field names as an array reference (line 15). While this
particular change to the callback (in comparison to
geo-fit-basic-data-extraction.pl
, above) will do the job, it’s not very
user-friendly. It will print the field names for all data messages in the
FIT file, which is a lot. The list of all available field names would be
repeated thousands of times! So, I changed the while
loop to a do-while
loop (lines 23-35), exiting as soon as the callback finds a data message
containing field names.
To actually grab the field name data, I had to get a bit tricky. This is
because fetch()
returns different values depending upon whether the
callback was called. For instance, when the callback isn’t called, the
return value is 1
on success or undef
. If the callback function is
called, fetch()
returns the callback’s return value, which in our case is
the array reference to the list of field names. Hence, I’ve assigned the
return value to a variable, $field_names
(line 24). To ensure that we’re
only processing data returned when the callback is run, we check that
$field_names
is defined and has a reference type of ARRAY
(line 26).
This we do with the help of the reftype
function from Scalar::Util
(line
25).
It turns out that there are 49 field names available (line 27). To format
the output more nicely I splice
d the array, extracting five elements at a
time (line 29) and printing them as a comma-separated string (lines 30 and
31). I adapted the while (splice)
pattern from the example in the Perl
documentation for splice
.
Note that I could have printed the field names from within the callback. It
doesn’t make much of a difference if we return data from the callback first
before processing it or doing the processing within the callback. In this
case, I chose to do the former.
Running the script gives the following output:
$ perl geo-fit-find-field-names.pl
Use of uninitialized value $emsg in string ne at /home/vagrant/perl5/perlbrew/perls/perl-5.38.3/lib/site_perl/5.38.3/Geo/FIT.pm line 7934.
Use of uninitialized value $emsg in string ne at /home/vagrant/perl5/perlbrew/perls/perl-5.38.3/lib/site_perl/5.38.3/Geo/FIT.pm line 7992.
Number of field names found: 49
timestamp, position_lat, position_long, distance, time_from_course
total_cycles, accumulated_power, enhanced_speed, enhanced_altitude, altitude
speed, power, grade, compressed_accumulated_power, vertical_speed
calories, vertical_oscillation, stance_time_percent, stance_time, ball_speed
cadence256, total_hemoglobin_conc, total_hemoglobin_conc_min, total_hemoglobin_conc_max, saturated_hemoglobin_percent
saturated_hemoglobin_percent_min, saturated_hemoglobin_percent_max, heart_rate, cadence, compressed_speed_distance
resistance, cycle_length, temperature, speed_1s, cycles
left_right_balance, gps_accuracy, activity_type, left_torque_effectiveness, right_torque_effectiveness
left_pedal_smoothness, right_pedal_smoothness, combined_pedal_smoothness, time128, stroke_type
zone, fractional_cadence, device_index, 1_6_target_power
Note that the uninitialized value
warnings are from Geo::FIT
.
Unfortunately, I don’t know what’s causing them. They appear whenever we
fetch data from the FIT file. From now on, I’ll omit these warnings from
program output in this article.
As you can see, there’s potentially a lot of information one can obtain
from FIT files. I say “potentially” here because not all these fields
contain valid data, as we’ll see soon. I was quite surprised at the level
of detail. For instance, there are various pedal smoothness values, stroke
type, and torque effectiveness parameters. Also, there’s haemoglobin
information,6 which I guess is something one can
collect given the appropriate peripheral device. What things like enhanced
speed and compressed accumulated power mean, I’ve got no idea. For me, the
interesting parameters are: timestamp
, position_lat
, position_long
,
distance
, altitude
, speed
, power
, calories
, heart_rate
, and
cadence
. We’ll get around to extracting and looking at these values soon.
Let’s see what values are present in each of the fields. To do this, we’ll
change the callback to collect the values in a hash with the field names as
the hash keys. Then we’ll return the hash from the callback. Here’s the
script I came up with (I called it geo-fit-show-single-values.pl
):
1use strict;
2use warnings;
3
4use Geo::FIT;
5use Scalar::Util qw(reftype);
6
7my $fit = Geo::FIT->new();
8$fit->file( "2025-05-08-07-58-33.fit" );
9$fit->open or die $fit->error;
10
11my $record_callback = sub {
12 my ($self, $descriptor, $values) = @_;
13 my @all_field_names = $self->fields_list($descriptor);
14
15 my %event_data;
16 for my $field_name (@all_field_names) {
17 my $field_value = $self->field_value($field_name, $descriptor, $values);
18 $event_data{$field_name} = $field_value;
19 }
20
21 return \%event_data;
22};
23
24$fit->data_message_callback_by_name('record', $record_callback ) or die $fit->error;
25
26my @header_things = $fit->fetch_header;
27
28my $found_event_data = 0;
29do {
30 my $event_data = $fit->fetch;
31 my $reftype = reftype $event_data;
32 if (defined $reftype && $reftype eq 'HASH' && defined %$event_data{'timestamp'}) {
33 for my $key ( sort keys %$event_data ) {
34 print "$key = ", $event_data->{$key}, "\n";
35 }
36 $found_event_data = 1;
37 }
38} while ( !$found_event_data );
39
40$fit->close;
The main changes here (in comparison to the previous script) involve collecting the data into a hash (lines 15-19) and later, after fetching the event data, printing it (lines 32-35).
To collect data from an individual event, we first find out what the
available fields are (line 13). Then we loop over each field name (line
16), extracting the values via the field_value()
method (line 17). To
pass the data outside the callback, we store the value in the %event_data
hash using the field name as a key (line 18). Finally, we return the event
data as a hash ref (line 21).
When printing the key and value information, we again only want to print the
first event that we come across. Hence we use a do-while
loop and exit as
soon as we’ve found appropriate event data (line 38).
Making sure that we’re only printing relevant event data is again a bit
tricky. Not only do we need to make sure that the callback has returned a
reference type, but we also need to check that it’s a hash. Plus, we have
an extra check to make sure that we’re getting time series data by looking
for the presence of the timestamp
key (line 32). Without the timestamp
key check, we receive data messages unrelated to the ride activity, which we
obviously don’t want.
Running this new script gives this output:
$ perl geo-fit-show-single-values.pl
1_6_target_power = 0
accumulated_power = 4294967295
activity_type = 255
altitude = 4.6 m
ball_speed = 65535
cadence = 84 rpm
cadence256 = 65535
calories = 65535
combined_pedal_smoothness = 255
compressed_accumulated_power = 65535
compressed_speed_distance = 255
cycle_length = 255
cycles = 255
device_index = 255
distance = 0.56 m
enhanced_altitude = 4294967295
enhanced_speed = 4294967295
fractional_cadence = 255
gps_accuracy = 255
grade = 32767
heart_rate = 115 bpm
left_pedal_smoothness = 255
left_right_balance = 255
left_torque_effectiveness = 255
position_lat = -11.6387709 deg
position_long = 166.9487493 deg
power = 188 watts
resistance = 255
right_pedal_smoothness = 255
right_torque_effectiveness = 255
saturated_hemoglobin_percent = 65535
saturated_hemoglobin_percent_max = 65535
saturated_hemoglobin_percent_min = 65535
speed = 1.339 m/s
speed_1s = 255
stance_time = 65535
stance_time_percent = 65535
stroke_type = 255
temperature = 127
time128 = 255
time_from_course = 2147483647
timestamp = 2025-05-08T05:58:45Z
total_cycles = 4294967295
total_hemoglobin_conc = 65535
total_hemoglobin_conc_max = 65535
total_hemoglobin_conc_min = 65535
vertical_oscillation = 65535
vertical_speed = 32767
zone = 255
That’s quite a list!
What’s immediately obvious (at least, to me) is that many of the values look
like maximum integer range values. For instance, activity_type = 255
suggests that this value ranges from 0 to 255, implying that it’s an 8-bit
integer. Also, the numbers 65535 and 4294967295 are the maximum values of
16-bit and 32-bit integers, respectively. This “smells” of dummy values
being used to fill the available keys with something other than 0. Thus, I
get the feeling that we can ignore such values.
Further, most of the values that aren’t only an integer have units attached.
For instance, the speed is given as 1.339 m/s
and the latitude coordinate
as -11.6387709 deg
. Note the units associated with these values. The
only value without a unit–yet is still a sensible value–is timestamp
.
This makes sense, as a timestamp doesn’t have a unit.
This is the next part of the puzzle to solve: we need to work out how to extract relevant event data and filter out anything containing a dummy value.
To filter out the dummy values and hence focus only on real event data, we
use the fact that real event data contains a string of letters denoting the
value’s unit. Thus, the event data we’re interested in has a value
containing numbers and letters. Fortunately, this is also the case for the
timestamp because it contains timezone information, denoted by the letter
Z
, meaning UTC. In other words, we can solve our problem with a
regex.7
Another way of looking at the problem would be to realise that all the irrelevant data contains only numbers. Thus, if a data value contains a letter, we should select it. Either way, the easiest approach is to look for a letter by using a regex.
I’ve modified the script above to filter out the dummy event data and to
collect valid event data into an array for the entire
activity.8 Here’s what the code looks like now (I’ve
called the file geo-fit-full-data-extraction.pl
):
1use strict;
2use warnings;
3
4use Geo::FIT;
5use Scalar::Util qw(reftype);
6
7my $fit = Geo::FIT->new();
8$fit->file( "2025-05-08-07-58-33.fit" );
9$fit->open or die $fit->error;
10
11my $record_callback = sub {
12 my ($self, $descriptor, $values) = @_;
13 my @all_field_names = $self->fields_list($descriptor);
14
15 my %event_data;
16 for my $field_name (@all_field_names) {
17 my $field_value = $self->field_value($field_name, $descriptor, $values);
18 if ($field_value =~ /[a-zA-Z]/) {
19 $event_data{$field_name} = $field_value;
20 }
21 }
22
23 return \%event_data;
24};
25
26$fit->data_message_callback_by_name('record', $record_callback ) or die $fit->error;
27
28my @header_things = $fit->fetch_header;
29
30my $event_data;
31my @activity_data;
32do {
33 $event_data = $fit->fetch;
34 my $reftype = reftype $event_data;
35 if (defined $reftype && $reftype eq 'HASH' && defined %$event_data{'timestamp'}) {
36 push @activity_data, $event_data;
37 }
38} while ( $event_data );
39
40$fit->close;
41
42print "Found ", scalar @activity_data, " entries in FIT file\n";
43my $available_fields = join ", ", sort keys %{$activity_data[0]};
44print "Available fields: $available_fields\n";
The primary difference here with respect to the previous script is the check
within the callback for a letter in the field value (line 18). If that’s
true, we store the field value in the %event_data
hash under a key
corresponding to the field name (line 19).
Later, if we have a hash and it has a timestamp
key, we push the
$event_data
hash reference onto an array. This way we store all events
related to our activity (line 36). Also, instead of checking that we got
only one instance of event data, we’re now looping over all event data in
the FIT file, exiting the do-while
loop if $event_data
is a falsey
value.9 Note that $event_data
has to be declared outside
the do
block. Otherwise, it won’t be in scope for the while
statement
and Perl will barf with a compile-time error. We also declare the
@activity_data
array outside the do-while
loop because we want to use it
later.
After processing all records in the FIT file, we display the number of data entries found (line 42) and show a list of the available (valid) fields (lines 43-44).
Running this script gives this output:10
$ perl geo-fit-full-data-extraction.pl
Found 3273 entries in FIT file
Available fields: altitude, cadence, distance, heart_rate, position_lat, position_long, power, speed, timestamp
We now have the full dataset to play with! So what can we do with it? One thing that springs to mind is to calculate the maximum and average values of each data series.
Given the list of available fields, my instincts tell me that it’d be nice to know what the following parameters are:
Let’s calculate them now.
Finding the total distance is very easy. Since this is a cumulative quantity, we only need to select the value in the final data point. Then we convert it to kilometres by dividing by 1000, because the distance data is in units of metres. I.e.:
my $total_distance_m = (split ' ', ${$activity_data[-1]}{'distance'})[0];
my $total_distance = $total_distance_m/1000;
print "Total distance: $total_distance km\n";
Note that since the distance
field value also contains its unit, we have
to split on spaces and take the first element to extract the numerical part.
To get maximum values (e.g. for maximum speed), we use the max
function
from List::Util
:
1my @speeds = map { (split ' ', $_->{'speed'})[0] } @activity_data;
2my $maximum_speed = max @speeds;
3my $maximum_speed_km = $maximum_speed*3.6;
4print "Maximum speed: $maximum_speed m/s = $maximum_speed_km km/h\n";
Here, I’ve extracted all speed values from the activity data, selecting only the numerical part (line 1). I then found the maximum speed on line 2 (which is in m/s) and converted this into km/h (line 3), displaying both at the end.
Getting average values is a bit more work because List::Util
doesn’t
provide an arithmetic mean function, commonly known as an “average”. Thus,
we have to calculate this ourselves. It’s not much work, though. Here’s
the code for the average speed:
1my $average_speed = (sum @speeds) / (scalar @speeds);
2my $average_speed_km = sprintf("%0.2f", $average_speed*3.6);
3$average_speed = sprintf("%0.2f", $average_speed);
4print "Average speed: $average_speed m/s = $average_speed_km km/h\n";
In this code, I’ve used the sum
function from List::Util
to find the sum
of all speed values in the entry data (line 1). Dividing this value by the
length of the array (i.e. scalar @speeds
) gives the average value.
Because this value will have lots of decimal places, I’ve used sprintf
to
show only two decimal places (this is what the "%0.2f"
format statement
does on line 3). Again, I’ve calculate the value in km/h (line 2) and
show the average speed in both m/s and km/h.
Extending the code to calculate and display all parameters I mentioned above, we get this:
my $total_distance_m = (split ' ', ${$activity_data[-1]}{'distance'})[0];
my $total_distance = $total_distance_m/1000;
print "Total distance: $total_distance km\n";
my @speeds = map { (split ' ', $_->{'speed'})[0] } @activity_data;
my $maximum_speed = max @speeds;
my $maximum_speed_km = $maximum_speed*3.6;
print "Maximum speed: $maximum_speed m/s = $maximum_speed_km km/h\n";
my $average_speed = (sum @speeds) / (scalar @speeds);
my $average_speed_km = sprintf("%0.2f", $average_speed*3.6);
$average_speed = sprintf("%0.2f", $average_speed);
print "Average speed: $average_speed m/s = $average_speed_km km/h\n";
my @powers = map { (split ' ', $_->{'power'})[0] } @activity_data;
my $maximum_power = max @powers;
print "Maximum power: $maximum_power W\n";
my $average_power = (sum @powers) / (scalar @powers);
$average_power = sprintf("%0.2f", $average_power);
print "Average power: $average_power W\n";
my @heart_rates = map { (split ' ', $_->{'heart_rate'})[0] } @activity_data;
my $maximum_heart_rate = max @heart_rates;
print "Maximum heart rate: $maximum_heart_rate bpm\n";
my $average_heart_rate = (sum @heart_rates) / (scalar @heart_rates);
$average_heart_rate = sprintf("%0.2f", $average_heart_rate);
print "Average heart rate: $average_heart_rate bpm\n";
If you’re following along at home–and assuming that you’ve added this code
to the end of geo-fit-full-data-extraction.pl
–when you run the file, you
should see output like this:
$ perl geo-fit-full-data-extraction.pl
Found 3273 entries in FIT file
Available fields: altitude, cadence, distance, heart_rate, position_lat,
position_long, power, speed, timestamp
Total distance: 31.10591 km
Maximum speed: 18.802 m/s = 67.6872 km/h
Average speed: 9.51 m/s = 34.23 km/h
Maximum power: 1023 W
Average power: 274.55 W
Maximum heart rate: 165 bpm
Average heart rate: 142.20 bpm
Nice! That gives us more of a feel for the data and what we can learn from it. We can also see that I was working fairly hard on this bike ride as seen from the average power and average heart rate data.
One thing to highlight about these numbers, from my experience riding both indoors and outdoors, is that the average speed on Zwift is too high. Were I riding my bike outside on the road, I’d be more likely to have an average speed of ~25 km/h, not the 34 km/h shown here. I think this discrepancy comes from Zwift not accurately converting power output into speed within the game.11 I’m not sure where the discrepancy comes from. Perhaps I don’t go as hard when out on the road? Dunno.
From experience, I know that it’s easier to put in more effort over shorter periods. Thus, I’d expect the average speed to be a bit higher indoors when doing shorter sessions. Another factor is that when riding outside one has to contend with stopping at intersections and traffic lights etc. Stopping and starting brings down the average speed on outdoor rides. These considerations might explain part of the discrepancy, but I don’t think it explains it all.
There’s some duplication in the above code that I could remove. For
instance, the code for extracting the numerical part of a data entry’s value
should really be in its own function. I don’t need to map
over a split
each time; those are just implementation details that should hide behind a
nicer interface. Also, the average value calculation would be better in its
own function.
A possible refactoring to reduce this duplication could look like this:
# extract and return the numerical parts of an array of FIT data values
sub num_parts {
my $field_name = shift;
my @activity_data = @_;
return map { (split ' ', $_->{$field_name})[0] } @activity_data;
}
# return the average of an array of numbers
sub avg {
my @array = @_;
return (sum @array) / (scalar @array);
}
which one would use like so:
my @speeds = num_parts('speed', @activity_data);
my $average_speed = avg(@speeds);
Seeing numerical values of ride statistics is all well and good, but it’s much nicer to see a picture of the data. To do this, we need to plot it.
But that’s a story for another time.
Note that I’m not affiliated with Zwift. I use the platform for training, especially for short rides, when the weather’s bad and in the winter. ↩︎
Note that I’m not affiliated with Garmin. I own a Garmin Edge 530 head unit and find that it works well for my needs. ↩︎
Note that I’m not affiliated with Strava. I’ve found the platform to be useful for individual ride analysis and for collating a year’s worth of training. ↩︎
There are different kinds of data messages.
We usually want record
s as these messages contain event data
from sporting activities. ↩︎
For those wondering: these coordinates would put me on the island of Teanu, which is part of the Santa Cruz Islands. This island group is north of Vanuatu and east of the Solomon Islands in the Pacific Ocean. ↩︎
I expected this field to be spelled ‘haemoglobin’ rather than hemoglobin. Oh well. ↩︎
Jeff Attwood wrote an interesting take on the use of regular expressions. ↩︎
Garmin calls a complete ride (or run, if you’re that way inclined) an “activity”. Hence I’m using their nomenclature here. ↩︎
Remember that fetch()
returns undef
on failure or EOF. ↩︎
Note that I’ve removed the uninitialized value
warnings from the script output. ↩︎
Even though Zwift is primarily a training platform, it is also a game. There are power-ups and other standard gaming features such as experience points (XP). Accumulating XP allows you to climb up a ladder of levels which then unlocks other features and in-game benefits. This is the first computer game I’ve ever played where strength and fitness in real life play a major role in the in-game success. ↩︎
Published by alh on Sunday 22 June 2025 17:38
Tony writes: ``` [Hours] [Activity] 2025/04/01 Tuesday 0.22 #23151 check CI results, fix minitest and re-push 1.77 #23160 try to decode how the NEED_ stuff works, try leont’s suggestion and test, push for CI 0.82 #22125 check smoke results, rebase and push 0.50 #21878 consider how to implement this
3.84
2025/04/02 Wednesday 0.23 #23075 rebase and squash some, push for CI 0.98 test-dist-modules threaded testing: check CI results, remove 5.8, clean up commits, push for CI 0.10 #23075 check CI results and apply to blead
1.59
2025/04/03 Thursday 0.37 #23151 check CI results, open PR 23171 1.60 #17601 side-issue: check history, testing, find an unrelated problem, work on a fix, testing 0.20 #17601 side-issue: push fix for CI, comment and mark
2.17
2025/04/07 Monday 0.15 #22120 follow-up 1.57 #23151 add suggested change, testing and push 0.62 #23172 review and comment 0.20 #23177 review, research and apply to blead 0.37 #17601 side-issue: check CI results, add perldelta, cleanup commit message, open PR 23178 0.60 #23022 clean up, add perldelta, push for CI
4.24
2025/04/08 Tuesday 0.53 #17601 research, minor fix and comment 0.08 #22125 fix test failure 0.48 #17601 side-issue: testing, research and comment 0.55 #16608 reproduce, code review
3.26
2025/04/09 Wednesday 1.23 #17601 side issue: add a panic message, research and comment 2.40 #16608 research, try to reproduce some other cases, comment, work on fixes, tests, work class initfields similar bug 1.83 #16608 fix an issue with smartmatch fix, work on initfields fix, testing, perldelta, push for CI, smoke-me 0.33 #17601 test another build configuration, minor fix and push 0.28 #23151 testing
6.30
2025/04/10 Thursday 0.32 #16608 fix a minor issue and re-push 0.13 #23165 review updates and approve 2.28 look into smoke test failures, ASAN detected leak from op/signatures, debugging, make #23187 2.28 op/signatures leak: debugging, work it out (I think), work
5.01
2025/06/14 Saturday 3.45 #23022 re-check, minor re-work, testing, push
3.80
2025/04/15 Tuesday 1.15 #23187 consider re-work, minor fix, testing, perldelta, push for CI 0.70 document that TARG isn’t pristine and the implications, open #23196 0.60 #16608 check smoke results, debugging and fix, push for CI/smoke 1.13 #22125 clean up commit history, testing, perldelta, more
3.58
2025/04/16 Wednesday 0.23 #23196 edits as suggested and push 1.50 #23187 check CI results, investigate ASAN results, which appear unrelated, open PR 23203 0.67 #23201 review, research a lot, approve 0.20 #16608 check CI results, make PR 23204 0.63 #1674 review history and research, comment since I’m
3.23
2025/04/22 Tuesday 0.17 #23207 review, research and approve 0.92 #23208 review, testing and comment 1.80 #23202 review, testing 0.67 #23202 more review, testing 0.37 #23202 more review, comments 0.25 #23208 research and comment
4.61
2025/04/23 Wednesday 0.30 #23202 review responses 0.80 #23172 review updates, approve 0.22 #1674 research 1.63 #1674 more research, minor change, testing, push for CI 0.45 #3965 testing 0.23 #3965 more testing, comment and mark “Closable?” 0.10 #1674 review CI results and make PR 23219
4.95
2025/04/24 Thursday 0.22 #23216 review and approve 0.08 #23217 review and approve 0.08 #23220 review and approve 1.10 #23215 testing, look if we can eliminate the conditional from cSVOPx_sv() on threads (we can’t directly, the non- pad sv is used at compile-time), approve 0.35 #23208 review, research, comments 1.27 #4106 research 2.70 #4106 testing for potential bugs and misbehaviour, chainsaw for w32_fdpid and make it like everyone else,
5.80
2025/04/28 Monday 0.35 #20841 comment 2.38 #22374 minor fixes, testing, force push to update, comments 0.13 #23226 review and approve 0.70 #23227 review, research, check build logs and comment
4.01
2025/04/29 Tuesday 0.42 #23228 check updates and approve 0.63 #23227 testing and comment 1.07 #23225 start review
3.35
2025/04/30 Wednesday 1.28 #23227 review, testing, research and approve with comment 0.68 #4106 check results, look for existing tests that might test this, testing 2.23 #4106 review history, work on a new test, testing, push for CI 0.83 #23232 review docs, open Dual-Life/experimental#22 which
5.02
Which I calculate is 64.76 hours.
Approximately 33 tickets were reviewed or worked on, and 2 patches were applied. ```
Published on Sunday 22 June 2025 16:45
The examples used here are from the weekly challenge problem statement and demonstrate the working solution.
You are given a date in the format YYYY-MM-DD. Write a script to find day number of the year that the given date represent.
The core of the solution is contained in a main loop. The resulting code can be contained in a single file.
The answer is arrived at via a fairly straightforward calculation.
sub day_of_year {
my ($date) = @_;
my $day_of_year = 0;
my ($year, $month, $day) = split /-/, $date;
⟨determine if this is a leap year 3 ⟩
my @days_in_month = (31, $february_days, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31);
$day_of_year += $days_in_month[$_] for (0 .. $month - 2);
$day_of_year += $day;
return $day_of_year;
}
◇
Let’s break the logic for computing a leap year into it’s own section. A leap year occurs every 4 years, except for years that are divisible by 100, unless they are also divisible by 400.
Just to make sure things work as expected we’ll define a few short tests. The double chop is just a lazy way to make sure there aren’t any trailing commas in the output.
MAIN:{
say day_of_year q/2025-02-02/;
say day_of_year q/2025-04-10/;
say day_of_year q/2025-09-07/;
}
◇
Fragment referenced in 1.
$ perl perl/ch-1.pl 33 100 250
You are given an array of positive integers having even elements. Write a script to to return the decompress list. To decompress, pick adjacent pair (i, j) and replace it with j, i times.
For fun let’s use recursion!
Sometimes when I write a recursive subroutine in Perl I use a reference variable to set the return value. Other times I just use an ordinary return. In some cases, for convenience, I’ll do this with two subroutines. One of these is a wrapper which calls the main recursion.
For this problem I’ll do something a little different. I’ll have one subroutine and for each recursive call I’ll add in an array reference to hold the accumulating return value.
Note that we take advantage of Perl’s automatic list flattening when pushing to the array reference holding the new list we are building.
sub decompress_list{
my $r = shift @_;
if(
!ref($r) || ref($r) ne q/ARRAY/){
unshift @_, $r;
$r = [];
}
unless(@_ == 0){
my $i = shift @_;
my $j = shift @_;
push @{$r}, ($j) x $i;
decompress_list($r, @_);
}
else{
return @{$r};
}
}
◇
Fragment referenced in 5.
The main section is just some basic tests.
MAIN:{
say join q/, /, decompress_list 1, 3, 2, 4;
say join q/, /, decompress_list 1, 1, 2, 2;
say join q/, /, decompress_list 3, 1, 3, 2;
}
◇
Fragment referenced in 5.
$ perl perl/ch-2.pl 3, 4, 4 1, 2, 2 1, 1, 1, 2, 2, 2
Published by prz on Saturday 21 June 2025 22:13
Published by alh on Tuesday 17 June 2025 16:07
Paul writes:
As earlier reported, I managed to make some progress on the
faster-signatures
work, as well as some other things.
^^=
operator
Total: 9 hours
Published by alh on Tuesday 17 June 2025 15:57
Dave writes:
A bit of a quiet month.
I checked blead for any performance regressions compared with 5.40.0, using Porting/bench.pl. I found only one significant one: UTF8 string literals were inadvertently no longer being created Copy-on-Write.
I created a PR which improves how OPs are dumped on threaded builds. This will make certain types of debugging easier in the future.
Fixed a bug.
Tweaked my ParseXS AST PR.
Summary:
Total:
Published on Tuesday 17 June 2025 00:00
Earlier this week, I read a post from someone who failed a job interview because they used a hash slice in some sample code and the interviewer didn’t believe it would work.
That’s not just wrong — it’s a teachable moment. Perl has several kinds of slices, and they’re all powerful tools for writing expressive, concise, idiomatic code. If you’re not familiar with them, you’re missing out on one of Perl’s secret superpowers.
In this post, I’ll walk through all the main types of slices in Perl — from the basics to the modern conveniences added in recent versions — using a consistent, real-world-ish example. Whether you’re new to slices or already slinging %hash{...}
like a pro, I hope you’ll find something useful here.
Let’s imagine you’re writing code to manage employees in a company. You’ve got an array of employee names and a hash of employee details.
my @employees = qw(alice bob carol dave eve); my %details = ( alice => 'Engineering', bob => 'Marketing', carol => 'HR', dave => 'Engineering', eve => 'Sales', );
We’ll use these throughout to demonstrate each kind of slice.
List slices are slices from a literal list. They let you pick multiple values from a list in a single operation:
my @subset = (qw(alice bob carol dave eve))[1, 3]; # @subset = ('bob', 'dave')
You can also destructure directly:
my ($employee1, $employee2) = (qw(alice bob carol))[0, 2]; # $employee1 = 'alice', $employee2 = 'carol'
Simple, readable, and no loop required.
Array slices are just like list slices, but from an array variable:
my @subset = @employees[0, 2, 4]; # @subset = ('alice', 'carol', 'eve')
You can also assign into an array slice to update multiple elements:
@employees[1, 3] = ('beatrice', 'daniel'); # @employees = ('alice', 'beatrice', 'carol', 'daniel', 'eve')
Handy for bulk updates without writing explicit loops.
This is where some people start to raise eyebrows — but hash slices are perfectly valid Perl and incredibly useful.
Let’s grab departments for a few employees:
my @departments = @details{'alice', 'carol', 'eve'}; # @departments = ('Engineering', 'HR', 'Sales')
The @
sigil here indicates that we’re asking for a list of values, even though %details
is a hash.
You can assign into a hash slice just as easily:
@details{'bob', 'carol'} = ('Support', 'Legal');
This kind of bulk update is especially useful when processing structured data or transforming API responses.
Starting in Perl 5.20, you can use %array[...]
to return index/value pairs — a very elegant way to extract and preserve positions in a single step.
my @indexed = %employees[1, 3]; # @indexed = (1 => 'bob', 3 => 'dave')
You get a flat list of index/value pairs. This is particularly helpful when mapping or reordering data based on array positions.
You can even delete from an array this way:
my @removed = delete %employees[0, 4]; # @removed = (0 => 'alice', 4 => 'eve')
And afterwards you’ll have this:
# @employees = (undef, 'bob', 'carol', 'dave', undef)
The final type of slice — also added in Perl 5.20 — is the %hash{...}
key/value slice. This returns a flat list of key/value pairs, perfect for passing to functions that expect key/value lists.
my @kv = %details{'alice', 'dave'}; # @kv = ('alice', 'Engineering', 'dave', 'Engineering')
You can construct a new hash from this easily:
my %engineering = (%details{'alice', 'dave'});
This avoids intermediate looping and makes your code clear and declarative.
Type | Syntax | Returns | Added in |
---|---|---|---|
List slice | (list)[@indices] |
Values | Ancient |
Array slice | @array[@indices] |
Values | Ancient |
Hash slice | @hash{@keys} |
Values | Ancient |
Index/value array slice | %array[@indices] |
Index-value pairs | Perl 5.20 |
Key/value hash slice | %hash{@keys} |
Key-value pairs | Perl 5.20 |
If someone tells you that @hash{...}
or %array[...]
doesn’t work — they’re either out of date or mistaken. These forms are standard, powerful, and idiomatic Perl.
Slices make your code cleaner, clearer, and more concise. They let you express what you want directly, without boilerplate. And yes — they’re perfectly interview-appropriate.
So next time you’re reaching for a loop to pluck a few values from a hash or an array, pause and ask: could this be a slice?
If the answer’s yes — go ahead and slice away.
The post A Slice of Perl first appeared on Perl Hacks.
Published by prz on Saturday 14 June 2025 23:27
Published on Thursday 12 June 2025 22:13
The examples used here are from the weekly challenge problem statement and demonstrate the working solution.
You are given a binary array containing only 0 or/and 1. Write a script to find out the maximum consecutive 1 in the given array.
The core of the solution is contained in a main loop. The resulting code can be contained in a single file.
We’ll use a recursive procedure, which we’ll call from a subroutine which sets up some variables. We’ll pass scalar references to a recursive subroutine. When the recursion completes the $max_consecutive variable will hold the final answer.
Now, let’s define our recursion. We’ll terminate the recursion when we’ve exhausted the input array.
sub consecutive_one_r{
my($i, $consecutive, $max_consecutive) = @_;
my $x;
unless(@{$i} == 0){
$x = pop @{$i};
if($x == 0){
$$max_consecutive = $$consecutive if $$consecutive > $$max_consecutive;
$$consecutive = 0;
}
if($x == 1){
$$consecutive++;
}
consecutive_one_r($i, $consecutive, $max_consecutive);
}
elsif(@{$i} == 1){
$x = pop @{$i};
if($x == 0){
$$max_consecutive = $$consecutive if $$consecutive > $$max_consecutive;
}
if($x == 1){
$$consecutive++;
$$max_consecutive = $$consecutive if $$consecutive > $$max_consecutive;
}
consecutive_one_r($i, $consecutive, $max_consecutive);
}
}
◇
Just to make sure things work as expected we’ll define a few short tests. The double chop is just a lazy way to make sure there aren’t any trailing commas in the output.
MAIN:{
say consecutive_one(0, 1, 1, 0, 1, 1, 1);
say consecutive_one(0, 0, 0, 0);
say consecutive_one(1, 0, 1, 0, 1, 1);
}
◇
Fragment referenced in 1.
$ perl perl/ch-1.pl 3 0 2
You are given an array of item prices. Write a script to find out the final price of each items in the given array. There is a special discount scheme going on. If there’s an item with a lower or equal price later in the list, you get a discount equal to that later price (the first one you find in order).
Hey, let’s use recursion again for this too!
The main section is just some basic tests.
MAIN:{
say join q/, /, calculate_lowest_prices 8, 4, 6, 2, 3;
say join q/, /, calculate_lowest_prices 1, 2, 3, 4, 5;
say join q/, /, calculate_lowest_prices 7, 1, 1, 5;
}
◇
Fragment referenced in 5.
First, let’s introduce a recursive subroutine that scans ahead and finds the next lowest price in the list. As in part one we’ll use a scalar reference.
With that subroutine defined we can use it to solve the main task at hand.
$ perl perl/ch-2.pl 4, 2, 4, 2, 3 1, 2, 3, 4, 5 6, 0, 1, 5
Back in January, I wrote a blog post about adding JSON-LD to your web pages to make it easier for Google to understand what they were about. The example I used was my ReadABooker site, which encourages people to read more Booker Prize shortlisted novels (and to do so by buying them using my Amazon Associate links).
I’m slightly sad to report that in the five months since I implemented that change, visits to the website have remained pretty much static and I have yet to make my fortune from Amazon kickbacks. But that’s ok, we just use it as an excuse to learn more about SEO and to apply more tweaks to the website.
I’ve been using the most excellent ARefs site to get information about how good the on-page SEO is for many of my sites. Every couple of weeks, ARefs crawls the site and will give me a list of suggestions of things I can improve. And for a long time, I had been putting off dealing with one of the biggest issues – because it seemed so difficult.
The site didn’t have enough text on it. You could get lists of Booker years, authors and books. And, eventually, you’d end up on a book page where, hopefully, you’d be tempted to buy a book. But the book pages were pretty bare – just the title, author, year they were short-listed and an image of the cover. Oh, and the all-important “Buy from Amazon” button. AHrefs was insistent that I needed more text (at least a hundred words) on a page in order for Google to take an interest in it. And given that my database of Booker books included hundreds of books by hundreds of authors, that seemed like a big job to take on.
But, a few days ago, I saw a solution to that problem – I could ask ChatGPT for the text.
I wrote a blog post in April about generating a daily-updating website using ChatGPT. This would be similar, but instead of writing the text directly to a Jekyll website, I’d write it to the database and add it to the templates that generate the website.
Adapting the code was very quick. Here’s the finished version for the book blurbs.
#!/usr/bin/env perl use strict; use warnings; use builtin qw[trim]; use feature 'say'; use OpenAPI::Client::OpenAI; use Time::Piece; use Encode qw[encode]; use Booker::Schema; my $sch = Booker::Schema->get_schema; my $count = 0; my $books = $sch->resultset('Book'); while ($count < 20 and my $book = $books->next) { next if defined $book->blurb; ++$count; my $blurb = describe_title($book); $book->update({ blurb => $blurb }); } sub describe_title { my ($book) = @_; my ($title, $author) = ($book->title, $book->author->name); my $debug = 1; my $api_key = $ENV{"OPENAI_API_KEY"} or die "OPENAI_API_KEY is not set\n"; my $client = OpenAPI::Client::OpenAI->new; my $prompt = join " ", 'Produce a 100-200 word description for the book', "'$title' by $author", 'Do not mention the fact that the book was short-listed for (or won)', 'the Booker Prize'; my $res = $client->createChatCompletion({ body => { model => 'gpt-4o', # model => 'gpt-4.1-nano', messages => [ { role => 'system', content => 'You are someone who knows a lot about popular literature.' }, { role => 'user', content => $prompt }, ], temperature => 1.0, }, }); my $text = $res->res->json->{choices}[0]{message}{content}; $text = encode('UTF-8', $text); say $text if $debug; return $text; }
There are a couple of points to note:
I then produced a similar program that did the same thing for authors. It’s similar enough that the next time I need something like this, I’ll spend some time turning it into a generic program.
I then added the new database fields to the book and author templates and re-published the site. You can see the results in, for example, the pages for Salman Rushie and Midnight’s Children.
I had one more slight concern going into this project. I pay for access to the ChatGPT API. I usually have about $10 in my pre-paid account and I really had no idea how much this was going to cost me. I needed have worried. Here’s a graph showing the bump in my API usage on the day I ran the code for all books and authors:
But you can also see that my total costs for the month so far are $0.01!
So, all-in-all, I call that a success and I’ll be using similar techniques to generate content for some other websites.
The post Generating Content with ChatGPT first appeared on Perl Hacks.
Published on Sunday 08 June 2025 12:36
The examples used here are from the weekly challenge problem statement and demonstrate the working solution.
You are given an array of integers and two integers $r and $c. Write a script to create two dimension array having $r rows and $c columns using the given array.
The core of the solution is contained in a main loop. The resulting code can be contained in a single file.
sub create_array{
my($i, $r, $c) = @_;
my @a = ();
for (0 .. $r - 1){
my $row = [];
for (0 .. $c - 1){
push @{$row}, shift @{$i};
}
push @a, $row;
}
return @a;
}
◇
Fragment referenced in 1.
Just to make sure things work as expected we’ll define a few short tests. The double chop is just a lazy way to make sure there aren’t any trailing commas in the output.
MAIN:{
my $s = q//;
$s .= q/(/;
do{
$s.= (q/[/ . join(q/, /, @{$_}) . q/], /);
} for create_array [1, 2, 3, 4], 2, 2;
chop $s;
chop $s;
$s .= q/)/;
say $s;
$s = q//;
$s .= q/(/;
do{
$s.= (q/[/ . join(q/, /, @{$_}) . q/], /);
} for create_array [1, 2, 3], 1, 3;
chop $s;
chop $s;
$s .= q/)/;
say $s;
$s = q//;
$s .= q/(/;
do{
$s.= (q/[/ . join(q/, /, @{$_}) . q/], /);
} for create_array [1, 2, 3, 4], 4, 1;
chop $s;
chop $s;
$s .= q/)/;
say $s;
}
◇
Fragment referenced in 1.
$ perl perl/ch-1.pl ([1, 2], [3, 4]) ([1, 2, 3]) ([1], [2], [3], [4])
You are given an array of integers. Write a script to return the sum of total XOR for every subset of given array.
This is another short one, but with a slightly more involved solution. We are going to compute the Power Set (set of all subsets) of the given array of integers and then for each of these sub-arrays compute and sum the XOR results.
The main section is just some basic tests.
MAIN:{
say calculate_total_xor 1, 3;
say calculate_total_xor 5, 1, 6;
say calculate_total_xor 3, 4, 5, 6, 7, 8;
}
◇
Fragment referenced in 4.
sub calculate_total_xor{
my $total = 0;
for my $a (power_set @_){
my $t = 0;
$t = eval join q/ ^ /, ($t, @{$a});
$total += $t;
}
return $total;
}
◇
Fragment referenced in 4.
The Power Set can be computed by using a binary counter. Let’s say we have N elements of the set. We start at 0 x N and continue to 1 x N. At each iteration we compose a subarray by including the ith element from the original array if the ith bit is set. Actually, we arent going to start at 0 x N because we want to exclude the empty set for the purposes of the later XOR computation.
sub power_set{
my @a = ();
for my $i (1 .. 2 ** @_- 1){
my @digits = ();
for my $j (0 .. @_ - 1){
push @digits, $_[$j] if 1 == ($i >> $j & 1);
}
push @a, \@digits;
}
return @a;
}
◇
Fragment referenced in 4.
$ perl perl/ch-2.pl 6 28 480
Power Set Defined
Power Set Calculcation (C++) from TWC 141
The Weekly Challenge 324
Generated Code
Published by prz on Saturday 07 June 2025 22:56
This is the weekly favourites list of CPAN distributions. Votes count: 22
This week there isn't any remarkable distribution
Build date: 2025/06/07 20:53:44 GMT
Clicked for first time:
Increasing its reputation:
Published by Robert Rothenberg on Friday 06 June 2025 15:00
The 2-argument open function is insecure, because the filename can include the mode. If it is not properly validated, then files can be modified, truncated or in the case of a pipe character, run an external command.
$file = "| echo Aha";
open my $fh, $file; # <-- THIS IS BAD
This will execute the command embedded in $file
.
Even when the filename is generated by your code, you can run into unexpected edge cases. For example, in a Unix shell run the command
touch '| echo Aha'
and in the same directory run the script
opendir( my $dh, ".");
while ( my $file = readdir $dh ) {
next if -d $file;
open my $fh, $file; # <-- THIS IS BAD
close $fh;
}
This is more subtle, and will execute the command embedded in that filename.
It is the same bug in File::Find::Rule that became CVE-2011-10007. (If you haven’t already upgraded File::Find::Rule to version 0.35 or later, please do so. That module has more than 1,700 direct or indirect dependents.)
The SEI CERT Perl Coding Standard recommends against using the two-argument form of open().
The fix is simply to use a 3-argument form, where the second argument is the mode and the third is the filename:
open my $fh, '<', $file;
The 3-argument open has been supported since Perl v5.6.0, so there is no worry about supporting older versions of Perl.
You can identify this issue in your code using the Perl::Critic ProhibitTwoArgOpen policy. There is a similar policy in Perl::Lint.
I seldom release modules to CPAN; mainly because
there’s so much great stuff there already. An answer on StackOverflow
about pretty printing DBIx::Class
result
sets got me thinking. I then
climbed onto the shoulders of several giants to create a wee module which
does just that. Introducing
DBIx::Class::ResultSet::PrettyPrint
!
Strangely enough, I’d released
DBIx::Class::ResultSet::PrettyPrint
in 2024 but had never gotten around to mentioning it anywhere. This post
rectifies that omission, gives some background about the module, and
discusses a small usage example.
One could say that this is a delta-epsilon1 module in that it
doesn’t extend things very much. Although it doesn’t constitute a large
change, it does make printing DBIx::Class
result sets easier. It stands
upon the shoulders of several giants, so all I can claim is to have bundled
the idea into a module.
The original impetus for DBIx::Class::ResultSet::PrettyPrint
came from
wanting to pretty print result sets in a Perl project I’ve been working
on.2 I find that by seeing the data within a result set, I can get
a feeling from what the data looks like and what kinds of information it
contains. Searching for a pretty printing module, I stumbled across an
answer on StackOverflow about pretty printing DBIx::Class
result
sets. I remember thinking
that the proposed solution looked nice and I used the pattern a couple of
times in my work. I eventually realised that the approach would be easier
to use as a module. Since then, I’ve found it handy as a way to get an idea
of the shape of the data that I’m playing with.
I made some small changes to the solution proposed on StackOverflow. For
instance, it recommended using
Text::Table
, but I found the table
output generated by
Text::Table::Tiny
nicer.
This is why DBIx::Class::ResultSet::PrettyPrint
uses Text::Table::Tiny
to generate tables. For instance, the output has +
symbols at the table
cell corners, which is reminiscent of how Postgres displays tables within
psql
. This I found to be a nice touch.
Of course, if one has large database tables with many columns and/or rows, this module might not be so useful. Yet, since it operates on result sets, one can create a result set with a subset of a given table and then pretty print that.
Although one often talks about pretty printing database tables, really the
module operates on DBIx::Class::ResultSet
objects. Hence, there isn’t a
strict one-to-one relationship between database tables and what the pretty
printer operates on. This is why the module was useful in one of my current
projects: sometimes there wasn’t a database table behind the ResultSet
I
was investigating. For instance, by querying the database directly with
psql
, it wasn’t (easily) possible to work out what form the data had and
what kinds of information it contained. Using
DBIx::Class::ResultSet::PrettyPrint
made this investigative work much
easier.
So, how to use the module? A small example should make things clear.
Let’s see the module in action. First off, we’ll need to install it:
$ cpanm DBIx::Class::ResultSet::PrettyPrint
This will pull in several CPAN modules, so you’ll need to wait a bit until it’s finished. For instance, on my test system, it took 22 minutes to download, build, test, and install the necessary 79 distributions. It’ll probably take less time if you’ve already got many of the upstream dependencies installed on your system.
Once that’s done, we can set up an example project. We’ll need to set up a
DBIx::Class
project, so there’s a bit of upfront work to do.
I’m a book fan, so let’s create a project to store metadata about some of my books. We only need one database table in this small example, so it won’t take long to set up.
I’ve got lots of books about Perl and a few about Unix, so let’s call the project “Perl and Unix library”. To give you an idea of what I mean, here’s a “shelfie”:
Create a directory for the project and change into the new directory:
$ mkdir perl-and-unix-library
$ cd perl-and-unix-library
Now we need to create the directory structure for our DBIx::Class
schema:
$ mkdir -p lib/Schema/Result/
We’ll need a stub Schema
package that we can use later to inspect the
database’s contents. So, create a file called lib/Schema.pm
and fill it
with this code:
package Schema;
use strict;
use warnings;
use base qw(DBIx::Class::Schema);
__PACKAGE__->load_namespaces();
1;
# vim: expandtab shiftwidth=4
We need to tell DBIx::Class
about the structure of our books table, so
create a file called lib/Schema/Result/Book.pm
and fill it with this
content:
package Schema::Result::Book;
use strict;
use warnings;
use base qw(DBIx::Class::Core);
use lib '.t/lib';
__PACKAGE__->table('books');
__PACKAGE__->add_columns(
id => {
data_type => 'integer',
size => 16,
is_nullable => 0,
is_auto_increment => 1,
},
title => {
data_type => 'varchar',
size => 128,
is_nullable => 0,
},
author => {
data_type => 'varchar',
size => 128,
is_nullable => 0,
},
pub_date => {
data_type => 'date',
is_nullable => 0,
},
num_pages => {
data_type => 'integer',
size => 16,
is_nullable => 0,
},
isbn => {
data_type => 'varchar',
size => 32,
is_nullable => 0,
},
);
__PACKAGE__->set_primary_key('id');
1;
# vim: expandtab shiftwidth=4
This defines our books
database table in which we’re storing title,
author, publication date, number of pages, and ISBN information about each
of our books.
We’ve now got enough structure for DBIx::Class
to create and query a
database. That means we can add some books to the database.
Create a file in the project’s root directory called create-books-db.pl
and fill it with this content:
use strict;
use warnings;
use lib './lib';
use Schema;
my $schema = Schema->connect("dbi:SQLite:books.db");
$schema->deploy( { add_drop_table => 1 } );
my $books = $schema->resultset('Book');
$books->create(
{
title => "Programming Perl",
author => "Tom Christiansen, brian d foy, Larry Wall, Jon Orwant",
pub_date => "2012-03-18",
num_pages => 1174,
isbn => "9780596004927"
}
);
$books->create(
{
title => "Perl by Example",
author => "Ellie Quigley",
pub_date => "1994-01-01",
num_pages => 200,
isbn => "9780131228399"
}
);
$books->create(
{
title => "Perl in a Nutshell",
author => "Nathan Patwardhan, Ellen Siever and Stephen Spainhour",
pub_date => "1999-01-01",
num_pages => 654,
isbn => "9781565922860"
}
);
$books->create(
{
title => "Perl Best Practices",
author => "Damian Conway",
pub_date => "2005-07-01",
num_pages => 517,
isbn => "9780596001735"
}
);
$books->create(
{
title => "Learning Perl, 7th Edition",
author => "Randal L. Schwartz, brian d foy, Tom Phoenix",
pub_date => "2016-10-05",
num_pages => 369,
isbn => "9781491954324"
}
);
$books->create(
{
title => "UNIX Shell Programming",
author => "Stephen G. Kochan and Patrick H. Wood",
pub_date => "1990",
num_pages => 502,
isbn => "067248448X"
}
);
# vim: expandtab shiftwidth=4
Running this file will create an SQLite database called books.db
in the
same directory as the script. I.e. after running
$ perl create-books-db.pl
you should see a file called books.db
in the project’s root directory.
Now we can query the data in our books database. Create a file called
show-books.pl
in the project base directory with this content:
use strict;
use warnings;
use lib './lib';
use DBIx::Class::ResultSet::PrettyPrint;
use Schema; # load your DBIx::Class schema
# load your database and fetch a result set
my $schema = Schema->connect( 'dbi:SQLite:books.db' );
my $books = $schema->resultset( 'Book' );
print "Title of first entry: ", $books->find(1)->title, "\n";
print "Authors of UNIX-related titles: ",
$books->search({ title => { -like => "%UNIX%" }})->first->author, "\n";
# vim: expandtab shiftwidth=4
Running this script will give this output:
$ perl show-books.pl
Title of first entry: Programming Perl
Authors of UNIX-related titles: Stephen G. Kochan and Patrick H. Wood
That’s all very well and good, but wouldn’t it be nice to view the database
table all in one go? This is the niche task that
DBIx::Class::ResultSet::PrettyPrint
performs.
Change the print
statements in the show-books.pl
script to this:
# pretty print the result set
my $pp = DBIx::Class::ResultSet::PrettyPrint->new();
$pp->print_table( $books );
Now, when we run the script, we get this output:
$ perl show-books.pl
+----+----------------------------+-------------------------------------------------------+------------+-----------+---------------+
| id | title | author | pub_date | num_pages | isbn |
+----+----------------------------+-------------------------------------------------------+------------+-----------+---------------+
| 1 | Programming Perl | Tom Christiansen, brian d foy, Larry Wall, Jon Orwant | 2012-03-18 | 1174 | 9780596004927 |
| 2 | Perl by Example | Ellie Quigley | 1994-01-01 | 200 | 9780131228399 |
| 3 | Perl in a Nutshell | Nathan Patwardhan, Ellen Siever and Stephen Spainhour | 1999-01-01 | 654 | 9781565922860 |
| 4 | Perl Best Practices | Damian Conway | 2005-07-01 | 517 | 9780596001735 |
| 5 | Learning Perl, 7th Edition | Randal L. Schwartz, brian d foy, Tom Phoenix | 2016-10-05 | 369 | 9781491954324 |
| 6 | UNIX Shell Programming | Stephen G. Kochan and Patrick H. Wood | 1990 | 502 | 067248448X |
+----+----------------------------+-------------------------------------------------------+------------+-----------+---------------+
Isn’t that nice?
As I mentioned earlier, I’ve found the module quite handy when using Perl to dig around in database tables in my daily work. Maybe it can help make your work easier too!
This is in reference to delta-epsilon proofs in mathematics where the values delta and epsilon are very small. ↩︎
If you need someone who is stubbornly thorough, give me a yell! I’m available for freelance Python/Perl backend development and maintenance work. Contact me at paul@peateasea.de and let’s discuss how I can help solve your business’ hairiest problems. ↩︎
Published on Thursday 05 June 2025 22:52
The examples used here are from the weekly challenge problem statement and demonstrate the working solution.
You are given a list of operations. Write a script to return the final value after performing the given operations in order. The initial value is always 0.
Let’s entertain ourselves with an over engineered solution! We’ll use Parse::Yapp to handle incrementing and decrementing any single letter variable. Or, to put it another way, we’ll define a tiny language which consists of single letter variables that do not require declaration, are only of unsigned integer type, and are automatically initialized to zero. The only operations on these variables are the increment and decrement operations from the problem statement. At the completion of the parser’s execution we will print the final values of each variable.
The majority of the work will be done in the .yp yapp grammar definition file. We’ll focus on this file first.
The declarations section will have some token definitions and a global variable declaration.
For our simple language we’re just going to define a few tokens: the increment and decrement operators, our single letter variables.
We’re going to define a single global variable which will be used to track the state of each variable.
The rules section defines the actions of our increment and decrement operations in both prefix and postfix form. We’ll also allow for a completely optional variable declaration which is just placing a single letter variable by itself
program: statement {$variable_state}
| program statement
;
statement: variable_declaration
| increment_variable
| decrement_variable
;
variable_declaration: LETTER {$variable_state->{$_[1]} = 0}
;
increment_variable: INCREMENT LETTER {$variable_state->{$_[2]}++}
| LETTER INCREMENT {$variable_state->{$_[1]}++}
;
decrement_variable: DECREMENT LETTER {$variable_state->{$_[2]}--}
| LETTER DECREMENT {$variable_state->{$_[1]}--}
;
◇
The final section of the grammar definition file is, historically, called programs. This is where we have Perl code for the lexer, error handing, and a parse function which provides the main point of execution from code that wants to call the parser that has been generated from the grammar.
The parse function is for the convenience of calling the generated parser from other code. yapp will generate a module and this will be the module’s method used by other code to execute the parser against a given input.
Notice here that we are squashing white space, both tabs and newlines, using tr. This reduces all tabs and newlines to a single space. This eases further processing since extra whitespace is just ignored, according to the rules we’ve been given.
Also notice the return value from parsing. In the rules section we provide a return value, a hash reference, in the final action code block executed.
sub parse{
my($self, $input) = @_;
$input =~ tr/\t/ /s;
$input =~ tr/\n/ /s;
$self->YYData->{INPUT} = $input;
my $result = $self->YYParse(yylex => \&lexer, yyerror => \&error);
return $result;
}
◇
Fragment referenced in 6.
This is really just about the most minimal error handling function there can be! All this does is print “syntax error”when the parser encounters a problem.
sub error{
exists $_[0]->YYData->{ERRMSG}
and do{
print $_[0]->YYData->{ERRMSG};
return;
};
print "syntax␣error\n";
}
◇
Fragment referenced in 6.
The lexer function is called repeatedly for the entire input. Regular expressions are used to identify tokens (the ones declared at the top of the file) and pass them along for the rules processing.
sub lexer{
my($parser) = @_;
$parser->YYData->{INPUT} or return(q//, undef);
$parser->YYData->{INPUT} =~ s/^[ \t]//g;
##
# send tokens to parser
##
for($parser->YYData->{INPUT}){
s/^(\s+)// and return (q/SPACE/, $1);
s/^([a-z]{1})// and return (q/LETTER/, $1);
s/^(\+\+)// and return (q/INCREMENT/, $1);
s/^(--)// and return (q/DECREMENT/, $1);
}
}
◇
Fragment referenced in 6.
There’s one more function we should add. The reason for it is a little complex. Variables defined in the declarations section are considered static and are stored in the lexical pad of the package. So each new invocation of the parse() method will re-use the same variables. They are not cleared or reset. So, we’ll define a subroutine which will clear this for us manually.
Let’s define a small file to drive some tests.
The preamble to the test driver sets the minimum perl version to be the most recent one, to take advantage of all recent changes. We also include the generated module file whihc yapp creates. For test purposes we’ll define some constants, taken from TWC’s examples.
use constant TEST0 => q/--x x++ x++/;
use constant TEST1 => q/x++ ++x x++/;
use constant TEST2 => q/x++ ++x --x x--/;
use constant COMPLEX_TEST => <<~END_TEST;
a b c
a++ b++ c++
++a ++b ++c
--a --b --c
a-- b-- c--
a++ ++b c++
END_TEST
◇
Fragment referenced in 12.
For printing the results in a nice way we’ll define a small subroutine to display the return value from the parser.
sub print_variables{
my($results) = @_;
for my $k (keys %{$results}){
print $k;
say qq/:\t$results->{$k}/;
}
}
◇
Fragment referenced in 11.
MAIN:{
my $parser = IncrementDecrement->new();
say TEST0;
say print_variables $parser->parse(TEST0);
say TEST1;
$parser->clear();
say print_variables $parser->parse(TEST1);
say TEST2;
$parser->clear();
say print_variables $parser->parse(TEST2);
say COMPLEX_TEST;
$parser->clear();
say print_variables $parser->parse(COMPLEX_TEST);
}
◇
Fragment referenced in 11.
$ yapp -m IncrementDecrement perl/IncrementDecrement.yp; mv IncrementDecrement.pm perl; perl -Iperl perl/ch-1.pl --x x++ x++ x: 1 x++ ++x x++ x: 3 x++ ++x --x x-- x: 0 a b c a++ b++ c++ ++a ++b ++c --a --b --c a-- b-- c-- a++ ++b c++ b: 1 a: 1 c: 1
You are given an income amount and tax brackets. Write a script to calculate the total tax amount.
After over doing the complexity for the first part, we’ll make this one quite a bit shorter.
The main section is just some basic tests.
MAIN:{
say calculate_tax 10, [[3, 50], [7, 10], [12,25]];
say calculate_tax 2, [[1, 0], [4, 25], [5,50]];
say calculate_tax 0, [[2, 50]];
}
◇
Fragment referenced in 16.
{
my $tax_bracket = shift @{$tax_brackets};
if($tax_bracket->[0] <= $income){
$taxable = $tax_bracket->[0] - $taxable;
$tax += ($taxable * ($tax_bracket->[1]/100));
$taxed += $taxable;
}
else{
$tax += (($income - $taxed) * ($tax_bracket->[1]/100));
$taxed = $income;
}
redo unless $taxed >= $income || @{$tax_brackets} == 0;
}
◇
$ perl perl/ch-2.pl 2.65 0.25 0
Published on Thursday 05 June 2025 09:09
In the previous post, we created a network close enough to reality so that finding routes between stations was possible and sufficiently interesting. In this final post in the series, we’re going to see how to handle indirect connections between stations.
Not all stations in the Hannover tram network are directly connected. A
good example is the line Linie 10
, which starts at the bus station next to
the main train station and has the station name
Hauptbahnhof/ZOB
.1 As its name suggests, this station is
associated with the station Hauptbahnhof
. Although they’re very close to
one another, they’re not connected directly. You have to cross a road to get
to Hauptbahnhof
from the Hauptbahnhof/ZOB
tram stop. A routing
framework such as Map::Tube
should
allow such indirect connections, thus joining Linie 10
to the rest of the
network.
So how do we connect such indirectly connected stations?
Map::Tube
has a solution: the
other_link
attribute.
To see this attribute in action, let’s add the line Linie 10
to the
network and connect Hauptbahnhof
to Hauptbahnhof/ZOB
with an
other_link
. Then we can try creating a route from Ahlem
(at the end of
Linie 10
) to Misburg
(at the end of Linie 7
) and see if our new
connection type works as we expect. Let’s get cracking!
Here’s the planned list of stations, IDs and links:
Station | ID | Links |
---|---|---|
Ahlem | H15 | H16 |
Leinaustraße | H16 | H15, H17 |
Hauptbahnhof/ZOB | H17 | H16 |
Ahlem
is the westernmost station, hence it’s the “first” station along
Linie 10
. Therefore, it gets the next logical ID carrying on from where we
left off in the map file.
As we’ve done before, we drive these changes by leaning on our test suite. We want to have four lines in the network now, hence we update our number of lines test like so:
my $num_lines = scalar @{$hannover->get_lines};
is( $num_lines, 4, "Number of lines in network correct" );
We can test that we’ve added the line and its stations correctly by checking for the expected route. Our routes tests are now:
my @routes = (
"Route 1|Langenhagen|Sarstedt|Langenhagen,Kabelkamp,Hauptbahnhof,Kroepcke,Laatzen,Sarstedt",
"Route 4|Garbsen|Roderbruch|Garbsen,Laukerthof,Kroepcke,Kantplatz,Roderbruch",
"Route 7|Wettbergen|Misburg|Wettbergen,Allerweg,Kroepcke,Hauptbahnhof,Vier Grenzen,Misburg",
"Route 10|Ahlem|Hauptbahnhof/ZOB|Ahlem,Leinaustraße,Hauptbahnhof/ZOB",
);
ok_map_routes($hannover, \@routes);
where we’ve added the expected list of stations for Linie 10
to the end of
the @routes
list.
Let’s make sure the tests fail as expected:
$ prove -lr t/map-tube-hannover.t
t/map-tube-hannover.t .. 1/?
# Failed test 'Number of lines in network correct'
# at t/map-tube-hannover.t line 15.
# got: '3'
# expected: '4'
Yup, that looks good. We expect four lines but only have three. Let’s add the line to our maps file now:
{
"id" : "L10",
"name" : "Linie 10",
"color" : "PaleGreen"
}
where I’ve guessed that the line colour used in the Üstra “Netzplan U” is pale green.
Re-running the tests, we have:
$ prove -lr t/map-tube-hannover.t
t/map-tube-hannover.t .. # Line id L10 consists of 0 separate components
# Failed test 'Hannover'
# at /home/cochrane/perl5/perlbrew/perls/perl-5.38.3/lib/site_perl/5.38.3/Test/Map/Tube.pm line 196.
# Line id L10 defined but serves no stations (not even as other_link)
# Failed test 'Hannover'
# at /home/cochrane/perl5/perlbrew/perls/perl-5.38.3/lib/site_perl/5.38.3/Test/Map/Tube.pm line 196.
# Looks like you failed 2 tests of 14.
Again, we expected this as this line doesn’t have any stations yet. Let’s add them to the map file.
{
"id" : "H15",
"name" : "Ahlem",
"line" : "L10",
"link" : "H16"
},
{
"id" : "H16",
"name" : "Leinaustraße",
"line" : "L10",
"link" : "H15,H17"
},
{
"id" : "H17",
"name" : "Hauptbahnhof/ZOB",
"line" : "L10",
"link" : "H16"
}
This time, we expect the tests to tell us that this line isn’t connected to the network. Sure enough:
$ prove -lr t/map-tube-hannover.t
t/map-tube-hannover.t .. # Map has 2 separate components; e.g., stations with ids H1, H15
# Failed test 'Hannover'
# at
/home/cochrane/perl5/perlbrew/perls/perl-5.38.3/lib/site_perl/5.38.3/Test/Map/Tube.pm line 196.
# Looks like you failed 1 test of 14.
The error message
Map has 2 separate components; e.g., stations with ids H1, H15
means that the line isn’t connected to any of the other lines already present because the map contains separate components.
To fix this, let’s change the entry for Hauptbahnhof/ZOB
to use the
other_link
attribute and see if that helps:
{
"id" : "H17",
"name" : "Hauptbahnhof/ZOB",
"line" : "L10",
"link" : "H16",
"other_link" : "Street:H3"
}
Oddly, the tests still raise an error:
$ prove -lr t/map-tube-hannover.t
t/map-tube-hannover.t .. # Map has 2 separate components; e.g., stations with ids H1, H15
# Failed test 'Hannover'
# at /home/cochrane/perl5/perlbrew/perls/perl-5.38.3/lib/site_perl/5.38.3/Test/Map/Tube.pm line 196.
t/map-tube-hannover.t .. 1/? # Looks like you failed 1 test of 14.
# Failed test 'ok_map_data'
# at t/map-tube-hannover.t line 11.
Oh, that’s right! We’ve only linked Hauptbahnhof/ZOB
to Hauptbahnhof
;
we need to add the other_link
in the other direction as well. We could
have debugged this situation by running bin/map2image.pl
and inspecting
the generated image. Yet we’ve seen this issue
before
and can call on experience instead.
We can fix the problem by updating the entry for Hauptbahnhof
like so:
{
"id" : "H3",
"name" : "Hauptbahnhof",
"line" : "L1,L7",
"link" : "H2,H8,H12",
"other_link" : "Street:H17"
},
Now the tests still fail, even though we thought we’d fixed everything:
$ prove -lr t/map-tube-hannover.t
t/map-tube-hannover.t .. 1/? Map::Tube::get_node_by_name(): ERROR: Invalid Station Name [Leinaustraße]. (status: 101) file /home/cochrane/perl5/perlbrew/perls/perl-5.38.3/lib/site_perl/5.38.3/Test/Map/Tube.pm on line 1434
# Tests were run but no plan was declared and done_testing() was not seen.
What’s going wrong?
Oh, yeah, the sharp-s (ß) character messes with the routing tests as we saw in the previous article in the series.
Let’s replace ß with the equivalent “double-s” for the Leinaustraße
station. First in the map file:
{
"id" : "H16",
"name" : "Leinaustrasse",
"line" : "L10",
"link" : "H15,H17"
},
and then in the routes tests:
my @routes = (
"Route 1|Langenhagen|Sarstedt|Langenhagen,Kabelkamp,Hauptbahnhof,Kroepcke,Laatzen,Sarstedt",
"Route 4|Garbsen|Roderbruch|Garbsen,Laukerthof,Kroepcke,Kantplatz,Roderbruch",
"Route 7|Wettbergen|Misburg|Wettbergen,Allerweg,Kroepcke,Hauptbahnhof,Vier Grenzen,Misburg",
"Route 10|Ahlem|Hauptbahnhof/ZOB|Ahlem,Leinaustrasse,Hauptbahnhof/ZOB",
);
ok_map_routes($hannover, \@routes);
How did we do?
$ prove -lr t/map-tube-hannover.t
t/map-tube-hannover.t .. ok
All tests successful.
Files=1, Tests=4, 0 wallclock secs ( 0.03 usr 0.00 sys + 0.55 cusr 0.05 csys = 0.63 CPU)
Result: PASS
Success! 🎉
We’ve reached the end of the development phase of the HOWTO. At this point,
the complete test file (t/map-tube-hannover.t
) looks like this:
use strict;
use warnings;
use Test::More;
use Map::Tube::Hannover;
use Test::Map::Tube;
my $hannover = Map::Tube::Hannover->new;
ok_map($hannover);
ok_map_functions($hannover);
my $num_lines = scalar @{$hannover->get_lines};
is( $num_lines, 4, "Number of lines in network correct" );
my @routes = (
"Route 1|Langenhagen|Sarstedt|Langenhagen,Kabelkamp,Hauptbahnhof,Kroepcke,Laatzen,Sarstedt",
"Route 4|Garbsen|Roderbruch|Garbsen,Laukerthof,Kroepcke,Kantplatz,Roderbruch",
"Route 7|Wettbergen|Misburg|Wettbergen,Allerweg,Kroepcke,Hauptbahnhof,Vier Grenzen,Misburg",
"Route 10|Ahlem|Hauptbahnhof/ZOB|Ahlem,Leinaustrasse,Hauptbahnhof/ZOB",
);
ok_map_routes($hannover, \@routes);
done_testing();
with the other Perl files remaining unchanged.
The full JSON content of the map file is too long to display here, but if you’re interested, you can see it in the Git repository accompanying this article series.
To get a feeling for what the network looks like, try running
bin/map2image.pl
. Doing so, you’ll find a network graph similar to this:
Although the graph doesn’t highlight the indirect link, it does show the connectivity in the entire map and gives us a high-level view of what we’ve achieved.
With our latest map changes in hand, we can find our way from Ahlem
to
Misburg
:
$ perl bin/get_route.pl Ahlem Misburg
Ahlem (Linie 10), Leinaustrasse (Linie 10), Hauptbahnhof/ZOB (Linie 10, Street), Hauptbahnhof (Linie 1, Linie 7, Street), Vier Grenzen (Linie 7), Misburg (Linie 7)
Wicked! It worked! And it got the connection from Hauptbahnhof/ZOB
to
Hauptbahnhof
right. Nice!
We can also plan more complex routes, such as travelling from Ahlem
to
Roderbruch
:
$ perl bin/get_route.pl Ahlem Roderbruch
Ahlem (Linie 10), Leinaustrasse (Linie 10), Hauptbahnhof/ZOB (Linie 10, Street), Hauptbahnhof (Linie 1, Linie 7, Street), Kroepcke (Linie 1, Linie 4, Linie 7), Kantplatz (Linie 4), Roderbruch (Linie 4)
Looking closely, we find that we have to change at Hauptbahnhof
and then
again at Kroepcke
to reach our destination. Comparing this with the
Üstra “Netzplan
U”
we can see (for the simpler map created here) that this matches reality.
Brilliant!
Let’s commit that change and give ourselves a pat on the back for a job well done!
$ git ci share/hannover-map.json t/map-tube-hannover.t -m "Add Linie 10 to network
>
> The most interesting part about this change is the use of other_link
> to ensure that Hauptbahnhof/ZOB and Hauptbahnhof are connected to one
> another and hence Linie 10 is connected to the rest of the network
> and routes can be found from Linie 10 to other lines."
[main bc34daa] Add Linie 10 to network
2 files changed, 29 insertions(+), 3 deletions(-)
Welcome to the end of the article series! Thanks for staying until the end. 🙂
Wow, that was quite a lot of work! But it was fun, and we learned a lot along the way. For instance, we’ve learned:
Map::Tube
map is structured,Map::Tube
in a test-driven
manner,Map::Tube
network,This discussion has hopefully given you the tools you need to create your
own Map::Tube
map. There’s so much more you can do with Map::Tube
, so
it’s a good idea to spend some time browsing the
documentation. Therein you will find
many nuggets of information and hints for ideas of things to play with.
I wish you the best of luck and have fun!
For those wondering who don’t speak German: Hauptbahnhof means “main train station” or equivalently “central train station”. ZOB is the abbreviation of Zentralomnibusbahnhof, which looks like it literally translates as “central omnibus train station”, but really means “central bus station”. ↩︎
Published on Wednesday 04 June 2025 08:00
SlapbirdAPM is a free-software observability platform tailor made for Perl web-applications. [ It is also a Perl web-application :^) ] It has first class support for Plack, Mojo, Dancer2, and CGI. Slapbird provides developers with comprehensive observability tools to monitor and optimize their applications’ performance.
In this article I will explain how to setup a Plack application with Slapbird. If you want to use another supported framework, please read our Getting Started documentation, or reach out to me on the Perl Foundations Slack channel!
SlapbirdAPM is easily installed on your Plack application, here is a minimal example, using a Dancer2 application that runs under Plack:
Install with
cpan -I SlapbirdAPM::Agent::Plack
#!/usr/bin/env perl
use Dancer2;
use Plack::Builder;
get '/' => sub {
'Hello World!';
};
builder {
enable 'SlapbirdAPM';
app;
};
Now, you can create an account on SlapbirdAPM, and create your application.
Then, simply copy the API key output and, add it to your application via the SLAPBIRDAPM_API_KEY
environment variable. For example:
SLAPBIRDAPM_API_KEY=<API-KEY> plackup app.pl
or, you can pass your key in to the middleware:
builder {
enable 'SlapbirdAPM', key => <YOUR API KEY>;
...
};
Now when you navigate to /
, you will see it logged in your SlapbirdAPM dashboard!
Then, clicking into one of the transactions, you’ll get some more information:
SlapbirdAPM also supports DBI, meaning you can trace your queries, let’s edit our application to include a few DBI queries:
#!/usr/bin/env perl
use Dancer2;
use DBI;
use Plack::Builder;
my $dbh = DBI->connect( 'dbi:SQLite:dbname=database.db', '', '' );
$dbh->do('create table if not exists users (id integer primary key, name varchar)');
get '/' => sub {
send_as html => 'Hello World!';
};
get '/users/:id' => sub {
my $user_id = route_parameters->get('id');
my ($user) =
$dbh->selectall_array(
'select * from users where id = ?',
{ Slice => {} }, $user_id );
send_as JSON => $user;
};
post '/users' => sub {
my $user_name = body_parameters->get('name');
my ($user) =
$dbh->selectall_array(
'insert into users(name) values ( ? ) returning id, name',
{ Slice => {} }, $user_name );
send_as JSON => $user;
};
builder {
enable 'SlapbirdAPM';
app;
};
Now we can use cURL to add data to our database:
curl -X POST -d 'name=bob' http://127.0.0.1:5000/users
Then, if we go back into Slapbird, we can view our timings for our queries:
This just breaks the surface of what is possible using SlapbirdAPM. You can also, generate reports, perform health-checks, and get notified if your application is creating too many 5XX responses.
Thanks for reading!