Published by Mohammad Sajid Anwar on Friday 27 June 2025 01:44
Lexical Method in the latest release Perl v5.42 RC1. For more details, follow the link: https://theweeklychallenge.org/blog/lexical-method-v542
Published by stack0114106 on Thursday 26 June 2025 22:48
I'm trying to extract the tabular data from apache pyspark logs using a perl
one-liner.
Below is the sample file log and there are 3 tabular output from the spark output:
24/06/19 01:00:00 INFO org.apache.spark.SparkContext: Running Spark version 3.5.1
24/06/19 01:00:01 INFO org.apache.spark.SparkContext: Submitted application: MyPySparkApp
24/06/19 01:00:02 INFO org.apache.spark.scheduler.DAGScheduler: Registering RDD 0 (text at <stdin>:1)
24/06/19 01:00:03 DEBUG pyspark_logging_examples.workloads.sample_logging_job.SampleLoggingJob: This is a debug message from my PySpark code.
+----+----------+-----+---+
|acct| dt| amt| rk|
+----+----------+-----+---+
|ACC3|2010-06-24| 35.7| 2|
|ACC2|2010-06-22| 23.4| 2|
|ACC4|2010-06-21| 21.5| 2|
|ACC5|2010-06-23| 34.9| 2|
|ACC6|2010-06-25|100.0| 1|
+----+----------+-----+---+
24/06/19 01:00:04 INFO pyspark_logging_examples.workloads.sample_logging_job.SampleLoggingJob: Processing data in MyPySparkApp.
24/06/19 01:00:05 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0) on host localhost: Executor lost connection, trying to reconnect.
24/06/19 01:00:07 INFO org.apache.spark.scheduler.DAGScheduler: Job 0 finished: collect at <stdin>:1, took 7.0000 s
24/06/19 01:00:08 INFO org.apache.spark.SparkContext: Stopped SparkContext
+----------+-----+
|inc |check|
+----------+-----+
|Australia |true |
|Bangladesh|false|
|England |true |
+----------+-----+
24/06/19 01:00:09 INFO org.apache.spark.scheduler.DAGScheduler: Job 0 finished: collect at <stdin>:1, took 7.0000 s
24/06/19 01:00:09 INFO org.apache.spark.SparkContext: Stopped SparkContext
+-----+------+---------+----+---------+----+----+------+
|empno| ename| job| mgr| hiredate| sal|comm|deptno|
+-----+------+---------+----+---------+----+----+------+
| 7369| SMITH| CLERK|7902|17-Dec-80| 800| 20| 10|
| 7499| ALLEN| SALESMAN|7698|20-Feb-81|1600| 300| 30|
| 7521| WARD| SALESMAN|7698|22-Feb-81|1250| 500| 30|
| 7566| JONES| MANAGER|7839| 2-Apr-81|2975| 0| 20|
| 7654|MARTIN| SALESMAN|7698|28-Sep-81|1250|1400| 30|
| 7698| BLAKE| MANAGER|7839| 1-May-81|2850| 0| 30|
| 7782| CLARK| MANAGER|7839| 9-Jun-81|2450| 0| 10|
| 7788| SCOTT| ANALYST|7566|19-Apr-87|3000| 0| 20|
| 7839| KING|PRESIDENT| 0|17-Nov-81|5000| 0| 10|
| 7844|TURNER| SALESMAN|7698| 8-Sep-81|1500| 0| 30|
| 7876| ADAMS| CLERK|7788|23-May-87|1100| 0| 20|
+-----+------+---------+----+---------+----+----+------+
root
|-- empno: integer (nullable = true)
|-- ename: string (nullable = true)
|-- job: string (nullable = true)
|-- mgr: integer (nullable = true)
|-- hiredate: string (nullable = true)
|-- sal: integer (nullable = true)
|-- comm: integer (nullable = true)
|-- deptno: integer (nullable = true)
24/06/19 01:00:20 INFO org.apache.spark.SparkContext: Running Spark version 3.5.1
24/06/19 01:00:21 INFO org.apache.spark.SparkContext: Submitted application: MyPySparkApp
24/06/19 01:00:22 INFO org.apache.spark.SparkContext: Running Spark version 3.5.1
24/06/19 01:00:23 INFO org.apache.spark.SparkContext: Submitted application: MyPySparkApp2
When there is only one tabular output the below command works:
perl -0777 -ne ' while(m/^\x2b(.+)\x2b$/gsm) { print "$&\n" } ' spark.log # \x2b="+"
but for multiple tabular outputs, it pulls all the text from first occurrence to end of last tabular occurrence. How do I get all the 3 tabular output from my sample log?
Expected output:
Table-1:
+----+----------+-----+---+
|acct| dt| amt| rk|
+----+----------+-----+---+
|ACC3|2010-06-24| 35.7| 2|
|ACC2|2010-06-22| 23.4| 2|
|ACC4|2010-06-21| 21.5| 2|
|ACC5|2010-06-23| 34.9| 2|
|ACC6|2010-06-25|100.0| 1|
+----+----------+-----+---+
Table-2
+----------+-----+
|inc |check|
+----------+-----+
|Australia |true |
|Bangladesh|false|
|England |true |
+----------+-----+
Table-3
+-----+------+---------+----+---------+----+----+------+
|empno| ename| job| mgr| hiredate| sal|comm|deptno|
+-----+------+---------+----+---------+----+----+------+
| 7369| SMITH| CLERK|7902|17-Dec-80| 800| 20| 10|
| 7499| ALLEN| SALESMAN|7698|20-Feb-81|1600| 300| 30|
| 7521| WARD| SALESMAN|7698|22-Feb-81|1250| 500| 30|
| 7566| JONES| MANAGER|7839| 2-Apr-81|2975| 0| 20|
| 7654|MARTIN| SALESMAN|7698|28-Sep-81|1250|1400| 30|
| 7698| BLAKE| MANAGER|7839| 1-May-81|2850| 0| 30|
| 7782| CLARK| MANAGER|7839| 9-Jun-81|2450| 0| 10|
| 7788| SCOTT| ANALYST|7566|19-Apr-87|3000| 0| 20|
| 7839| KING|PRESIDENT| 0|17-Nov-81|5000| 0| 10|
| 7844|TURNER| SALESMAN|7698| 8-Sep-81|1500| 0| 30|
| 7876| ADAMS| CLERK|7788|23-May-87|1100| 0| 20|
+-----+------+---------+----+---------+----+----+------+
Quick introduction to AWS DynamoDB using CLI, Python and Perl.
Please check out the link for more information:
https://theweeklychallenge.org/blog/aws-dynamodb
Published by /u/briandfoy on Thursday 26 June 2025 11:31
![]() | submitted by /u/briandfoy [link] [comments] |
Published by craigberry on Thursday 26 June 2025 11:19
Add explicit rule for class.c to vms/descrip_mms.template Without the explicit rule, it still gets compiled, but without PERL_CORE defined, which makes class_cleanup_definition get interpreted as a function rather than a macro, which causes the link to fail with an undefined symbol error.
Published by thibaultduponchelle on Thursday 26 June 2025 09:02
perldelta: Few edits mainly in Security section
Published by mike joe on Thursday 26 June 2025 02:31
How can I join string content in Perl?
my $string = $q->param("us1"); # this is what entered +book +dog +cat
print join(',', $string), "\n";
This is the output I want:
"+book", "+dog", "+cat"
Published by Perl Steering Council on Wednesday 25 June 2025 17:13
Graham couldn’t make it, so only Aristotle and Philippe this week.
Published by karenetheridge on Wednesday 25 June 2025 16:33
fix version use line for keyword_all example
Published by thibaultduponchelle on Wednesday 25 June 2025 13:10
Disarm RC1 in patchlevel.h - Arm release_status in META.json
Published by /u/briandfoy on Wednesday 25 June 2025 11:31
![]() | submitted by /u/briandfoy [link] [comments] |
Published on Wednesday 25 June 2025 10:10
Last time, we worked out how to extract, collate, and print statistics about the data contained in a FIT file. Now we’re going to take the next logical step and plot the time series data.
Now that we’ve extracted data from the FIT file, what else can we do with
it? Since this is time series data, the most natural next step is to
visualise the data values over time. Since I know that
Gnuplot handles time series data
well,1 I chose to use
Chart::Gnuplot
to plot the
data.
An additional point in Gnuplot’s favour is that it can plot two datasets on the same graph, each with its own y-axis. Such functionality is handy when searching for correlations between datasets of different y-axis scales and ranges that share the same baseline data series.
Clearly Chart::Gnuplot
relies on Gnuplot, so we need to install it first:
$ sudo apt install -y gnuplot
Now we can install Chart::Gnuplot
with cpanm
:
$ cpanm Chart::Gnuplot
Something I like looking at is how my heart rate evolved throughout a ride; it gives me an idea of how much effort I was putting in. So, we’ll start off by looking at how the heart rate data varied over time. In other words, we want time on the x-axis and heart rate on the y-axis.
One great thing about Gnuplot is that if you give it a format string for the time data, then plotting “just works”. In other words, explicit conversion to datetime data for the x-axis is unnecessary.
Here’s a script to extract the FIT data from our example data file. It
displays some statistics about the activity and plots heart rate versus
time. I’ve given the script the filename geo-fit-plot-data.pl
:
1use strict;
2use warnings;
3
4use Geo::FIT;
5use Scalar::Util qw(reftype);
6use List::Util qw(max sum);
7use Chart::Gnuplot;
8
9
10sub main {
11 my @activity_data = extract_activity_data();
12
13 show_activity_statistics(@activity_data);
14 plot_activity_data(@activity_data);
15}
16
17sub extract_activity_data {
18 my $fit = Geo::FIT->new();
19 $fit->file( "2025-05-08-07-58-33.fit" );
20 $fit->open or die $fit->error;
21
22 my $record_callback = sub {
23 my ($self, $descriptor, $values) = @_;
24 my @all_field_names = $self->fields_list($descriptor);
25
26 my %event_data;
27 for my $field_name (@all_field_names) {
28 my $field_value = $self->field_value($field_name, $descriptor, $values);
29 if ($field_value =~ /[a-zA-Z]/) {
30 $event_data{$field_name} = $field_value;
31 }
32 }
33
34 return \%event_data;
35 };
36
37 $fit->data_message_callback_by_name('record', $record_callback ) or die $fit->error;
38
39 my @header_things = $fit->fetch_header;
40
41 my $event_data;
42 my @activity_data;
43 do {
44 $event_data = $fit->fetch;
45 my $reftype = reftype $event_data;
46 if (defined $reftype && $reftype eq 'HASH' && defined %$event_data{'timestamp'}) {
47 push @activity_data, $event_data;
48 }
49 } while ( $event_data );
50
51 $fit->close;
52
53 return @activity_data;
54}
55
56# extract and return the numerical parts of an array of FIT data values
57sub num_parts {
58 my $field_name = shift;
59 my @activity_data = @_;
60
61 return map { (split ' ', $_->{$field_name})[0] } @activity_data;
62}
63
64# return the average of an array of numbers
65sub avg {
66 my @array = @_;
67
68 return (sum @array) / (scalar @array);
69}
70
71sub show_activity_statistics {
72 my @activity_data = @_;
73
74 print "Found ", scalar @activity_data, " entries in FIT file\n";
75 my $available_fields = join ", ", sort keys %{$activity_data[0]};
76 print "Available fields: $available_fields\n";
77
78 my $total_distance_m = (split ' ', ${$activity_data[-1]}{'distance'})[0];
79 my $total_distance = $total_distance_m/1000;
80 print "Total distance: $total_distance km\n";
81
82 my @speeds = num_parts('speed', @activity_data);
83 my $maximum_speed = max @speeds;
84 my $maximum_speed_km = $maximum_speed*3.6;
85 print "Maximum speed: $maximum_speed m/s = $maximum_speed_km km/h\n";
86
87 my $average_speed = avg(@speeds);
88 my $average_speed_km = sprintf("%0.2f", $average_speed*3.6);
89 $average_speed = sprintf("%0.2f", $average_speed);
90 print "Average speed: $average_speed m/s = $average_speed_km km/h\n";
91
92 my @powers = num_parts('power', @activity_data);
93 my $maximum_power = max @powers;
94 print "Maximum power: $maximum_power W\n";
95
96 my $average_power = avg(@powers);
97 $average_power = sprintf("%0.2f", $average_power);
98 print "Average power: $average_power W\n";
99
100 my @heart_rates = num_parts('heart_rate', @activity_data);
101 my $maximum_heart_rate = max @heart_rates;
102 print "Maximum heart rate: $maximum_heart_rate bpm\n";
103
104 my $average_heart_rate = avg(@heart_rates);
105 $average_heart_rate = sprintf("%0.2f", $average_heart_rate);
106 print "Average heart rate: $average_heart_rate bpm\n";
107}
108
109sub plot_activity_data {
110 my @activity_data = @_;
111
112 my @heart_rates = num_parts('heart_rate', @activity_data);
113 my @times = map { $_->{'timestamp'} } @activity_data;
114
115 my $date = "2025-05-08";
116
117 my $chart = Chart::Gnuplot->new(
118 output => "watopia-figure-8-heart-rate.png",
119 title => "Figure 8 in Watopia on $date: heart rate over time",
120 xlabel => "Time",
121 ylabel => "Heart rate (bpm)",
122 terminal => "png size 1024, 768",
123 timeaxis => "x",
124 xtics => {
125 labelfmt => '%H:%M',
126 },
127 );
128
129 my $data_set = Chart::Gnuplot::DataSet->new(
130 xdata => \@times,
131 ydata => \@heart_rates,
132 timefmt => "%Y-%m-%dT%H:%M:%SZ",
133 style => "lines",
134 );
135
136 $chart->plot2d($data_set);
137}
138
139main();
A lot has happened between this code and the previous scripts. Let’s review it to see what’s changed.
The biggest changes were structural. I’ve moved the code into separate routines, improving encapsulation and making each more focused on one task.
The FIT file data extraction code I’ve put into its own routine
(extract_activity_data()
; lines 17-54). This sub returns the array of
event data that we’ve been
using.
I’ve also created two utility routines num_parts()
(lines 57-62) and
avg()
(lines 65-69). These return the numerical parts of the activity
data and average data series value, respectively.
The ride statistics calculation and display code is now located in the
show_activity_statistics()
routine. Now it’s out of the way, allowing us
to concentrate on other things.
The plotting code is new and sits in a sub called plot_activity_data()
(lines 109-137). We’ll focus much more on that later.
These routines are called from a main()
routine (lines 10-15) giving us a
nice bird’s eye view of what the script is trying to achieve. Running all
the code is now as simple as calling main()
(line 139).
Let’s zoom in on the plotting code in plot_activity_data()
. After having
imported Chart::Gnuplot
at the top of the file (line 7), we need to do a
bit of organising before we can set up the chart. We extract the activity
data with extract_activity_data()
(line 11) and pass this as an argument
to plot_activity_data()
(line 14). At the top of plot_activity_data()
we fetch an array of the numerical heart rate data (line 112) and an array
of all the timestamps (line 113).
The activity’s date (line 115) is assigned as a string variable because I want this to appear in the chart’s title. Although the date is present in the activity data, I’ve chosen not to calculate its value until later. This way we get the plotting code up and running sooner, as there’s still a lot to discuss.
Now we’re ready to set up the chart, which takes place on lines 117-127.
We create a new Chart::Gnuplot
object on line 117 and configure the plot
with various keyword arguments to the object’s constructor.
The parameters are as follows:
output
specifies the name of the output file as a string. The name
I’ve chosen reflects the activity as well as the data being plotted.title
is a string to use as the plot’s title. To provide context, I
mention the name of the route (Figure 8)
within Zwift’s main virtual world (Watopia)
as well as the date of the activity. To highlight that we’re
plotting heart rate over time, I’ve mentioned this in the title also.xlabel
is a string describing the x-axis data.ylabel
is a string describing the y-axis data.terminal
option tells Gnuplot to use the PNG2
“terminal”3 and to set its dimensions to 1024x768.timeaxis
tells Gnuplot which axis contains time-based data (in this
case the x-axis). This enables Gnuplot to space out the data along the
axis evenly. Often, the spacing between points in time-based data isn’t
regular; for instance, data points can be missing. Hence,
naively plotting unevenly-spaced time data can produce a distorted graph.
Telling Gnuplot that the x-axis contains time-based data allows it to
add appropriate space where necessary.xtics
is a hash of options to configure the behaviour of the ticks on
the x-axis. The setting here displays hour and minute information at
each tick mark for our time data. We omit the year, month and day
information as this is the same for all data points.Now that the main chart parameters have been set, we can focus on the data
we want to plot. In Chart::Gnuplot
parlance, a Chart::Gnuplot::DataSet
object represents a set of data to plot. Lines 129-134 instantiate such an
object which we later pass to the Chart::Gnuplot
object when plotting the
data (line 136). One configures Chart::Gnuplot::DataSet
objects similarly
to how Chart::Gnuplot
objects are constructed: by passing various options
to its constructor. These options include the data to plot and how this
data should be styled on the graph.
The options used here have the following meanings:
xdata
is an array reference to the data to use for the x-axis. If
this option is omitted, then Gnuplot uses the array indices of the
y-data as the x-axis values.ydata
is an array reference to the data to use for the y-axis.timefmt
specifies the format string Gnuplot should use when reading
the time data in the xdata
array. Timestamps are strings and
we need to inform Gnuplot how to parse them into a form useful for
x-axis data. Were the x-axis data a numerical data type, this option
wouldn’t be necessary.style
is a string specifying the style to use for plotting the data.
In this example, we’re plotting the data points as a set of connected
lines. Check out the
Chart::Gnuplot
documentation
for a full list of the available style options.We finally get to plot the data on line 136. The data set gets passed to
the Chart::Gnuplot
object as the argument to its plot2d()
method. As
its name suggests, this plots 2D data, i.e. y versus x. Gnuplot can also
display 3D data, in which case we’d call plot3d()
. When plotting 3D data
we’d have to include a z dimension when setting up the data set.
Running this code
$ perl geo-fit-plot-data.pl
generates this plot:
A couple of things are apparent when looking at this graph. It took me a while to get going (my pulse rose steadily over the first ~15 minutes of the ride) and the time is weird (6 am? Me? Lol, no). We’ll try to explain the heart rate behaviour later.
But first, what’s up with the time data? Did I really start riding at 6 o’clock in the morning? I’m not a morning person, so that’s not right. Also, I’m pretty sure my neighbours wouldn’t appreciate me coughing and wheezing at 6 am while trying to punish myself on Zwift. So what’s going on?
For those following carefully, you might have noticed the trailing Z
on
the timestamp data. This means that the time zone is UTC. Given that this
data is from May and I live in Germany, this implies that the local time
would have been 8 am. Still rather early for me, but not too early to
disturb the neighbours too much.4 In other
words, we need to fix the time zone to get the time data right.
How do we fix the time zone? I’m glad you asked! We need to parse the
timestamp into a DateTime
object, set the time zone, and then pass the
fixed time data to Gnuplot. It turns out that the standard DateTime
library doesn’t parse date/time strings.
Instead, we need to use
DateTime::Format::Strptime
.
This module parses date/time strings much like the strptime(3)
POSIX
function
does and returns DateTime
objects.
Since the module isn’t part of the core Perl distribution, we need to install it:
$ cpanm DateTime::Format::Strptime
Most of the code changes that follow take place only within the plotting
routine (plot_activity_data()
). So, I’m going to focus on that from now
on and won’t create a new script for the new version of the code.
The first thing to do is to import the DateTime::Format::Strptime
module:
use Scalar::Util qw(reftype);
use List::Util qw(max sum);
use Chart::Gnuplot;
+use DateTime::Format::Strptime;
Extending plot_activity_data()
to set the correct time zone, we get this
code:
1sub plot_activity_data {
2 my @activity_data = @_;
3
4 # extract data to plot from full activity data
5 my @heart_rates = num_parts('heart_rate', @activity_data);
6 my @timestamps = map { $_->{'timestamp'} } @activity_data;
7
8 # fix time zone in time data
9 my $date_parser = DateTime::Format::Strptime->new(
10 pattern => "%Y-%m-%dT%H:%M:%SZ",
11 time_zone => 'UTC',
12 );
13
14 my @times = map {
15 my $dt = $date_parser->parse_datetime($_);
16 $dt->set_time_zone('Europe/Berlin');
17 my $time_string = $dt->strftime("%H:%M:%S");
18 $time_string;
19 } @timestamps;
20
21 # determine date from timestamp data
22 my $dt = $date_parser->parse_datetime($timestamps[0]);
23 my $date = $dt->strftime("%Y-%m-%d");
24
25 # plot data
26 my $chart = Chart::Gnuplot->new(
27 output => "watopia-figure-8-heart-rate.png",
28 title => "Figure 8 in Watopia on $date: heart rate over time",
29 xlabel => "Time",
30 ylabel => "Heart rate (bpm)",
31 terminal => "png size 1024, 768",
32 timeaxis => "x",
33 xtics => {
34 labelfmt => '%H:%M',
35 },
36 );
37
38 my $data_set = Chart::Gnuplot::DataSet->new(
39 xdata => \@times,
40 ydata => \@heart_rates,
41 timefmt => "%H:%M:%S",
42 style => "lines",
43 );
44
45 $chart->plot2d($data_set);
46}
The timestamp data is no longer read straight into the @times
array; it’s
stored in the @timestamps
temporary array (line 6). This change also
makes the variable naming a bit more consistent, which is nice.
To parse a timestamp string into a DateTime
object, we need to tell
DateTime::Format::Strptime
how to format the timestamp (lines 8-12). This
is the purpose of the pattern
argument in the DateTime::Format::Strptime
constructor (line 10). You might have noticed that we used the same pattern
when telling Gnuplot what format the time data was in. We also specify the
time zone (line 11) to ensure that the date/time data is parsed as UTC.
Next, we fix the time zone in all elements of the @timestamps
array (lines
14-19). I’ve chosen to do this within a map
here. I could extract this
code into a well-named routine, but it does the job for now. The map
parses the date/time string into a Date::Time
object (line 15) and sets
the time zone to Europe/Berlin
5 (line 16). We only need to
plot the time data,6 hence we format the DateTime
object as a string including only hour, minute and second information (line
17). Even though we only use hours and minutes for the x-axis tick labels
later, the time data is resolved down to the second, hence we retain the
seconds information in the @times
array.
One could write a more compact version of the time zone correction code like this:
my @times = map {
$date_parser->parse_datetime($_)
->set_time_zone('Europe/Berlin')
->strftime("%H:%M:%S");
} @timestamps;
yet, in this case, I find giving each step a name (via a variable) helps the code explain itself. YMMV.
The next chunk of code (lines 22-23) isn’t related to the time zone fix. It
generalises working out the current date from the activity data. This way I
can use a FIT file from a different activity without having to update the
$date
variable by hand. The process is simple: all elements of the
@timestamps
array have the same date, so we choose to parse only the first
one (line 22)7. This gives us a DateTime
object
which we convert into a formatted date string (via the strftime()
method)
composed of the year, month and day (line 23). We don’t need to fix the
time zone because UTC is sufficient in this case to extract the date
information. Of course, if you’re in a time zone close to the international
date line you might need to set the time zone to get the correct date.
The last thing to change is the timefmt
option to the
Chart::Gnuplot::DataSet
object on line 41. Because we now only have hour,
minute and second information, we need to update the time format string to
reflect this.
Now we’re ready to run the script again! Doing so
$ perl geo-fit-plot-data.pl
creates this graph
where we see that the time information is correct. Yay! 🎉
Now that I look at the graph again, I realise something: it doesn’t matter
when the data was taken (at least, not for this use case). What matters
more is the elapsed time from the start of the activity until the end. It
looks like we need to munge the time
data again. The job now is to convert the timestamp information into
seconds elapsed since the ride began. Since we’ve parsed the timestamp data
into DateTime
objects (in line 15 above), we can convert that value into
the number of seconds since the epoch (via the epoch()
method). As soon as we
know the epoch value for each element in the @timestamps
array, we can
subtract the first element’s epoch value from each element in the array.
This will give us an array containing elapsed seconds since the beginning of
the activity. Elapsed seconds are a bit too fine-grained for an activity
extending over an hour, so we’ll also convert seconds to minutes.
Making these changes to the plot_activity_data()
code, we get:
1sub plot_activity_data {
2 my @activity_data = @_;
3
4 # extract data to plot from full activity data
5 my @heart_rates = num_parts('heart_rate', @activity_data);
6 my @timestamps = map { $_->{'timestamp'} } @activity_data;
7
8 # parse timestamp data
9 my $date_parser = DateTime::Format::Strptime->new(
10 pattern => "%Y-%m-%dT%H:%M:%SZ",
11 time_zone => 'UTC',
12 );
13
14 # get the epoch time for the first point in the time data
15 my $first_epoch_time = $date_parser->parse_datetime($timestamps[0])->epoch;
16
17 # convert timestamp data to elapsed minutes from start of activity
18 my @times = map {
19 my $dt = $date_parser->parse_datetime($_);
20 my $epoch_time = $dt->epoch;
21 my $elapsed_time = ($epoch_time - $first_epoch_time)/60;
22 $elapsed_time;
23 } @timestamps;
24
25 # determine date from timestamp data
26 my $dt = $date_parser->parse_datetime($timestamps[0]);
27 my $date = $dt->strftime("%Y-%m-%d");
28
29 # plot data
30 my $chart = Chart::Gnuplot->new(
31 output => "watopia-figure-8-heart-rate.png",
32 title => "Figure 8 in Watopia on $date: heart rate over time",
33 xlabel => "Elapsed time (min)",
34 ylabel => "Heart rate (bpm)",
35 terminal => "png size 1024, 768",
36 );
37
38 my $data_set = Chart::Gnuplot::DataSet->new(
39 xdata => \@times,
40 ydata => \@heart_rates,
41 style => "lines",
42 );
43
44 $chart->plot2d($data_set);
45}
The main changes occur in lines 14-23. We parse the date/time information
from the first timestamp (line 15), chaining the epoch()
method call to
find the number of seconds since the epoch. We store this result in a
variable for later use; it holds the epoch time at the beginning of the data
series. After parsing the timestamps into DateTime
objects (line 19), we
find the epoch time for each time point (line 20). Line 21 calculates the
elapsed time from the time stored in $first_epoch_time
and converts
seconds to minutes by dividing by 60. The map
returns this value (line
22) and hence @times
now contains a series of elapsed time values in
minutes.
It’s important to note here that we’re no longer plotting a date/time value
on the x-axis; the elapsed time is a purely numerical value. Thus, we
update the x-axis label string (line 33) to highlight this fact and remove
the timeaxis
and xtics
/labelfmt
options from the Chart::Gnuplot
constructor. The timefmt
option to the Chart::Gnuplot::DataSet
constructor is also no longer necessary and it too has been removed.
The script is now ready to go!
Running it
$ perl geo-fit-plot-data.pl
gives
That’s better!
Our statistics output from earlier told us that the maximum heart rate was 165 bpm with an average of 142 bpm. Looking at the graph, an average of 142 bpm seems about right. It also looks like the maximum pulse value occurred at an elapsed time of just short of 50 minutes. We can check that guess more closely later.
What’s intriguing me now is what caused this pattern in the heart rate data.
What could have caused the values to go up and down like that? Is there a
correlation with other data fields? We know from earlier that there’s an
altitude
field, so we can try plotting that along with the heart rate data
and see how (or if) they’re related.
Careful readers might have noticed something: how can you have a variation in altitude when you’re sitting on an indoor trainer? Well, Zwift simulates going up and downhill by changing the resistance in the smart trainer. The resistance is then correlated to a gradient and, given time and speed data, one can work out a virtual altitude gain or loss. Thus, for the data set we’re analysing here, altitude is a sensible parameter to consider. Even if you had no vertical motion whatsoever!
As I mentioned earlier, one of the things I like about Gnuplot is that one can plot two data sets with different y-axes on the same plot. Plotting heart rate and altitude on the same graph is one such use case.
To plot an extra data set on our graph, we need to set up another
Chart::Gnuplot::DataSet
object, this time for the altitude data. Before
we can do that, we’ll have to extract the altitude data from the full
activity data set. Gnuplot also needs to know which data to plot on the
primary and secondary y-axes (i.e. on the left- and right-hand sides of the
graph). And we must remember to label our axes
properly. That’s a fair bit of work, so I’ve done
the hard
yakka for
ya. 😉
Here’s the updated plot_activity_data()
code:
1sub plot_activity_data {
2 my @activity_data = @_;
3
4 # extract data to plot from full activity data
5 my @heart_rates = num_parts('heart_rate', @activity_data);
6 my @timestamps = map { $_->{'timestamp'} } @activity_data;
7 my @altitudes = num_parts('altitude', @activity_data);
8
9 # parse timestamp data
10 my $date_parser = DateTime::Format::Strptime->new(
11 pattern => "%Y-%m-%dT%H:%M:%SZ",
12 time_zone => 'UTC',
13 );
14
15 # get the epoch time for the first point in the time data
16 my $first_epoch_time = $date_parser->parse_datetime($timestamps[0])->epoch;
17
18 # convert timestamp data to elapsed minutes from start of activity
19 my @times = map {
20 my $dt = $date_parser->parse_datetime($_);
21 my $epoch_time = $dt->epoch;
22 my $elapsed_time = ($epoch_time - $first_epoch_time)/60;
23 $elapsed_time;
24 } @timestamps;
25
26 # determine date from timestamp data
27 my $dt = $date_parser->parse_datetime($timestamps[0]);
28 my $date = $dt->strftime("%Y-%m-%d");
29
30 # plot data
31 my $chart = Chart::Gnuplot->new(
32 output => "watopia-figure-8-heart-rate-and-altitude.png",
33 title => "Figure 8 in Watopia on $date: heart rate and altitude over time",
34 xlabel => "Elapsed time (min)",
35 ylabel => "Heart rate (bpm)",
36 terminal => "png size 1024, 768",
37 xtics => {
38 incr => 5,
39 },
40 y2label => 'Altitude (m)',
41 y2range => [-10, 70],
42 y2tics => {
43 incr => 10,
44 },
45 );
46
47 my $heart_rate_ds = Chart::Gnuplot::DataSet->new(
48 xdata => \@times,
49 ydata => \@heart_rates,
50 style => "lines",
51 );
52
53 my $altitude_ds = Chart::Gnuplot::DataSet->new(
54 xdata => \@times,
55 ydata => \@altitudes,
56 style => "boxes",
57 axes => "x1y2",
58 );
59
60 $chart->plot2d($altitude_ds, $heart_rate_ds);
61}
Line 7 extracts the altitude data from the full activity data. This code
also strips the unit information from the altitude data so that we only have
the numerical part, which is what Gnuplot needs. We store the altitude data
in the @altitudes
array. This we use later to create a
Chart::Gnuplot::DataSet
object on lines 53-58. An important line to note
here is the axes
setting on line 57; it tells Gnuplot to use the secondary
y-axis on the right-hand side for this data set. I’ve chosen to use the
boxes
style for the altitude data (line 56) so that the output looks a bit
like the hills and valleys that it represents.
To make the time data a bit easier to read and analyse, I’ve set the increment for the ticks on the x-axis to 5 (lines 37-39). This way it’ll be easier to refer to specific changes in altitude and heart rate data.
The settings for the secondary y-axis use the same names as for the primary
y-axis, with the exception that the string y2
replaces y
. For instance,
to set the axis label for the secondary y-axis, we specify the y2label
value, as in line 40 above.
I’ve set the range on the secondary y-axis explicitly (line 41) because the output looks better than what the automatic range was able to make in this case. Similarly, I’ve set the increment on the secondary y-axis ticks (lines 42-44) because the automatic output wasn’t as good as what I wanted.
I’ve also renamed the variable for the heart rate data set on line 47 to be
more descriptive; the name $data_set
was much too generic.
We specify the altitude data set first in the call to plot2d()
(line 60)
because we want the heart rate data plotted “on top” of the altitude data.
Had we used $heart_rate_ds
first in this call, the altitude data would
have obscured part of the heart rate data.
Running our script in the now familiar way
$ perl geo-fit-plot-data.pl
gives this plot
Cool! Now it’s a bit clearer why the heart rate evolved the way it did.
At the beginning of the graph (in the first ~10 minutes) it looks like I was getting warmed up and my pulse was finding a kind of base level (~130 bpm). Then things started going uphill at about the 10-minute mark and my pulse also kept going upwards. This makes sense because I was working harder. Between about 13 minutes and 19 minutes came the first hill climb on the route and here I was riding even harder. The effort is reflected in the heart rate data which rose to around 160 bpm at the top of the hill. That explains why the heart rate went up from the beginning to roughly the 18-minute mark.
Looking back over the Zwift data for that particular ride, it seems that I took the KOM8 for that climb at that time, so no wonder my pulse was high!9 Note that this wasn’t a special record or anything like that; it was a short-term live result10 and someone else took the jersey with a faster time not long after I’d done my best time up that climb.
It was all downhill shortly after the hill climb, which also explains why the heart rate went down straight afterwards. We also see similar behaviour on the second hill climb (from about 37 minutes to 42 minutes). Although my pulse rose throughout the hill climb, it didn’t rise as high this time. This indicates that I was getting tired and wasn’t able to put as much effort in.
Just in case you’re wondering how the altitude can go negative,11 part of the route goes through “underwater tunnels”. This highlights the flexibility of the virtual worlds within Zwift: the designers have enormous room to let their imaginations run wild. There are all kinds of fun things to discover along the various routes and many that don’t exist in the Real World™. Along with the underwater tunnels (where it’s like riding through a giant aquarium, with sunken shipwrecks, fish, and whales), there is a wild west style town complete with a steam train from that era chugging past. There are also Mayan ruins with llamas (or maybe alpacas?) wandering around and even a section with dinosaurs grazing at the side of the road.
Here’s what it looks like riding through an underwater tunnel:
I think that’s pretty cool.
At the end of the ride (at ~53 minutes) my pulse dropped sharply. Since this was the warm-down phase of the ride, this also makes sense.
There are two peaks in the heart rate data that don’t correlate with altitude (one at ~25 minutes and another at ~48 minutes). The altitude change at these locations would suggest that things are fairly flat. What’s going on there?
One other parameter that we could consider for correlations is power output. Going uphill requires more power than riding on the flat, so we’d expect to see higher power values (and therefore higher heart rates) when climbing. If flat roads require less power, what’s causing the peaks in the pulse? Maybe there’s another puzzle hiding in the data.
Let’s combine the heart rate data with power output and see what other relationships we can discover. To do this we need to extract power output data instead of altitude data. Then we need to change the secondary y-axis data set and configuration to produce a nice plot of power output. Making these changes gives this code:
1sub plot_activity_data {
2 my @activity_data = @_;
3
4 # extract data to plot from full activity data
5 my @heart_rates = num_parts('heart_rate', @activity_data);
6 my @timestamps = map { $_->{'timestamp'} } @activity_data;
7 my @powers = num_parts('power', @activity_data);
8
9 # parse timestamp data
10 my $date_parser = DateTime::Format::Strptime->new(
11 pattern => "%Y-%m-%dT%H:%M:%SZ",
12 time_zone => 'UTC',
13 );
14
15 # get the epoch time for the first point in the time data
16 my $first_epoch_time = $date_parser->parse_datetime($timestamps[0])->epoch;
17
18 # convert timestamp data to elapsed minutes from start of activity
19 my @times = map {
20 my $dt = $date_parser->parse_datetime($_);
21 my $epoch_time = $dt->epoch;
22 my $elapsed_time = ($epoch_time - $first_epoch_time)/60;
23 $elapsed_time;
24 } @timestamps;
25
26 # determine date from timestamp data
27 my $dt = $date_parser->parse_datetime($timestamps[0]);
28 my $date = $dt->strftime("%Y-%m-%d");
29
30 # plot data
31 my $chart = Chart::Gnuplot->new(
32 output => "watopia-figure-8-heart-rate-and-power.png",
33 title => "Figure 8 in Watopia on $date: heart rate and power over time",
34 xlabel => "Elapsed time (min)",
35 ylabel => "Heart rate (bpm)",
36 terminal => "png size 1024, 768",
37 xtics => {
38 incr => 5,
39 },
40 ytics => {
41 mirror => "off",
42 },
43 y2label => 'Power (W)',
44 y2range => [0, 1100],
45 y2tics => {
46 incr => 100,
47 },
48 );
49
50 my $heart_rate_ds = Chart::Gnuplot::DataSet->new(
51 xdata => \@times,
52 ydata => \@heart_rates,
53 style => "lines",
54 );
55
56 my $power_ds = Chart::Gnuplot::DataSet->new(
57 xdata => \@times,
58 ydata => \@powers,
59 style => "lines",
60 axes => "x1y2",
61 );
62
63 $chart->plot2d($power_ds, $heart_rate_ds);
64}
On line 7, I swapped out the altitude data extraction code with power output. Then, I updated the output filename (line 32) and plot title (line 33) to highlight that we’re now plotting heart rate and power data.
The mirror
option to the ytics
setting (lines 40-42) isn’t an obvious
change. Its purpose is to stop the ticks from the primary y-axis from being
mirrored to the secondary y-axis (on the right-hand side). We want to stop
these mirrored ticks from appearing because they’ll clash with the secondary
y-axis tick marks. The reason we didn’t need this before is that all the
y-axis ticks happened to line up and the issue wasn’t obvious until now.
I’ve updated the secondary axis label setting to mention power (line 43).
Also, I’ve set the range to match the data we’re plotting (line 44) and to
space out the data nicely via the incr
option to the y2tics
setting
(lines 45-47). It seemed more appropriate to use lines to plot power output
as opposed to the bars we used for the altitude data, hence the change to
the style
option on line 59.
As we did when plotting altitude, we pass the power data set ($power_ds
)
to the plot2d()
call before $heart_rate_ds
(line 63).
Running the script again
$ perl geo-fit-plot-data.pl
produces this plot:
This plot shows the correlation between heart rate and power output that we expected for the first hill climb. The power output increases steadily from the 3-minute mark up to about the 18-minute mark. After that, it dropped suddenly once I’d reached the top of the climb. This makes sense: I’d just done a personal best up that climb and needed a bit of respite!
However, now we can see clearly what caused the spikes in heart rate at 25 minutes and 48 minutes: there are two large spikes in power output. The first spike maxes out at 1023 W;12 what value the other peak has, it’s hard to tell. We’ll try to work out what that value is later. These spikes in power result from sprints. In Zwift, not only can one try to go up hills as fast as possible, but flatter sections have sprints where one also tries to go as fast as possible, albeit for shorter distances (say 200m or 500m).
Great! We’ve worked out another puzzle in the data!
Zwift produces what they call timelines of a given ride, which is much the same as what we’ve been plotting here. For instance, for the FIT file we’ve been looking at, this is the timeline graph:
Zwift plots several datasets on this graph that have very different value ranges. The plot above shows power output, cadence, heart rate, and altitude data all on one graph! A lot is going on here and because of the different data values and ranges, Zwift doesn’t display values on the y-axes. Their solution is to show all four values at a given time point when the user hovers their mouse over the graph. This solution only works within a web browser and needs lots of JavaScript to work, hence this is something I like to avoid. That (and familiarity) is largely the reason why I prefer PNG output for my graphs.
If you take a close look at the timeline graph, you’ll notice that the maximum power is given as 937 W and not 1023 W, which we worked out from the FIT file data. I don’t know what’s going on here, as the same graph in the Zwift Companion App shows the 1023 W that we got. The graph above is a screenshot from the web application in a browser on my laptop and, at least theoretically, it’s supposed to display the same data. I’ve noticed a few inconsistencies between the web browser view and that from the Zwift Companion App, so maybe this discrepancy is one bug that still needs shaking out.
Y’know what’d also be cool beyond plotting this data? Playing around with it interactively.
That’s also possible with Perl, but it’s another story.
I’ve been using Gnuplot since the late 90’s. Back then, it was the only freely available plotting software which handled time data well. ↩︎
By default, Gnuplot will generate Postscript output. ↩︎
One can interpret the word “terminal” as a kind of “screen” or “canvas” that the plotting library draws its output on. ↩︎
I’ve later found out that they haven’t heard anything, so that’s good! ↩︎
I live in Germany, so this is the relevant time zone for me. ↩︎
All dates are the same and displaying them would be redundant, hence we omit the date information. ↩︎
All elements in the array have the same date, so using the first one does the job. ↩︎
KOM stands for “king of the mountains”. ↩︎
Yes, I am stoked that I managed to take that jersey! Even if it was only for a short time. ↩︎
A live result that makes it onto a leaderboard is valid only for one hour. ↩︎
Around the 5-minute mark and again shortly before the 35-minute mark. ↩︎
One thing that this value implies is that I could power a small bar heater for one second. But not for very much longer! ↩︎
If you’re a Perl developer, you’ve probably heard it before: “Is Perl still a thing?”
The short answer? Yes. Absolutely.
The longer answer? It’s evolving—quietly, but purposefully—and there’s still real demand for skilled Perl developers across a number of industries.
Let’s explore where the opportunities are today and how to find them.
Despite not being the trendiest language, Perl continues to power core infrastructure in fields like:
The reality is, companies with decades of code running in Perl aren’t eager to rip and replace something that still works flawlessly.
One major shift in recent years is the rise of remote-first hiring. More companies are hiring global developers to work on existing Perl systems—whether it's maintaining codebases, modernizing legacy apps, or integrating Perl into cloud workflows.
These roles aren’t always posted on major job boards. That’s why using niche platforms is key.
General job boards often bury Perl listings under unrelated content or make it difficult to filter accurately.
That’s why developers increasingly rely on specialized platforms like
Perl-Jobs.com — a focused job board built specifically
for the Perl community, offering remote, freelance, and full-time listings from companies that actually want your Perl skills.
It saves time and connects you with opportunities that are actually relevant.
Perl isn’t dead—it’s just not loud. There are still solid, high-paying roles out there for developers who know how to write clean, efficient Perl code. And with the right tools and platforms, you don’t have to hunt blindly to find them.
So whether you're actively job hunting or just keeping an eye on the market, it’s a good time to dust off the resume and see where Perl can take you.
Published by thibaultduponchelle on Wednesday 25 June 2025 05:40
Add epigraph for 5.42.0-RC1
Published by /u/briandfoy on Tuesday 24 June 2025 11:31
![]() | submitted by /u/briandfoy [link] [comments] |
Published by /u/daxim on Monday 23 June 2025 23:53
![]() | submitted by /u/daxim [link] [comments] |
Published by /u/scottchiefbaker on Monday 23 June 2025 20:15
I have the following code snippet that prints the word "PASS" in green letters. I want to use printf()
to align the text but printf
reads the raw length of the string, not the printable characters, so the alignment doesn't work.
```perl
my $str = "\033[38;5;2m" . "PASS" . "\033[0m";
printf("Test was '%10s' result\n", $str); ```
Is there any way to make printf()
ANSI aware? Or could I write a wrapper that would do what I want?
The best I've been able to come up with is:
```perl
$str = "Test was '\033[38;5;2m" . sprintf("%10s", "PASS") . "\033[0m' result";
printf("%s\n", $str); ```
While this works, it's much less readable and doesn't leverage the power of the full formatting potential of printf()
.
Published by Ronak Bhatt on Monday 23 June 2025 14:39
The tech world moves quickly — some languages just can’t keep up. Are you clinging to one that’s quietly dying? By Ronak Bhatt
FIT files record the activities of people using devices such as sports watches and bike head units. Platforms such as Strava and Zwift understand this now quasi-standard format. So does Perl! Here I discuss how to parse FIT files and calculate some basic statistics from the extracted data.
I love data. Geographical data, time series data, simulation data, whatever. Whenever I get my hands on a new dataset, I like to have a look at it and visualise it. This way I can get a feel for what’s available and to see what kind of information I can extract from the long lists of numbers. I guess this comes with having worked in science for so long: there’s always some interesting dataset to look at and analyse and try to understand.
I began collecting lots of data recently when I started riding my bike more. Bike head units can save all sorts of information about one’s ride. There are standard parameters such as time, position, altitude, temperature, and speed. If you have extra sensors then you can also measure power output, heart rate, and cadence. This is a wealth of information just waiting to be played with!
I’ve also recently started using Zwift1 and there I can get even more data than on my road bike. Now I can get power and cadence data along with the rest of the various aspects of a normal training ride.
My head unit is from Garmin2 and thus saves ride data in their standard FIT format. Zwift also allows you to save ride data in FIT format, so you don’t have to deal with multiple formats when reading and analysing ride data. FIT files can also be uploaded to Strava3 where you can track all the riding you’re doing in one location.
But what if you don’t want to use an online service to look at your ride data? What if you want to do this yourself, using your own tools? That’s what I’m going to talk about here: reading ride data from FIT files and analysing the resulting information.
Because I like Perl, I wondered if there are any modules available to read
FIT files. It turns out that there are two:
Geo::FIT
and
Parser::FIT
. I chose to use
Geo::FIT
because Parser::FIT
is still in alpha status. Also, Geo::FIT
is quite mature with its last release in 2024, so it is still up-to-date.
The Garmin developer site explains all the gory details of the FIT format. The developer docs give a good high-level overview of what the format is for:
The Flexible and Interoperable Data Transfer (FIT) protocol is a format designed specifically for the storing and sharing of data that originates from sport, fitness and health devices. It is specifically designed to be compact, interoperable and extensible.
A FIT file has a well-defined structure and contains a series of records of different types. There are definition messages which describe the data appearing in the file. There are also data messages which contain the data fields storing a ride’s various parameters. Header fields contain such things as CRC information which one can use to check a file’s integrity.
As noted above, to extract the data, I’m going to use the
Geo::FIT
module. It’s based on
the Garmin::Fit
module originally by Kiyokazu
Suto and later
expanded upon by Matjaz Rihtar.
Unfortunately, neither was ever released to
CPAN. The latest releases
of the Garmin::FIT
code (either version) were in 2017. In contrast,
Geo::FIT
’s most recent release is from 2024-07-13 and it’s available on
CPAN, making it easy to install. It’s great to see that someone has taken
on the mantle of maintaining this codebase!
To install Geo::FIT
, we’ll use cpanm
:
$ cpanm Geo::FIT
Now we’re ready to start parsing FIT files and extracting their data.
As mentioned earlier, FIT files store event data in data messages. Each event has various fields, depending upon the kind of device (e.g. watch or head unit) used to record the activity. More fields are possible if other peripherals are attached to the main device (e.g. power meter or heart rate monitor). We wish to extract all available event data.
To extract (and if we want to, process) the event data, Geo::FIT
requires
that we define a callback function and register it. Geo::FIT
calls this
function each time it detects a data message, allowing us to process the
file in small bites as a stream of data rather than one giant blob.
A simple example should explain the process. I’m going to adapt the example
mentioned in the module’s
synopsis. Here’s the code
(which I’ve put into a file called geo-fit-basic-data-extraction.pl
):
1use strict;
2use warnings;
3
4use Geo::FIT;
5
6my $fit = Geo::FIT->new();
7$fit->file( "2025-05-08-07-58-33.fit" );
8$fit->open or die $fit->error;
9
10my $record_callback = sub {
11 my ($self, $descriptor, $values) = @_;
12 my $time= $self->field_value( 'timestamp', $descriptor, $values );
13 my $lat = $self->field_value( 'position_lat', $descriptor, $values );
14 my $lon = $self->field_value( 'position_long', $descriptor, $values );
15 print "Time was: ", join("\t", $time, $lat, $lon), "\n"
16};
17
18$fit->data_message_callback_by_name('record', $record_callback ) or die $fit->error;
19
20my @header_things = $fit->fetch_header;
21
221 while ( $fit->fetch );
23
24$fit->close;
The only changes I’ve made from the original example code have been to
include the strict
and warnings
strictures on lines 1 and 2, and to
replace the $fname
variable with the name of a FIT file exported from one
of my recent Zwift rides (line 7).
After having imported the module (line 4), we instantiate a Geo::FIT
object (line 6). We then tell Geo::FIT
the name of the file to process by
calling the file()
method on line 7. This method returns the name of the
file if it’s called without an argument. We open the file on line 8 and
barf with an error if anything went wrong.
Lines 10-16 define the callback function, which must accept the given
argument list. Within the callback, the field_value()
method extracts the
value with the given field name from the FIT data message (lines 12-14).
I’ll talk about how to find out what field names are available later. In
this example, we extract the timestamp as well as the latitude and longitude
of where the event happened. Considering that Garmin is a company that has
focused on GPS sensors, it makes sense that such data is the minimum we
would expect to find in a FIT file.
On line 18 we register the callback with the Geo::FIT
object. We tell it
that the callback should be run whenever Geo::FIT
sees a data message with
the name record
4. Again, the code barfs with an
error if anything goes wrong.
The next line (line 20) looks innocuous but is actually necessary. The
fetch_header()
method must be called before we can fetch any data from
the FIT file. Calling this method also returns header information, part of
which we can use to check the file integrity. This is something we might
want to use in a robust application as opposed to a simple script such as
that here.
The main action takes place on line 22. We read each data message from the
FIT file and–if it’s a data message with the name record
–process it with
our callback.
At the end (line 24), we’re good little developers and close the file.
Running this code, you’ll see lots of output whiz past. It’ll look something like this:
$ perl geo-fit-basic-data-extraction.pl
<snip>
Time was: 2025-05-08T06:53:10Z -11.6379448 deg 166.9560685 deg
Time was: 2025-05-08T06:53:11Z -11.6379450 deg 166.9560904 deg
Time was: 2025-05-08T06:53:12Z -11.6379451 deg 166.9561073 deg
Time was: 2025-05-08T06:53:13Z -11.6379452 deg 166.9561185 deg
Time was: 2025-05-08T06:53:14Z -11.6379452 deg 166.9561232 deg
Time was: 2025-05-08T06:53:15Z -11.6379452 deg 166.9561233 deg
Time was: 2025-05-08T06:53:16Z -11.6379452 deg 166.9561233 deg
Time was: 2025-05-08T06:53:17Z -11.6379452 deg 166.9561233 deg
This tells us that, at the end of my ride on Zwift, I was at a position of roughly 11°S, 167°E shortly before 07:00 UTC on the 8th of May 2025.5 Because Zwift has virtual worlds, this position tells little of my actual physical location at the time. Hint: my spare room (where I was riding my indoor trainer) isn’t located at this position. 😉
We want to get serious, though, and not only extract position and timestamp
data. There’s more in there to discover! So how do we find out what fields
are available? For this task, we need to use the fields_list()
method.
To extract the list of available field names, I wrote the following script,
which I called geo-fit-find-field-names.pl
:
1use strict;
2use warnings;
3
4use Geo::FIT;
5use Scalar::Util qw(reftype);
6
7my $fit = Geo::FIT->new();
8$fit->file( "2025-05-08-07-58-33.fit" );
9$fit->open or die $fit->error;
10
11my $record_callback = sub {
12 my ($self, $descriptor, $values) = @_;
13 my @all_field_names = $self->fields_list($descriptor);
14
15 return \@all_field_names;
16};
17
18$fit->data_message_callback_by_name('record', $record_callback ) or die $fit->error;
19
20my @header_things = $fit->fetch_header;
21
22my $found_field_names = 0;
23do {
24 my $field_names = $fit->fetch;
25 my $reftype = reftype $field_names;
26 if (defined $reftype && $reftype eq 'ARRAY') {
27 print "Number of field names found: ", scalar @{$field_names}, "\n";
28
29 while (my @next_field_names = splice @{$field_names}, 0, 5) {
30 my $joined_field_names = join ", ", @next_field_names;
31 print $joined_field_names, "\n";
32 }
33 $found_field_names = 1;
34 }
35} while ( !$found_field_names );
36
37$fit->close;
This script extracts and prints the field names from the first data message
it finds. Here, I’ve changed the callback (lines 11-16) to only return the
list of all available field names by calling the fields_list()
method. We
return the list of field names as an array reference (line 15). While this
particular change to the callback (in comparison to
geo-fit-basic-data-extraction.pl
, above) will do the job, it’s not very
user-friendly. It will print the field names for all data messages in the
FIT file, which is a lot. The list of all available field names would be
repeated thousands of times! So, I changed the while
loop to a do-while
loop (lines 23-35), exiting as soon as the callback finds a data message
containing field names.
To actually grab the field name data, I had to get a bit tricky. This is
because fetch()
returns different values depending upon whether the
callback was called. For instance, when the callback isn’t called, the
return value is 1
on success or undef
. If the callback function is
called, fetch()
returns the callback’s return value, which in our case is
the array reference to the list of field names. Hence, I’ve assigned the
return value to a variable, $field_names
(line 24). To ensure that we’re
only processing data returned when the callback is run, we check that
$field_names
is defined and has a reference type of ARRAY
(line 26).
This we do with the help of the reftype
function from Scalar::Util
(line
25).
It turns out that there are 49 field names available (line 27). To format
the output more nicely I splice
d the array, extracting five elements at a
time (line 29) and printing them as a comma-separated string (lines 30 and
31). I adapted the while (splice)
pattern from the example in the Perl
documentation for splice
.
Note that I could have printed the field names from within the callback. It
doesn’t make much of a difference if we return data from the callback first
before processing it or doing the processing within the callback. In this
case, I chose to do the former.
Running the script gives the following output:
$ perl geo-fit-find-field-names.pl
Use of uninitialized value $emsg in string ne at /home/vagrant/perl5/perlbrew/perls/perl-5.38.3/lib/site_perl/5.38.3/Geo/FIT.pm line 7934.
Use of uninitialized value $emsg in string ne at /home/vagrant/perl5/perlbrew/perls/perl-5.38.3/lib/site_perl/5.38.3/Geo/FIT.pm line 7992.
Number of field names found: 49
timestamp, position_lat, position_long, distance, time_from_course
total_cycles, accumulated_power, enhanced_speed, enhanced_altitude, altitude
speed, power, grade, compressed_accumulated_power, vertical_speed
calories, vertical_oscillation, stance_time_percent, stance_time, ball_speed
cadence256, total_hemoglobin_conc, total_hemoglobin_conc_min, total_hemoglobin_conc_max, saturated_hemoglobin_percent
saturated_hemoglobin_percent_min, saturated_hemoglobin_percent_max, heart_rate, cadence, compressed_speed_distance
resistance, cycle_length, temperature, speed_1s, cycles
left_right_balance, gps_accuracy, activity_type, left_torque_effectiveness, right_torque_effectiveness
left_pedal_smoothness, right_pedal_smoothness, combined_pedal_smoothness, time128, stroke_type
zone, fractional_cadence, device_index, 1_6_target_power
Note that the uninitialized value
warnings are from Geo::FIT
.
Unfortunately, I don’t know what’s causing them. They appear whenever we
fetch data from the FIT file. From now on, I’ll omit these warnings from
program output in this article.
As you can see, there’s potentially a lot of information one can obtain
from FIT files. I say “potentially” here because not all these fields
contain valid data, as we’ll see soon. I was quite surprised at the level
of detail. For instance, there are various pedal smoothness values, stroke
type, and torque effectiveness parameters. Also, there’s haemoglobin
information,6 which I guess is something one can
collect given the appropriate peripheral device. What things like enhanced
speed and compressed accumulated power mean, I’ve got no idea. For me, the
interesting parameters are: timestamp
, position_lat
, position_long
,
distance
, altitude
, speed
, power
, calories
, heart_rate
, and
cadence
. We’ll get around to extracting and looking at these values soon.
Let’s see what values are present in each of the fields. To do this, we’ll
change the callback to collect the values in a hash with the field names as
the hash keys. Then we’ll return the hash from the callback. Here’s the
script I came up with (I called it geo-fit-show-single-values.pl
):
1use strict;
2use warnings;
3
4use Geo::FIT;
5use Scalar::Util qw(reftype);
6
7my $fit = Geo::FIT->new();
8$fit->file( "2025-05-08-07-58-33.fit" );
9$fit->open or die $fit->error;
10
11my $record_callback = sub {
12 my ($self, $descriptor, $values) = @_;
13 my @all_field_names = $self->fields_list($descriptor);
14
15 my %event_data;
16 for my $field_name (@all_field_names) {
17 my $field_value = $self->field_value($field_name, $descriptor, $values);
18 $event_data{$field_name} = $field_value;
19 }
20
21 return \%event_data;
22};
23
24$fit->data_message_callback_by_name('record', $record_callback ) or die $fit->error;
25
26my @header_things = $fit->fetch_header;
27
28my $found_event_data = 0;
29do {
30 my $event_data = $fit->fetch;
31 my $reftype = reftype $event_data;
32 if (defined $reftype && $reftype eq 'HASH' && defined %$event_data{'timestamp'}) {
33 for my $key ( sort keys %$event_data ) {
34 print "$key = ", $event_data->{$key}, "\n";
35 }
36 $found_event_data = 1;
37 }
38} while ( !$found_event_data );
39
40$fit->close;
The main changes here (in comparison to the previous script) involve collecting the data into a hash (lines 15-19) and later, after fetching the event data, printing it (lines 32-35).
To collect data from an individual event, we first find out what the
available fields are (line 13). Then we loop over each field name (line
16), extracting the values via the field_value()
method (line 17). To
pass the data outside the callback, we store the value in the %event_data
hash using the field name as a key (line 18). Finally, we return the event
data as a hash ref (line 21).
When printing the key and value information, we again only want to print the
first event that we come across. Hence we use a do-while
loop and exit as
soon as we’ve found appropriate event data (line 38).
Making sure that we’re only printing relevant event data is again a bit
tricky. Not only do we need to make sure that the callback has returned a
reference type, but we also need to check that it’s a hash. Plus, we have
an extra check to make sure that we’re getting time series data by looking
for the presence of the timestamp
key (line 32). Without the timestamp
key check, we receive data messages unrelated to the ride activity, which we
obviously don’t want.
Running this new script gives this output:
$ perl geo-fit-show-single-values.pl
1_6_target_power = 0
accumulated_power = 4294967295
activity_type = 255
altitude = 4.6 m
ball_speed = 65535
cadence = 84 rpm
cadence256 = 65535
calories = 65535
combined_pedal_smoothness = 255
compressed_accumulated_power = 65535
compressed_speed_distance = 255
cycle_length = 255
cycles = 255
device_index = 255
distance = 0.56 m
enhanced_altitude = 4294967295
enhanced_speed = 4294967295
fractional_cadence = 255
gps_accuracy = 255
grade = 32767
heart_rate = 115 bpm
left_pedal_smoothness = 255
left_right_balance = 255
left_torque_effectiveness = 255
position_lat = -11.6387709 deg
position_long = 166.9487493 deg
power = 188 watts
resistance = 255
right_pedal_smoothness = 255
right_torque_effectiveness = 255
saturated_hemoglobin_percent = 65535
saturated_hemoglobin_percent_max = 65535
saturated_hemoglobin_percent_min = 65535
speed = 1.339 m/s
speed_1s = 255
stance_time = 65535
stance_time_percent = 65535
stroke_type = 255
temperature = 127
time128 = 255
time_from_course = 2147483647
timestamp = 2025-05-08T05:58:45Z
total_cycles = 4294967295
total_hemoglobin_conc = 65535
total_hemoglobin_conc_max = 65535
total_hemoglobin_conc_min = 65535
vertical_oscillation = 65535
vertical_speed = 32767
zone = 255
That’s quite a list!
What’s immediately obvious (at least, to me) is that many of the values look
like maximum integer range values. For instance, activity_type = 255
suggests that this value ranges from 0 to 255, implying that it’s an 8-bit
integer. Also, the numbers 65535 and 4294967295 are the maximum values of
16-bit and 32-bit integers, respectively. This “smells” of dummy values
being used to fill the available keys with something other than 0. Thus, I
get the feeling that we can ignore such values.
Further, most of the values that aren’t only an integer have units attached.
For instance, the speed is given as 1.339 m/s
and the latitude coordinate
as -11.6387709 deg
. Note the units associated with these values. The
only value without a unit–yet is still a sensible value–is timestamp
.
This makes sense, as a timestamp doesn’t have a unit.
This is the next part of the puzzle to solve: we need to work out how to extract relevant event data and filter out anything containing a dummy value.
To filter out the dummy values and hence focus only on real event data, we
use the fact that real event data contains a string of letters denoting the
value’s unit. Thus, the event data we’re interested in has a value
containing numbers and letters. Fortunately, this is also the case for the
timestamp because it contains timezone information, denoted by the letter
Z
, meaning UTC. In other words, we can solve our problem with a
regex.7
Another way of looking at the problem would be to realise that all the irrelevant data contains only numbers. Thus, if a data value contains a letter, we should select it. Either way, the easiest approach is to look for a letter by using a regex.
I’ve modified the script above to filter out the dummy event data and to
collect valid event data into an array for the entire
activity.8 Here’s what the code looks like now (I’ve
called the file geo-fit-full-data-extraction.pl
):
1use strict;
2use warnings;
3
4use Geo::FIT;
5use Scalar::Util qw(reftype);
6
7my $fit = Geo::FIT->new();
8$fit->file( "2025-05-08-07-58-33.fit" );
9$fit->open or die $fit->error;
10
11my $record_callback = sub {
12 my ($self, $descriptor, $values) = @_;
13 my @all_field_names = $self->fields_list($descriptor);
14
15 my %event_data;
16 for my $field_name (@all_field_names) {
17 my $field_value = $self->field_value($field_name, $descriptor, $values);
18 if ($field_value =~ /[a-zA-Z]/) {
19 $event_data{$field_name} = $field_value;
20 }
21 }
22
23 return \%event_data;
24};
25
26$fit->data_message_callback_by_name('record', $record_callback ) or die $fit->error;
27
28my @header_things = $fit->fetch_header;
29
30my $event_data;
31my @activity_data;
32do {
33 $event_data = $fit->fetch;
34 my $reftype = reftype $event_data;
35 if (defined $reftype && $reftype eq 'HASH' && defined %$event_data{'timestamp'}) {
36 push @activity_data, $event_data;
37 }
38} while ( $event_data );
39
40$fit->close;
41
42print "Found ", scalar @activity_data, " entries in FIT file\n";
43my $available_fields = join ", ", sort keys %{$activity_data[0]};
44print "Available fields: $available_fields\n";
The primary difference here with respect to the previous script is the check
within the callback for a letter in the field value (line 18). If that’s
true, we store the field value in the %event_data
hash under a key
corresponding to the field name (line 19).
Later, if we have a hash and it has a timestamp
key, we push the
$event_data
hash reference onto an array. This way we store all events
related to our activity (line 36). Also, instead of checking that we got
only one instance of event data, we’re now looping over all event data in
the FIT file, exiting the do-while
loop if $event_data
is a falsey
value.9 Note that $event_data
has to be declared outside
the do
block. Otherwise, it won’t be in scope for the while
statement
and Perl will barf with a compile-time error. We also declare the
@activity_data
array outside the do-while
loop because we want to use it
later.
After processing all records in the FIT file, we display the number of data entries found (line 42) and show a list of the available (valid) fields (lines 43-44).
Running this script gives this output:10
$ perl geo-fit-full-data-extraction.pl
Found 3273 entries in FIT file
Available fields: altitude, cadence, distance, heart_rate, position_lat, position_long, power, speed, timestamp
We now have the full dataset to play with! So what can we do with it? One thing that springs to mind is to calculate the maximum and average values of each data series.
Given the list of available fields, my instincts tell me that it’d be nice to know what the following parameters are:
Let’s calculate them now.
Finding the total distance is very easy. Since this is a cumulative quantity, we only need to select the value in the final data point. Then we convert it to kilometres by dividing by 1000, because the distance data is in units of metres. I.e.:
my $total_distance_m = (split ' ', ${$activity_data[-1]}{'distance'})[0];
my $total_distance = $total_distance_m/1000;
print "Total distance: $total_distance km\n";
Note that since the distance
field value also contains its unit, we have
to split on spaces and take the first element to extract the numerical part.
To get maximum values (e.g. for maximum speed), we use the max
function
from List::Util
:
1my @speeds = map { (split ' ', $_->{'speed'})[0] } @activity_data;
2my $maximum_speed = max @speeds;
3my $maximum_speed_km = $maximum_speed*3.6;
4print "Maximum speed: $maximum_speed m/s = $maximum_speed_km km/h\n";
Here, I’ve extracted all speed values from the activity data, selecting only the numerical part (line 1). I then found the maximum speed on line 2 (which is in m/s) and converted this into km/h (line 3), displaying both at the end.
Getting average values is a bit more work because List::Util
doesn’t
provide an arithmetic mean function, commonly known as an “average”. Thus,
we have to calculate this ourselves. It’s not much work, though. Here’s
the code for the average speed:
1my $average_speed = (sum @speeds) / (scalar @speeds);
2my $average_speed_km = sprintf("%0.2f", $average_speed*3.6);
3$average_speed = sprintf("%0.2f", $average_speed);
4print "Average speed: $average_speed m/s = $average_speed_km km/h\n";
In this code, I’ve used the sum
function from List::Util
to find the sum
of all speed values in the entry data (line 1). Dividing this value by the
length of the array (i.e. scalar @speeds
) gives the average value.
Because this value will have lots of decimal places, I’ve used sprintf
to
show only two decimal places (this is what the "%0.2f"
format statement
does on line 3). Again, I’ve calculate the value in km/h (line 2) and
show the average speed in both m/s and km/h.
Extending the code to calculate and display all parameters I mentioned above, we get this:
my $total_distance_m = (split ' ', ${$activity_data[-1]}{'distance'})[0];
my $total_distance = $total_distance_m/1000;
print "Total distance: $total_distance km\n";
my @speeds = map { (split ' ', $_->{'speed'})[0] } @activity_data;
my $maximum_speed = max @speeds;
my $maximum_speed_km = $maximum_speed*3.6;
print "Maximum speed: $maximum_speed m/s = $maximum_speed_km km/h\n";
my $average_speed = (sum @speeds) / (scalar @speeds);
my $average_speed_km = sprintf("%0.2f", $average_speed*3.6);
$average_speed = sprintf("%0.2f", $average_speed);
print "Average speed: $average_speed m/s = $average_speed_km km/h\n";
my @powers = map { (split ' ', $_->{'power'})[0] } @activity_data;
my $maximum_power = max @powers;
print "Maximum power: $maximum_power W\n";
my $average_power = (sum @powers) / (scalar @powers);
$average_power = sprintf("%0.2f", $average_power);
print "Average power: $average_power W\n";
my @heart_rates = map { (split ' ', $_->{'heart_rate'})[0] } @activity_data;
my $maximum_heart_rate = max @heart_rates;
print "Maximum heart rate: $maximum_heart_rate bpm\n";
my $average_heart_rate = (sum @heart_rates) / (scalar @heart_rates);
$average_heart_rate = sprintf("%0.2f", $average_heart_rate);
print "Average heart rate: $average_heart_rate bpm\n";
If you’re following along at home–and assuming that you’ve added this code
to the end of geo-fit-full-data-extraction.pl
–when you run the file, you
should see output like this:
$ perl geo-fit-full-data-extraction.pl
Found 3273 entries in FIT file
Available fields: altitude, cadence, distance, heart_rate, position_lat,
position_long, power, speed, timestamp
Total distance: 31.10591 km
Maximum speed: 18.802 m/s = 67.6872 km/h
Average speed: 9.51 m/s = 34.23 km/h
Maximum power: 1023 W
Average power: 274.55 W
Maximum heart rate: 165 bpm
Average heart rate: 142.20 bpm
Nice! That gives us more of a feel for the data and what we can learn from it. We can also see that I was working fairly hard on this bike ride as seen from the average power and average heart rate data.
One thing to highlight about these numbers, from my experience riding both indoors and outdoors, is that the average speed on Zwift is too high. Were I riding my bike outside on the road, I’d be more likely to have an average speed of ~25 km/h, not the 34 km/h shown here. I think this discrepancy comes from Zwift not accurately converting power output into speed within the game.11 I’m not sure where the discrepancy comes from. Perhaps I don’t go as hard when out on the road? Dunno.
From experience, I know that it’s easier to put in more effort over shorter periods. Thus, I’d expect the average speed to be a bit higher indoors when doing shorter sessions. Another factor is that when riding outside one has to contend with stopping at intersections and traffic lights etc. Stopping and starting brings down the average speed on outdoor rides. These considerations might explain part of the discrepancy, but I don’t think it explains it all.
There’s some duplication in the above code that I could remove. For
instance, the code for extracting the numerical part of a data entry’s value
should really be in its own function. I don’t need to map
over a split
each time; those are just implementation details that should hide behind a
nicer interface. Also, the average value calculation would be better in its
own function.
A possible refactoring to reduce this duplication could look like this:
# extract and return the numerical parts of an array of FIT data values
sub num_parts {
my $field_name = shift;
my @activity_data = @_;
return map { (split ' ', $_->{$field_name})[0] } @activity_data;
}
# return the average of an array of numbers
sub avg {
my @array = @_;
return (sum @array) / (scalar @array);
}
which one would use like so:
my @speeds = num_parts('speed', @activity_data);
my $average_speed = avg(@speeds);
Seeing numerical values of ride statistics is all well and good, but it’s much nicer to see a picture of the data. To do this, we need to plot it.
But that’s a story for another time.
Note that I’m not affiliated with Zwift. I use the platform for training, especially for short rides, when the weather’s bad and in the winter. ↩︎
Note that I’m not affiliated with Garmin. I own a Garmin Edge 530 head unit and find that it works well for my needs. ↩︎
Note that I’m not affiliated with Strava. I’ve found the platform to be useful for individual ride analysis and for collating a year’s worth of training. ↩︎
There are different kinds of data messages.
We usually want record
s as these messages contain event data
from sporting activities. ↩︎
For those wondering: these coordinates would put me on the island of Teanu, which is part of the Santa Cruz Islands. This island group is north of Vanuatu and east of the Solomon Islands in the Pacific Ocean. ↩︎
I expected this field to be spelled ‘haemoglobin’ rather than hemoglobin. Oh well. ↩︎
Jeff Attwood wrote an interesting take on the use of regular expressions. ↩︎
Garmin calls a complete ride (or run, if you’re that way inclined) an “activity”. Hence I’m using their nomenclature here. ↩︎
Remember that fetch()
returns undef
on failure or EOF. ↩︎
Note that I’ve removed the uninitialized value
warnings from the script output. ↩︎
Even though Zwift is primarily a training platform, it is also a game. There are power-ups and other standard gaming features such as experience points (XP). Accumulating XP allows you to climb up a ladder of levels which then unlocks other features and in-game benefits. This is the first computer game I’ve ever played where strength and fitness in real life play a major role in the in-game success. ↩︎
Published by Gabor Szabo on Monday 23 June 2025 06:22
Originally published at Perl Weekly 726
Hi there,
Most of us have explored and seen the power of ChatGPT. Last week, Dave Cross shared very interesting tool written in Perl to create podcast for the Perl Weekly newsletter. We already have podcast for Week #723, Week #724 and Week #725. I am very impressed with the content and sound quality. Great job, Dave. You can checkout the code in the GitHub repository.
Another post, Slice of Perl by Dave Cross inspired me to write about Array vs List in Perl.
For all Dancer2 fan, we have a good news, very soon there would be Dancer2 2.0.0 release. I am very excited and looking forward to it.
Enjoy rest of the newsletter.
--
Your editor: Mohammad Sajid Anwar.
I have been teaching Perl for 25 years (and Python and Rust for a shorter period of time). Most of my courses were geared towards corporations and they are 3-4-5 days long, 8 hours a day covering a lot of subjects. Today it seems we need very short and very focused courses. So I am splitting my long courses into subject and will run those mini-courses. The first 3 I've announced are about Object Oriented Programming in Perl, Functional Programming in Python, and Creating a command line tool in Rust. Check out if any of these courses would be interesting to you or to some of your co-workers! If you have any question, send me an email (szabgab@gmail.com) or contact me via LinkedIn or join our Telegram channel.
The Dancer Core Team is excitedly preparing a major release of Dancer2, 2.0.0. In advance of this, I'd like to give you all a preview of what to expect.
For more fine-grained analysis of FIT file data, it’d be great to be able to investigate it interactively.
Find out the subtle difference between Array and List in Perl.
The Weekly Challenge by Mohammad Sajid Anwar will help you step out of your comfort-zone. You can even win prize money of $50 by participating in the weekly challenge. We pick one champion at the end of the month from among all of the contributors during the month, thanks to the sponsor Lance Wicks.
Welcome to a new week with a couple of fun tasks "Missing Integers" and "MAD". If you are new to the weekly challenge then why not join us and have fun every week. For more information, please read the FAQ.
Enjoy a quick recap of last week's contributions by Team PWC dealing with the "Day of the Year" and "Decompressed List" tasks in Perl and Raku. You will find plenty of solutions to keep you busy.
Well-commented, educational, and test-driven structure. Offers neat and idiomatic Perl solutions that avoid overengineering.
The solutions are clear, concise, and make good use of Perl's standard modules and operators. The second task uses manual iteration, which is easy to follow and avoids recursion or map-heavy constructs.
Solutions are succinct, leveraging Raku's expressive standard library. Emphasizes type and input validation in the signature, promoting reliability.
The post is well-written and shows how Raku and Perl handle the same problem with slight idiomatic differences.
PDL offers elegant ways to handle list transformations via vectorized operations. The post illustrates a concise and advanced Perl style that values reuse and idiomatic tools.
The post not only provides solutions in various programming languages and database systems but also emphasizes the importance of maintaining coding skills and learning new technologies. It showcase the versatility of different languages and tools in solving the same problem, providing valuable insights for developers interested in exploring multiple programming environments.
Solutions are characterized by their simplicity and effective use of Perl's features. The clear explanations and well-structured code make the blog post a valuable resource for Perl enthusiasts.
Concise and idiomatic Perl. Code is well structured and leverage robust date/time modules to avoid reinventing calendar logic.
Emphasizes the use of core Perl modules (Time::Piece) to avoid reinventing date logic. Mentions best practices such as handling context carefully in the decompression task.
Clear demonstration of using CPAN modules to solve common problems. Concise, readable code snippets.
The solutions are complete and well-documented, providing clear explanations of the logic behind each step. It serves as an excellent resource for Perl programmers looking to improve their problem-solving skills and coding practices.
Provides a comparative analysis of how different programming languages approach the same problems, highlighting the diversity in language features and libraries. The post serves as an insightful resource for programmers interested in exploring multiple solutions to common problems across various languages.
It emphasizes the importance of using built-in functions for tasks like calculating the day of the year to avoid errors and simplify code. The solutions in both Python and Perl demonstrate clear and efficient approaches to the challenges.
The solutions demonstrate Raku's expressive syntax and powerful built-in methods, leading to clean and efficient code. The inclusion of input validation and user-friendly comments further enhances the quality of the solutions.
The solution efficiently handles the decompression task using Perl's list manipulation features. Solutions are concise and demonstrate effective use of Perl's modules and list handling capabilities.
Solutions are notable for their clarity and efficiency, demonstrating a deep understanding of Python's capabilities. Exploration of both standard library functions and mathematical approaches provides valuable insights for Python enthusiasts.
A section for newbies and for people who need some refreshing of their Perl knowledge. If you have questions or suggestions about the articles, let me know and I'll try to make the necessary changes. The included articles are from the Perl Maven Tutorial and are part of the Perl Maven eBook.
The first page of my new OOP booklet.
Great CPAN modules released last week.
A couple of entries sneaked in by Gabor.
If you use LinkedIn I'd like to invite you to follow me on that platform or if you like, send me a connect request. In that case, please include a note that you are are reader of the Perl Weekly newsletter.
Greenville, South Carolina, USA
You joined the Perl Weekly to get weekly e-mails about the Perl programming language and related topics.
Want to see more? See the archives of all the issues.
Not yet subscribed to the newsletter? Join us free of charge!
(C) Copyright Gabor Szabo
The articles are copyright the respective authors.
Published by Gabor Szabo on Monday 23 June 2025 06:20
Originally published at Perl Weekly 725
Hi there,
First of all, I'd like to apologize, I could not get back to every reader of the Perl Weekly who expressed their solidarity and asked if I am in any danger: Thank you for asking! My immediate family and myself, are OK, but it is scary to be targetted by half-a-ton ballistic missiles.
A funny thing happened, someone, called SophoDave, asked if there are any Perl podcasts just when the 2nd episode of The Underbar was published. One person recommended our newsletter to which Olaf Alders suggested someone could read it out aloud.
Dave Cross wrote about a Perl script Generating Content with ChatGPT. Which made me wonder, would it be possible to have some Perl script that would take the content of the Perl Weekly newsletter and using some AI tool would generated a podcast out of it? Any volunteers?
I wish you a calm week!
--
Your editor: Gabor Szabo.
The next version of Perl is going to be v5.42.0. Or maybe 42.0? Listen to Perl leaders arguing about what version numbers represent, and what it means to change one.
In an excellent timing SophoDave asked: Are there any Perl related podcasts out there? Not seeing any on iTunes.
How Bartosz created a control center with Perl. discuss
Earlier this week, Dave read a post from someone who failed a job interview because they used a hash slice in some sample code and the interviewer didn't believe it would work.
Continuing the blog series on AWS encryption, this post focuses on Server-Side Encryption using Customer-Provided Keys.
The question was: how long would it take to merge the next big thing, multifactor authentication for PAUSE? Two years, three years, or maybe four years this time?
Using Perl.
An operator in programming is a symbol or keyword that performs a specific operation on one or more operands (values or variables). There are many types of operators, such as arithmetic operators (like +, -, *, /) and comparison operators (like ==, !=, <, >). In Perl, you can overload these operators for your own classes, allowing you to define custom behaviour when these operators are used with objects of that class.
In programming file processing is a key skill to master. Files are essential for storing data, reading configurations, and logging information. A file handle is a reference to an open file, allowing you to read from or write to that file.
The Weekly Challenge by Mohammad Sajid Anwar will help you step out of your comfort-zone. You can even win prize money of $50 by participating in the weekly challenge. We pick one champion at the end of the month from among all of the contributors during the month, thanks to the sponsor Lance Wicks.
Welcome to a new week with a couple of fun tasks "Day of the Year" and "Decompressed List". If you are new to the weekly challenge then why not join us and have fun every week. For more information, please read the FAQ.
Enjoy a quick recap of last week's contributions by Team PWC dealing with the "Consecutive One" and "Final Price" tasks in Perl and Raku. You will find plenty of solutions to keep you busy.
The post showcases a masterful blend of programming paradigms. The solutions are elevated by deep mathematical framing.
It approaches the challenges with outside-the-box thinking that yields surprisingly elegant solutions. It exemplify Perl's philosophy of TIMTOWTDI.
Raku's expressive power showcased elegantly. Also demonstrates Raku's unique capabilities through idiomatic and nearly poetic solutions.
Demonstrate production-ready solutions with exceptional attention to defensive programming and comprehensive testing.
Unique mathematical insight with visual explanations. The post excels at breaking down problems using mathematical reasoning.
Unique approach of solving challenges in multiple languages offers rare insights. The side-by-side implementation helps readers think polyglot, understanding how algorithms translate across languages.
Concise and readable code with functional Perl. It is a great blend of brevity, functionality, and interactivity, making it both instructive and practical for Perl enthusiasts.
A detailed, step-by-step breakdown of his thought process for both tasks. Explanations are concise yet thorough, making it easy to follow his logic.
It stands out for its storytelling approach to technical challenges. The solution balances Perl’s classic style with contemporary readability.
Algorithmic purity with mathematical precision. Code clarity is exceptional with pedagogical value.
Rigorous problem analysis with detailed edge-case handling. The solutions prioritize clarity and maintainability.
Deep mathematical insight & algorithmic elegance. Code is compact but highly readable with minimalist syntax.
It is approachable and conversational, making it great for learners while still valuable for experienced developers. Also explains the challenge in simple terms before diving into solutions.
It demonstrates Perl at its most elegant and expressive. It achieves maximum density of Perl idioms without sacrificing readability
Achieves maximum effect with minimum code through Python's unique features. It proves good algorithms can be elegantly expressed in any language.
Great CPAN modules released last week.
A couple of entries sneaked in by Gabor.
A quick post encouraging people to use mdbook.
Greenville, South Carolina, USA
You joined the Perl Weekly to get weekly e-mails about the Perl programming language and related topics.
Want to see more? See the archives of all the issues.
Not yet subscribed to the newsletter? Join us free of charge!
(C) Copyright Gabor Szabo
The articles are copyright the respective authors.
Published by alh on Sunday 22 June 2025 17:38
Tony writes: ``` [Hours] [Activity] 2025/04/01 Tuesday 0.22 #23151 check CI results, fix minitest and re-push 1.77 #23160 try to decode how the NEED_ stuff works, try leont’s suggestion and test, push for CI 0.82 #22125 check smoke results, rebase and push 0.50 #21878 consider how to implement this
3.84
2025/04/02 Wednesday 0.23 #23075 rebase and squash some, push for CI 0.98 test-dist-modules threaded testing: check CI results, remove 5.8, clean up commits, push for CI 0.10 #23075 check CI results and apply to blead
1.59
2025/04/03 Thursday 0.37 #23151 check CI results, open PR 23171 1.60 #17601 side-issue: check history, testing, find an unrelated problem, work on a fix, testing 0.20 #17601 side-issue: push fix for CI, comment and mark
2.17
2025/04/07 Monday 0.15 #22120 follow-up 1.57 #23151 add suggested change, testing and push 0.62 #23172 review and comment 0.20 #23177 review, research and apply to blead 0.37 #17601 side-issue: check CI results, add perldelta, cleanup commit message, open PR 23178 0.60 #23022 clean up, add perldelta, push for CI
4.24
2025/04/08 Tuesday 0.53 #17601 research, minor fix and comment 0.08 #22125 fix test failure 0.48 #17601 side-issue: testing, research and comment 0.55 #16608 reproduce, code review
3.26
2025/04/09 Wednesday 1.23 #17601 side issue: add a panic message, research and comment 2.40 #16608 research, try to reproduce some other cases, comment, work on fixes, tests, work class initfields similar bug 1.83 #16608 fix an issue with smartmatch fix, work on initfields fix, testing, perldelta, push for CI, smoke-me 0.33 #17601 test another build configuration, minor fix and push 0.28 #23151 testing
6.30
2025/04/10 Thursday 0.32 #16608 fix a minor issue and re-push 0.13 #23165 review updates and approve 2.28 look into smoke test failures, ASAN detected leak from op/signatures, debugging, make #23187 2.28 op/signatures leak: debugging, work it out (I think), work
5.01
2025/06/14 Saturday 3.45 #23022 re-check, minor re-work, testing, push
3.80
2025/04/15 Tuesday 1.15 #23187 consider re-work, minor fix, testing, perldelta, push for CI 0.70 document that TARG isn’t pristine and the implications, open #23196 0.60 #16608 check smoke results, debugging and fix, push for CI/smoke 1.13 #22125 clean up commit history, testing, perldelta, more
3.58
2025/04/16 Wednesday 0.23 #23196 edits as suggested and push 1.50 #23187 check CI results, investigate ASAN results, which appear unrelated, open PR 23203 0.67 #23201 review, research a lot, approve 0.20 #16608 check CI results, make PR 23204 0.63 #1674 review history and research, comment since I’m
3.23
2025/04/22 Tuesday 0.17 #23207 review, research and approve 0.92 #23208 review, testing and comment 1.80 #23202 review, testing 0.67 #23202 more review, testing 0.37 #23202 more review, comments 0.25 #23208 research and comment
4.61
2025/04/23 Wednesday 0.30 #23202 review responses 0.80 #23172 review updates, approve 0.22 #1674 research 1.63 #1674 more research, minor change, testing, push for CI 0.45 #3965 testing 0.23 #3965 more testing, comment and mark “Closable?” 0.10 #1674 review CI results and make PR 23219
4.95
2025/04/24 Thursday 0.22 #23216 review and approve 0.08 #23217 review and approve 0.08 #23220 review and approve 1.10 #23215 testing, look if we can eliminate the conditional from cSVOPx_sv() on threads (we can’t directly, the non- pad sv is used at compile-time), approve 0.35 #23208 review, research, comments 1.27 #4106 research 2.70 #4106 testing for potential bugs and misbehaviour, chainsaw for w32_fdpid and make it like everyone else,
5.80
2025/04/28 Monday 0.35 #20841 comment 2.38 #22374 minor fixes, testing, force push to update, comments 0.13 #23226 review and approve 0.70 #23227 review, research, check build logs and comment
4.01
2025/04/29 Tuesday 0.42 #23228 check updates and approve 0.63 #23227 testing and comment 1.07 #23225 start review
3.35
2025/04/30 Wednesday 1.28 #23227 review, testing, research and approve with comment 0.68 #4106 check results, look for existing tests that might test this, testing 2.23 #4106 review history, work on a new test, testing, push for CI 0.83 #23232 review docs, open Dual-Life/experimental#22 which
5.02
Which I calculate is 64.76 hours.
Approximately 33 tickets were reviewed or worked on, and 2 patches were applied. ```
Published on Sunday 22 June 2025 16:45
The examples used here are from the weekly challenge problem statement and demonstrate the working solution.
You are given a date in the format YYYY-MM-DD. Write a script to find day number of the year that the given date represent.
The core of the solution is contained in a main loop. The resulting code can be contained in a single file.
The answer is arrived at via a fairly straightforward calculation.
sub day_of_year {
my ($date) = @_;
my $day_of_year = 0;
my ($year, $month, $day) = split /-/, $date;
⟨determine if this is a leap year 3 ⟩
my @days_in_month = (31, $february_days, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31);
$day_of_year += $days_in_month[$_] for (0 .. $month - 2);
$day_of_year += $day;
return $day_of_year;
}
◇
Let’s break the logic for computing a leap year into it’s own section. A leap year occurs every 4 years, except for years that are divisible by 100, unless they are also divisible by 400.
Just to make sure things work as expected we’ll define a few short tests. The double chop is just a lazy way to make sure there aren’t any trailing commas in the output.
MAIN:{
say day_of_year q/2025-02-02/;
say day_of_year q/2025-04-10/;
say day_of_year q/2025-09-07/;
}
◇
Fragment referenced in 1.
$ perl perl/ch-1.pl 33 100 250
You are given an array of positive integers having even elements. Write a script to to return the decompress list. To decompress, pick adjacent pair (i, j) and replace it with j, i times.
For fun let’s use recursion!
Sometimes when I write a recursive subroutine in Perl I use a reference variable to set the return value. Other times I just use an ordinary return. In some cases, for convenience, I’ll do this with two subroutines. One of these is a wrapper which calls the main recursion.
For this problem I’ll do something a little different. I’ll have one subroutine and for each recursive call I’ll add in an array reference to hold the accumulating return value.
Note that we take advantage of Perl’s automatic list flattening when pushing to the array reference holding the new list we are building.
sub decompress_list{
my $r = shift @_;
if(
!ref($r) || ref($r) ne q/ARRAY/){
unshift @_, $r;
$r = [];
}
unless(@_ == 0){
my $i = shift @_;
my $j = shift @_;
push @{$r}, ($j) x $i;
decompress_list($r, @_);
}
else{
return @{$r};
}
}
◇
Fragment referenced in 5.
The main section is just some basic tests.
MAIN:{
say join q/, /, decompress_list 1, 3, 2, 4;
say join q/, /, decompress_list 1, 1, 2, 2;
say join q/, /, decompress_list 3, 1, 3, 2;
}
◇
Fragment referenced in 5.
$ perl perl/ch-2.pl 3, 4, 4 1, 2, 2 1, 1, 1, 2, 2, 2
Published by Zilore Mumba on Sunday 22 June 2025 06:06
I have data file1, which starts from the beginning of the data recording. Data file 2 contains data from later than the beginning of the record but goes beyond file1. I an trying to update file1 so that it contains the whole record, from the beginning in file1 to the end of file2.
My attempt so far is not successful. Below are parts of the sample files
file1
'LIVING01' 2022 01 0.0 3.6 2.0 8.0 5.6 51.0 62.0 73.6 5.9 29.6 11.5 40.3 2.4 5.6 0.7 0.0 5.4 5.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 35.4 12.9 1.0 10.8 1.0 17.1
'LIVING01' 2022 05 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
'LIVING01' 2022 06 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 -99.0
'LIVING01' 2022 09 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 -99.0
'LIVING01' 2022 10 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
'LIVING01' 2023 02 8.3 0.0 0.0 3.0 11.7 0.0 0.0 0.0 1.9 0.0 0.0 0.0 2.8 1.2 0.0 3.9 32.3 72.8 14.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 -99.0 -99.0 -99.0
'LIVING01' 2023 06 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 -99.0
'LIVING01' 2023 08 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
'LIVING01' 2023 09 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 -99.0
'LIVING01' 2023 10 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.9 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 22.5 0.0
'LIVING01' 2023 11 0.0 0.0 0.0 0.0 0.0 3.3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 27.9 0.0 0.0 0.0 0.0 0.0 0.5 3.0 0.8 0.0 2.5 11.0 1.0 0.0 0.0 0.0 0.5 -99.0
'LIVING01' 2023 12 0.0 0.0 0.0 0.0 0.0 0.0 4.1 0.0 0.5 0.0 0.0 0.0 5.5 9.1 1.5 0.0 0.0 0.0 0.0 0.0 0.0 4.6 5.9 0.0 3.2 57.0 44.1 0.0 0.0 0.0 0.0
file2
'LIVING01' 2023 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
'LIVING01' 2023 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
'LIVING01' 2023 9 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
'LIVING01' 2023 10 0 0 0 0 0 0 0 0 0 0 0 0 0.9 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 22.5 0
'LIVING01' 2023 11 0 0 0 0 0 3.3 0 0 0 0 0 0 0 27.9 0 0 0 0 0 0.5 3 0.8 0 2.5 11 1 0 0 0 0.5
'LIVING01' 2023 12 0 0 0 0 0 0 4.1 0 0.5 0 0 0 5.5 9.1 1.5 0 0 0 0 0 0 4.6 5.9 0 3.2 57 44.1 0 0 0 0
'LIVING01' 2024 1 2 0 0 0 13 0 0 0 0 0 7.4 9.9 3.4 1.1 22 6.3 5.1 36.3 0 1.1 0 0 0 0 0 0 0 0 0 0 0
'LIVING01' 2024 2 1.8 0 0 0 0 0 16.5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4.8 0 0 0
'LIVING01' 2024 3 0 0 0 0 0 1.9 5.9 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 14.6 0 1 0
'LIVING01' 2024 4 0 0 0 0 0 9 20.5 10.4 4.6 9 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
'LIVING01' 2024 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
'LIVING01' 2024 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
'LIVING01' 2024 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
'LIVING01' 2024 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
'LIVING01' 2024 9 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
'LIVING01' 2024 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
'LIVING01' 2024 11 0 0 0 0 0 0 5.4 0 0 0 2.5 0 6.6 0 18 6.2 0 0 0 0 0 4.5 0 0 2 0 0 0 0 0
'LIVING01' 2024 12 0 0 0 0 0 0 0.9 0 0 2 0 0 0 0 13 3 0 1.5 0 0 0 12.9 0 0 0 0.2 18.8 0 2 4.5 10
In my code below, I am able to write file1 to a temporary file, then I get the line number of the last line in file1. I then try to skip all lines in file2, up to the last line number in file1.
open(my $FILE1, "< file1") or die "open 'file1': $! ($^E)";
open(my $TEMPF, "> tempfile.txt") or die "open 'tempfile.txt': $! ($^E)";
while (my $line = readline $FILE1){
my($stn_name, $yr, $mn, $dat)=split(' ', $line, 4);
my @data=split(' ', $dat, 31);
my $ts = sprintf("%08s %04d %02d", $stn_name, $yr, $mn);
my $format = "%s".( " %6.1f" x 31 )."\n";
printf $TEMPF $format, $ts, @data;
printf $format, $ts, @data;
if(eof){
my $endpoint=$.;
}
}
close($FILE1);
print "$endpoint\n"; #Note, not recognised!!!
open(my $FILE2, "< file2") or die "open 'file2': $! ($^E)";
while (my $lines = <$FILE2>){
next unless($. <= $endpoint);
if($. > $endpoint){
goto label; #Read file2 to end line of file 1, then goto label
}
}
label:
while (my $nline = <$FILE2>){
next if($. <= $entpoint);
if($. > $entpoint){
.....
process the data;
printf $TEMPF $format, $ts, @data;
printf $format, $ts, @data;
}
}
close($FILE2);
close($TEMPF);
What can I try next?
Each week Mohammad S. Anwar sends out The Weekly Challenge, a chance for all of us to come up with solutions to two weekly tasks. My solutions are written in Python first, and then converted to Perl. It's a great way for us all to practice some coding.
You are given a date in the format YYYY-MM-DD
.
Write a script to find day number of the year that the given date represent.
There a couple of ways to solve this challenge.
It's pretty much a no-brainer to use the last option. In Python this returns a 3 digit zero-padded string, so I need to convert it to an integer.
I also check that the input is in the expected format.
def day_of_year(input_date: str) -> int:
if not re.match(r'^\d{4}-\d\d?-\d\d?$', input_date):
raise ValueError("Input date must be in 'YYYY-MM-DD' format")
year, month, day = map(int, input_date.split('-'))
return int((date(year, month, day).strftime('%j')))
Perl has the Date::Calc CPAN module, which has a Date_of_Year function built in.
sub main ($date) {
if ($date !~ /^[0-9]{4}-[0-9][0-9]?-[0-9][0-9]?$/) {
die "Usage: $0 YYYY-MM-DD\n";
}
say Day_of_Year(split /-/, $date);
}
It should also be pointed out that neither handle the situation that happened in 2011 when Samoa skipped December 30th entirely. I do suspect this is why they did it at the end of the calendar year.
$ ./ch-1.py 2025-02-02
33
$ ./ch-1.py 2025-04-10
100
$ ./ch-1.py 2025-09-07
250
You are given an array of positive integers having even elements.
Write a script to to return the decompress list. To decompress, pick adjacent pair (i, j)
and replace it with j
, i
times.
This is relatively straight forward. For this task, I have an loop i
that goes from 0 to the length of the list, incrementing by two each iteration. For each iteration I extend the results
list (array in Perl), with value
, count
times.
Using variable like count
and value
instead of i
and j
makes it easier to understand what the variables are intended to be used for.
def decompressed_list(ints: list) -> list:
result = []
for i in range(0, len(ints), 2):
count = ints[i]
value = ints[i + 1]
result.extend([value] * count)
return result
The Perl solution is a transliteration of the Python code.
$ ./ch-2.py 1 3 2 4
[3, 4, 4]
$ ./ch-2.py 1 1 2 2
[1, 2, 2]
$ ./ch-2.py 3 1 3 2
[1, 1, 1, 2, 2, 2]
Published on Sunday 22 June 2025 00:00
Published by prz on Saturday 21 June 2025 22:13
Published by Ron Savage on Friday 20 June 2025 07:51
Remember! Click Continue Reading to see all the text.
I am selling my villa unit and downsizing, probably in a month or so.
There may be a period when I am off-line.
In Australia villa unit means (usually) a stand-alone building in a small block of units.
I have 2-bedroom unit and am moving into a retirement (Yikes!) village to a 1-bedroom unit.
The are various reasons but one is this month I turned 75, much to my amazement and horror.
I still live independently, drive, have 2 miniature dogs, manage my own medicine, etc. So - all good ATM.
And yes, I am still programming. I more-or-less monthly release https://savage.net.au/misc/Perl.Wiki.html,
my curated compendium of Perl module, and I am slowly automating the creation of this wiki.
The next step will be to output the wiki as a jsTree (https://www.jstree.com/),
but moving - as you might know - consumes a lot of time.....
Published by Mohammad Sajid Anwar on Wednesday 18 June 2025 18:27
Quick refresher about Array and List in Perl.Please check out the link for more information: https://theweeklychallenge.org/blog/array-vs-list
Published by Paul Cochrane 🇪🇺 on Tuesday 17 June 2025 22:00
Printing statistics to the terminal or plotting data extracted from FIT files is all well and good. The problem is that the feedback loops are too long. Sometimes questions are better answered by playing with the data directly. Enter the Perl Data Language.
For more fine-grained analysis of our FIT file data, it’d be great to be able to investigate it interactively. Other languages such as Ruby, Raku and Python have a built-in REPL.1 Yet Perl doesn’t.2 But help is at hand! PDL (the Perl Data Language) is designed to be used interactively and thus has a REPL we can use to manipulate and investigate our activity data.3
Before we can use PDL, we’ll have to install it:
$ cpanm PDL
After it has finished installing (this can take a while), you’ll be able to start the perlDL shell
with the pdl
command:
perlDL shell v1.357
PDL comes with ABSOLUTELY NO WARRANTY. For details, see the file
'COPYING' in the PDL distribution. This is free software and you
are welcome to redistribute it under certain conditions, see
the same file for details.
ReadLines, NiceSlice, MultiLines enabled
Reading PDL/default.perldlrc...
Found docs database /home/vagrant/perl5/perlbrew/perls/perl-5.38.3/lib/site_perl/5.38.3/x86_64-linux/PDL/pdldoc.db
Type 'help' for online help
Type 'demo' for online demos
Loaded PDL v2.100 (supports bad values)
Note: AutoLoader not enabled ('use PDL::AutoLoader' recommended)
pdl>
To exit the pdl
shell, enter Ctrl-D
at the prompt and you’ll be returned to your terminal.
To manipulate the data in the pdl
shell, we want to be able to call individual routines from the geo-fit-plot-data.pl
script. This way we can use the arrays that some of the routines return to initialise PDL data objects.
It’s easier to manipulate the data if we get ourselves a bit more organised first.4 In other words, we need to extract the routines into a module, which will make calling the code we created earlier from within pdl
much easier.
Before we create a module, we need to do some refactoring. One thing that’s been bothering me is the way the plot_activity_data()
subroutine also parses and manipulates date/time data. This routine should be focused on plotting data, not on massaging its requirements into the correct form. Munging the date/time data is something that should happen in its own routine. This way we encapsulate the concepts and abstract away the details. Another way of saying this is that the plotting routine shouldn’t “know” how to manipulate date/time information to do its job.
To this end, I’ve moved the time extraction code into a routine called get_time_data()
:
sub get_time_data {
my @activity_data = @_;
# get the epoch time for the first point in the time data
my @timestamps = map { $_->{'timestamp'} } @activity_data;
my $first_epoch_time = $date_parser->parse_datetime($timestamps[0])->epoch;
# convert timestamp data to elapsed minutes from start of activity
my @times = map {
my $dt = $date_parser->parse_datetime($_);
my $epoch_time = $dt->epoch;
my $elapsed_time = ($epoch_time - $first_epoch_time)/60;
$elapsed_time;
} @timestamps;
return @times;
}
The main change here in comparison to the previous version of the code is that we pass the activity data as an argument to get_time_data()
, returning the @times
array to the calling code.
The code creating the date string used in the plot title now also resides in its own function:
sub get_date {
my @activity_data = @_;
# determine date from timestamp data
my @timestamps = map { $_->{'timestamp'} } @activity_data;
my $dt = $date_parser->parse_datetime($timestamps[0]);
my $date = $dt->strftime("%Y-%m-%d");
return $date;
}
Where again, we’re passing the @activity_data
array to the function. It then returns the $date
string that we use in the plot title.
Both of these routines use the $date_parser
object, which I’ve extracted into a constant in the main script scope:
our $date_parser = DateTime::Format::Strptime->new(
pattern => "%Y-%m-%dT%H:%M:%SZ",
time_zone => 'UTC',
);
I’ve also made it our
so that both subroutines needing this information have access to it.
It’s time to make our module. I’m not going to create the full Perl module infrastructure here, as it’s not necessary for our current goal. I want to import a module called Geo::FIT::Utils
and then access the functions that it provides.5 Thus–in an appropriate project directory–we need to create a file called lib/Geo/FIT/Utils.pm
as well as its associated path:
$ mkdir -p lib/Geo/FIT
$ touch lib/Geo/FIT/Utils.pm
Opening the file in an editor and entering this stub module code:
package Geo::FIT::Utils;
use Exporter 5.57 'import';
our @EXPORT_OK = qw(
extract_activity_data
show_activity_statistics
plot_activity_data
get_time_data
num_parts
);
1;
we now have the scaffolding of a module that (at least, theoretically) exports the functionality we need.
Line 1 specifies the name of the module. Note that the module’s name must match its path on the filesystem, hence why we created the file Geo/FIT/Utils.pm
.
We import the Exporter
module (line 3) so that we can specify the functions to export. This is the @EXPORT_OK
array’s purpose (lines 6-12).
Finally, we end the file on line 14 with the code 1;
. This line is necessary so that importing the package (which in this case is also a module) returns a true value. The value 1
is synonymous with Boolean true in Perl, hence why it’s best practice to end module files with 1;
.
Copying all the code except the main()
routine from geo-fit-plot-data.pl
into Utils.pm
, we end up with this:
package Geo::FIT::Utils;
use strict;
use warnings;
use Exporter 5.57 'import';
use Geo::FIT;
use Scalar::Util qw(reftype);
use List::Util qw(max sum);
use Chart::Gnuplot;
use DateTime::Format::Strptime;
our $date_parser = DateTime::Format::Strptime->new(
pattern => "%Y-%m-%dT%H:%M:%SZ",
time_zone => 'UTC',
);
sub extract_activity_data {
my $fit = Geo::FIT->new();
$fit->file( "2025-05-08-07-58-33.fit" );
$fit->open or die $fit->error;
my $record_callback = sub {
my ($self, $descriptor, $values) = @_;
my @all_field_names = $self->fields_list($descriptor);
my %event_data;
for my $field_name (@all_field_names) {
my $field_value = $self->field_value($field_name, $descriptor, $values);
if ($field_value =~ /[a-zA-Z]/) {
$event_data{$field_name} = $field_value;
}
}
return \%event_data;
};
$fit->data_message_callback_by_name('record', $record_callback ) or die $fit->error;
my @header_things = $fit->fetch_header;
my $event_data;
my @activity_data;
do {
$event_data = $fit->fetch;
my $reftype = reftype $event_data;
if (defined $reftype && $reftype eq 'HASH' && defined %$event_data{'timestamp'}) {
push @activity_data, $event_data;
}
} while ( $event_data );
$fit->close;
return @activity_data;
}
# extract and return the numerical parts of an array of FIT data values
sub num_parts {
my $field_name = shift;
my @activity_data = @_;
return map { (split ' ', $_->{$field_name})[0] } @activity_data;
}
# return the average of an array of numbers
sub avg {
my @array = @_;
return (sum @array) / (scalar @array);
}
sub show_activity_statistics {
my @activity_data = @_;
print "Found ", scalar @activity_data, " entries in FIT file\n";
my $available_fields = join ", ", sort keys %{$activity_data[0]};
print "Available fields: $available_fields\n";
my $total_distance_m = (split ' ', ${$activity_data[-1]}{'distance'})[0];
my $total_distance = $total_distance_m/1000;
print "Total distance: $total_distance km\n";
my @speeds = num_parts('speed', @activity_data);
my $maximum_speed = max @speeds;
my $maximum_speed_km = $maximum_speed*3.6;
print "Maximum speed: $maximum_speed m/s = $maximum_speed_km km/h\n";
my $average_speed = avg(@speeds);
my $average_speed_km = sprintf("%0.2f", $average_speed*3.6);
$average_speed = sprintf("%0.2f", $average_speed);
print "Average speed: $average_speed m/s = $average_speed_km km/h\n";
my @powers = num_parts('power', @activity_data);
my $maximum_power = max @powers;
print "Maximum power: $maximum_power W\n";
my $average_power = avg(@powers);
$average_power = sprintf("%0.2f", $average_power);
print "Average power: $average_power W\n";
my @heart_rates = num_parts('heart_rate', @activity_data);
my $maximum_heart_rate = max @heart_rates;
print "Maximum heart rate: $maximum_heart_rate bpm\n";
my $average_heart_rate = avg(@heart_rates);
$average_heart_rate = sprintf("%0.2f", $average_heart_rate);
print "Average heart rate: $average_heart_rate bpm\n";
}
sub plot_activity_data {
my @activity_data = @_;
# extract data to plot from full activity data
my @times = get_time_data(@activity_data);
my @heart_rates = num_parts('heart_rate', @activity_data);
my @powers = num_parts('power', @activity_data);
# plot data
my $date = get_date(@activity_data);
my $chart = Chart::Gnuplot->new(
output => "watopia-figure-8-heart-rate-and-power.png",
title => "Figure 8 in Watopia on $date: heart rate and power over time",
xlabel => "Elapsed time (min)",
ylabel => "Heart rate (bpm)",
terminal => "png size 1024, 768",
xtics => {
incr => 5,
},
ytics => {
mirror => "off",
},
y2label => 'Power (W)',
y2range => [0, 1100],
y2tics => {
incr => 100,
},
);
my $heart_rate_ds = Chart::Gnuplot::DataSet->new(
xdata => \@times,
ydata => \@heart_rates,
style => "lines",
);
my $power_ds = Chart::Gnuplot::DataSet->new(
xdata => \@times,
ydata => \@powers,
style => "lines",
axes => "x1y2",
);
$chart->plot2d($power_ds, $heart_rate_ds);
}
sub get_time_data {
my @activity_data = @_;
# get the epoch time for the first point in the time data
my @timestamps = map { $_->{'timestamp'} } @activity_data;
my $first_epoch_time = $date_parser->parse_datetime($timestamps[0])->epoch;
# convert timestamp data to elapsed minutes from start of activity
my @times = map {
my $dt = $date_parser->parse_datetime($_);
my $epoch_time = $dt->epoch;
my $elapsed_time = ($epoch_time - $first_epoch_time)/60;
$elapsed_time;
} @timestamps;
return @times;
}
sub get_date {
my @activity_data = @_;
# determine date from timestamp data
my @timestamps = map { $_->{'timestamp'} } @activity_data;
my $dt = $date_parser->parse_datetime($timestamps[0]);
my $date = $dt->strftime("%Y-%m-%d");
return $date;
}
our @EXPORT_OK = qw(
extract_activity_data
show_activity_statistics
plot_activity_data
get_time_data
num_parts
);
1;
… which is what we had before, but put into a nice package for easier use.
One upside to having put all this code into a module is that the geo-fit-plot-data.pl
script is now much simpler:
use strict;
use warnings;
use Geo::FIT::Utils qw(
extract_activity_data
show_activity_statistics
plot_activity_data
);
sub main {
my @activity_data = extract_activity_data();
show_activity_statistics(@activity_data);
plot_activity_data(@activity_data);
}
main();
We’re now ready to investigate our power and heart rate data interactively!
Start pdl
and enter use lib 'lib'
at the pdl>
prompt so that it can find our new module:6
$ pdl
perlDL shell v1.357
PDL comes with ABSOLUTELY NO WARRANTY. For details, see the file
'COPYING' in the PDL distribution. This is free software and you
are welcome to redistribute it under certain conditions, see
the same file for details.
ReadLines, NiceSlice, MultiLines enabled
Reading PDL/default.perldlrc...
Found docs database /home/vagrant/perl5/perlbrew/perls/perl-5.38.3/lib/site_perl/5.38.3/x86_64-linux/PDL/pdldoc.db
Type 'help' for online help
Type 'demo' for online demos
Loaded PDL v2.100 (supports bad values)
Note: AutoLoader not enabled ('use PDL::AutoLoader' recommended)
pdl> use lib 'lib'
Now import the functions we wish to use:
pdl> use Geo::FIT::Utils qw(extract_activity_data get_time_data num_parts)
Since we need the activity data from the FIT file to pass to the other routines, we grab it and put it into a variable:
pdl> @activity_data = extract_activity_data
We also need to load the time data:
pdl> @times = get_time_data(@activity_data)
which we can then read into a PDL array:
pdl> $time = pdl \@times
With the time data in a PDL array, we can manipulate it more easily. For instance, we can display elements of the array with the PDL print
statement in combination with the splice()
method. The following code shows the last five elements of the $time
array:
pdl> print $time->slice("-1:-5")
[54.5333333333333 54.5166666666667 54.5 54.4833333333333 54.4666666666667]
Loading power output and heart rate data into PDL arrays works similarly:
pdl> @powers = num_parts('power', @activity_data)
pdl> $power = pdl \@powers
pdl> @heart_rates = num_parts('heart_rate', @activity_data)
pdl> $heart_rate = pdl \@heart_rates
In the previous article, we wanted to know what the maximum power was for the second sprint. Here’s the graph again for context:
Eyeballing the graph from above, we can see that the second sprint occurred between approximately 47 and 48 minutes elapsed time. We know that the arrays of time and power data all have the same length. Thus, if we find out the indices of the $time
array between these times, we can use them to select the corresponding power data. To get array indices for known data values, we use the PDL which
command:
pdl> $indices = which(47 < $time & $time < 48)
pdl> print $indices
[2821 2822 2823 2824 2825 2826 2827 2828 2829 2830 2831 2832 2833 2834 2835
2836 2837 2838 2839 2840 2841 2842 2843 2844 2845 2846 2847 2848 2849 2850
2851 2852 2853 2854 2855 2856 2857 2858 2859 2860 2861 2862 2863 2864 2865
2866 2867 2868 2869 2870 2871 2872 2873 2874 2875 2876 2877 2878 2879]
We can check that we’ve got the correct range of time values by passing the $indices
array as a slice of the $time
array:
pdl> print $time($indices)
[47.0166666666667 47.0333333333333 47.05 47.0666666666667 47.0833333333333
47.1 47.1166666666667 47.1333333333333 47.15 47.1666666666667
47.1833333333333 47.2 47.2166666666667 47.2333333333333 47.25
47.2666666666667 47.2833333333333 47.3 47.3166666666667 47.3333333333333
47.35 47.3666666666667 47.3833333333333 47.4 47.4166666666667
47.4333333333333 47.45 47.4666666666667 47.4833333333333 47.5
47.5166666666667 47.5333333333333 47.55 47.5666666666667 47.5833333333333
47.6 47.6166666666667 47.6333333333333 47.65 47.6666666666667
47.6833333333333 47.7 47.7166666666667 47.7333333333333 47.75
47.7666666666667 47.7833333333333 47.8 47.8166666666667 47.8333333333333
47.85 47.8666666666667 47.8833333333333 47.9 47.9166666666667
47.9333333333333 47.95 47.9666666666667 47.9833333333333]
The time values lie between 47 and 48, so we can conclude that we’ve selected the correct indices.
Note that we have to use the bitwise logical AND operator here because it operates on an element-by-element basis across the array.
Selecting $power
array values at these indices is as simple as passing the $indices
array as a slice:
pdl> print $power($indices)
[229 231 232 218 210 204 255 252 286 241 231 237 260 256 287 299 318 337 305
276 320 289 280 301 320 303 395 266 302 341 299 287 309 279 294 284 266 281
367 497 578 512 762 932 907 809 821 847 789 740 657 649 722 715 669 657 705
643 647]
Using the max()
method on this output gives us the maximum power:
pdl> print $power($indices)->max
932
In other words, the maximum power for the second sprint was 932 W. Not as good as the first sprint (which achieved 1023 W), but I was getting tired by this stage.
The same procedure allows us to find the maximum power for the first sprint with PDL. Again, eyeballing the graph above, we can see that the peak for the first sprint occurred between 24 and 26 minutes. Constructing the query in PDL, we have
pdl> print $power(which(24 < $time & $time < 26))->max
1023
which gives the maximum power value we expect.
We can also find out the maximum heart rate values around these times. E.g. for the first sprint:
pdl> print $heart_rate(which(24 < $time & $time < 26))->max
157
in other words, 157 bpm. For the second sprint, we have:
pdl> print $heart_rate(which(47 < $time & $time < 49))->max
165
i.e. 165 bpm, which matches the value that we found earlier. Note that I broadened the range of times to search over heart rate data here because its peak occurred a bit after the power peak for the second sprint.
Where to from here? Well, we could extend this code to handle processing multiple FIT files. This would allow us to find trends over many activities and longer periods. Perhaps there are other data sources that one could combine with longer trends. For instance, if one has access to weight data over time, then it’d be possible to work out things like power-to-weight ratios. Maybe looking at power and heart rate trends over a longer time can identify things such as overtraining. I’m not a sport scientist, so I don’t know how to go about that, yet it’s a possibility. Since we’ve got fine-grained, per-ride data, if we can combine this with longer-term analysis, there are probably many more interesting tidbits hiding in there that we can look at and think about.
One thing I haven’t been able to work out is where the calorie information is. As far as I can tell, Zwift calculates how many calories were burned during a given ride. Also, if one uploads the FIT file to a service such as Strava, then it too shows calories burned and the value is the same. This would imply that Strava is only displaying a value stored in the FIT file. So where is the calorie value in the FIT data? I’ve not been able to find it in the data messages that Geo::FIT
reads, so I’ve no idea what’s going on there.
What have we learned? We’ve found out how to read, analyse and plot data from Garmin FIT files all by using Perl modules. Also, we’ve learned how to investigate the data interactively by using the PDL shell. Cool!
One main takeaway that might not be obvious is that you don’t really need online services such as Strava. You should now have the tools to process, analyse and visualise data from your own FIT files. With Geo::FIT
,Chart::Gnuplot
and a bit of programming, you can glue together the components to provide much of the same (and in some cases, more) functionality yourself.
I wish you lots of fun when playing around with FIT data!
REPL stands for read-eval-print loop and is an environment where one can interactively enter programming language commands and manipulate data. ↩
It is, however, possible to (ab)use the Perl debugger and use it as a kind of REPL. Enter perl -de0
and you’re in a Perl environment much like REPLs in other languages. ↩
Many thanks to Harald Jörg for pointing this out to me at the recent German Perl and Raku Workshop. ↩
This is an application of “first make the change easy, then make the easy change”(paraphrasing Kent Beck). An important point often overlooked in this quote is that making the change easy can be hard. ↩
Not a particularly imaginative name, I know. ↩
The documented way to add a path to @INC
in pdl
is via the -Ilib
command line option. Unfortunately, this didn’t work in my test environment: the local lib/
path wasn’t added to @INC
and hence using the Geo::FIT::Utils
module failed with the error that it couldn’t be located. ↩
Published by alh on Tuesday 17 June 2025 16:07
Paul writes:
As earlier reported, I managed to make some progress on the
faster-signatures
work, as well as some other things.
^^=
operator
Total: 9 hours
Published by alh on Tuesday 17 June 2025 15:57
Dave writes:
A bit of a quiet month.
I checked blead for any performance regressions compared with 5.40.0, using Porting/bench.pl. I found only one significant one: UTF8 string literals were inadvertently no longer being created Copy-on-Write.
I created a PR which improves how OPs are dumped on threaded builds. This will make certain types of debugging easier in the future.
Fixed a bug.
Tweaked my ParseXS AST PR.
Summary:
Total:
Published on Tuesday 17 June 2025 00:00
Published by Steve Waring on Monday 16 June 2025 15:41
This statement:
open(FIND, "/usr/bin/find /home/steve -type f -print |");
gives:
Insecure $ENV{PATH} while running with -T switch
I got around it by setting
$ENV{PATH} = '/usr/bin';
but I don't see why that is needed.
Published by Joe Casadonte on Monday 16 June 2025 14:10
I use Perlbrew to manage local Perl binary installs. I'd like to clone the current install so that I can test out a new CPAN module, and then revert if I don't like the module. I'd rather not wait the 20+ minutes it would to to create a new Perlbrew install and replicate the current module downloads I already have.
The only thing I can think of to do is to use Git to manage the Perlbrew install, making sure the repo is clean before I do my test, and use Git to revert the Perlbrew install if I don't want to keep the new CPAN module (or commit the changes if I do like it). Is there a better way to do it?
Earlier this week, I read a post from someone who failed a job interview because they used a hash slice in some sample code and the interviewer didn’t believe it would work.
That’s not just wrong — it’s a teachable moment. Perl has several kinds of slices, and they’re all powerful tools for writing expressive, concise, idiomatic code. If you’re not familiar with them, you’re missing out on one of Perl’s secret superpowers.
In this post, I’ll walk through all the main types of slices in Perl — from the basics to the modern conveniences added in recent versions — using a consistent, real-world-ish example. Whether you’re new to slices or already slinging %hash{...}
like a pro, I hope you’ll find something useful here.
Let’s imagine you’re writing code to manage employees in a company. You’ve got an array of employee names and a hash of employee details.
my @employees = qw(alice bob carol dave eve); my %details = ( alice => 'Engineering', bob => 'Marketing', carol => 'HR', dave => 'Engineering', eve => 'Sales', );
We’ll use these throughout to demonstrate each kind of slice.
List slices are slices from a literal list. They let you pick multiple values from a list in a single operation:
my @subset = (qw(alice bob carol dave eve))[1, 3]; # @subset = ('bob', 'dave')
You can also destructure directly:
my ($employee1, $employee2) = (qw(alice bob carol))[0, 2]; # $employee1 = 'alice', $employee2 = 'carol'
Simple, readable, and no loop required.
Array slices are just like list slices, but from an array variable:
my @subset = @employees[0, 2, 4]; # @subset = ('alice', 'carol', 'eve')
You can also assign into an array slice to update multiple elements:
@employees[1, 3] = ('beatrice', 'daniel'); # @employees = ('alice', 'beatrice', 'carol', 'daniel', 'eve')
Handy for bulk updates without writing explicit loops.
This is where some people start to raise eyebrows — but hash slices are perfectly valid Perl and incredibly useful.
Let’s grab departments for a few employees:
my @departments = @details{'alice', 'carol', 'eve'}; # @departments = ('Engineering', 'HR', 'Sales')
The @
sigil here indicates that we’re asking for a list of values, even though %details
is a hash.
You can assign into a hash slice just as easily:
@details{'bob', 'carol'} = ('Support', 'Legal');
This kind of bulk update is especially useful when processing structured data or transforming API responses.
Starting in Perl 5.20, you can use %array[...]
to return index/value pairs — a very elegant way to extract and preserve positions in a single step.
my @indexed = %employees[1, 3]; # @indexed = (1 => 'bob', 3 => 'dave')
You get a flat list of index/value pairs. This is particularly helpful when mapping or reordering data based on array positions.
You can even delete from an array this way:
my @removed = delete %employees[0, 4]; # @removed = (0 => 'alice', 4 => 'eve')
And afterwards you’ll have this:
# @employees = (undef, 'bob', 'carol', 'dave', undef)
The final type of slice — also added in Perl 5.20 — is the %hash{...}
key/value slice. This returns a flat list of key/value pairs, perfect for passing to functions that expect key/value lists.
my @kv = %details{'alice', 'dave'}; # @kv = ('alice', 'Engineering', 'dave', 'Engineering')
You can construct a new hash from this easily:
my %engineering = (%details{'alice', 'dave'});
This avoids intermediate looping and makes your code clear and declarative.
Type | Syntax | Returns | Added in |
---|---|---|---|
List slice | (list)[@indices] |
Values | Ancient |
Array slice | @array[@indices] |
Values | Ancient |
Hash slice | @hash{@keys} |
Values | Ancient |
Index/value array slice | %array[@indices] |
Index-value pairs | Perl 5.20 |
Key/value hash slice | %hash{@keys} |
Key-value pairs | Perl 5.20 |
If someone tells you that @hash{...}
or %array[...]
doesn’t work — they’re either out of date or mistaken. These forms are standard, powerful, and idiomatic Perl.
Slices make your code cleaner, clearer, and more concise. They let you express what you want directly, without boilerplate. And yes — they’re perfectly interview-appropriate.
So next time you’re reaching for a loop to pluck a few values from a hash or an array, pause and ask: could this be a slice?
If the answer’s yes — go ahead and slice away.
The post A Slice of Perl first appeared on Perl Hacks.
Published by prz on Saturday 14 June 2025 23:27
Published on Thursday 12 June 2025 22:13
The examples used here are from the weekly challenge problem statement and demonstrate the working solution.
You are given a binary array containing only 0 or/and 1. Write a script to find out the maximum consecutive 1 in the given array.
The core of the solution is contained in a main loop. The resulting code can be contained in a single file.
We’ll use a recursive procedure, which we’ll call from a subroutine which sets up some variables. We’ll pass scalar references to a recursive subroutine. When the recursion completes the $max_consecutive variable will hold the final answer.
Now, let’s define our recursion. We’ll terminate the recursion when we’ve exhausted the input array.
sub consecutive_one_r{
my($i, $consecutive, $max_consecutive) = @_;
my $x;
unless(@{$i} == 0){
$x = pop @{$i};
if($x == 0){
$$max_consecutive = $$consecutive if $$consecutive > $$max_consecutive;
$$consecutive = 0;
}
if($x == 1){
$$consecutive++;
}
consecutive_one_r($i, $consecutive, $max_consecutive);
}
elsif(@{$i} == 1){
$x = pop @{$i};
if($x == 0){
$$max_consecutive = $$consecutive if $$consecutive > $$max_consecutive;
}
if($x == 1){
$$consecutive++;
$$max_consecutive = $$consecutive if $$consecutive > $$max_consecutive;
}
consecutive_one_r($i, $consecutive, $max_consecutive);
}
}
◇
Just to make sure things work as expected we’ll define a few short tests. The double chop is just a lazy way to make sure there aren’t any trailing commas in the output.
MAIN:{
say consecutive_one(0, 1, 1, 0, 1, 1, 1);
say consecutive_one(0, 0, 0, 0);
say consecutive_one(1, 0, 1, 0, 1, 1);
}
◇
Fragment referenced in 1.
$ perl perl/ch-1.pl 3 0 2
You are given an array of item prices. Write a script to find out the final price of each items in the given array. There is a special discount scheme going on. If there’s an item with a lower or equal price later in the list, you get a discount equal to that later price (the first one you find in order).
Hey, let’s use recursion again for this too!
The main section is just some basic tests.
MAIN:{
say join q/, /, calculate_lowest_prices 8, 4, 6, 2, 3;
say join q/, /, calculate_lowest_prices 1, 2, 3, 4, 5;
say join q/, /, calculate_lowest_prices 7, 1, 1, 5;
}
◇
Fragment referenced in 5.
First, let’s introduce a recursive subroutine that scans ahead and finds the next lowest price in the list. As in part one we’ll use a scalar reference.
With that subroutine defined we can use it to solve the main task at hand.
$ perl perl/ch-2.pl 4, 2, 4, 2, 3 1, 2, 3, 4, 5 6, 0, 1, 5
Back in January, I wrote a blog post about adding JSON-LD to your web pages to make it easier for Google to understand what they were about. The example I used was my ReadABooker site, which encourages people to read more Booker Prize shortlisted novels (and to do so by buying them using my Amazon Associate links).
I’m slightly sad to report that in the five months since I implemented that change, visits to the website have remained pretty much static and I have yet to make my fortune from Amazon kickbacks. But that’s ok, we just use it as an excuse to learn more about SEO and to apply more tweaks to the website.
I’ve been using the most excellent ARefs site to get information about how good the on-page SEO is for many of my sites. Every couple of weeks, ARefs crawls the site and will give me a list of suggestions of things I can improve. And for a long time, I had been putting off dealing with one of the biggest issues – because it seemed so difficult.
The site didn’t have enough text on it. You could get lists of Booker years, authors and books. And, eventually, you’d end up on a book page where, hopefully, you’d be tempted to buy a book. But the book pages were pretty bare – just the title, author, year they were short-listed and an image of the cover. Oh, and the all-important “Buy from Amazon” button. AHrefs was insistent that I needed more text (at least a hundred words) on a page in order for Google to take an interest in it. And given that my database of Booker books included hundreds of books by hundreds of authors, that seemed like a big job to take on.
But, a few days ago, I saw a solution to that problem – I could ask ChatGPT for the text.
I wrote a blog post in April about generating a daily-updating website using ChatGPT. This would be similar, but instead of writing the text directly to a Jekyll website, I’d write it to the database and add it to the templates that generate the website.
Adapting the code was very quick. Here’s the finished version for the book blurbs.
#!/usr/bin/env perl use strict; use warnings; use builtin qw[trim]; use feature 'say'; use OpenAPI::Client::OpenAI; use Time::Piece; use Encode qw[encode]; use Booker::Schema; my $sch = Booker::Schema->get_schema; my $count = 0; my $books = $sch->resultset('Book'); while ($count < 20 and my $book = $books->next) { next if defined $book->blurb; ++$count; my $blurb = describe_title($book); $book->update({ blurb => $blurb }); } sub describe_title { my ($book) = @_; my ($title, $author) = ($book->title, $book->author->name); my $debug = 1; my $api_key = $ENV{"OPENAI_API_KEY"} or die "OPENAI_API_KEY is not set\n"; my $client = OpenAPI::Client::OpenAI->new; my $prompt = join " ", 'Produce a 100-200 word description for the book', "'$title' by $author", 'Do not mention the fact that the book was short-listed for (or won)', 'the Booker Prize'; my $res = $client->createChatCompletion({ body => { model => 'gpt-4o', # model => 'gpt-4.1-nano', messages => [ { role => 'system', content => 'You are someone who knows a lot about popular literature.' }, { role => 'user', content => $prompt }, ], temperature => 1.0, }, }); my $text = $res->res->json->{choices}[0]{message}{content}; $text = encode('UTF-8', $text); say $text if $debug; return $text; }
There are a couple of points to note:
I then produced a similar program that did the same thing for authors. It’s similar enough that the next time I need something like this, I’ll spend some time turning it into a generic program.
I then added the new database fields to the book and author templates and re-published the site. You can see the results in, for example, the pages for Salman Rushie and Midnight’s Children.
I had one more slight concern going into this project. I pay for access to the ChatGPT API. I usually have about $10 in my pre-paid account and I really had no idea how much this was going to cost me. I needed have worried. Here’s a graph showing the bump in my API usage on the day I ran the code for all books and authors:
But you can also see that my total costs for the month so far are $0.01!
So, all-in-all, I call that a success and I’ll be using similar techniques to generate content for some other websites.
The post Generating Content with ChatGPT first appeared on Perl Hacks.
Published on Sunday 08 June 2025 12:36
The examples used here are from the weekly challenge problem statement and demonstrate the working solution.
You are given an array of integers and two integers $r and $c. Write a script to create two dimension array having $r rows and $c columns using the given array.
The core of the solution is contained in a main loop. The resulting code can be contained in a single file.
sub create_array{
my($i, $r, $c) = @_;
my @a = ();
for (0 .. $r - 1){
my $row = [];
for (0 .. $c - 1){
push @{$row}, shift @{$i};
}
push @a, $row;
}
return @a;
}
◇
Fragment referenced in 1.
Just to make sure things work as expected we’ll define a few short tests. The double chop is just a lazy way to make sure there aren’t any trailing commas in the output.
MAIN:{
my $s = q//;
$s .= q/(/;
do{
$s.= (q/[/ . join(q/, /, @{$_}) . q/], /);
} for create_array [1, 2, 3, 4], 2, 2;
chop $s;
chop $s;
$s .= q/)/;
say $s;
$s = q//;
$s .= q/(/;
do{
$s.= (q/[/ . join(q/, /, @{$_}) . q/], /);
} for create_array [1, 2, 3], 1, 3;
chop $s;
chop $s;
$s .= q/)/;
say $s;
$s = q//;
$s .= q/(/;
do{
$s.= (q/[/ . join(q/, /, @{$_}) . q/], /);
} for create_array [1, 2, 3, 4], 4, 1;
chop $s;
chop $s;
$s .= q/)/;
say $s;
}
◇
Fragment referenced in 1.
$ perl perl/ch-1.pl ([1, 2], [3, 4]) ([1, 2, 3]) ([1], [2], [3], [4])
You are given an array of integers. Write a script to return the sum of total XOR for every subset of given array.
This is another short one, but with a slightly more involved solution. We are going to compute the Power Set (set of all subsets) of the given array of integers and then for each of these sub-arrays compute and sum the XOR results.
The main section is just some basic tests.
MAIN:{
say calculate_total_xor 1, 3;
say calculate_total_xor 5, 1, 6;
say calculate_total_xor 3, 4, 5, 6, 7, 8;
}
◇
Fragment referenced in 4.
sub calculate_total_xor{
my $total = 0;
for my $a (power_set @_){
my $t = 0;
$t = eval join q/ ^ /, ($t, @{$a});
$total += $t;
}
return $total;
}
◇
Fragment referenced in 4.
The Power Set can be computed by using a binary counter. Let’s say we have N elements of the set. We start at 0 x N and continue to 1 x N. At each iteration we compose a subarray by including the ith element from the original array if the ith bit is set. Actually, we arent going to start at 0 x N because we want to exclude the empty set for the purposes of the later XOR computation.
sub power_set{
my @a = ();
for my $i (1 .. 2 ** @_- 1){
my @digits = ();
for my $j (0 .. @_ - 1){
push @digits, $_[$j] if 1 == ($i >> $j & 1);
}
push @a, \@digits;
}
return @a;
}
◇
Fragment referenced in 4.
$ perl perl/ch-2.pl 6 28 480
Power Set Defined
Power Set Calculcation (C++) from TWC 141
The Weekly Challenge 324
Generated Code
Published by prz on Saturday 07 June 2025 22:56
This is the weekly favourites list of CPAN distributions. Votes count: 22
This week there isn't any remarkable distribution
Build date: 2025/06/07 20:53:44 GMT
Clicked for first time:
Increasing its reputation:
Published by Robert Rothenberg on Friday 06 June 2025 15:00
The 2-argument open function is insecure, because the filename can include the mode. If it is not properly validated, then files can be modified, truncated or in the case of a pipe character, run an external command.
$file = "| echo Aha";
open my $fh, $file; # <-- THIS IS BAD
This will execute the command embedded in $file
.
Even when the filename is generated by your code, you can run into unexpected edge cases. For example, in a Unix shell run the command
touch '| echo Aha'
and in the same directory run the script
opendir( my $dh, ".");
while ( my $file = readdir $dh ) {
next if -d $file;
open my $fh, $file; # <-- THIS IS BAD
close $fh;
}
This is more subtle, and will execute the command embedded in that filename.
It is the same bug in File::Find::Rule that became CVE-2011-10007. (If you haven’t already upgraded File::Find::Rule to version 0.35 or later, please do so. That module has more than 1,700 direct or indirect dependents.)
The SEI CERT Perl Coding Standard recommends against using the two-argument form of open().
The fix is simply to use a 3-argument form, where the second argument is the mode and the third is the filename:
open my $fh, '<', $file;
The 3-argument open has been supported since Perl v5.6.0, so there is no worry about supporting older versions of Perl.
You can identify this issue in your code using the Perl::Critic ProhibitTwoArgOpen policy. There is a similar policy in Perl::Lint.
I seldom release modules to CPAN; mainly because
there’s so much great stuff there already. An answer on StackOverflow
about pretty printing DBIx::Class
result
sets got me thinking. I then
climbed onto the shoulders of several giants to create a wee module which
does just that. Introducing
DBIx::Class::ResultSet::PrettyPrint
!
Strangely enough, I’d released
DBIx::Class::ResultSet::PrettyPrint
in 2024 but had never gotten around to mentioning it anywhere. This post
rectifies that omission, gives some background about the module, and
discusses a small usage example.
One could say that this is a delta-epsilon1 module in that it
doesn’t extend things very much. Although it doesn’t constitute a large
change, it does make printing DBIx::Class
result sets easier. It stands
upon the shoulders of several giants, so all I can claim is to have bundled
the idea into a module.
The original impetus for DBIx::Class::ResultSet::PrettyPrint
came from
wanting to pretty print result sets in a Perl project I’ve been working
on.2 I find that by seeing the data within a result set, I can get
a feeling from what the data looks like and what kinds of information it
contains. Searching for a pretty printing module, I stumbled across an
answer on StackOverflow about pretty printing DBIx::Class
result
sets. I remember thinking
that the proposed solution looked nice and I used the pattern a couple of
times in my work. I eventually realised that the approach would be easier
to use as a module. Since then, I’ve found it handy as a way to get an idea
of the shape of the data that I’m playing with.
I made some small changes to the solution proposed on StackOverflow. For
instance, it recommended using
Text::Table
, but I found the table
output generated by
Text::Table::Tiny
nicer.
This is why DBIx::Class::ResultSet::PrettyPrint
uses Text::Table::Tiny
to generate tables. For instance, the output has +
symbols at the table
cell corners, which is reminiscent of how Postgres displays tables within
psql
. This I found to be a nice touch.
Of course, if one has large database tables with many columns and/or rows, this module might not be so useful. Yet, since it operates on result sets, one can create a result set with a subset of a given table and then pretty print that.
Although one often talks about pretty printing database tables, really the
module operates on DBIx::Class::ResultSet
objects. Hence, there isn’t a
strict one-to-one relationship between database tables and what the pretty
printer operates on. This is why the module was useful in one of my current
projects: sometimes there wasn’t a database table behind the ResultSet
I
was investigating. For instance, by querying the database directly with
psql
, it wasn’t (easily) possible to work out what form the data had and
what kinds of information it contained. Using
DBIx::Class::ResultSet::PrettyPrint
made this investigative work much
easier.
So, how to use the module? A small example should make things clear.
Let’s see the module in action. First off, we’ll need to install it:
$ cpanm DBIx::Class::ResultSet::PrettyPrint
This will pull in several CPAN modules, so you’ll need to wait a bit until it’s finished. For instance, on my test system, it took 22 minutes to download, build, test, and install the necessary 79 distributions. It’ll probably take less time if you’ve already got many of the upstream dependencies installed on your system.
Once that’s done, we can set up an example project. We’ll need to set up a
DBIx::Class
project, so there’s a bit of upfront work to do.
I’m a book fan, so let’s create a project to store metadata about some of my books. We only need one database table in this small example, so it won’t take long to set up.
I’ve got lots of books about Perl and a few about Unix, so let’s call the project “Perl and Unix library”. To give you an idea of what I mean, here’s a “shelfie”:
Create a directory for the project and change into the new directory:
$ mkdir perl-and-unix-library
$ cd perl-and-unix-library
Now we need to create the directory structure for our DBIx::Class
schema:
$ mkdir -p lib/Schema/Result/
We’ll need a stub Schema
package that we can use later to inspect the
database’s contents. So, create a file called lib/Schema.pm
and fill it
with this code:
package Schema;
use strict;
use warnings;
use base qw(DBIx::Class::Schema);
__PACKAGE__->load_namespaces();
1;
# vim: expandtab shiftwidth=4
We need to tell DBIx::Class
about the structure of our books table, so
create a file called lib/Schema/Result/Book.pm
and fill it with this
content:
package Schema::Result::Book;
use strict;
use warnings;
use base qw(DBIx::Class::Core);
use lib '.t/lib';
__PACKAGE__->table('books');
__PACKAGE__->add_columns(
id => {
data_type => 'integer',
size => 16,
is_nullable => 0,
is_auto_increment => 1,
},
title => {
data_type => 'varchar',
size => 128,
is_nullable => 0,
},
author => {
data_type => 'varchar',
size => 128,
is_nullable => 0,
},
pub_date => {
data_type => 'date',
is_nullable => 0,
},
num_pages => {
data_type => 'integer',
size => 16,
is_nullable => 0,
},
isbn => {
data_type => 'varchar',
size => 32,
is_nullable => 0,
},
);
__PACKAGE__->set_primary_key('id');
1;
# vim: expandtab shiftwidth=4
This defines our books
database table in which we’re storing title,
author, publication date, number of pages, and ISBN information about each
of our books.
We’ve now got enough structure for DBIx::Class
to create and query a
database. That means we can add some books to the database.
Create a file in the project’s root directory called create-books-db.pl
and fill it with this content:
use strict;
use warnings;
use lib './lib';
use Schema;
my $schema = Schema->connect("dbi:SQLite:books.db");
$schema->deploy( { add_drop_table => 1 } );
my $books = $schema->resultset('Book');
$books->create(
{
title => "Programming Perl",
author => "Tom Christiansen, brian d foy, Larry Wall, Jon Orwant",
pub_date => "2012-03-18",
num_pages => 1174,
isbn => "9780596004927"
}
);
$books->create(
{
title => "Perl by Example",
author => "Ellie Quigley",
pub_date => "1994-01-01",
num_pages => 200,
isbn => "9780131228399"
}
);
$books->create(
{
title => "Perl in a Nutshell",
author => "Nathan Patwardhan, Ellen Siever and Stephen Spainhour",
pub_date => "1999-01-01",
num_pages => 654,
isbn => "9781565922860"
}
);
$books->create(
{
title => "Perl Best Practices",
author => "Damian Conway",
pub_date => "2005-07-01",
num_pages => 517,
isbn => "9780596001735"
}
);
$books->create(
{
title => "Learning Perl, 7th Edition",
author => "Randal L. Schwartz, brian d foy, Tom Phoenix",
pub_date => "2016-10-05",
num_pages => 369,
isbn => "9781491954324"
}
);
$books->create(
{
title => "UNIX Shell Programming",
author => "Stephen G. Kochan and Patrick H. Wood",
pub_date => "1990",
num_pages => 502,
isbn => "067248448X"
}
);
# vim: expandtab shiftwidth=4
Running this file will create an SQLite database called books.db
in the
same directory as the script. I.e. after running
$ perl create-books-db.pl
you should see a file called books.db
in the project’s root directory.
Now we can query the data in our books database. Create a file called
show-books.pl
in the project base directory with this content:
use strict;
use warnings;
use lib './lib';
use DBIx::Class::ResultSet::PrettyPrint;
use Schema; # load your DBIx::Class schema
# load your database and fetch a result set
my $schema = Schema->connect( 'dbi:SQLite:books.db' );
my $books = $schema->resultset( 'Book' );
print "Title of first entry: ", $books->find(1)->title, "\n";
print "Authors of UNIX-related titles: ",
$books->search({ title => { -like => "%UNIX%" }})->first->author, "\n";
# vim: expandtab shiftwidth=4
Running this script will give this output:
$ perl show-books.pl
Title of first entry: Programming Perl
Authors of UNIX-related titles: Stephen G. Kochan and Patrick H. Wood
That’s all very well and good, but wouldn’t it be nice to view the database
table all in one go? This is the niche task that
DBIx::Class::ResultSet::PrettyPrint
performs.
Change the print
statements in the show-books.pl
script to this:
# pretty print the result set
my $pp = DBIx::Class::ResultSet::PrettyPrint->new();
$pp->print_table( $books );
Now, when we run the script, we get this output:
$ perl show-books.pl
+----+----------------------------+-------------------------------------------------------+------------+-----------+---------------+
| id | title | author | pub_date | num_pages | isbn |
+----+----------------------------+-------------------------------------------------------+------------+-----------+---------------+
| 1 | Programming Perl | Tom Christiansen, brian d foy, Larry Wall, Jon Orwant | 2012-03-18 | 1174 | 9780596004927 |
| 2 | Perl by Example | Ellie Quigley | 1994-01-01 | 200 | 9780131228399 |
| 3 | Perl in a Nutshell | Nathan Patwardhan, Ellen Siever and Stephen Spainhour | 1999-01-01 | 654 | 9781565922860 |
| 4 | Perl Best Practices | Damian Conway | 2005-07-01 | 517 | 9780596001735 |
| 5 | Learning Perl, 7th Edition | Randal L. Schwartz, brian d foy, Tom Phoenix | 2016-10-05 | 369 | 9781491954324 |
| 6 | UNIX Shell Programming | Stephen G. Kochan and Patrick H. Wood | 1990 | 502 | 067248448X |
+----+----------------------------+-------------------------------------------------------+------------+-----------+---------------+
Isn’t that nice?
As I mentioned earlier, I’ve found the module quite handy when using Perl to dig around in database tables in my daily work. Maybe it can help make your work easier too!
This is in reference to delta-epsilon proofs in mathematics where the values delta and epsilon are very small. ↩︎
If you need someone who is stubbornly thorough, give me a yell! I’m available for freelance Python/Perl backend development and maintenance work. Contact me at paul@peateasea.de and let’s discuss how I can help solve your business’ hairiest problems. ↩︎
Published on Thursday 05 June 2025 22:52
The examples used here are from the weekly challenge problem statement and demonstrate the working solution.
You are given a list of operations. Write a script to return the final value after performing the given operations in order. The initial value is always 0.
Let’s entertain ourselves with an over engineered solution! We’ll use Parse::Yapp to handle incrementing and decrementing any single letter variable. Or, to put it another way, we’ll define a tiny language which consists of single letter variables that do not require declaration, are only of unsigned integer type, and are automatically initialized to zero. The only operations on these variables are the increment and decrement operations from the problem statement. At the completion of the parser’s execution we will print the final values of each variable.
The majority of the work will be done in the .yp yapp grammar definition file. We’ll focus on this file first.
The declarations section will have some token definitions and a global variable declaration.
For our simple language we’re just going to define a few tokens: the increment and decrement operators, our single letter variables.
We’re going to define a single global variable which will be used to track the state of each variable.
The rules section defines the actions of our increment and decrement operations in both prefix and postfix form. We’ll also allow for a completely optional variable declaration which is just placing a single letter variable by itself
program: statement {$variable_state}
| program statement
;
statement: variable_declaration
| increment_variable
| decrement_variable
;
variable_declaration: LETTER {$variable_state->{$_[1]} = 0}
;
increment_variable: INCREMENT LETTER {$variable_state->{$_[2]}++}
| LETTER INCREMENT {$variable_state->{$_[1]}++}
;
decrement_variable: DECREMENT LETTER {$variable_state->{$_[2]}--}
| LETTER DECREMENT {$variable_state->{$_[1]}--}
;
◇
The final section of the grammar definition file is, historically, called programs. This is where we have Perl code for the lexer, error handing, and a parse function which provides the main point of execution from code that wants to call the parser that has been generated from the grammar.
The parse function is for the convenience of calling the generated parser from other code. yapp will generate a module and this will be the module’s method used by other code to execute the parser against a given input.
Notice here that we are squashing white space, both tabs and newlines, using tr. This reduces all tabs and newlines to a single space. This eases further processing since extra whitespace is just ignored, according to the rules we’ve been given.
Also notice the return value from parsing. In the rules section we provide a return value, a hash reference, in the final action code block executed.
sub parse{
my($self, $input) = @_;
$input =~ tr/\t/ /s;
$input =~ tr/\n/ /s;
$self->YYData->{INPUT} = $input;
my $result = $self->YYParse(yylex => \&lexer, yyerror => \&error);
return $result;
}
◇
Fragment referenced in 6.
This is really just about the most minimal error handling function there can be! All this does is print “syntax error”when the parser encounters a problem.
sub error{
exists $_[0]->YYData->{ERRMSG}
and do{
print $_[0]->YYData->{ERRMSG};
return;
};
print "syntax␣error\n";
}
◇
Fragment referenced in 6.
The lexer function is called repeatedly for the entire input. Regular expressions are used to identify tokens (the ones declared at the top of the file) and pass them along for the rules processing.
sub lexer{
my($parser) = @_;
$parser->YYData->{INPUT} or return(q//, undef);
$parser->YYData->{INPUT} =~ s/^[ \t]//g;
##
# send tokens to parser
##
for($parser->YYData->{INPUT}){
s/^(\s+)// and return (q/SPACE/, $1);
s/^([a-z]{1})// and return (q/LETTER/, $1);
s/^(\+\+)// and return (q/INCREMENT/, $1);
s/^(--)// and return (q/DECREMENT/, $1);
}
}
◇
Fragment referenced in 6.
There’s one more function we should add. The reason for it is a little complex. Variables defined in the declarations section are considered static and are stored in the lexical pad of the package. So each new invocation of the parse() method will re-use the same variables. They are not cleared or reset. So, we’ll define a subroutine which will clear this for us manually.
Let’s define a small file to drive some tests.
The preamble to the test driver sets the minimum perl version to be the most recent one, to take advantage of all recent changes. We also include the generated module file whihc yapp creates. For test purposes we’ll define some constants, taken from TWC’s examples.
use constant TEST0 => q/--x x++ x++/;
use constant TEST1 => q/x++ ++x x++/;
use constant TEST2 => q/x++ ++x --x x--/;
use constant COMPLEX_TEST => <<~END_TEST;
a b c
a++ b++ c++
++a ++b ++c
--a --b --c
a-- b-- c--
a++ ++b c++
END_TEST
◇
Fragment referenced in 12.
For printing the results in a nice way we’ll define a small subroutine to display the return value from the parser.
sub print_variables{
my($results) = @_;
for my $k (keys %{$results}){
print $k;
say qq/:\t$results->{$k}/;
}
}
◇
Fragment referenced in 11.
MAIN:{
my $parser = IncrementDecrement->new();
say TEST0;
say print_variables $parser->parse(TEST0);
say TEST1;
$parser->clear();
say print_variables $parser->parse(TEST1);
say TEST2;
$parser->clear();
say print_variables $parser->parse(TEST2);
say COMPLEX_TEST;
$parser->clear();
say print_variables $parser->parse(COMPLEX_TEST);
}
◇
Fragment referenced in 11.
$ yapp -m IncrementDecrement perl/IncrementDecrement.yp; mv IncrementDecrement.pm perl; perl -Iperl perl/ch-1.pl --x x++ x++ x: 1 x++ ++x x++ x: 3 x++ ++x --x x-- x: 0 a b c a++ b++ c++ ++a ++b ++c --a --b --c a-- b-- c-- a++ ++b c++ b: 1 a: 1 c: 1
You are given an income amount and tax brackets. Write a script to calculate the total tax amount.
After over doing the complexity for the first part, we’ll make this one quite a bit shorter.
The main section is just some basic tests.
MAIN:{
say calculate_tax 10, [[3, 50], [7, 10], [12,25]];
say calculate_tax 2, [[1, 0], [4, 25], [5,50]];
say calculate_tax 0, [[2, 50]];
}
◇
Fragment referenced in 16.
{
my $tax_bracket = shift @{$tax_brackets};
if($tax_bracket->[0] <= $income){
$taxable = $tax_bracket->[0] - $taxable;
$tax += ($taxable * ($tax_bracket->[1]/100));
$taxed += $taxable;
}
else{
$tax += (($income - $taxed) * ($tax_bracket->[1]/100));
$taxed = $income;
}
redo unless $taxed >= $income || @{$tax_brackets} == 0;
}
◇
$ perl perl/ch-2.pl 2.65 0.25 0
Published on Thursday 05 June 2025 09:09
In the previous post, we created a network close enough to reality so that finding routes between stations was possible and sufficiently interesting. In this final post in the series, we’re going to see how to handle indirect connections between stations.
Not all stations in the Hannover tram network are directly connected. A
good example is the line Linie 10
, which starts at the bus station next to
the main train station and has the station name
Hauptbahnhof/ZOB
.1 As its name suggests, this station is
associated with the station Hauptbahnhof
. Although they’re very close to
one another, they’re not connected directly. You have to cross a road to get
to Hauptbahnhof
from the Hauptbahnhof/ZOB
tram stop. A routing
framework such as Map::Tube
should
allow such indirect connections, thus joining Linie 10
to the rest of the
network.
So how do we connect such indirectly connected stations?
Map::Tube
has a solution: the
other_link
attribute.
To see this attribute in action, let’s add the line Linie 10
to the
network and connect Hauptbahnhof
to Hauptbahnhof/ZOB
with an
other_link
. Then we can try creating a route from Ahlem
(at the end of
Linie 10
) to Misburg
(at the end of Linie 7
) and see if our new
connection type works as we expect. Let’s get cracking!
Here’s the planned list of stations, IDs and links:
Station | ID | Links |
---|---|---|
Ahlem | H15 | H16 |
Leinaustraße | H16 | H15, H17 |
Hauptbahnhof/ZOB | H17 | H16 |
Ahlem
is the westernmost station, hence it’s the “first” station along
Linie 10
. Therefore, it gets the next logical ID carrying on from where we
left off in the map file.
As we’ve done before, we drive these changes by leaning on our test suite. We want to have four lines in the network now, hence we update our number of lines test like so:
my $num_lines = scalar @{$hannover->get_lines};
is( $num_lines, 4, "Number of lines in network correct" );
We can test that we’ve added the line and its stations correctly by checking for the expected route. Our routes tests are now:
my @routes = (
"Route 1|Langenhagen|Sarstedt|Langenhagen,Kabelkamp,Hauptbahnhof,Kroepcke,Laatzen,Sarstedt",
"Route 4|Garbsen|Roderbruch|Garbsen,Laukerthof,Kroepcke,Kantplatz,Roderbruch",
"Route 7|Wettbergen|Misburg|Wettbergen,Allerweg,Kroepcke,Hauptbahnhof,Vier Grenzen,Misburg",
"Route 10|Ahlem|Hauptbahnhof/ZOB|Ahlem,Leinaustraße,Hauptbahnhof/ZOB",
);
ok_map_routes($hannover, \@routes);
where we’ve added the expected list of stations for Linie 10
to the end of
the @routes
list.
Let’s make sure the tests fail as expected:
$ prove -lr t/map-tube-hannover.t
t/map-tube-hannover.t .. 1/?
# Failed test 'Number of lines in network correct'
# at t/map-tube-hannover.t line 15.
# got: '3'
# expected: '4'
Yup, that looks good. We expect four lines but only have three. Let’s add the line to our maps file now:
{
"id" : "L10",
"name" : "Linie 10",
"color" : "PaleGreen"
}
where I’ve guessed that the line colour used in the Üstra “Netzplan U” is pale green.
Re-running the tests, we have:
$ prove -lr t/map-tube-hannover.t
t/map-tube-hannover.t .. # Line id L10 consists of 0 separate components
# Failed test 'Hannover'
# at /home/cochrane/perl5/perlbrew/perls/perl-5.38.3/lib/site_perl/5.38.3/Test/Map/Tube.pm line 196.
# Line id L10 defined but serves no stations (not even as other_link)
# Failed test 'Hannover'
# at /home/cochrane/perl5/perlbrew/perls/perl-5.38.3/lib/site_perl/5.38.3/Test/Map/Tube.pm line 196.
# Looks like you failed 2 tests of 14.
Again, we expected this as this line doesn’t have any stations yet. Let’s add them to the map file.
{
"id" : "H15",
"name" : "Ahlem",
"line" : "L10",
"link" : "H16"
},
{
"id" : "H16",
"name" : "Leinaustraße",
"line" : "L10",
"link" : "H15,H17"
},
{
"id" : "H17",
"name" : "Hauptbahnhof/ZOB",
"line" : "L10",
"link" : "H16"
}
This time, we expect the tests to tell us that this line isn’t connected to the network. Sure enough:
$ prove -lr t/map-tube-hannover.t
t/map-tube-hannover.t .. # Map has 2 separate components; e.g., stations with ids H1, H15
# Failed test 'Hannover'
# at
/home/cochrane/perl5/perlbrew/perls/perl-5.38.3/lib/site_perl/5.38.3/Test/Map/Tube.pm line 196.
# Looks like you failed 1 test of 14.
The error message
Map has 2 separate components; e.g., stations with ids H1, H15
means that the line isn’t connected to any of the other lines already present because the map contains separate components.
To fix this, let’s change the entry for Hauptbahnhof/ZOB
to use the
other_link
attribute and see if that helps:
{
"id" : "H17",
"name" : "Hauptbahnhof/ZOB",
"line" : "L10",
"link" : "H16",
"other_link" : "Street:H3"
}
Oddly, the tests still raise an error:
$ prove -lr t/map-tube-hannover.t
t/map-tube-hannover.t .. # Map has 2 separate components; e.g., stations with ids H1, H15
# Failed test 'Hannover'
# at /home/cochrane/perl5/perlbrew/perls/perl-5.38.3/lib/site_perl/5.38.3/Test/Map/Tube.pm line 196.
t/map-tube-hannover.t .. 1/? # Looks like you failed 1 test of 14.
# Failed test 'ok_map_data'
# at t/map-tube-hannover.t line 11.
Oh, that’s right! We’ve only linked Hauptbahnhof/ZOB
to Hauptbahnhof
;
we need to add the other_link
in the other direction as well. We could
have debugged this situation by running bin/map2image.pl
and inspecting
the generated image. Yet we’ve seen this issue
before
and can call on experience instead.
We can fix the problem by updating the entry for Hauptbahnhof
like so:
{
"id" : "H3",
"name" : "Hauptbahnhof",
"line" : "L1,L7",
"link" : "H2,H8,H12",
"other_link" : "Street:H17"
},
Now the tests still fail, even though we thought we’d fixed everything:
$ prove -lr t/map-tube-hannover.t
t/map-tube-hannover.t .. 1/? Map::Tube::get_node_by_name(): ERROR: Invalid Station Name [Leinaustraße]. (status: 101) file /home/cochrane/perl5/perlbrew/perls/perl-5.38.3/lib/site_perl/5.38.3/Test/Map/Tube.pm on line 1434
# Tests were run but no plan was declared and done_testing() was not seen.
What’s going wrong?
Oh, yeah, the sharp-s (ß) character messes with the routing tests as we saw in the previous article in the series.
Let’s replace ß with the equivalent “double-s” for the Leinaustraße
station. First in the map file:
{
"id" : "H16",
"name" : "Leinaustrasse",
"line" : "L10",
"link" : "H15,H17"
},
and then in the routes tests:
my @routes = (
"Route 1|Langenhagen|Sarstedt|Langenhagen,Kabelkamp,Hauptbahnhof,Kroepcke,Laatzen,Sarstedt",
"Route 4|Garbsen|Roderbruch|Garbsen,Laukerthof,Kroepcke,Kantplatz,Roderbruch",
"Route 7|Wettbergen|Misburg|Wettbergen,Allerweg,Kroepcke,Hauptbahnhof,Vier Grenzen,Misburg",
"Route 10|Ahlem|Hauptbahnhof/ZOB|Ahlem,Leinaustrasse,Hauptbahnhof/ZOB",
);
ok_map_routes($hannover, \@routes);
How did we do?
$ prove -lr t/map-tube-hannover.t
t/map-tube-hannover.t .. ok
All tests successful.
Files=1, Tests=4, 0 wallclock secs ( 0.03 usr 0.00 sys + 0.55 cusr 0.05 csys = 0.63 CPU)
Result: PASS
Success! 🎉
We’ve reached the end of the development phase of the HOWTO. At this point,
the complete test file (t/map-tube-hannover.t
) looks like this:
use strict;
use warnings;
use Test::More;
use Map::Tube::Hannover;
use Test::Map::Tube;
my $hannover = Map::Tube::Hannover->new;
ok_map($hannover);
ok_map_functions($hannover);
my $num_lines = scalar @{$hannover->get_lines};
is( $num_lines, 4, "Number of lines in network correct" );
my @routes = (
"Route 1|Langenhagen|Sarstedt|Langenhagen,Kabelkamp,Hauptbahnhof,Kroepcke,Laatzen,Sarstedt",
"Route 4|Garbsen|Roderbruch|Garbsen,Laukerthof,Kroepcke,Kantplatz,Roderbruch",
"Route 7|Wettbergen|Misburg|Wettbergen,Allerweg,Kroepcke,Hauptbahnhof,Vier Grenzen,Misburg",
"Route 10|Ahlem|Hauptbahnhof/ZOB|Ahlem,Leinaustrasse,Hauptbahnhof/ZOB",
);
ok_map_routes($hannover, \@routes);
done_testing();
with the other Perl files remaining unchanged.
The full JSON content of the map file is too long to display here, but if you’re interested, you can see it in the Git repository accompanying this article series.
To get a feeling for what the network looks like, try running
bin/map2image.pl
. Doing so, you’ll find a network graph similar to this:
Although the graph doesn’t highlight the indirect link, it does show the connectivity in the entire map and gives us a high-level view of what we’ve achieved.
With our latest map changes in hand, we can find our way from Ahlem
to
Misburg
:
$ perl bin/get_route.pl Ahlem Misburg
Ahlem (Linie 10), Leinaustrasse (Linie 10), Hauptbahnhof/ZOB (Linie 10, Street), Hauptbahnhof (Linie 1, Linie 7, Street), Vier Grenzen (Linie 7), Misburg (Linie 7)
Wicked! It worked! And it got the connection from Hauptbahnhof/ZOB
to
Hauptbahnhof
right. Nice!
We can also plan more complex routes, such as travelling from Ahlem
to
Roderbruch
:
$ perl bin/get_route.pl Ahlem Roderbruch
Ahlem (Linie 10), Leinaustrasse (Linie 10), Hauptbahnhof/ZOB (Linie 10, Street), Hauptbahnhof (Linie 1, Linie 7, Street), Kroepcke (Linie 1, Linie 4, Linie 7), Kantplatz (Linie 4), Roderbruch (Linie 4)
Looking closely, we find that we have to change at Hauptbahnhof
and then
again at Kroepcke
to reach our destination. Comparing this with the
Üstra “Netzplan
U”
we can see (for the simpler map created here) that this matches reality.
Brilliant!
Let’s commit that change and give ourselves a pat on the back for a job well done!
$ git ci share/hannover-map.json t/map-tube-hannover.t -m "Add Linie 10 to network
>
> The most interesting part about this change is the use of other_link
> to ensure that Hauptbahnhof/ZOB and Hauptbahnhof are connected to one
> another and hence Linie 10 is connected to the rest of the network
> and routes can be found from Linie 10 to other lines."
[main bc34daa] Add Linie 10 to network
2 files changed, 29 insertions(+), 3 deletions(-)
Welcome to the end of the article series! Thanks for staying until the end. 🙂
Wow, that was quite a lot of work! But it was fun, and we learned a lot along the way. For instance, we’ve learned:
Map::Tube
map is structured,Map::Tube
in a test-driven
manner,Map::Tube
network,This discussion has hopefully given you the tools you need to create your
own Map::Tube
map. There’s so much more you can do with Map::Tube
, so
it’s a good idea to spend some time browsing the
documentation. Therein you will find
many nuggets of information and hints for ideas of things to play with.
I wish you the best of luck and have fun!
For those wondering who don’t speak German: Hauptbahnhof means “main train station” or equivalently “central train station”. ZOB is the abbreviation of Zentralomnibusbahnhof, which looks like it literally translates as “central omnibus train station”, but really means “central bus station”. ↩︎
Published on Wednesday 04 June 2025 08:00
SlapbirdAPM is a free-software observability platform tailor made for Perl web-applications. [ It is also a Perl web-application :^) ] It has first class support for Plack, Mojo, Dancer2, and CGI. Slapbird provides developers with comprehensive observability tools to monitor and optimize their applications’ performance.
In this article I will explain how to setup a Plack application with Slapbird. If you want to use another supported framework, please read our Getting Started documentation, or reach out to me on the Perl Foundations Slack channel!
SlapbirdAPM is easily installed on your Plack application, here is a minimal example, using a Dancer2 application that runs under Plack:
Install with
cpan -I SlapbirdAPM::Agent::Plack
#!/usr/bin/env perl
use Dancer2;
use Plack::Builder;
get '/' => sub {
'Hello World!';
};
builder {
enable 'SlapbirdAPM';
app;
};
Now, you can create an account on SlapbirdAPM, and create your application.
Then, simply copy the API key output and, add it to your application via the SLAPBIRDAPM_API_KEY
environment variable. For example:
SLAPBIRDAPM_API_KEY=<API-KEY> plackup app.pl
or, you can pass your key in to the middleware:
builder {
enable 'SlapbirdAPM', key => <YOUR API KEY>;
...
};
Now when you navigate to /
, you will see it logged in your SlapbirdAPM dashboard!
Then, clicking into one of the transactions, you’ll get some more information:
SlapbirdAPM also supports DBI, meaning you can trace your queries, let’s edit our application to include a few DBI queries:
#!/usr/bin/env perl
use Dancer2;
use DBI;
use Plack::Builder;
my $dbh = DBI->connect( 'dbi:SQLite:dbname=database.db', '', '' );
$dbh->do('create table if not exists users (id integer primary key, name varchar)');
get '/' => sub {
send_as html => 'Hello World!';
};
get '/users/:id' => sub {
my $user_id = route_parameters->get('id');
my ($user) =
$dbh->selectall_array(
'select * from users where id = ?',
{ Slice => {} }, $user_id );
send_as JSON => $user;
};
post '/users' => sub {
my $user_name = body_parameters->get('name');
my ($user) =
$dbh->selectall_array(
'insert into users(name) values ( ? ) returning id, name',
{ Slice => {} }, $user_name );
send_as JSON => $user;
};
builder {
enable 'SlapbirdAPM';
app;
};
Now we can use cURL to add data to our database:
curl -X POST -d 'name=bob' http://127.0.0.1:5000/users
Then, if we go back into Slapbird, we can view our timings for our queries:
This just breaks the surface of what is possible using SlapbirdAPM. You can also, generate reports, perform health-checks, and get notified if your application is creating too many 5XX responses.
Thanks for reading!
Published on Monday 02 June 2025 00:00
Published by prz on Saturday 31 May 2025 16:50
Published by Dave Cross on Friday 30 May 2025 15:45
Last summer, I wrote a couple of posts about my lightweight, roll-your-own approach to deploying PSGI (Dancer) web apps:
In those posts, I described how I avoided heavyweight deployment tools by writing a small, custom Perl script (app_service
) to start and manage them. It was minimal, transparent, and easy to replicate.
It also wasn’t great.
The system mostly worked, but it had a number of growing pains:
systemctl
.curl
, not journalctl
.As I started running more apps, this ad-hoc approach became harder to justify. It was time to grow up.
psgi-systemd-deploy
So today (with some help from ChatGPT) I wrote psgi-systemd-deploy — a simple, declarative deployment tool for PSGI apps that integrates directly with systemd
. It generates .service
files for your apps from environment-specific config and handles all the fiddly bits (paths, ports, logging, restart policies, etc.) with minimal fuss.
Key benefits:
.deploy.env
.env
file support for application-specific settingsenvsubst
systemd
units you can inspect and manage yourself--dry-run
mode so you can preview changes before deployingrun_all
helper script for managing all your deployed apps with one commandYou may know about my Line of Succession web site (introductory talk). This is one of the Dancer apps I’ve been talking about. To deploy it, I wrote a .deploy.env
file that looks like this:
WEBAPP_SERVICE_NAME=succession WEBAPP_DESC="British Line of Succession" WEBAPP_WORKDIR=/opt/succession WEBAPP_USER=succession WEBAPP_GROUP=psacln WEBAPP_PORT=2222 WEBAPP_WORKER_COUNT=5 WEBAPP_APP_PRELOAD=1
And optionally a .env
file for app-specific settings (e.g., database credentials). Then I run:
$ /path/to/psgi-systemd-deploy/deploy.sh
And that’s it. The app is now a first-class systemd
service, automatically started on boot and restartable with systemctl
.
run_all
Once you’ve deployed several PSGI apps using psgi-systemd-deploy
, you’ll probably want an easy way to manage them all at once. That’s where the run_all
script comes in.
It’s a simple but powerful wrapper around systemctl
that automatically discovers all deployed services by scanning for .deploy.env
files. That means no need to hard-code service names or paths — it just works, based on the configuration you’ve already provided.
Here’s how you might use it:
# Restart all PSGI apps $ run_all restart # Show current status $ run_all status # Stop them all (e.g., for maintenance) $ run_all stop
And if you want machine-readable output for scripting or monitoring, there’s a --json
flag:
$ run_all --json is-active | jq . [ { "service": "succession.service", "action": "is-active", "status": 0, "output": "active" }, { "service": "klortho.service", "action": "is-active", "status": 0, "output": "active" } ]
Under the hood, run_all
uses the same environment-driven model as the rest of the system — no surprises, no additional config files. It’s just a lightweight helper that understands your layout and automates the boring bits.
It’s not a replacement for systemctl
, but it makes common tasks across many services far more convenient — especially during development, deployment, or server reboots.
The goal of psgi-systemd-deploy
isn’t to replace Docker, K8s, or full-featured PaaS systems. It’s for the rest of us — folks running VPSes or bare-metal boxes where PSGI apps just need to run reliably and predictably under the OS’s own tools.
If you’ve been rolling your own init scripts, cron jobs, or nohup
-based hacks, give it a look. It’s clean, simple, and reliable — and a solid step up from duct tape.
The post Deploying Dancer Apps – The Next Generation first appeared on Perl Hacks.