Published by Mayur Koshti on Saturday 10 August 2024 07:19
A Knight in chess can move from its current position to any square two rows or columns plus one column or row away. Write a script which takes a starting position and an ending position and calculates the least number of moves required.
$start = 'g2', $end = 'a8'
4
$start = 'g2', $end = 'h2'
3
If you didn't immediately start humming "Night Moves" by Bob Seger, well, then you, uh, probably aren't as old as I am. "... How far off, I sat and wondered ... Ain't it funny how the night moves ..."
It's a shortest-path problem. The classic computer science solution is a breadth-first search, mildly complicated by the weird L-shaped moves of the knight and the chess notation. As early answers trickled in, I saw that everyone appeared to be going down this path, and I sighed and thought about all the ways I was going to mess this up until I beat it into submission with the debugger.
But here's a different way to approach the problem. Let's place a knight in the bottom left corner (a1) and label that corner 0. From here, there are two possible knight moves; let's label them with 1. Then, for each of the 1s, let's label the possible knight moves from there with a 2.
| a | b | c | d | e | f | g | h | |
|---|---|---|---|---|---|---|---|---|
| 5 | 2 | 2 | ||||||
| 4 | 2 | 2 | ||||||
| 3 | 2 | 1 | 2 | |||||
| 2 | 1 | 2 | ||||||
| 1 | 0 | 2 | 2 |
If we continue doing this until the 8x8 grid fills up, the board will look like this:
We have a lookup table. If we place the start point at a1, then we can read the move count directly out of the table at the end point. Well, mostly, assuming the end point is above and to the right of the start point.
There are symmetries we can exploit. Knight moves are the same forward and backward, so we can interchange start and end. If we think of the start and end as being the ends of a line segment, we can rotate by 90 degrees or reflect horizontally or vertically, and the move distance will remain the same.
So, given this table, we can slide or reflect the start/end pair over the grid until one of the points is at a1 and the other point is somewhere on the grid.
It took all of five minutes to generate the grid on a piece of graph paper, and the easy move here would be to hard-code the grid into a two-dimensional array. To make it more fun, we could write the code to generate the grid. That might be useful if the problem were generalized to board sizes other than 8x8, which I am not going to do, just to be clear.
For even more Perl fun, let's use the class feature to build an object around the board.
So, let's start laying out a program.
sub chessToGrid($chess)
{
return ( substr($chess,1,1) - 1, ord(substr($chess, 0, 1)) - ord('a') )
}
Our board is going to be a class, because I want it to be. I use v5.40, which has class features, but it still gives "experimental" warnings. The attributes of a Board are its size, and the two-dimensional grid. The only public methods we really need are a constructor, and a method to retrieve a value from the grid.
use v5.40;
use feature 'class'; no warnings "experimental::class";
class Board
{
field $row :param //= 8;
field $col :param //= 8;
field $lastRow = $row - 1;
field $lastCol = $col - 1;
field @board;
ADJUST {
# The board starts out as 8x8 undef values
push @board, [ (undef) x $col ] for ( 1 .. $row );
$self->_init();
}
method at($r, $c) { ... }
# Private methods
method _init() { ... }
method _knightMoveFrom($r, $c) { ... }
}
Some Perl points:
field $row :param //= 8 -- The :param declares that this is a parameter that could be passed to the constructor, but the //=8 declares that it will default to 8 if not given.field $lastRow = $row - 1 -- This is a convenience variable, which will be set up in the constructor. It's not a parameter to the constructor.ADJUST -- This is a Perl class convention for adding code to the constructor. In this case, we're going to initialize @board to be a two-dimensional array, and then we're going to call a private method to fill in the distance values.method at() -- This is an accessor to a square on the board, helping us keep @board private to the class.method _init() -- I'm using the ancient venerable convention that private methods have an underscore prefix.Let's tackle the sub-problem of figuring out possible knight moves. For a square in the middle of the board, there are 8 possible moves:
These possibilities can be represented in a list of (row,column) delta pairs.
( [-1, 2], [1,2], [-1,-2], [1,-2], [2,1], [2, -1], [-2, 1], [-2, -1 ] )
Now we can find our possible moves by adding our given row and column to each of these deltas:
map { [ $r + $_->[0], $c + $_->[1] ] } ( [-1,2]... )
That yields a list of [row,column] pairs, but some of those pairs might now be off the grid. Let's prune the list:
grep { 0 <= $_->[0] <= $lastRow && 0 <= $_->[1] <= $lastCol }
Since this is a class method, it has access to the fields, so $lastRow and $lastCol are available for the bounds check. The whole thing together is a terse couple of lines:
method knightMoveFrom($r, $c)
{
grep { 0 <= $_->[0] <= $lastRow && 0 <= $_->[1] <= $lastCol }
map { [ $r + $_->[0], $c + $_->[1] ] }
( [-1, 2], [1,2], [-1,-2], [1,-2], [2,1], [2, -1], [-2, 1], [-2, -1 ] )
}
Having this function available, it becomes possible to write the _init() private method. I'm going to omit the code for it, in an already-lost quest for brevity; it's on GitHub
Let's return to the main function of the program. We're given a start and end square in chess notation. We want to convert that to Cartesian coordinates, and then figure out how to translate the points to the origin.
We're going to check the slope of the line between the two points. If it's positive, then one of the points is up and to the right of the other, and all we have to do is slide the points down and to the left until one reaches (0,0).
If the slope is negative, that means that if we slide the two points towards the origin, one of them is going to fall outside of the grid. Fortunately, we can use symmetry to flip the line over and make the slope positive. For example, suppose our start and end points look like this:
| . . . . . .
| . S . . . .
| . . . . . .
| . . . . . .
| . . . . E .
| . . . . . .
+------------------
As far as the knight-move distance between the points is concerned, this is equivalent to flipping the points:
| . . . . . .
| . S------->s' .
| . . . . . .
| . . . . . .
| . e'<------E .
| . . . . . .
+------------------
The reflection means that for S, keep its row, but move it over to E's column; and for E, keep its row, but move it over to S's column. That is, (in programming terms), swap the column coordinates.
Once we've adjusted the points so that they form a positive slope, then we have to slide the one closest to the origin to (0,0), and slide the other point by the same amount. If the end is closer than the start, then we swap the points (again, as far as knight-move distance is concerned, it's going to be the same).
The translation to (0,0) happens by subtracting the start point from the end point. Finally, the number of moves is just a table lookup from our carefully constructed Board object.
sub km($start, $end)
{
my @start = chessToGrid($start);
my @end = chessToGrid($end);
# If the slope is negative, reflect the line so that the slope is positive.
my $dy = $end[1] - $start[1];
my $dx = $end[0] - $start[0];
my $slope = ( $dx == 0 ? 0 : $dy / $dx );
if ( $slope < 0 )
{
( $start[1], $end[1] ) = ( $end[1], $start[1] );
}
# If the end is closer to the origin, swap ends
if ( $end[0] < $start[0] || $end[1] < $start[1] )
{
( $start[0], $start[1], $end[0], $end[1] ) = ( @end, @start);
}
# Shift the end point as if the start is at 0,0
$end[0] -= $start[0];
$end[1] -= $start[1];
return $Board->at(@end);
}
Published by JGNI on Friday 09 August 2024 19:57
Unicode adds new properties to the Unicode database and these are added to later versions of Perl. I'd like to be able to est if a property is allowed on my current version of Perl.
I have tried
perl -E 'say eval {qr/\p{wibble}/}; say q(OK)'
But the eval doesn't trap the missing property, I suspect because it's being looked up at compile time, and the script dies before it reaches the say q(OK)
Is there somehow that I can test if a property exists before using it in a regex?
Are there new modern alternatives to PerlNET? I am using a Perl game automation library whose ui is built with Wx using Perl bindings. I want to create my own UI using C#(WPF) so wanted to kno if there are existing solutions to this?
Published by Mayur Koshti on Friday 09 August 2024 14:21
Published by U. Windl on Friday 09 August 2024 12:50
While finding a solution for How to solve Perl's `length 'für' == 4` for `LC_CTYPE="en_US.UTF-8"`? I wrote another little test program:
It seems Perl (5.18.2) does not output UTF-8 encoded strings correctly in an UTF-8 Linux (SLES12 SP5) environment when the strings are interpreted as UTF-8.
The basic problem was that (e.g.) string "Gemäß" being read from a file had a length of 7 instead of 5, so I wrote this test program ("length.pl", the first test is "commented out" in an odd way):
#!/usr/bin/perl
use warnings;
use strict;
=begin debug
use utf8;
print length('Gemäß'), "\n";
=end debug
=cut
if (open(my $fh, "<:encoding(UTF-8)", 'length.txt')) {
while (<$fh>) {
chomp;
print length($_), ':', $_, "\n";
}
close($fh);
} else {
warn "length.txt: $!\n";
}
The input file "length.txt" just contains a single line, like this
> cat length.txt
Gemäß
> hexdump -C length.txt
00000000 47 65 6d c3 a4 c3 9f 0a |Gem.....|
00000008
> ./length.pl
5:Gem▒▒
> locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
> vi length.pl # remove the ":encoding(UTF-8)" from open
> ./length.pl
7:Gemäß
So the length is correct, but the output on the screen is wrong.
When dropping :encoding(UTF-8) from the open call, then the string length is wrong, but the output is correct.
I'm using an SSH session via PuTTY with setting "Remote character set:" set to "UTF-8" (just in case someone would ask for that).
Obviously I'd like to have both (for correct UTF-8 input), the string length and correct text output.
(I should ask a new question as it is actually a different problem, but to avoid a cascade of related question, I add it to this one)
Unfortunately the test case was incomplete for the whole scenario: The real program reads UTF-8 input from a file, processes the contents, then writes the contents to a different file.
While https://stackoverflow.com/a/78851883/6607497 helped to fix the length issue, and a print inside the debugger shows the correct content encoding (I think), the content in the file written is wrong.
For example:
Say $templatecontains Mit freundlichen Grüßen, and $mf is a temporary file ($mf = File::Temp->new('TEMPLATE' => 'msg-XXXXXX', 'UNLINK' => 0)), then after print $mf $template; the file contains Mit freundlichen Gr▒▒en, ("üß" encoded as 0xfc 0xdf).
Before the "fix", the file content was correct. I had tried any combinations of
use utf8;
use open ':locale';
use open IO => ':locale';
use open IO => 'utf8';
use open ':std' => ':locale';
use open ':std' => 'utf8';
use feature qw(unicode_strings);
Published by /u/briandfoy on Friday 09 August 2024 11:31
| submitted by /u/briandfoy [link] [comments] |
Published by LEARN TO CODE on Friday 09 August 2024 00:05
Perl, often dubbed the “Swiss Army chainsaw” of programming languages, is known for its versatility and power in text processing, system…
Published by Perl Steering Council on Thursday 08 August 2024 22:12
Just Graham and Aristotle this time.
=begin/=end sections.Published by laurent_r on Thursday 08 August 2024 19:33
These are some answers to the Week 281, Task 2, of the Perl Weekly Challenge organized by Mohammad S. Anwar.
Spoiler Alert: This weekly challenge deadline is due in a few days from now (on August 11, 2024, at 23:59). This blog post provides some solutions to this challenge. Please don’t read on if you intend to complete the challenge on your own.
A Knight in chess can move from its current position to any square two rows or columns plus one column or row away. So in the diagram below, if it starts a S, it can move to any of the squares marked E.
Write a script which takes a starting position and an ending position and calculates the least number of moves required.

Example 1
Input: $start = 'g2', $end = 'a8'
Ouput: 4
g2 -> e3 -> d5 -> c7 -> a8
Example 2
Input: $start = 'g2', $end = 'h2'
Ouput: 3
g2 -> e3 -> f1 -> h2
This is a classical computer science topic. In fact, I had to implement almost the same task in C and in Pascal as a homework exercise in the beginning of my CS studies many years ago. It is an opportunity to study and understand First-In First-Out (FIFO) data structures such as queues, as opposed to Last-in First Out (LIFO) data structures, aka stacks. It is also an occasion to work on Breadth-First Search (BFS) algorithms (as opposed to Depth-First Search (DFS) algorithms) for creating and traversing a tree.
BFS traverses a tree by visiting all possible moves level by level, so that as soon as we find a solution, we know it is the shortest path (or, rather, one of the shortest paths) and can stop iteration and return the level value. In DFS, by contrast, you would need to explore essentially all possible paths to make sure you've found the shortest one.
A final point before we go on: how do we model the chess board? I considered various solutions and decided to transform the chess notation a to h abscissas to a 0 to 7 range. For the ordinates, we subtract 1 to convert the 1 to 8 range to a 0 to 7 range. For example, the c2 square would be transformed to rectangular or Cartesian coordinates (2, 1). This conversion is performed by the to-coordinates subroutine.

The authorized moves for a knight are modeled by @moves, an array of eight pairs representing the values to be added to the coordinates of one square to find the next square.
The @to-be-explored array contains subarrays describing the next squares to be visited along with the level depth (i.e. the number of moves to reach this square). The @to-be-explored array is initialized with the starting position (and a depth of 0). The %seen hash contains the squares that have already been visited (we simply stringify the pair of coordinates to build the hash key). The %seen hash is also initialized with the starting position.
The process traverses the @to-be-explored array and return the depth if it is the target square. Otherwise, for each item, it computes the next position that would be reached with each of the eight possible moves. This next position is dismissed if it falls out of the chess board (unauthorized move) or if it has already been visited. Else, the next position is added to the %seen hash and to the @to-be-explored array.
my @moves = <2 1>, <2 -1>, <1 2>, <1 -2>,
<-1 2>, <-1 -2>, <-2 1>, <-2 -1>;
sub to-coordinates ($in) {
my ($col, $row) = $in.comb;
return $col.ord - 'a'.ord, $row - 1;
}
sub find-shortest ($st-in, $end-in) {
# convert input to Cartesian coordinates
my @start = to-coordinates $st-in;
my @end = to-coordinates $end-in;
my @to-be-explored; # a queue of squares to be visited
push @to-be-explored, (0, @start).flat;
my %seen = "@start[]" => 1; # already visited squares
while @to-be-explored {
my @node = shift @to-be-explored;
my ($depth, @current) = @node[0];
return $depth if "@current[]" eq "@end[]";
for @moves -> @move {
my @next = @current[0] + @move[0],
@current[1] + @move[1];
# dismiss if computed position not on chessboard
next if @next.any > 7 or @next.any < 0;
# dismiss if computed position already visited
next if %seen{"@next[]"}:exists;
# update seen hash and to-be-explored queue
%seen{"@next[]"} = 1;
push @to-be-explored, ($depth + 1, @next).flat;
}
}
}
my @tests = <g2 a8>, <g2 h2>;
for @tests -> @test {
printf "%-6s => ", "@test[]";
say find-shortest @test[0], @test[1];
}
This program displays the following output:
$ raku ./shortest-knight-path.raku
g2 a8 => 4
g2 h2 => 3
This is a port to Perl of the Raku program above. Please refer to the rather large chunks of information provided in the two sections above if you need further information.
use strict;
use warnings;
use feature 'say';
my @moves = ( [<2 1>], [<2 -1>], [<1 2>], [<1 -2>],
[<-1 2>], [<-1 -2>], [<-2 1>], [<-2 -1>] );
sub to_coordinates {
my ($col, $row) = split //, shift;
return ord($col) - ord('a'), $row - 1;
}
sub find_shortest {
my ($st_in, $end_in) = @_;
# convert input to Cartesian coordinates
my @start = to_coordinates $st_in;
my @end = to_coordinates $end_in;
my @to_be_explored; # a queue of squares to be visited
push @to_be_explored, [0, @start];
my %seen = ("@start" => 1); # already visited squares
while (@to_be_explored) {
my $node = shift @to_be_explored;
my ($depth, @current) = @$node;
return $depth if "@current" eq "@end";
for my $move (@moves) {
my @next = ( $current[0] + $move->[0],
$current[1] + $move->[1] );
# dismiss if computed position not on chessboard
next if $next[0] > 7 or $next[0] < 0 or
$next[1] > 7 or $next[1] < 0;
# dismiss if computed position already visited
next if exists $seen{"@next"};
# update seen hash and to_be_explored queue
$seen{"@next"} = 1;
push @to_be_explored, [$depth + 1, @next];
}
}
}
my @tests = ([<g2 a8>], [<g2 h2>]);
for my $test (@tests) {
printf "%-6s => ", "@$test";
say find_shortest @$test;;
}
This program displays the following output:
$ perl ./shortest-knight-path.pl
g2 a8 => 4
g2 h2 => 3
The next week Perl Weekly Challenge will start soon. If you want to participate in this challenge, please check https://perlweeklychallenge.org/ and make sure you answer the challenge before 23:59 BST (British summer time) on August 18, 2024. And, please, also spread the word about the Perl Weekly Challenge if you can.
Published by Konabob on Thursday 08 August 2024 15:47
I have two perl scripts. Let's call them alpha.cgi and beta.cgi. I wish to have alpha.cgi finish its job, then activate beta.cgi, and immediately exit.
I do not need to pass any information between these two scripts. I just want alpha.cgi to trigger beta.cgi with no other interaction between them. Is there some command that I can insert before 'exit();' in alpha.cgi?
I have searched for ways to do this, and have found ways to kill the child and return to the parent, but I want to kill the parent, and let the child run.
Thanks to any and all who have experience with this situation. -Konabob
Published by haarg on Thursday 08 August 2024 14:33
Merge branch 'haarg/storable-cleanup' into blead This cleans up and modernized various parts of Storable. * various files are moved around to better match modern practices * strict is enabled in the module and tests * warnings are enabled in the tests * whitespace is normalized at 4 spaces * removed various minor bits of code and comments that have been irrelevant for decades
Published by haarg on Thursday 08 August 2024 14:32
Storable: change log entries
Published by haarg on Thursday 08 August 2024 14:32
Storable: use strict and warnings in tests
Storable: enable strict
Published by haarg on Thursday 08 August 2024 14:32
Storable: remove some attempted compatibility with ancient perl versions Storable hasn't been compatible with perl versions older than 5.6 for a long time, so we can remove attempts to keep compatible with earlier versions. Perl 5.6 compatibility may also already be broken, but for now we will leave in place the attempts at compatibility with it.
Published by Crass Spektakel on Thursday 08 August 2024 11:21
A negative pattern matching creates an unused array element.
Given is a file containing lines like these:
FILE:abc LENGTH:123 AUTHOR:Bobby
FILE:xyz LENGTH:987 AUTHOR:Sabine
I need to splice this line by line into the columns but ignore the FILE column
while ($line=<>) {
print "$line";
foreach $element ( $line=~/(\S+)/g ) {
print "$element\n"
}
}
running does output:
FILE:abc LENGTH:123 AUTHOR:Bobby
FILE:abc
LENGTH:123
AUTHOR:Bobby
FILE:xyz LENGTH:987 AUTHOR:Sabine
FILE:xyz
LENGTH:987
AUTHOR:Sabine
But I need:
FILE:abc LENGTH:123 AUTHOR:Bobby
LENGTH:123
AUTHOR:Bobby
FILE:xyz LENGTH:987 AUTHOR:Sabine
LENGTH:987
AUTHOR:Sabine
I already know about negative look ahead/behind and in theory it works well, it looks something like /(?<!FILE:)(\S+)/g but for my case this isn't good enough as it has one big drawback:
It creates another array element for the foreach loop which changes the output to:
FILE:abc LENGTH:123 AUTHOR:Bobby
LENGTH:123
AUTHOR:Bobby
FILE:xyz LENGTH:987 AUTHOR:Sabine
LENGTH:987
AUTHOR:Sabine
As the position of the columns isn't stable I can not just use an array range.
So, is it even possible to create a regex with ignores the FILE column but does not create an additional array element?
Background: I am giving you an extremely simplified version of my loop, please bear with me if my problem seem trivial. Sure, I can use an additional pattern-matching to filter out the unwanted column but it would turn a lot of things upside down later in the code. Also, I consider this an interesting programming exercise.
Published by ScoobaQueef on Wednesday 07 August 2024 11:59
I have a large set of numbers, basically 1001 .. 150000 for a database using MySQL
There are a ton of gaps in the IDs on the database, so not all IDs exist. It can go from 100000 - 10500, then the next number will be 10675, and so forth.
I want to make the IDs shortened, such as 1001..3000, x, x, x, 55000..101000, etc.
I'm sure it's simple.
SELECT id FROM table_name WHERE data = x
give me above info.
I used
select group_concat(id) from items where id>1000
to get all ids in a comma-separated list. How do I shrink this to be more clean? Basically to add ".." to a series of sequential numbers.
I am using Perl, but I'm just not sure of the syntax to make it work.
Published by /u/briandfoy on Wednesday 07 August 2024 11:32
Although this StackOverflow question about "islands and gaps" is titularly about Perl, the SQL answers are very nice. Apparently this is a FAQ for SQL.
However, this has bugged me for years on the CPAN side, but never enough to make me really do anthing about.
I thought there was a Perl module that did this, and it was in the context of a usenet reader that would take a list of article IDs, such as 1, 2, 3, 4, 5, 7, 10, 11, 15 and return something like 1-5,7,10-11,15 as a more space-efficient store of all the articles you had read.
Every time I've looked I've stopped after 15 minutes because I get distracted and I've never really needed this except to answer someone else's question. I'm not asking how to solve this because there are plenty of algorithm tutorials out there. Surely this is on CPAN somewhere.
There are plenty of options to go the other way and to ask if a number is in one of the ranges.
Published by /u/briandfoy on Wednesday 07 August 2024 11:31
| submitted by /u/briandfoy [link] [comments] |
Published by /u/briandfoy on Tuesday 06 August 2024 11:31
| submitted by /u/briandfoy [link] [comments] |
Published by laurent_r on Monday 05 August 2024 22:33
These are some answers to the Week 281, Task 1, of the Perl Weekly Challenge organized by Mohammad S. Anwar.
Spoiler Alert: This weekly challenge deadline is due in a few days from now (on August 11, 2024, at 23:59). This blog post provides some solutions to this challenge. Please don’t read on if you intend to complete the challenge on your own.
You are given coordinates, a string that represents the coordinates of a square of the chessboard as shown below:

Write a script to return true if the square is light, and false if the square is dark.
Example 1
Input: $coordinates = "d3"
Output: true
Example 2
Input: $coordinates = "g5"
Output: false
Example 3
Input: $coordinates = "e6"
Output: true
We could replace the abscissa letters with numbers from 1 to 8 (or 0 to 7), add the two values of the coordinates and check whether the sum is even or odd. But it is even simpler to assign 0 or 1 to a variable depending on whether the abscissa belong to the [aceg] or [bdfh] character class. We then add this variable to the ordinates and check whether the sum is even or odd.
sub check-color ($in) {
my ($abscissa, $ordinate) = $in.comb;`
my $code;
given $abscissa {
when /<[aceg]>/ {$code = 0}
when /<[bdfh]>/ {$code = 1}
}
return True if ($code + $ordinate) %% 2;
False;
}
for <a1 d3 g5 e6 h8> -> $coordinates {
printf "%-2s => ", $coordinates;
say check-color $coordinates;
}
This program displays the following output:
$ raku ./check-color.raku
a1 => False
d3 => True
g5 => False
e6 => True
h8 => False
This is a port to Perl of the above Raku program. Please refer to the previous section if you need explanations.
use strict;
use warnings;
use feature 'say';
sub check_color {
my ($abscissa, $ordinate) = split //, shift;
my $code = 1;
$code = 0 if $abscissa =~ /[aceg]/;
return "True" if ($code + $ordinate) % 2 == 0;
return "False";
}
for my $coordinates (qw<a1 d3 g5 e6 h8>) {
printf "%-2s => ", $coordinates;
say check_color $coordinates;
}
This program displays the following output:
$ perl ./check-color.pl
a1 => False
d3 => True
g5 => False
e6 => True
h8 => False
The next week Perl Weekly Challenge will start soon. If you want to participate in this challenge, please check https://perlweeklychallenge.org/ and make sure you answer the challenge before 23:59 BST (British summer time) on August 18, 2024. And, please, also spread the word about the Perl Weekly Challenge if you can.
Published on Monday 05 August 2024 18:48
Published by Gabor Szabo on Monday 05 August 2024 02:47
Originally published at Perl Weekly 680
Hi there,
I know we still have 4 months before the Advent Calendar season kicks in but I noticed the ground work has already started by Olaf Alders with the announcement of The Call for Papers for the 2024 Perl Advent Calendar. The CFP is open until midnight on Friday September 30th EST. I would like to request all Perl fans to share their fun encounter with Perl as an article for the upcoming Perl Advent Calendar 2024. I have already submitted my proposal and keeping my fingers crossed. While talking about, Advent Calendar, I would love to see Dancer2 Advent Calendar to come back with full force. If I remember correctly, we didn't have the complete calendar last year. Being Dancer2 fan, I would really want it to get back to its peak. Dave Cross is one of those who uses Dancer2 for building applications. In the recent post, he shared his experience about deploying Dancer Apps. Even my personal website is a Dancer2 REST API application.
It's a bit late to share but better late than never. We have the monthly post about What's new on CPAN - June 2024 by Mathew Korica. Kudos for the effort and thanks for keeping the tradition alive.
Enjoy rest of the newsletter.
--
Your editor: Mohammad Sajid Anwar.
You are cordially invited to write an article for the 2024 Perl Advent Calendar. The CFP is open until midnight on Friday September 30th EST.
Video recordings
In this virtual event you will learn how to use Markdown and GitHub Pages to create a simple web site and then we'll extend our use of GitHub Actions to generate the site using Perl. Register now!
Please checkout the monthly post abput the cool collection of CPAN modules. Find out more about the new addition.
This is the second post in continuation to what was discussed in earlier post: Deploying Dancer Apps.
Regular updates from Perl Steering Council. Happy to see the progress report. Thank you all for your time and efforts.
An honest opinion and view about Perl. What do you think?
The Weekly Challenge by Mohammad Sajid Anwar will help you step out of your comfort-zone. You can even win prize money of $50 by participating in the weekly challenge. We pick one champion at the end of the month from among all of the contributors during the month, thanks to the sponsor Lance Wicks.
Welcome to a new week with a couple of fun tasks "Check Color" and "Knight's Move". If you are new to the weekly challenge then why not join us and have fun every week. For more information, please read the FAQ.
Enjoy a quick recap of last week's contributions by Team PWC dealing with the "Twice Appearance" and "Count Asterisks" tasks in Perl and Raku. You will find plenty of solutions to keep you busy.
Welcome back to blogging. Suprise to see do { } for loop construct. Highly recommended.
Time for some clever regex in Perl and you end up with one-line. Cool work, thanks for sharing.
Powerful one-liner in Raku is on demand every week. Keep it up great work.
See how to keep the logic simple and still get the elegant solution. Incredible, keep it up great work.
Simply use split and hash, you end up with nice easy to follow solution. Smart and clever approach.
Well documented extended regex solutions, very brave indeed. Thanks for sharing knowledge with us.
Straight forward and to the point without any gimmicks. Short little discussion is very handy too. Great work.
Great show of regex in Perl and Raku. Thanks for sharing knowledge with us.
Mix of Raku, Python, Java and PostgreSQL will keep you busy as always every week. Pleasure to see how implementation differs. Thanks for sharing.
Ideal use case for one-liner in Perl this week by master of one-liners. You really don't want to skip it.
Love the journey to get the desired output. Plenty to learn from the experience. Great work, keep it up.
I liked the Approach section where we get to see the finer details then implementation in varieties of languages. Highly recommended.
Interesting use of regex and we have pretty cool solutions in Perl. Thanks for sharing knowledge with us.
Simple use of hash in Perl is good enough for Twice Appearance task. Smart approach, well done.
Interesting use of finite state machine with detailed discussion. Loved the use of Unicode character. Keep it up great work.
Line by line discussion is very handy even if the language is alien. This week choice of languages for post are Raku and Rust. Thanks for your contributions.
Great CPAN modules released last week.
August 13, 2024, Virtual event
August 15, 2024, in Zoom
August 14, 2024, Virtual event
October 26, 2024, in London, UK
You joined the Perl Weekly to get weekly e-mails about the Perl programming language and related topics.
Want to see more? See the archives of all the issues.
Not yet subscribed to the newsletter? Join us free of charge!
(C) Copyright Gabor Szabo
The articles are copyright the respective authors.
Published on Sunday 04 August 2024 16:57
The examples used here are from the weekly challenge problem statement and demonstrate the working solution.
You are given a string, $str, containing lowercase English letters only. Write a script to print the first letter that appears twice.
The complete solution is contained in one file that has a simple structure.
For this problem we do not need to include very much. We’re just specifying to use the current version of Perl, for all the latest features in the language. This fragment is also used in Part 2.
sub twice_appearance{
my($s) = @_;
my @a = ();
do{
$a[ord($_)]++;
return $_ if $a[ord($_)] == 2;
} for split //, $s;
return undef;
}
◇
Fragment referenced in 1.
Now all we need are a few lines of code for running some tests.
MAIN:{
say twice_appearance q/acbddbca/;
say twice_appearance q/abccd/;
say twice_appearance q/abcdabbb/;
}
◇
Fragment referenced in 1.
$ perl perl/ch-1.pl d c a
You are given a string, $str, where every two consecutive vertical bars are grouped into a pair. Write a script to return the number of asterisks, *, excluding any between each pair of vertical bars.
This is our principal function. As can be seen, it’s very short! The logic here is simple: peel off pairs and use a regex to find the asterisks.
sub count_asterisks{
my($s) = shift;
my $score = 0;
my @asterisks = ();
my @s = split /\|/, $s;
{
my $x = shift @s;
my $y = shift @s;
my @a = $x =~ m/(\*)/g if $x;
push @asterisks, @a if @a > 0;
redo if @s >= 1;
}
return 0 + @asterisks;
}
◇
Fragment referenced in 5.
Finally, here’s a few tests to confirm everything is working right.
MAIN:{
say count_asterisks q/p|*e*rl|w**e|*ekly|/;
say count_asterisks q/perl/;
say count_asterisks q/th|ewe|e**|k|l***ych|alleng|e/;
}
◇
Fragment referenced in 5.
$ perl ch-2.pl 13 30 37
Published by Otmane on Sunday 04 August 2024 08:13
Ah, Perl. The programming language that refuses to die. In a world flooded with shiny new toys like Go, Kotlin, and Python, Perl remains the wise (and slightly eccentric) grandparent at the family reunion, clutching its cherished regular expressions and muttering about the good old days. But before you roll your eyes and dismiss Perl as a relic, let’s dive into why this ancient language is still relevant. Spoiler alert: It involves some serious magic and a whole lot of text processing.
The Power of Text Processing
Perl’s text processing capabilities are legendary. No, seriously, they are. Imagine a Swiss Army knife, but instead of blades and screwdrivers, it has regex patterns and string manipulation functions. While other languages are busy with their fancy syntax and clean code, Perl is out there in the trenches, getting the job done with a regex pattern that looks like someone’s cat walked across the keyboard.
Example: Log File Analysis
System admins rejoice! Perl can plow through log files like a hot knife through butter. Need to find all the error messages in a 10GB log file? Perl’s got your back.
#!/usr/bin/env perl
use strict;
use warnings;
my $log_file = 'system.log';
open my $fh, '<', $log_file or die "Cannot open $log_file: $!";
while (my $line = <$fh>) {
if ($line =~ /ERROR/) {
print $line;
}
}
close $fh;
See? Easy peasy. While Python is off doing yoga and Go is busy with its minimalism, Perl is here, elbows deep in your log files, pulling out the gory details.
DevOps and Automation
In the age of DevOps, automation is king. And who better to automate your mundane, soul-crushing tasks than Perl? Forget spending hours on deployment. With Perl, you can sit back, relax, and watch the magic happen.
Example: Automated Deployment
Perl can automate deployments like a boss. Git pull? Check. Server configuration? Check. Deploy script? Double-check.
#!/usr/bin/env perl
use strict;
use warnings;
use Net::SSH::Perl;
my $host = 'example.com';
my $user = 'deploy';
my $password = 'secret';
my $ssh = Net::SSH::Perl->new($host);
$ssh->login($user, $password);
my $output = $ssh->cmd('cd /var/www/myapp && git pull origin master && ./deploy.sh');
print $output;
While Kotlin is busy figuring out its coroutines, Perl is out there making your life easier, one deployment at a time.
Web Development
Web development, you say? Surely, Perl can’t compete with the likes of JavaScript and Python, right? Wrong. Enter Mojolicious, the web framework that lets you whip up web apps faster than you can say “Node.js”.
Example: Rapid Prototyping with Mojolicious
With Mojolicious, you can have a web app up and running in no time. Minimal boilerplate, maximum fun.
#!/usr/bin/env perl
use Mojolicious::Lite;
get '/' => {text => 'Hello, World!'};
app->start;
Take that, React! While you’re setting up your endless dependencies, Perl just launched a web app. Boom.
Data Science and AI
Sure, Python has pandas, and R has... well, R. But did you know Perl has the Perl Data Language (PDL)? It’s like Perl decided to dabble in data science and accidentally became pretty good at it.
Example: Data Analysis with PDL
PDL handles large datasets with the grace of a ballerina on a sugar rush. Need to calculate the mean? Perl’s got you covered.
use PDL;
use PDL::NiceSlice;
my $data = pdl [1, 2, 3, 4, 5];
my $mean = $data->average;
print "Mean: $mean\n";
While Python is off publishing papers, Perl is quietly crunching numbers in the corner, getting stuff done.
System Administration
System administration is where Perl truly shines. It’s like Perl was born for this stuff. Need to manage user accounts or automate backups? Perl’s your guy.
Example: User Management Script
Perl scripts can handle system admin tasks with the finesse of a ninja.
#!/usr/bin/env perl
use strict;
use warnings;
my @users = qw(user1 user2 user3);
foreach my $user (@users) {
system("useradd $user");
}
While Go is busy being statically typed, Perl is out there making your sysadmin tasks look easy.
Conclusion
In a world obsessed with the latest and greatest, Perl stands as a testament to the power of simplicity and raw functionality. It may not have the flashiest syntax or the trendiest features, but it gets the job done. Whether it’s text processing, automation, web development, data science, or system administration, Perl is the unsung hero, quietly working behind the scenes. So, next time you’re faced with a daunting task, remember: Perl is still here, and it’s ready to help.
Thank You
And let’s not forget to extend a heartfelt thank you to Larry Wall, the genius behind Perl, and all the alpha nerds and sysadmin ninjas who’ve kept this language not just alive, but thriving. Your dedication and wit have made the tech world a better (and funnier) place. Here’s to many more years of Perl wizardry!
Published by Unknown on Sunday 04 August 2024 08:54
Published by Unknown on Sunday 04 August 2024 08:51
Published by chrisarg on Friday 02 August 2024 05:01
At this point one may wonder how numba, the Python compiler around numpy Python code, delivers a performance premium over numpy. To do so, let's inspect timings individually for all trigonometric functions (and yes, the exponential and the logarithm are trigonometric functions if you recall your complex analysis lessons from high school!). But the test is not relevant only for those who want to do high school trigonometry: physics engines e.g. in games will use these function, and machine learning and statistical calculations heavily use log and exp. So getting the trigonometric functions right is one small, but important step, towards implementing a variety of applications. The table below shows the timings for 50M in place transformations:
| Function | Library | Execution Time |
|---|---|---|
| Sqrt | numpy | 1.02e-01 seconds |
| Log | numpy | 2.82e-01 seconds |
| Exp | numpy | 3.00e-01 seconds |
| Cos | numpy | 3.55e-01 seconds |
| Sin | numpy | 4.83e-01 seconds |
| Sin | numba | 1.05e-01 seconds |
| Sqrt | numba | 1.05e-01 seconds |
| Exp | numba | 1.27e-01 seconds |
| Log | numba | 1.47e-01 seconds |
| Cos | numba | 1.82e-01 seconds |
The table holds the first hold to the performance benefits: the square root, a function that has a dedicated SIMD instruction for vectorization takes exactly the same time to execute in numba and numpy, while all the other functions are speed up by 2-2.5 time, indicating either that the code auto-vectorizes using SIMD or auto-threads. A second clue is provided by examining the difference in results between the numba and numpy, using the ULP (Unit in the Last Place). ULP is a measure of accuracy in numerical calculations and can easily be computed for numpy arrays using the following Python function:
def compute_ulp_error(array1, array2):
## maxulp set up to a very high number to avoid throwing an exception in the code
return np.testing.assert_array_max_ulp(array1, array2, maxulp=100000)
These numerical benchmarks indicate that the square root function utilizes pretty much equivalent code in numpy and numba, while for all the other trigonometric functions the mean, median, 99.9th percentile and maximum ULP value over all 50M numbers differ. This is a subtle hint that SIMD is at play: vectorization changes slightly the semantics of floating point code to make use of associative math, and floating point numerical operations are not associative.
| Function | Mean ULP | Median ULP | 99.9th ULP | Max ULP |
|---|---|---|---|---|
| Sqrt | 0.00e+00 | 0.00e+00 | 0.00e+00 | 0.00e+00 |
| Sin | 1.56e-03 | 0.00e+00 | 1.00e+00 | 1.00e+00 |
| Cos | 1.43e-03 | 0.00e+00 | 1.00e+00 | 1.00e+00 |
| Exp | 5.47e-03 | 0.00e+00 | 1.00e+00 | 2.00e+00 |
| Log | 1.09e-02 | 0.00e+00 | 2.00e+00 | 3.00e+00 |
Finally, we can inspect the code of the numba generated functions for vectorized assembly instructions as detailed here, using the code below:
@njit(nogil=True, fastmath=False, cache=True)
def compute_sqrt_with_numba(array):
np.sqrt(array, array)
@njit(nogil=True, fastmath=False, cache=True)
def compute_sin_with_numba(array):
np.sin(array, array)
@njit(nogil=True, fastmath=False, cache=True)
def compute_cos_with_numba(array):
np.cos(array, array)
@njit(nogil=True, fastmath=False, cache=True)
def compute_exp_with_numba(array):
np.exp(array, array)
@njit(nogil=True, fastmath=False, cache=True)
def compute_log_with_numba(array):
np.log(array, array)
## check for vectorization
## code lifted from https://tbetcke.github.io/hpc_lecture_notes/simd.html
def find_instr(func, keyword, sig, limit=5):
count = 0
for l in func.inspect_asm(func.signatures[sig]).split("\n"):
if keyword in l:
count += 1
print(l)
if count >= limit:
break
if count == 0:
print("No instructions found")
# Compile the function to avoid the overhead of the first call
compute_sqrt_with_numba(np.array([np.random.rand() for _ in range(1, 6)]))
compute_sin_with_numba(np.array([np.random.rand() for _ in range(1, 6)]))
compute_exp_with_numba(np.array([np.random.rand() for _ in range(1, 6)]))
compute_cos_with_numba(np.array([np.random.rand() for _ in range(1, 6)]))
compute_exp_with_numba(np.array([np.random.rand() for _ in range(1, 6)]))
And this is how we probe for the presence f the vmovups instruction indicating that the YMM AVX2 registers are being used in calculations
print("sqrt")
find_instr(compute_sqrt_with_numba, keyword="vsqrtsd", sig=0)
print("\n\n")
print("sin")
find_instr(compute_sin_with_numba, keyword="vmovups", sig=0)
As we can see in the output below the square root function uses the x86 SIMD vectorized square root instruction vsqrtsd , and the sin (but also all trigonometric functions) use SIMD instructions to access memory.
sqrt
vsqrtsd %xmm0, %xmm0, %xmm0
vsqrtsd %xmm0, %xmm0, %xmm0
sin
vmovups (%r12,%rsi,8), %ymm0
vmovups 32(%r12,%rsi,8), %ymm8
vmovups 64(%r12,%rsi,8), %ymm9
vmovups 96(%r12,%rsi,8), %ymm10
vmovups %ymm11, (%r12,%rsi,8)
Let's shift attention to Perl and C now (after all this is a Perl blog!). In Part I
we saw that PDL and C gave similar performance when evaluating the nested function cos(sin(sqrt(x))), and in Part II that the single-threaded Perl code was as fast as numba. But what about the individual trigonometric functions in PDL and C without the Inline module? The C code block that will be evaluated in this case is:
#pragma omp for simd
for (int i = 0; i < array_size; i++) {
double x = array[i];
array[i] = foo(x);
}
where foo is one of sqrt, sin, cos, log, exp. We will use the simd omp pragma in conjunction with the following compilation flags and the gcc compiler to see if we can get the code to pick up the hint and auto-vectorize the machine code generated using the SIMD instructions.
CC = gcc
CFLAGS = -O3 -ftree-vectorize -march=native -mtune=native -Wall -std=gnu11 -fopenmp -fstrict-aliasing -fopt-info-vec-optimized -fopt-info-vec-missed
LDFLAGS = -fPIE -fopenmp
LIBS = -lm
During compilation, gcc informs us about all the wonderful missed opportunities to optimize the loop. The performance table below also demonstrates this; note that the standard C implementation and PDL are equivalent and equal in performance to numba. Perl through PDL can deliver performance in our data science world.
| Function | Library | Execution Time |
|---|---|---|
| Sqrt | PDL | 1.11e-01 seconds |
| Log | PDL | 2.73e-01 seconds |
| Exp | PDL | 3.10e-01 seconds |
| Cos | PDL | 3.54e-01 seconds |
| Sin | PDL | 4.75e-01 seconds |
| Sqrt | C | 1.23e-01 seconds |
| Log | C | 2.87e-01 seconds |
| Exp | C | 3.19e-01 seconds |
| Cos | C | 3.57e-01 seconds |
| Sin | C | 4.96e-01 seconds |
To get the compiler to use SIMD, we replace the flag -O3 by -Ofast and all these wonderful opportunities for performance are no longer missed, and the C code now delivers (but with the usual caveats that apply to the -Ofast flag).
| Function | Library | Execution Time |
|---|---|---|
| Sqrt | C - Ofast | 1.00e-01 seconds |
| Sin | C - Ofast | 9.89e-02 seconds |
| Cos | C - Ofast | 1.05e-01 seconds |
| Exp | C - Ofast | 8.40e-02 seconds |
| Log | C - Ofast | 1.04e-01 seconds |
With these benchmarks, let's return to our initial Perl benchmarks and contrast the timings obtained with the non-SIMD aware invokation of the Inline C code:
use Inline (
C => 'DATA',
build_noisy => 1,
with => qw/Alien::OpenMP/,
optimize => '-O3 -march=native -mtune=native',
libs => '-lm'
);
and the SIMD aware one (in the code below, one has to incluce the vectorized version of the mathematics library for the code to compile):
use Inline (
C => 'DATA',
build_noisy => 1,
with => qw/Alien::OpenMP/,
optimize => '-Ofast -march=native -mtune=native',
libs => '-lmvec'
);
The non-vectorized version of the code yield the following table
Inplace in base Python took 11.9 seconds
Inplace in PythonJoblib took 4.42 seconds
Inplace in Perl took 2.88 seconds
Inplace in Perl/mapCseq took 1.60 seconds
Inplace in Perl/mapC took 1.50 seconds
C array in C took 1.42 seconds
Vector in Base R took 1.30 seconds
C array in Perl/C/seq took 1.17 seconds
Inplace in PDL - ST took 0.94 seconds
Inplace in Python Numpy took 0.93 seconds
Inplace in Python Numba took 0.49 seconds
Inplace in Perl/C/OMP took 0.24 seconds
C array in C with OMP took 0.22 seconds
C array in C/OMP/seq took 0.18 seconds
Inplace in PDL - MT took 0.16 seconds
while the vectorized ones this one:
Inplace in base Python took 11.9 seconds
Inplace in PythonJoblib took 4.42 seconds
Inplace in Perl took 2.94 seconds
Inplace in Perl/mapCseq took 1.59 seconds
Inplace in Perl/mapC took 1.48 seconds
Vector in Base R took 1.30 seconds
Inplace in PDL - ST took 0.96 seconds
Inplace in Python Numpy took 0.93 seconds
Inplace in Python Numba took 0.49 seconds
C array in Perl/C/seq took 0.30 seconds
C array in C took 0.26 seconds
Inplace in Perl/C/OMP took 0.24 seconds
C array in C with OMP took 0.23 seconds
C array in C/OMP/seq took 0.19 seconds
Inplace in PDL - MT took 0.17 seconds
To facilitate comparisons against the various flavors of Python and R, we inserted the results we presented previously in these 2 tables.
The take home points (some of which may be somewhat surprising are):
These observations generate the following big picture questions:
Published by chrisarg on Friday 02 August 2024 04:58
In the two prior installments of this series, we considered the performance of floating operations in Perl,
Python and R in a toy example that computed the function cos(sin(sqrt(x))), where x was a very large array of 50M double precision floating numbers.
Hybrid implementations that delegated the arithmetic intensive part to C were among the most performant implementations. In this installment, we will digress slightly and look at the performance of a pure C code implementation of the toy example.
The C code will provide further insights about the importance of memory locality for performance (by default elements in a C array are stored in sequential addresses in memory, and numerical APIs such as PDL or numpy interface with such containers) vis-a-vis containers,
e.g. Perl arrays which do not store their values in sequential addresses in memory. Last, but certainly not least, the C code implementations will allow us to assess whether flags related to floating point operations for the low level compiler (in this case gcc) can affect performance.
This point is worth emphasizing: common mortals are entirely dependent on the choice of compiler flags when "piping" their "install" or building their Inline file. If one does not touch these flags, then one will be blissfully unaware of what they may missing, or pitfalls they may be avoiding.
The humble C file makefile allows one to make such performance evaluations explicitly.
The C code for our toy example is listed in its entirety below. The code is rather self-explanatory, so will not spend time explaining other than pointing out that it contains four functions for
In this case, one may hope that the compiler is smart enough to recognize that the square root maps to packed (vectorized) floating pointing operations in assembly, so that one function can be vectorized using the appropriate SIMD instructions (note we did not use the simd program for the OpenMP codes).
Perhaps the speedup from the vectorization may offset the loss of performance from repeatedly accessing the same memory locations (or not).
#include <stdlib.h>
#include <string.h>
#include <math.h>
#include <stdio.h>
#include <omp.h>
// simulates a large array of random numbers
double* simulate_array(int num_of_elements,int seed);
// OMP environment functions
void _set_openmp_schedule_from_env();
void _set_num_threads_from_env();
// functions to modify C arrays
void map_c_array(double* array, int len);
void map_c_array_sequential(double* array, int len);
void map_C_array_using_OMP(double* array, int len);
void map_C_array_sequential_using_OMP(double* array, int len);
int main(int argc, char *argv[]) {
if (argc != 2) {
printf("Usage: %s <array_size>\n", argv[0]);
return 1;
}
int array_size = atoi(argv[1]);
// printf the array size
printf("Array size: %d\n", array_size);
double *array = simulate_array(array_size, 1234);
// Set OMP environment
_set_openmp_schedule_from_env();
_set_num_threads_from_env();
// Perform calculations and collect timing data
double start_time, end_time, elapsed_time;
// Non-Sequential calculation
start_time = omp_get_wtime();
map_c_array(array, array_size);
end_time = omp_get_wtime();
elapsed_time = end_time - start_time;
printf("Non-sequential calculation time: %f seconds\n", elapsed_time);
free(array);
// Sequential calculation
array = simulate_array(array_size, 1234);
start_time = omp_get_wtime();
map_c_array_sequential(array, array_size);
end_time = omp_get_wtime();
elapsed_time = end_time - start_time;
printf("Sequential calculation time: %f seconds\n", elapsed_time);
free(array);
array = simulate_array(array_size, 1234);
// Parallel calculation using OMP
start_time = omp_get_wtime();
map_C_array_using_OMP(array, array_size);
end_time = omp_get_wtime();
elapsed_time = end_time - start_time;
printf("Parallel calculation using OMP time: %f seconds\n", elapsed_time);
free(array);
// Sequential calculation using OMP
array = simulate_array(array_size, 1234);
start_time = omp_get_wtime();
map_C_array_sequential_using_OMP(array, array_size);
end_time = omp_get_wtime();
elapsed_time = end_time - start_time;
printf("Sequential calculation using OMP time: %f seconds\n", elapsed_time);
free(array);
return 0;
}
/*
*******************************************************************************
* OMP environment functions
*******************************************************************************
*/
void _set_openmp_schedule_from_env() {
char *schedule_env = getenv("OMP_SCHEDULE");
printf("Schedule from env %s\n", getenv("OMP_SCHEDULE"));
if (schedule_env != NULL) {
char *kind_str = strtok(schedule_env, ",");
char *chunk_size_str = strtok(NULL, ",");
omp_sched_t kind;
if (strcmp(kind_str, "static") == 0) {
kind = omp_sched_static;
} else if (strcmp(kind_str, "dynamic") == 0) {
kind = omp_sched_dynamic;
} else if (strcmp(kind_str, "guided") == 0) {
kind = omp_sched_guided;
} else {
kind = omp_sched_auto;
}
int chunk_size = atoi(chunk_size_str);
omp_set_schedule(kind, chunk_size);
}
}
void _set_num_threads_from_env() {
char *num = getenv("OMP_NUM_THREADS");
printf("Number of threads = %s from within C\n", num);
omp_set_num_threads(atoi(num));
}
/*
*******************************************************************************
* Functions that modify C arrays whose address is passed from Perl in C
*******************************************************************************
*/
double* simulate_array(int num_of_elements, int seed) {
srand(seed); // Seed the random number generator
double *array = (double *)malloc(num_of_elements * sizeof(double));
for (int i = 0; i < num_of_elements; i++) {
array[i] =
(double)rand() / RAND_MAX; // Generate a random double between 0 and 1
}
return array;
}
void map_c_array(double *array, int len) {
for (int i = 0; i < len; i++) {
array[i] = cos(sin(sqrt(array[i])));
}
}
void map_c_array_sequential(double* array, int len) {
for (int i = 0; i < len; i++) {
array[i] = sqrt(array[i]);
}
for (int i = 0; i < len; i++) {
array[i] = sin(array[i]);
}
for (int i = 0; i < len; i++) {
array[i] = cos(array[i]);
}
}
void map_C_array_using_OMP(double* array, int len) {
#pragma omp parallel
{
#pragma omp for schedule(runtime) nowait
for (int i = 0; i < len; i++) {
array[i] = cos(sin(sqrt(array[i])));
}
}
}
void map_C_array_sequential_using_OMP(double* array, int len) {
#pragma omp parallel
{
#pragma omp for schedule(runtime) nowait
for (int i = 0; i < len; i++) {
array[i] = sqrt(array[i]);
}
#pragma omp for schedule(runtime) nowait
for (int i = 0; i < len; i++) {
array[i] = sin(array[i]);
}
#pragma omp for schedule(runtime) nowait
for (int i = 0; i < len; i++) {
array[i] = cos(array[i]);
}
}
}
A critical question is whether the use of fast floating compiler flags, a trick that trades speed for accuracy of the code, can affect performance.
Here is the makefile withut this compiler flag
CC = gcc
CFLAGS = -O3 -ftree-vectorize -march=native -Wall -std=gnu11 -fopenmp -fstrict-aliasing
LDFLAGS = -fPIE -fopenmp
LIBS = -lm
SOURCES = inplace_array_mod_with_OpenMP.c
OBJECTS = $(SOURCES:.c=_noffmath_gcc.o)
EXECUTABLE = inplace_array_mod_with_OpenMP_noffmath_gcc
all: $(SOURCES) $(EXECUTABLE)
clean:
rm -f $(OBJECTS) $(EXECUTABLE)
$(EXECUTABLE): $(OBJECTS)
$(CC) $(LDFLAGS) $(OBJECTS) $(LIBS) -o $@
%_noffmath_gcc.o : %.c
$(CC) $(CFLAGS) -c $< -o $@
and here is the one with this flag:
CC = gcc
CFLAGS = -O3 -ftree-vectorize -march=native -Wall -std=gnu11 -fopenmp -fstrict-aliasing -ffast-math
LDFLAGS = -fPIE -fopenmp
LIBS = -lm
SOURCES = inplace_array_mod_with_OpenMP.c
OBJECTS = $(SOURCES:.c=_gcc.o)
EXECUTABLE = inplace_array_mod_with_OpenMP_gcc
all: $(SOURCES) $(EXECUTABLE)
clean:
rm -f $(OBJECTS) $(EXECUTABLE)
$(EXECUTABLE): $(OBJECTS)
$(CC) $(LDFLAGS) $(OBJECTS) $(LIBS) -o $@
%_gcc.o : %.c
$(CC) $(CFLAGS) -c $< -o $@
And here are the results of running these two programs
OMP_SCHEDULE=guided,1 OMP_NUM_THREADS=8 ./inplace_array_mod_with_OpenMP_noffmath_gcc 50000000
Array size: 50000000
Schedule from env guided,1
Number of threads = 8 from within C
Non-sequential calculation time: 1.12 seconds
Sequential calculation time: 0.95 seconds
Parallel calculation using OMP time: 0.17 seconds
Sequential calculation using OMP time: 0.15 seconds
OMP_SCHEDULE=guided,1 OMP_NUM_THREADS=8 ./inplace_array_mod_with_OpenMP_gcc 50000000
Array size: 50000000
Schedule from env guided,1
Number of threads = 8 from within C
Non-sequential calculation time: 0.27 seconds
Sequential calculation time: 0.28 seconds
Parallel calculation using OMP time: 0.05 seconds
Sequential calculation using OMP time: 0.06 seconds
Note that one can use the fastmath in Numba code as follows (the default is fastmath=False):
@njit(nogil=True,fastmath=True)
def compute_inplace_with_numba(array):
np.sqrt(array,array)
np.sin(array,array)
np.cos(array,array)
A few points that are worth noting:
title: " The Quest for Performance Part III : C Force "
In the two prior installments of this series, we considered the performance of floating operations in Perl,
Python and R in a toy example that computed the function cos(sin(sqrt(x))), where x was a very large array of 50M double precision floating numbers.
Hybrid implementations that delegated the arithmetic intensive part to C were among the most performant implementations. In this installment, we will digress slightly and look at the performance of a pure C code implementation of the toy example.
The C code will provide further insights about the importance of memory locality for performance (by default elements in a C array are stored in sequential addresses in memory, and numerical APIs such as PDL or numpy interface with such containers) vis-a-vis containers,
e.g. Perl arrays which do not store their values in sequential addresses in memory. Last, but certainly not least, the C code implementations will allow us to assess whether flags related to floating point operations for the low level compiler (in this case gcc) can affect performance.
This point is worth emphasizing: common mortals are entirely dependent on the choice of compiler flags when "piping" their "install" or building their Inline file. If one does not touch these flags, then one will be blissfully unaware of what they may missing, or pitfalls they may be avoiding.
The humble C file makefile allows one to make such performance evaluations explicitly.
The C code for our toy example is listed in its entirety below. The code is rather self-explanatory, so will not spend time explaining other than pointing out that it contains four functions for
In this case, one may hope that the compiler is smart enough to recognize that the square root maps to packed (vectorized) floating pointing operations in assembly, so that one function can be vectorized using the appropriate SIMD instructions (note we did not use the simd program for the OpenMP codes).
Perhaps the speedup from the vectorization may offset the loss of performance from repeatedly accessing the same memory locations (or not).
#include <stdlib.h>
#include <string.h>
#include <math.h>
#include <stdio.h>
#include <omp.h>
// simulates a large array of random numbers
double* simulate_array(int num_of_elements,int seed);
// OMP environment functions
void _set_openmp_schedule_from_env();
void _set_num_threads_from_env();
// functions to modify C arrays
void map_c_array(double* array, int len);
void map_c_array_sequential(double* array, int len);
void map_C_array_using_OMP(double* array, int len);
void map_C_array_sequential_using_OMP(double* array, int len);
int main(int argc, char *argv[]) {
if (argc != 2) {
printf("Usage: %s <array_size>\n", argv[0]);
return 1;
}
int array_size = atoi(argv[1]);
// printf the array size
printf("Array size: %d\n", array_size);
double *array = simulate_array(array_size, 1234);
// Set OMP environment
_set_openmp_schedule_from_env();
_set_num_threads_from_env();
// Perform calculations and collect timing data
double start_time, end_time, elapsed_time;
// Non-Sequential calculation
start_time = omp_get_wtime();
map_c_array(array, array_size);
end_time = omp_get_wtime();
elapsed_time = end_time - start_time;
printf("Non-sequential calculation time: %f seconds\n", elapsed_time);
free(array);
// Sequential calculation
array = simulate_array(array_size, 1234);
start_time = omp_get_wtime();
map_c_array_sequential(array, array_size);
end_time = omp_get_wtime();
elapsed_time = end_time - start_time;
printf("Sequential calculation time: %f seconds\n", elapsed_time);
free(array);
array = simulate_array(array_size, 1234);
// Parallel calculation using OMP
start_time = omp_get_wtime();
map_C_array_using_OMP(array, array_size);
end_time = omp_get_wtime();
elapsed_time = end_time - start_time;
printf("Parallel calculation using OMP time: %f seconds\n", elapsed_time);
free(array);
// Sequential calculation using OMP
array = simulate_array(array_size, 1234);
start_time = omp_get_wtime();
map_C_array_sequential_using_OMP(array, array_size);
end_time = omp_get_wtime();
elapsed_time = end_time - start_time;
printf("Sequential calculation using OMP time: %f seconds\n", elapsed_time);
free(array);
return 0;
}
/*
*******************************************************************************
* OMP environment functions
*******************************************************************************
*/
void _set_openmp_schedule_from_env() {
char *schedule_env = getenv("OMP_SCHEDULE");
printf("Schedule from env %s\n", getenv("OMP_SCHEDULE"));
if (schedule_env != NULL) {
char *kind_str = strtok(schedule_env, ",");
char *chunk_size_str = strtok(NULL, ",");
omp_sched_t kind;
if (strcmp(kind_str, "static") == 0) {
kind = omp_sched_static;
} else if (strcmp(kind_str, "dynamic") == 0) {
kind = omp_sched_dynamic;
} else if (strcmp(kind_str, "guided") == 0) {
kind = omp_sched_guided;
} else {
kind = omp_sched_auto;
}
int chunk_size = atoi(chunk_size_str);
omp_set_schedule(kind, chunk_size);
}
}
void _set_num_threads_from_env() {
char *num = getenv("OMP_NUM_THREADS");
printf("Number of threads = %s from within C\n", num);
omp_set_num_threads(atoi(num));
}
/*
*******************************************************************************
* Functions that modify C arrays whose address is passed from Perl in C
*******************************************************************************
*/
double* simulate_array(int num_of_elements, int seed) {
srand(seed); // Seed the random number generator
double *array = (double *)malloc(num_of_elements * sizeof(double));
for (int i = 0; i < num_of_elements; i++) {
array[i] =
(double)rand() / RAND_MAX; // Generate a random double between 0 and 1
}
return array;
}
void map_c_array(double *array, int len) {
for (int i = 0; i < len; i++) {
array[i] = cos(sin(sqrt(array[i])));
}
}
void map_c_array_sequential(double* array, int len) {
for (int i = 0; i < len; i++) {
array[i] = sqrt(array[i]);
}
for (int i = 0; i < len; i++) {
array[i] = sin(array[i]);
}
for (int i = 0; i < len; i++) {
array[i] = cos(array[i]);
}
}
void map_C_array_using_OMP(double* array, int len) {
#pragma omp parallel
{
#pragma omp for schedule(runtime) nowait
for (int i = 0; i < len; i++) {
array[i] = cos(sin(sqrt(array[i])));
}
}
}
void map_C_array_sequential_using_OMP(double* array, int len) {
#pragma omp parallel
{
#pragma omp for schedule(runtime) nowait
for (int i = 0; i < len; i++) {
array[i] = sqrt(array[i]);
}
#pragma omp for schedule(runtime) nowait
for (int i = 0; i < len; i++) {
array[i] = sin(array[i]);
}
#pragma omp for schedule(runtime) nowait
for (int i = 0; i < len; i++) {
array[i] = cos(array[i]);
}
}
}
A critical question is whether the use of fast floating compiler flags, a trick that trades speed for accuracy of the code, can affect performance.
Here is the makefile withut this compiler flag
CC = gcc
CFLAGS = -O3 -ftree-vectorize -march=native -Wall -std=gnu11 -fopenmp -fstrict-aliasing
LDFLAGS = -fPIE -fopenmp
LIBS = -lm
SOURCES = inplace_array_mod_with_OpenMP.c
OBJECTS = $(SOURCES:.c=_noffmath_gcc.o)
EXECUTABLE = inplace_array_mod_with_OpenMP_noffmath_gcc
all: $(SOURCES) $(EXECUTABLE)
clean:
rm -f $(OBJECTS) $(EXECUTABLE)
$(EXECUTABLE): $(OBJECTS)
$(CC) $(LDFLAGS) $(OBJECTS) $(LIBS) -o $@
%_noffmath_gcc.o : %.c
$(CC) $(CFLAGS) -c $< -o $@
and here is the one with this flag:
CC = gcc
CFLAGS = -O3 -ftree-vectorize -march=native -Wall -std=gnu11 -fopenmp -fstrict-aliasing -ffast-math
LDFLAGS = -fPIE -fopenmp
LIBS = -lm
SOURCES = inplace_array_mod_with_OpenMP.c
OBJECTS = $(SOURCES:.c=_gcc.o)
EXECUTABLE = inplace_array_mod_with_OpenMP_gcc
all: $(SOURCES) $(EXECUTABLE)
clean:
rm -f $(OBJECTS) $(EXECUTABLE)
$(EXECUTABLE): $(OBJECTS)
$(CC) $(LDFLAGS) $(OBJECTS) $(LIBS) -o $@
%_gcc.o : %.c
$(CC) $(CFLAGS) -c $< -o $@
And here are the results of running these two programs
OMP_SCHEDULE=guided,1 OMP_NUM_THREADS=8 ./inplace_array_mod_with_OpenMP_noffmath_gcc 50000000
Array size: 50000000
Schedule from env guided,1
Number of threads = 8 from within C
Non-sequential calculation time: 1.12 seconds
Sequential calculation time: 0.95 seconds
Parallel calculation using OMP time: 0.17 seconds
Sequential calculation using OMP time: 0.15 seconds
OMP_SCHEDULE=guided,1 OMP_NUM_THREADS=8 ./inplace_array_mod_with_OpenMP_gcc 50000000
Array size: 50000000
Schedule from env guided,1
Number of threads = 8 from within C
Non-sequential calculation time: 0.27 seconds
Sequential calculation time: 0.28 seconds
Parallel calculation using OMP time: 0.05 seconds
Sequential calculation using OMP time: 0.06 seconds
Note that one can use the fastmath in Numba code as follows (the default is fastmath=False):
@njit(nogil=True,fastmath=True)
def compute_inplace_with_numba(array):
np.sqrt(array,array)
np.sin(array,array)
np.cos(array,array)
A few points that are worth noting:
Published by Andrew Baerg on Friday 02 August 2024 00:00
Next Generation of Perl
I attended The Perl and Raku Conference in Las Vegas, NV, which took place June 25–28, 2024. It was HOT outside (over 40 °C/110 °F) but we stayed cool inside at the Alexis Park Resort.
Curtis Poe (Ovid) got things started with the keynote encouraging us to Party Like It’s 19100+e^iπ, and reminded us that Vegas is lexically scoped (what happens in Vegas stays in Vegas)! More importantly he reminded us that Perl is about people, not just the technology. The Perl community has been meeting all over the world since 1999, with this being the 25th anniversary of the first The Perl Conference (aka YAPC::NA).
Ovid Keynote
Meeting in person with people who you interact with primarily through digital channels, code commits, and MetaCPAN documentation really highlighted the importance of the community. On the first day, I messed up timezones, showed up an hour before registration opened, and witnessed the conference organizers and core members arrive and greet each other with hugs. I also enjoyed visiting with one of the very welcoming board members of The Perl and Raku Foundation (TPRF).
Many of the speakers and attendees put a “Hallway++” sticker on their badge which simply meant “talk to me, I’m here to meet and get to know people”. At breakfast one morning, I had the privilege of sitting with Jason Crome, the core maintainer of Dancer, a framework that I have used extensively. It was amazing to be able to pick the brain of one of the people who has intimate knowledge of the software.
Dancing with Cromedome
The Perl community is large and diverse, which is reflected in the Science Perl Committee and the all-Perl Koha Library Software in use at over 4,000 libraries and with its own annual conference. It was cool to hear about the Glue Photo Project, making algorithmic music, and gaming with the TinyNES.
Every community will experience conflict and this one is no different. The impact of Sawyer’s resignation at TPRC 2023 could be felt at this conference, and in response the community is focused on making things better with a new standards of conduct.
It wouldn’t be a conference in 2024 without talk of AI. We had some Musings on Generative AI and an introduction to PerlGPT, A Code Llama LLM Fine-Tuned For Perl.
There were, of course, a lot of talks about actual Perl code!
One of the things I enjoy about attending conferences is discovering things that I wasn’t looking for. Chad Granum gave a lightning talk on goto::file and I noticed the use of line directives which can be extremely helpful in debugging evaled code. For example, let’s say you are evaling subs into a hashref and then calling them like so:
my $sub1 = "sub {\n print 'foo';\n print 'bar';\n print 'baz';\n}";
my $sub2 = "sub {\n print 'foo';\n print 'bar';\n warn 'baz';\n}";
my $Sub = {
sub1 => eval $sub1,
sub2 => eval $sub2,
};
$Sub->{sub1}->();
$Sub->{sub2}->();
Any warn or die output will give you a line number, but no context as to which sub it originated from:
baz at (eval 2) line 4.
Making use of a line directive when evaling the subs like this:
my $sub1 = "sub {\n print 'foo';\n print 'bar';\n print 'baz';\n}";
my $sub2 = "sub {\n print 'foo';\n print 'bar';\n warn 'baz';\n}";
my $Sub = {
sub1 => eval qq(#line 1 "sub1"\n$sub1),
sub2 => eval qq(#line 1 "sub2"\n$sub2),
};
$Sub->{sub1}->();
$Sub->{sub2}->();
Will now result in a much more friendly:
baz at sub2 line 4.
And in true open source fashion, this has been turned into a pull request for Interchange!
The Las Vegas Sphere
I always enjoy hearing from others in the community about CPAN modules that are in their toolbox. I learned about DBIx::QuickDB, which you can use to spin up a database server on the fly, which removes the need for a running database server for tests, and also enables running concurrent tests which require a database server that would otherwise conflict. Combine this with a DBIx::Class schema and DBIx::Class::Fixtures and you have a very nice way to run some tests against fixed data:
use DBIx::QuickDB PSQL_DB => {driver => 'PostgreSQL'};
my $dbh = PSQL_DB->connect;
$schema = Some::Schema->connect( sub { $dbh }, { on_connect_do => ["..."] } );
$schema->deploy();
my $fixtures = DBIx::Class::Fixtures->new({ config_dir => '...' });
$fixtures->populate({ no_deploy => 1, schema => $schema, directory => '...' });
ok($schema->resultset('Foo')->count >= 1, 'database populated');
Damian Conway gave a keynote on The Once and Future Perl, showing how far Perl has come as a language and how its rich history can be leveraged into the future: “if you can envisage what you could have done better in the past, then you can probably think of ways to make the future brighter!”.
He showed off his new Multi::Dispatch module which you can use now to write incredibly extensible (and beautiful) code. Here it is in action with a simple Data::Dumper clone in 5 lines of code:
use v5.26;
use Multi::Dispatch;
multi dd :before :where(VOID) (@data) { say &next::variant }
multi dd ($k, $v) { dd($k) . ' => ' . dd($v) }
multi dd ($data :where(ARRAY)) { '[' . join(', ', map {dd($_)} @$data) . ']' }
multi dd ($data :where(HASH)) { '{' . join(', ', map {dd($_, $data->{$_})} keys %$data) . '}' }
multi dd ($data) { '"' . quotemeta($data) . '"' }
say dd ['foo', { bar => "baz" }];
With Smartmatch (given/when) scheduled for deprecation in 5.42 he has written a drop-in replacement that uses Multi::Dispatch: Switch::Right, which addresses the issues with the original implementation that was in core.
Also on display was the new class syntax introduced into core in 5.38 with use feature 'class'. Here’s an example of how it looks:
use v5.40;
use feature 'class';
class Point {
field $x :param = 0;
field $y :param = 0;
method describe () {
say "A point at ($x, $y)\n";
}
}
Point->new(x => 5, y => 10)->describe;
If you are waiting for Perl 7, Damian is here to tell you that the future is now; you don’t have to wait for Perl 7 (or 17), it’s Perl 5 with Multi::Dispatch and use feature 'class'!
The last day of the conference provided an opportunity to go deeper into learning the new class syntax with a workshop on building a roguelike adventure game from scratch.
So what’s next? The London Perl & Raku Workshop is taking place on October 26, 2024 and Perl 5.42 is just around the corner!
—JAPH (Just Another Perl Hacker)
Published by Perl Steering Council on Thursday 01 August 2024 21:38
This week, the whole PSC was in attendance.
We merged HTTP-Tiny#6, and then discussed a number of topics:
Published by Dave Cross on Thursday 01 August 2024 14:13
Back in May, I wrote a blog post about how I had moved a number of Dancer2 applications to a new server and had, in the process, created a standardised procedure for deploying Dancer2 apps. It’s been about six weeks since I did that and I thought it would be useful to give a little update on how it all went and talk about a few little changes I’ve made.
I mentioned that I was moving the apps to a new server. What I didn’t say was that I was convinced my old server was overpowered (and overpriced!) for what I needed, so the new server has less RAM and, I think, a slower CPU than the old one. And that turned out to be a bit of a problem. It turned out there was a time early each morning when there were too many requests coming into the server and it ran out of memory. I was waking up most days to a dead server. My previous work meant that fixing it wasn’t hard, but it really wasn’t something that I wanted to do most mornings.
So I wanted to look into reducing the amount of memory used by the apps. And that turned out to be a two-stage approach.
You might recall that the apps were all controlled using a standardised driver program called “app_service”. It looked like this:
#!/usr/bin/env perl
use warnings;
use strict;
use Daemon::Control;
use ENV::Util -load_dotenv;
use Cwd qw(abs_path);
use File::Basename;
Daemon::Control->new({
name => ucfirst lc $ENV{KLORTHO_APP_NAME},
lsb_start => '$syslog $remote_fs',
lsb_stop => '$syslog',
lsb_sdesc => 'Advice from Klortho',
lsb_desc => 'Klortho knows programming. Listen to Klortho',
path => abs_path($0),
program => '/usr/bin/starman',
program_args => [ '--workers', 10, '-l', ":$ENV{KLORTHO_APP_PORT}",
dirname(abs_path($0)) . '/app.psgi' ],
user => $ENV{KLORTHO_OWNER},
group => $ENV{KLORTHO_GROUP},
pid_file => "/var/run/$ENV{KLORTHO_APP_NAME}.pid",
stderr_file => "$ENV{KLORTHO_LOG_DIR}/error.log",
stdout_file => "$ENV{KLORTHO_LOG_DIR}/output.log",
fork => 2,
})->run;We’re deferring most of the clever stuff to Daemon::Control. But we’re building the parameters to pass to the constructor. And two of the parameters (“program” and “program_args”) control how the service is run. You’ll see I’m using Starman. The first fix was obvious when you look at my code. Starman is a pre-forking server and we always start with 10 copies of the app. Now, I’m very proud of some of my apps, but I think it’s optimistic to think my Klortho server will ever need to respond to 10 simultaneous requests. Honestly, I’m pleasantly surprised if it gets 10 requests in a month. So the first change was to make it easy to change the number of workers.
In the previous article, I talked about using ENV::Util to load environment variables from a “.env” file. And we can continue to use that approach here. I rewrote the “program_args” code to be this:
program_args => [ '--workers', ($ENV{KLORTHO_APP_WORKERS} // 10), '-l', ":$ENV{KLORTHO_APP_PORT}",
dirname(abs_path($0)) . '/app.psgi' ],Here we look for an environment variable (defined in “.env”) and use either that value or a default of 10.
I made similar changes to all the “app_service” files, added appropriate environment variables to all the “.env” files and restarted all the apps. Immediately, I could see an improvement as I was now running maybe a third of the app processes on the server. But I felt I could do betters. So I had a close look at the Starman documentation to see if there was anything else I could tweak. That’s when I found the “–preload-app” command-line option.
Starman works by loading a main driver process which then fires up as many worker processes as you ask it for. Without the “–preload-app” option, each of those processes loads a copy of the application. But with this option, each worker process reads the main driver’s copy of the application and only loads its own copy when it wants to write something. This can be a big memory saving – although it’s important to note that the documentation warns:
Enabling this option can cause bad things happen when resources like sockets or database connections are opened at load time by the master process and shared by multiple children.
I’m pretty sure that most of my apps are not in any danger here, but I’m keeping a close eye on the situation and if I see any problems, it’s easy enough to turn preloading off again.
When adding the preloading option to “app_service”, I realised I should probably completely rewrite the code that builds the program arguments. It now looks like this:
my @program_args;
if ($ENV{KLORTHO_WORKER_COUNT}) {
push @program_args, '--workers', $ENV{KLORTHO_WORKER_COUNT};
}
if ($ENV{KLORTHO_APP_PORT}) {
push @program_args, '-l', ":$ENV{KLORTHO_APP_PORT}";
}
if ($ENV{KLORTHO_APP_PRELOAD}) {
push @program_args, '--preload-app';
}
push @program_args, dirname(abs_path($0)) . '/bin/app.psgi';The observant among you will notice that I’ve subtly changed the behaviour of the worker count environment variable. Previously, a missing variable would use a default value of 10. Now, it just omits the argument which uses Starman’s default value of 5.
I’ve made similar changes in all my “app_service” programs and set environment variables to turn preloading on. And now my apps use substantially less memory. The server hasn’t died since I implemented this stuff at the start of this week. So that makes me very happy.
But programming is the pursuit of minimisation. I’ve already seen two places where I can make these programs smaller and simpler.
The Klortho service driver program is on GitHub. Can you suggest any more improvements?
The post Deploying Dancer Apps (Addendum) first appeared on Perl Hacks.
Published by laurent_r on Tuesday 30 July 2024 21:12
These are some answers to the Week 280, Task 2, of the Perl Weekly Challenge organized by Mohammad S. Anwar.
Spoiler Alert: This weekly challenge deadline is due in a few days from now (on August 4, 2024, at 23:59). This blog post provides some solutions to this challenge. Please don’t read on if you intend to complete the challenge on your own.
You are given a string, $str, where every two consecutive vertical bars are grouped into a pair.
Write a script to return the number of asterisks, *, excluding any between each pair of vertical bars.
Example 1
Input: $str = "p|*e*rl|w**e|*ekly|"
Ouput: 2
The characters we are looking here are "p" and "w**e".
Example 2
Input: $str = "perl"
Ouput: 0
Example 3
Input: $str = "th|ewe|e**|k|l***ych|alleng|e"
Ouput: 5
The characters we are looking here are "th", "e**", "l***ych" and "e".
We'll use a regex substitution to remove the parts of the input strings to be excluded from the count (parts between pairs of vertical bars or pipe characters). There are many ways of counting the asterisks in the remaining parts of the input string, including various types of loops, but, it is simpler to use the tr/// transliteration operator, which returns essentially the number of changes performed. And, at least in Perl, the transliteration operator is reputed to be the fastest way of counting the occurrences of a character in a string.
For our substitution, we need our regex to match a vertical bar, followed by any number of characters other than the pipe, followed by a pipe. This is easily achieved with a frugal (or non-greedy) quantifier, which will match as much as it has to for the overall regex to succeed, but not more than that. This leads to the following possible regex: s:g/'|'.*?'|'//. The *? part is the frugal quantifier.
As mentioned above, we can use the tr/// transliteration operator to count the asterisks. In Raku, the transliteration operator returns not exactly the number of changes performed as informally stated above, but more precisely a StrDistance object, which will numify to the distance (or number of edits) between the original and resulting strings. Numification is performed here with a + sign before the overall expression.
This leads to the following program:
sub count-asterisks ($in is copy) {
$in ~~ s:g/'|'.*?'|'//;
return +($in ~~ tr/*//);
}
my @tests = "p|*e*rl|w**e|*ekly|", "perl",
"th|ewe|e**|k|l***ych|alleng|e";
for @tests -> $test {
printf "%-30s => ", $test;
say count-asterisks $test;
}
This program displays the following output:
$ raku ./count-asterisks.raku
p|*e*rl|w**e|*ekly| => 2
perl => 0
th|ewe|e**|k|l***ych|alleng|e => 5
This is a port to Perl of the above Raku program. The regex syntax is slightly different, but it uses a similar frugal quantifier and leads to the same matches.
use strict;
use warnings;
use feature 'say';
sub count_asterisks {
my $in = shift;
$in =~ s/\|.*?\|//g;
return +($in =~ tr/*//);
}
my @tests = ("p|*e*rl|w**e|*ekly|", "perl",
"th|ewe|e**|k|l***ych|alleng|e");
for my $test (@tests) {
printf "%-30s => ", $test;
say count_asterisks $test;
}
This program displays the following output:
$ perl ./count-asterisks.pl
p|*e*rl|w**e|*ekly| => 2
perl => 0
th|ewe|e**|k|l***ych|alleng|e => 5
The next week Perl Weekly Challenge will start soon. If you want to participate in this challenge, please check https://perlweeklychallenge.org/ and make sure you answer the challenge before 23:59 BST (British summer time) on August 11, 2024. And, please, also spread the word about the Perl Weekly Challenge if you can.
Published by Unknown on Sunday 28 July 2024 10:33
Published by alh on Tuesday 23 July 2024 14:10

Tony writes:
``` [Hours] [Activity] 2024/06/03 Monday 0.72 #22211 check smoke results, check and re-word one commit message, make PR 22257 1.95 #22230 debugging
5.09
2024/06/04 Tuesday 0.45 #22211 cleanup, testing, update PR 1.20 #22252 review, testing
3.12
2024/06/05 Wednesday 2.20 #22230 review khw-env changes 1.47 #22230 try an experiment with character encoding, which
3.67
2024/06/06 Thursday 1.25 #22230 discussion with khw (zoom, irc) 0.87 #22169 work on a fix
3.22
2024/06/11 Tuesday 0.75 github notifications, look into #22208 warnings and comment 0.62 #22169 more cleanup 1.70 #22169 more cleanup, testing 0.18 #22252 review and approve 0.32 #22268 review and approve 0.12 #22269 review, check history and approve 0.13 #22270 review, research, comment and approve
3.95
2024/06/12 Wednesday 2.35 #22169 setup regression tests, debugging
4.42
2024/06/13 Thursday 0.18 github notifications 0.18 #22271 research, work up tiny cast-away-const fix, anything else waits for #22271 to be applied 0.77 #22270 review re-work
1.96
2024/06/17 Monday 0.15 github notifications 0.23 #22270 review updated PR and approve 0.22 #22271 more cast-away-const fix, push for CI 1.52 #22169 re-check, polish 1.53 #22169 fix an issue, try avoiding any_sv (CVs already have
3.65
2024/06/18 Tuesday 0.08 #22300 review CI results and open PR 22300 0.53 #22273 review discussion, research, comment 0.08 #22276 review and approve 0.53 #22280 review, testing and approve 0.17 #22274 review and apply to blead 2.23 tick-in-name removal: implementation, working through
3.62
2024/06/19 Wednesday 0.52 #22295 review logs, write most of a response, OP, closes the ticket 1.58 tick-in-name removal: working through tests, some core fixes 1.82 tick-in-name removal: more test fixes, commits, push for CI 0.37 #22292 review, research and approve 0.10 #22287 review and approve
4.61
2024/05/20 Monday 0.10 tick-in-name removal: check CI, re-run failed job (appears unrelated to change) 0.08 #22070 review and comment 0.47 security list 0.08 tick-in-name removal: check CI results and open PR 22303 0.30 #22289 review updates, approve and comment on StructCopy() 0.22 macos workflow update: updates, push for CI 0.23 #22296 rebase, minor commit message change, testing, push 0.08 #22296 check CI results and apply to blead 0.08 macos workflow update: check CI results, make PR 22306 0.10 #22070 review latest update, approve
2.92
2024/06/24 Monday 0.15 #22257 recheck and apply to blead 0.32 #22282 review and approve 0.75 #22283 review and comments 0.08 #22290 review and approve 0.17 #22293 review and comment 0.33 #22298 review and approve 0.57 #22309 review, comment and approve
3.85
2024/06/25 Tuesday 0.30 github notifications 0.17 #22310 review and approve 0.08 #22311 review and approve 0.20 #22312 review, research and approve 0.35 briefly review coverity scan results, email list about the many new defect reports 0.08 #22313 review and approve 0.08 #22314 review and approve 0.15 #22315 review and approve 0.12 #22318 review and approve 0.08 #22319 review and approve 0.08 #22320 review and comment
4.36
2024/06/26 Wednesday 0.32 #22321 review and approve 0.08 #22322 review and approve 0.15 #22323 review and comment 0.22 #22324 review and comment 0.12 #22325 review and approve 0.20 #22326 review and approve 0.15 #22327 review and approve 0.08 #22328 review and approve 0.73 #22329 review, research and comment 0.13 #22331 review and comment 0.33 #22332 review, research and approve with comment 0.10 #22333 review and comment 0.08 #22334 review and approve 0.18 #22341 review and comment 1.30 smartmatch removal, working through tests
5.25
2024/06/27 Thursday 1.32 #22329 long comment 0.32 #22344 review and comment 0.43 #22345 review and approve 0.17 #22349 review and comment 0.78 smartmatch removal: get deparse tests passing (need to do docs)
5.29
Which I calculate is 58.98 hours.
Approximately 53 tickets were reviewed or worked on, and 3 patches were applied. ```
Published by Ted James on Monday 22 July 2024 10:32
Welcome to “What’s new on CPAN”, a curated look at last month’s new CPAN uploads for your reading and programming pleasure. Enjoy!
Published by The Perl and Raku Conference - Las Vegas, NV 2024 on Saturday 20 July 2024 23:19
Published by Unknown on Saturday 20 July 2024 18:01
Published by alh on Monday 15 July 2024 18:03

Paul writes: ``` My time was almost entirely consumed during April and May by running a stage show, and then June by travelling the entire length of the country supporting a long-distance bicycle adventure; as such I didn't get much Perl work done and I've only just got around to writing out what few things I did get done.
Entirely related to a few last pre-release steps to get features nicely lined up for the 5.40 release:
Hours:
5 = Stablise experiments for perl 5.40 https://github.com/Perl/perl5/pull/22123
1 = Fix perlexperiment.pod; create v5.40 feature bundle https://github.com/Perl/perl5/pull/22141
1 = Bugfix for GH22278 (uninitialised fields during DESTROY) https://github.com/Perl/perl5/pull/22280
Total: 7 hours
I'm now back to the usual schedule, so I hope to have more to work on for July onwards... ```
Published by Unknown on Saturday 13 July 2024 20:41
Published by alh on Thursday 11 July 2024 08:43

Dave writes:
This is my monthly report on work done during June 2024 covered by my TPF perl core maintenance grant.
I spent most of last month continuing to work on understanding XS and improving its documentation, as a precursor to adding reference-counted stack (PERL_RC_STACK) abilities to XS.
This work is a bit frustrating, as I still haven't got anything publicly to show for it. Privately however, I do have about 4000 lines of notes on ways to improve the documentation and the XS parser itself.
I have also been going through the Extutils::ParseXS' module's code line by line trying to understand it, and have (so far) added about 1000 new lines of code comments to ParseXS.pm, which is nearly ready to be pushed. This has involved a lot of code archaeology, since much of the code is obscure. I have even discovered XS keywords which have been implemented but aren't documented (such as "ATTRS:" and "NOT_IMPLEMENTED_YET").
I have also dug out the xsubpp script from the 1994 perl5.000 release and from time to time run it against some sample XS constructs. Amazingly, it still runs under a modern perl (although I didn't check whether the C code it outputs is still compilable).
SUMMARY: * 1:38 process p5p mailbox * 53:30 rework XS documentation
Total: * 55:08 (HH:MM)