First Batch of LPW 2024 Talks Accepted

blogs.perl.org

Published by London Perl Workshop on Saturday 27 July 2024 16:54

Yep, that's right - the first dozen talks have been accepted for this year's London Perl and Raku Workshop. This puts our schedule at approximately 50% full, so if you are thinking about talking at the workshop then submit your proposal now!

If you aren't thinking about talking then have a think about what you've been doing in the Perl and/or Raku space the last five years, or even just the general IT and development space. Perhaps there's something interesting you can talk about? If you don't feel it's a full fat talk then submit a lightning talk instead.

The London Perl and Raku Workshop will take place on 26th Oct 2024. Thanks to this year's sponsors, without whom LPW would not happen:

If you would like to sponsor LPW then please have a look at the options here: https://act.yapc.eu/lpw2024/sponsoring.html

First Batch of LPW 2024 Talks Accepted

r/perl

Published by /u/leejo on Saturday 27 July 2024 11:55

my $file = '/a/b/c.xxx';
my $path = $file =~ s'/.*?$''r;
print "$path\n";
$path = $file =~ s'/[^/]*$''r;
print "$path\n";

The second substitution gives what I was expecting a/b but the first gives an empty string. Why is the .*? slurping up the whole string?

I understand .*? to be a reluctant match anything, matching the smallest possible amount. However in this case, instead of matching c.xxx it matched the whole string. In what circumstances will .*? match more than the smallest possible amount?

Perl is a Plug-in Hybrid and not a dying Pontiac

r/perl

Published by /u/Itcharlie on Saturday 27 July 2024 05:11

Perl is growing and adapting to the modern times, just like a new shiny Plug-In Hybrid car. Plug-in hybrid cars have the best of both worlds, an electric motor that has an average range of 20-60 miles and after its range is depleted it goes on the good old reliable gas motor for a few hundred miles.

Perl has many new shiny tech ( think of this as the Electric motor side of the Plug-In Hybrid car ) like Dancer2, Mojolicious , Starman and now COR which is a new OO system that is part of the language ( and many other cool new cpan modules that I might have missed * feel free to share your favorite new cpan module in the comments)

Perl has done a good job at keeping backwards compatibility ( now think of this as the gas engine side of the plugin hybrid car ) where many companies can still reliably run its perl code ( even after upgrading perl and obviously a few tweaks in the codebase here and there )

The Perl community is still active and its cpan modules continue to be maintained. Yes we have experienced a shrinking in the Perl community but the community has maintained a focus on improving the core modules that are shipped in the language and has paid close attention to widely used cpan modules ( read up on cpan river model - https://neilb.org/2015/04/20/river-of-cpan.html )

If you used to write Perl code or your curious about it then this is the best time to give Perl 5.40 a try and play around with some of its new web frameworks, cpan libraries and its new OO system COR.

submitted by /u/Itcharlie
[link] [comments]

Using perl single liner, how to remove " around all column names in first row from a file.

Unable to delete double-quotes around column names only.

Input file

"FNAME","LNAME"
"A1B1","XYZ"
"A1B2","X12"

Output file without double-quotes around column names.

FNAME,LNAME
"A1B1","XYZ"
"A1B2","X12"
perldelta: Document the new SvTYPE() checks added by 2463f19365 and related work

Perl Weekly Challenge 279: Split String

blogs.perl.org

Published by laurent_r on Friday 26 July 2024 19:24

These are some answers to the Week 279, Task 2, of the Perl Weekly Challenge organized by Mohammad S. Anwar.

Spoiler Alert: This weekly challenge deadline is due in a few days from now (on July 28, 2024 at 23:59). This blog post offers some solutions to this challenge. Please don’t read on if you intend to complete the challenge on your own.

Task 2: Split String

You are given a string, $str.

Write a script to split the given string into two containing exactly same number of vowels and return true if you can otherwise false.

Example 1

Input: $str = "perl"
Ouput: false

Example 2

Input: $str = "book"
Ouput: true

Two possible strings "bo" and "ok" containing exactly one vowel each.

Example 3

Input: $str = "good morning"
Ouput: true

Two possible strings "good " and "morning" containing two vowels each or "good m" and "orning" containing two vowels each.

We are asked to say whether the input string can be split into two substrings containing the same number of vowels. This can always be done if the input string contains an even number of vowels, and can never be done if it contains an odd number of vowels. So all we need to do it to count the vowels and return "True" if the count is even, and "False otherwise'.

Split String in Raku

As said above, we want to count the vowels in the input string. We use the comb method (with a regex matching vowels and an ignore-caseadverb) to get the vowels, count them with the elems method and find out whether the count can be evenly divided by 2, using the %% operator. We end up with a one-liner subroutine:

sub split-string ($in) {
    return $in.comb(/:i <[aeiou}]>/).elems %% 2;
}

for "Perl", "book", "bOok", "good morning" -> $test {
    printf "%-15s => ", $test;
    say split-string $test;
}

This program displays the following output:

$ raku ./split-string.raku
Perl            => False
book            => True
bOok            => True
good morning    => True

Split String in Perl

This is a port to Perl of the above Raku program. Not much to say about this port, except that we return "true" or "false" as strings.

use strict;
use warnings;
use feature 'say';

sub split_string {
    my @vowels = grep {/[aeiou]/i} split "", shift;
    scalar @vowels % 2 == 0 ? "true" : "false";

}

for my $test ("Perl", "book", "bOok", "good morning") {
    printf "%-12s => ", $test;
    say split_string $test;
}

This program displays the following output:_

$ perl ./split-string.pl
Perl         => false
book         => true
bOok         => true
good morning => true

Wrapping up

The next week Perl Weekly Challenge will start soon. If you want to participate in this challenge, please check https://perlweeklychallenge.org/ and make sure you answer the challenge before 23:59 BST (British summer time) on August 4, 2024. And, please, also spread the word about the Perl Weekly Challenge if you can.

Sometimes, one's code must simply perform and principles, such as aesthetics, "cleverness" or commitment to a single language solution simply go out of the window.
At the TPRC I gave a talk (here are the slides) about how this
can be done for bioinformatics applications, but I feel that a simpler example is warranted to illustrate the potential venues to maximize performance that a Perl
programmer has at their disposal when working in data intensive applications.

So here is a toy problem to illustrate these options. Given a very large array of double precision floats transform them in place with the following function : cos(sin(sqrt(x))).
The function has 3 nested floating point operations. This is an expensive function to evaluate, especially if one has to calculate for a large number of values. We can generate reasonably
quickly the array values in Perl (and some copies for the solutions we will be examining) using the following code:

my $num_of_elements = 50_000_000;
my @array0 = map { rand } 1 .. $num_of_elements;    ## generate random numbers
my @array1 = @array0;                               ## copy the array
my @array2 = @array0;                               ## another copy
my @array3 = @array0;                               ## yet another copy
my @rray4  = @array0;                               ## the last? copy
my $array_in_PDL      = pdl(@array0);    ## convert the array to a PDL ndarray
my $array_in_PDL_copy = $array_in_PDL->copy;    ## copy the PDL ndarray

The posssible solutions include the following:

Inplace modification using a for-loop in Perl.

for my $elem (@array0) {
    $elem = cos( sin( sqrt($elem) ) );
}

Using Inline C code to walk the array and transform in place in C. . Effectively one does a inplace map using C. Accessing elements of Perl arrays (AV* in C) in C is particularly
performant if one is using perl 5.36 and above because of an optimized fetch function introduced in that version of Perl.

void map_in_C(AV *array) {
  int len = av_len(array) + 1;
  for (int i = 0; i < len; i++) {
    SV **elem = av_fetch_simple(array, i, 0); // perl 5.36 and above
    if (elem != NULL) {
      double value = SvNV(*elem);
      value = cos(sin(sqrt(value))); // Modify the value
      sv_setnv(*elem, value);
    }
  }
}

Using Inline C code to transform the array, but break the transformation in 3 sequential C for-loops. This is an experiment really about tradeoffs: modern x86 processors have a specialized,
vectorized square root instruction, so perhaps the compiler can figure how to use it to speed up at least one part of the calculation. On the other hand, we will be reducing the arithmetic intensity
of each loop and accessing the same data value twice, so there will likely be a price to pay for these repeated data accesses.

void map_in_C_sequential(AV *array) {
  int len = av_len(array) + 1;
  for (int i = 0; i < len; i++) {
    SV **elem = av_fetch_simple(array, i, 0); // perl 5.36 and above
    if (elem != NULL) {
      double value = SvNV(*elem);
      value = sqrt(value); // Modify the value
      sv_setnv(*elem, value);
    }
  }
  for (int i = 0; i < len; i++) {
    SV **elem = av_fetch_simple(array, i, 0); // perl 5.36 and above
    double value = SvNV(*elem);
    value = sin(value); // Modify the value
    sv_setnv(*elem, value);
  }
  for (int i = 0; i < len; i++) {
    SV **elem = av_fetch_simple(array, i, 0); // perl 5.36 and above
    double value = SvNV(*elem);
    value = cos(value); // Modify the value
    sv_setnv(*elem, value);
  }
}

Parallelize the C function loop using OpenMP. In a previous entry we discussed how to control the OpenMP environment from within Perl and compile OpenMP aware Inline::C code for
use by Perl, so let's put this knowledge into action! On the Perl side of the program we will do this:

use v5.38;
use Alien::OpenMP;
use OpenMP::Environment;
use Inline (
    C    => 'DATA',
    with => qw/Alien::OpenMP/,
);
my $env = OpenMP::Environment->new();
my $threads_or_workers = 8; ## or any other value
## modify number of threads and make C aware of the change
$env->omp_num_threads($threads_or_workers);
_set_num_threads_from_env();

## modify runtime schedule and make C aware of the change
$env->omp_schedule("guided,1");    ## modify runtime schedule
_set_openmp_schedule_from_env();

On the C part of the program, we will then do this (the helper functions for the OpenMP environment have been discussed
previously, and thus not repeated here).

#include <omp.h>
void map_in_C_using_OMP(AV *array) {
  int len = av_len(array) + 1;
#pragma omp parallel
  {
#pragma omp for schedule(runtime) nowait
    for (int i = 0; i < len; i++) {
      SV **elem = av_fetch_simple(array, i, 0); // perl 5.36 and above
      if (elem != NULL) {
        double value = SvNV(*elem);
        value = cos(sin(sqrt(value))); // Modify the value
        sv_setnv(*elem, value);
      }
    }
  }
}

Perl Data Language (PDL) to the rescue. The PDL set of modules is yet another way to speed up operations and can save the programmer from C. It also autoparallelizes given the right directives so why not use it?

use PDL;
## set the minimum size problem for autothreading in PDL
set_autopthread_size(0);
my $threads_or_workers = 8; ## or any other value

## PDL
## use PDL to modify the array - multi threaded
set_autopthread_targ($threads_or_workers);
$array_in_PDL->inplace->sqrt;
$array_in_PDL->inplace->sin;
$array_in_PDL->inplace->cos;


## use PDL to modify the array - single thread
set_autopthread_targ(0);

$array_in_PDL_copy->inplace->sqrt;
$array_in_PDL_copy->inplace->sin;
$array_in_PDL_copy->inplace->cos;

Using 8 threads we get something like this

Inplace benchmarks
Inplace  in         Perl took 2.85 seconds
Inplace  in Perl/mapCseq took 1.62 seconds
Inplace  in    Perl/mapC took 1.54 seconds
Inplace  in   Perl/C/OMP took 0.24 seconds

PDL benchmarks
Inplace  in     PDL - ST took 0.94 seconds
Inplace  in     PDL - MT took 0.17 seconds

Using 16 threads we get this!

Starting the benchmark for 50000000 elements using 16 threads/workers

Inplace benchmarks
Inplace  in         Perl took 3.00 seconds
Inplace  in Perl/mapCseq took 1.72 seconds
Inplace  in    Perl/mapC took 1.62 seconds
Inplace  in   Perl/C/OMP took 0.13 seconds

PDL benchmarks
Inplace  in     PDL - ST took 0.99 seconds
Inplace  in     PDL - MT took 0.10 seconds

A few observations:

  • The OpenMP and the multi-threaded (MT) of the PDL are responsive to the number of workers, while the solutions are not. Hence, the timings of the pure Perl and the inline non-OpenMP solution timings in these benchmarks give an idea of the natural variability in performance
  • Writing the map version of the code in C improved performance by about 180% (contrast Perl and Perl/mapC).
  • Using PDL in a single thread improved performance by 285-300% (contrast PDL - ST and Perl timings).
  • There was a price to pay for repeated memory access (contrast Perl/mapC to Perl/mapCseq)
  • OpenMP and multi-threaded PDL operations gave similar performance (though PDL appeared faster in these examples). The code run 23-30 times faster.

In summary, there are both native (PDL modules) and foreign (C/OpenMP) solutions to speed up data intensive operations in Perl, so why not use them widely and wisely to make Perl programs performant?

I have the following table

head  v1  v2  v3  v4  v5  v6
stn2   1   4   1   1   4   2
stn2   1   4   1   1   4   2
stn3   1   4   1   1   4   2
stn4   1   4   1   1   4   3
stn4   1   4   1   1   4   2
stn5   1   4   1   1   4   4
stn6   1   3   1   1   4   3
stn7   4   4   1   1   4   4
stn8   4   4   1   1   4   3
stn9   2   4   1   1   4   3

I would like to delete all columns containing 1's only. I have put the six columns in arrays with the following lengthy code;

#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
use feature 'say';

open(my $RSKF, " < ../risks.txt") || die "open risks.txt: failed $! ($^E)";
my $line;
my $count;

my @one_column;
my @two_column;
my @three_column;
my @four_column;
my @five_column;
my @six_column;

my $one_column;
my $two_column;
my $three_column;
my $four_column;
my $five_column;
my $six_column;

my @remove;
my @keep1;
my $keepcount=0;

while($line = <$RSKF>){
push(@one_column, (split(/\s+/, $line))[2]);
push(@two_column, (split(/\s+/, $line))[3]);
push(@three_column, (split(/\s+/, $line))[4]);
push(@four_column, (split(/\s+/, $line))[5]);
push(@five_column, (split(/\s+/, $line))[6]);
push(@six_column, (split(/\s+/, $line))[7]);
 }

My attempt to loop through the fourth column below does not work.

$count=0;
for (my $i=1; $i < @four_column; ++$i){
if($four_column[$i] ge '2'){
$count++;
}
if($count > 0){
@four_column=@keep1;
$keepcount++;
}
else{
@four_column=@remove;
}

}

Surely there should be an easier way of doing this. Please help.

perldelta for GH #22412 / d8935409c9

Perl commits on GitHub

Published by mauke on Friday 26 July 2024 17:30

perldelta for GH #22412 / d8935409c9

I'm trying to compile an executable on z/OS. The final step fails with a handful of undefined symbols. Presumably this is because the step before it that is generating a .so also generates a zero length .x file. The .so looks sane, Executing 'nm' on it shows the missing symbols as being defined.

A typical .o was compiled with

xlclang -c -m64 -fvisibility=default -DOS390 -DZOS -D_EXT=1 -D_ALL_SOURCE -Duserelocatableinc -DMAXSIG=39 -DNSIG=39 -DOEMVS -DYYDYNAMIC -DNO_LOCALE_MESSAGES -D_OPEN_THREADS=3 -D_UNIX03_SOURCE=1 -D_AE_BIMODAL=1 -D_XOPEN_SOURCE_EXTENDED -D_ALL_SOURCE -D_ENHANCED_ASCII_EXT=0xFFFFFFFF -D_OPEN_SYS_FILE_EXT=1 -D_OPEN_SYS_SOCK_IPV6 -D_XOPEN_SOURCE=600 -D_XOPEN_SOURCE_EXTENDED -D_SHR_ENVIRON -DNO_NL_LOCALE_NAME -DDEBUGGING -g -c -m64 util.c 

The step that is yielding the zero length .x looks like

clang -o libperl.so -shared -Wl,-bedit=no -m64 op.o perl.o universal.o av.o builtin.o caretx.o class.o  deb.o doio.o doop.o dquote.o  dump.o globals.o gv.o hv.o  keywords.o locale.o mathoms.o mg.o  mro_core.o numeric.o pad.o peep.o  perlio.o perly.o pp.o pp_ctl.o  pp_hot.o pp_pack.o pp_sort.o pp_sys.o  reentr.o regcomp.o regcomp_debug.o regcomp_invlist.o regcomp_study.o regcomp_trie.o  regexec.o run.o scope.o sv.o  taint.o time64.o toke.o utf8.o  util.o   os390.o /karlw/zopen/usr/local/zopen/zoslib/zoslib/lib/libzoslib.x DynaLoader.o -lm -lc 

Any advice, including further debugging steps I could take, would be appreciated

I am trying to print a line from a file with interpolation. I have the following code:

open FH, "<", "file.txt";
my $line = <FH>;
print "line = " . $line . "\n";

The file.txt contains one line:

one\ntwo

The line prints as "one\ntwo". The \n is not interpolated.

I have tried opening the file with :raw :crlf :any but nothing works.

I have tried using $'one\ntwo' in file.txt as well as all forms of quoting.

I know the following code works:

my $st = "one\ntwo";
print "line = " . $st . "\n";

It prints:

one
two

How do I do that when reading the line from a file?

Is Perl the dying Pontiac?

r/perl

Published by /u/a430 on Friday 26 July 2024 09:03

Those who've been around long enough know that the use of programming languages was almost a religion a few years ago. For example, the .NET community made no secret of being a sect that branded other technologies as the devil's work. Admittedly, the Llama book was also considered a bible.

Until 20 years ago, Perl was regarded as an elite technology that one could boast about even barely mastering. Getting started with Perl was and still is tough and requires motivation. The reward for building Perl skills often comes years later when you calmly realize that even 10-year-old scripts still perform their duties perfectly - despite multiple system environment updates. Generally, even unoptimized Perl programs run more efficiently than new developments with technologies sold to us as the "hot shit."

One of Perl's top application areas is high-performance and robust web applications in mod_perl/2. To my knowledge, there's no comparable flexible programming language that can interact so closely with the web server and intervene in every layer of the delivery process. The language is mature, balanced, and the syntax is always consistent - at least for the Perl interpreter ;-) If you go to the official mod_perl page (perl.apache.org) in 2024, it recommends a manual written over 20 years ago, and even the link no longer works.

As a Perl enthusiast from the get-go and a full-stack developer, I feel today that - albeit reluctantly - I need to consider a technology switch. Currently, I'm still developing with mod_perl/2 and Perl Mason. As long as I'm working on interface projects, I'm always ahead of the game and can deliver everything in record time. However, when it comes to freelance projects or a new job, it's almost hopeless to bring in Perl experience, especially in Europe.

Throughout my career, I've also used other technologies such as Java Struts, PHP, C/C++, Visual Basic .NET, and I'd better not mention COBOL-85. I've always come back to Perl because of its stability. But I'm noticing that the language is effectively dead and hardly receives any updates or is talked about much. If I were forced to make a technology switch for developing full-stack applications, I would switch to React or Django. It's a shame.

submitted by /u/a430
[link] [comments]

Really stuck getting my app to work on Perl 5.40 w/ macOS 14.5; HELP!

r/perl

Published by /u/kosaromepr on Friday 26 July 2024 06:22

I have been a Perl guy for 30+ years and have had the great idea to upgrade the Perl version of my large Perl APP to 5.40; after a lot of fiddeling around and overriding some Makefile.PL files I got the code and all required libraries to work on an AlmaLinux 9.4. Hoewever I am stuck on getting it to run locally on MacOS 14.5.

The two libraries currently roadblocking are Net::SSL and DBD::MariaDB. I am not fluent enough in C to understand how and if I can get things sorted; anyone able to help? Full compile errors below.

Net::SSL

cpanm (App::cpanminus) 1.9018 on perl 5.040000 built for darwin-2level Work directory is /Users/administrator/.cpanm/work/1721974215.35063 You have make /usr/bin/make You have LWP: 6.77 You have /usr/bin/tar: bsdtar 3.5.3 - libarchive 3.5.3 zlib/1.2.12 liblzma/5.4.3 bz2lib/1.0.8 You have /usr/bin/unzip Searching Net::SSL () on cpanmetadb ... --> Working on Net::SSL Fetching http://www.cpan.org/authors/id/N/NA/NANIS/Crypt-SSLeay-0.72.tar.gz -> OK Unpacking Crypt-SSLeay-0.72.tar.gz Entering Crypt-SSLeay-0.72 Checking configure dependencies from META.json Checking if you have Getopt::Long 0 ... Yes (2.57) Checking if you have ExtUtils::CBuilder 0.280205 ... Yes (0.280240) Checking if you have Try::Tiny 0.19 ... Yes (0.31) Checking if you have Path::Class 0.26 ... Yes (0.37) Configuring Crypt-SSLeay-0.72 Running Makefile.PL Argument "pro" isn't numeric in numeric ge (>=) at /usr/local/lib/perl5/5.40.0/ExtUtils/MM_Unix.pm line 47. Found libraries 'ssl, crypto, z' *** THIS IS NOT AN ERROR, JUST A MESSAGE FOR YOUR INFORMATION *** Do you really need Crypt::SSLeay? Starting with version 6.02 of LWP, https support was unbundled into LWP::Protocol::https. This module specifies as one of its prerequisites IO::Socket::SSL which is automatically used by LWP::UserAgent unless this preference is overridden separately. IO::Socket::SSL is a more complete implementation, and, crucially, it allows hostname verification. Crypt::SSLeay does not support this. At this point, Crypt::SSLeay is maintained to support existing software that already depends on it. However, it is possible that your software does not really depend on Crypt::SSLeay, only on the ability of LWP::UserAgent class to communicate with sites over SSL/TLS. If are using version LWP 6.02 or later, and therefore have installed LWP::Protocol::https and its dependencies, and do not explicitly use Net::SSL before loading LWP::UserAgent, or override the default socket class, you are probably using IO::Socket::SSL and do not really need Crypt::SSLeay. Before installing Crypt::SSLeay, you may want to try specifying a dependency on LWP::Protocol::https. ================================================================================ Output from '/Users/administrator/.cpanm/work/1721974215.35063/Crypt-SSLeay-0.72/openssl-version': OpenSSL 3.3.1 4 Jun 2024 30300010 ================================================================================ Checking if your kit is complete... Looks good Generating a Unix-style Makefile Writing Makefile for Crypt::SSLeay Writing MYMETA.yml and MYMETA.json -> OK Checking dependencies from MYMETA.json ... Checking if you have Try::Tiny 0.19 ... Yes (0.31) Checking if you have Test::More 0.19 ... Yes (1.302199) Checking if you have ExtUtils::MakeMaker 0 ... Yes (7.70) Checking if you have LWP::Protocol::https 6.02 ... Yes (6.14) Checking if you have MIME::Base64 0 ... Yes (3.16_01) Building and testing Crypt-SSLeay-0.72 cp lib/Crypt/SSLeay/Conn.pm blib/lib/Crypt/SSLeay/Conn.pm cp SSLeay.pm blib/lib/Crypt/SSLeay.pm cp lib/Crypt/SSLeay/Err.pm blib/lib/Crypt/SSLeay/Err.pm cp lib/Crypt/SSLeay/MainContext.pm blib/lib/Crypt/SSLeay/MainContext.pm cp lib/Crypt/SSLeay/Version.pm blib/lib/Crypt/SSLeay/Version.pm cp lib/Net/SSL.pm blib/lib/Net/SSL.pm cp lib/Crypt/SSLeay/CTX.pm blib/lib/Crypt/SSLeay/CTX.pm cp lib/Crypt/SSLeay/X509.pm blib/lib/Crypt/SSLeay/X509.pm Running Mkbootstrap for SSLeay () chmod 644 "SSLeay.bs" "/usr/local/bin/perl" -MExtUtils::Command::MM -e 'cp_nonempty' -- SSLeay.bs blib/arch/auto/Crypt/SSLeay/SSLeay.bs 644 "/usr/local/bin/perl" "/usr/local/lib/perl5/5.40.0/ExtUtils/xsubpp" -typemap '/usr/local/lib/perl5/5.40.0/ExtUtils/typemap' -typemap '/Users/administrator/.cpanm/work/1721974215.35063/Crypt-SSLeay-0.72/typemap' SSLeay.xs > SSLeay.xsc mv SSLeay.xsc SSLeay.c cc -c -fno-common -DPERL_DARWIN -mmacosx-version-min=14.5 -DNO_THREAD_SAFE_QUERYLOCALE -DNO_POSIX_2008_LOCALE -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -Wno-error=implicit-function-declaration -O3 -DVERSION=\"0.72\" -DXS_VERSION=\"0.72\" "-I/usr/local/lib/perl5/5.40.0/darwin-2level/CORE" SSLeay.c SSLeay.xs:152:31: warning: call to undeclared function 'SSLv3_client_method'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] ctx = SSL_CTX_new(SSLv3_client_method()); ^ SSLeay.xs:152:31: error: incompatible integer to pointer conversion passing 'int' to parameter of type 'const SSL_METHOD *' (aka 'const struct ssl_method_st *') [-Wint-conversion] ctx = SSL_CTX_new(SSLv3_client_method()); ^~~~~~~~~~~~~~~~~~~~~ /usr/local/include/openssl/ssl.h:1634:47: note: passing argument to parameter 'meth' here __owur SSL_CTX *SSL_CTX_new(const SSL_METHOD *meth); ^ SSLeay.xs:157:31: warning: call to undeclared function 'SSLv2_client_method'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] ctx = SSL_CTX_new(SSLv2_client_method()); ^ SSLeay.xs:157:31: error: incompatible integer to pointer conversion passing 'int' to parameter of type 'const SSL_METHOD *' (aka 'const struct ssl_method_st *') [-Wint-conversion] ctx = SSL_CTX_new(SSLv2_client_method()); ^~~~~~~~~~~~~~~~~~~~~ /usr/local/include/openssl/ssl.h:1634:47: note: passing argument to parameter 'meth' here __owur SSL_CTX *SSL_CTX_new(const SSL_METHOD *meth); ^ 2 warnings and 2 errors generated. make: *** [SSLeay.o] Error 1 -> FAIL Installing Net::SSL failed. See /Users/administrator/.cpanm/work/1721974215.35063/build.log for details. Retry with --force to force install it. 

and DBD::MariaDB (mariadb-devel is installed)

cpanm (App::cpanminus) 1.9018 on perl 5.040000 built for darwin-2level Work directory is /Users/administrator/.cpanm/work/1721974660.39142 You have make /usr/bin/make You have LWP: 6.77 You have /usr/bin/tar: bsdtar 3.5.3 - libarchive 3.5.3 zlib/1.2.12 liblzma/5.4.3 bz2lib/1.0.8 You have /usr/bin/unzip Searching DBD::MariaDB () on cpanmetadb ... --> Working on DBD::MariaDB Fetching http://www.cpan.org/authors/id/P/PA/PALI/DBD-MariaDB-1.23.tar.gz -> OK Unpacking DBD-MariaDB-1.23.tar.gz Entering DBD-MariaDB-1.23 Checking configure dependencies from META.json Checking if you have utf8 0 ... Yes (1.25) Checking if you have DBI 1.608 ... Yes (1.643) Checking if you have strict 0 ... Yes (1.13) Checking if you have ExtUtils::MakeMaker 6.58 ... Yes (7.70) Checking if you have Data::Dumper 0 ... Yes (2.189) Checking if you have Getopt::Long 0 ... Yes (2.57) Checking if you have Devel::CheckLib 1.12 ... Yes (1.16) Checking if you have File::Spec 0 ... Yes (3.90) Checking if you have Config 0 ... Yes (5.040000) Checking if you have warnings 0 ... Yes (1.70) Configuring DBD-MariaDB-1.23 Running Makefile.PL Argument "pro" isn't numeric in numeric ge (>=) at /usr/local/lib/perl5/5.40.0/ExtUtils/MM_Unix.pm line 47. PLEASE NOTE: For 'make test' to run properly, you must ensure that the database user 'root' can connect to your MariaDB or MySQL server and has the proper privileges that these tests require such as 'drop table', 'create table', 'drop procedure', 'create procedure' as well as others. mysql> grant all privileges on test.* to 'root'@'localhost' identified by 's3kr1t'; For MySQL 8 it is needed to use different syntax: mysql> create user 'root'@'localhost' identified by 's3kr1t'; mysql> grant all privileges on test.* to 'root'@'localhost'; You can also optionally set the user to run 'make test' with: perl Makefile.PL --testuser=username I will use the following settings for compiling and testing: cflags (mysql_config) = -I/usr/local/Cellar/mariadb/11.4.2/include/mysql -I/usr/local/Cellar/mariadb/11.4.2/include/mysql/mysql libs (mysql_config) = -L/usr/local/Cellar/mariadb/11.4.2/lib/ -lmariadb mysql_config (guessed ) = mariadb_config testauthplugin (default ) = testdb (default ) = test testhost (default ) = testpassword (default ) = testport (default ) = testsocket (default ) = testuser (guessed ) = root To change these settings, see 'perl Makefile.PL --help' and 'perldoc DBD::MariaDB::INSTALL'. Checking if libs and header files are available for compiling... Checking if correct version of MariaDB or MySQL client is present... Looks good. Embedded server: not supported by client library WARNING: Older versions of ExtUtils::MakeMaker may errantly install README.pod as part of this distribution. It is recommended to avoid using this path in CPAN modules. Client library deinitialize OpenSSL library functions: yes Checking if your kit is complete... Looks good Using DBI 1.643 (for perl 5.040000 on darwin-2level) installed in /usr/local/lib/perl5/site_perl/5.40.0/darwin-2level/auto/DBI/ Generating a Unix-style Makefile Writing Makefile for DBD::MariaDB Writing MYMETA.yml and MYMETA.json -> OK Checking dependencies from MYMETA.json ... Checking if you have Test::More 0.90 ... Yes (1.302199) Checking if you have bigint 0 ... Yes (0.67) Checking if you have lib 0 ... Yes (0.65) Checking if you have DynaLoader 0 ... Yes (1.56) Checking if you have Data::Dumper 0 ... Yes (2.189) Checking if you have ExtUtils::MakeMaker 0 ... Yes (7.70) Checking if you have warnings 0 ... Yes (1.70) Checking if you have DBI::Const::GetInfoType 0 ... Yes (2.008697) Checking if you have Encode 0 ... Yes (3.21) Checking if you have vars 0 ... Yes (1.05) Checking if you have constant 0 ... Yes (1.33) Checking if you have Time::HiRes 0 ... Yes (1.9777) Checking if you have B 0 ... Yes (1.89) Checking if you have Test::Deep 0 ... Yes (1.204) Checking if you have FindBin 0 ... Yes (1.54) Checking if you have DBI 1.608 ... Yes (1.643) Checking if you have File::Temp 0 ... Yes (0.2311) Checking if you have strict 0 ... Yes (1.13) Checking if you have utf8 0 ... Yes (1.25) Building and testing DBD-MariaDB-1.23 cp lib/DBD/MariaDB.pm blib/lib/DBD/MariaDB.pm cp lib/DBD/MariaDB.pod blib/lib/DBD/MariaDB.pod cp lib/DBD/MariaDB/INSTALL.pod blib/lib/DBD/MariaDB/INSTALL.pod cp README.pod blib/lib/DBD/MariaDB/README.pod Running Mkbootstrap for MariaDB () chmod 644 "MariaDB.bs" "/usr/local/bin/perl" -MExtUtils::Command::MM -e 'cp_nonempty' -- MariaDB.bs blib/arch/auto/DBD/MariaDB/MariaDB.bs 644 "/usr/local/bin/perl" -p -e "s/~DRIVER~/MariaDB/g" /usr/local/lib/perl5/site_perl/5.40.0/darwin-2level/auto/DBI/Driver.xst > MariaDB.xsi "/usr/local/bin/perl" "/usr/local/lib/perl5/5.40.0/ExtUtils/xsubpp" -typemap '/usr/local/lib/perl5/5.40.0/ExtUtils/typemap' MariaDB.xs > MariaDB.xsc Warning: duplicate function definition 'do' detected in MariaDB.xs, line 104 Warning: duplicate function definition 'rows' detected in MariaDB.xs, line 231 Warning: duplicate function definition 'last_insert_id' detected in MariaDB.xs, line 250 mv MariaDB.xsc MariaDB.c cc -c -I/usr/local/lib/perl5/site_perl/5.40.0/darwin-2level/auto/DBI -I/usr/local/Cellar/mariadb/11.4.2/include/mysql -I/usr/local/Cellar/mariadb/11.4.2/include/mysql/mysql -DHAVE_DBI_1_634 -DHAVE_DBI_1_642 -DHAVE_PROBLEM_WITH_OPENSSL -fno-common -DPERL_DARWIN -mmacosx-version-min=14.5 -DNO_THREAD_SAFE_QUERYLOCALE -DNO_POSIX_2008_LOCALE -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -Wno-error=implicit-function-declaration -O3 -DVERSION=\"1.23\" -DXS_VERSION=\"1.23\" "-I/usr/local/lib/perl5/5.40.0/darwin-2level/CORE" MariaDB.c In file included from MariaDB.c:186: /usr/local/lib/perl5/site_perl/5.40.0/darwin-2level/auto/DBI/Driver_xst.h:33:5: warning: '(' and '{' tokens introducing statement expression appear in different macro expansion contexts [-Wcompound-token-split-by-macro] EXTEND(SP, params); ^~~~~~~~~~~~~~~~~~ /usr/local/lib/perl5/5.40.0/darwin-2level/CORE/pp.h:460:25: note: expanded from macro 'EXTEND' # define EXTEND(p,n) STMT_START { \ ^~~~~~~~~~ /usr/local/lib/perl5/site_perl/5.40.0/darwin-2level/auto/DBI/dbipport.h:4185:31: note: expanded from macro 'STMT_START' # define STMT_START (void)( /* gcc supports ``({ STATEMENTS; })'' */ ^ /usr/local/lib/perl5/site_perl/5.40.0/darwin-2level/auto/DBI/Driver_xst.h:33:5: note: '{' token is here EXTEND(SP, params); ^~~~~~~~~~~~~~~~~~ /usr/local/lib/perl5/5.40.0/darwin-2level/CORE/pp.h:460:36: note: expanded from macro 'EXTEND' # define EXTEND(p,n) STMT_START { \ ^ In file included from MariaDB.c:186: /usr/local/lib/perl5/site_perl/5.40.0/darwin-2level/auto/DBI/Driver_xst.h:33:5: warning: '}' and ')' tokens terminating statement expression appear in different macro expansion contexts [-Wcompound-token-split-by-macro] EXTEND(SP, params); ^~~~~~~~~~~~~~~~~~ /usr/local/lib/perl5/5.40.0/darwin-2level/CORE/pp.h:466:25: note: expanded from macro 'EXTEND' } STMT_END ^ /usr/local/lib/perl5/site_perl/5.40.0/darwin-2level/auto/DBI/Driver_xst.h:33:5: note: ')' token is here EXTEND(SP, params); ^~~~~~~~~~~~~~~~~~ /usr/local/lib/perl5/5.40.0/darwin-2level/CORE/pp.h:466:27: note: expanded from macro 'EXTEND' } STMT_END ^~~~~~~~ /usr/local/lib/perl5/site_perl/5.40.0/darwin-2level/auto/DBI/dbipport.h:4186:25: note: expanded from macro 'STMT_END' # define STMT_END ) ^ ^ ./MariaDB.xsi:214:39: warning: '(' and '{' tokens introducing statement expression appear in different macro expansion contexts [-Wcompound-token-split-by-macro] if (is_selectrow_array) { XSRETURN_EMPTY; } else { XSRETURN_UNDEF; } ^~~~~~~~~~~~~~ /usr/local/lib/perl5/5.40.0/darwin-2level/CORE/XSUB.h:340:27: note: expanded from macro 'XSRETURN_EMPTY' #define XSRETURN_EMPTY STMT_START { XSRETURN(0); } STMT_END ^~~~~~~~~~ /usr/local/lib/perl5/site_perl/5.40.0/darwin-2level/auto/DBI/dbipport.h:4185:31: note: expanded from macro 'STMT_START' # define STMT_START (void)( /* gcc supports ``({ STATEMENTS; })'' */ ^ ./MariaDB.xsi:214:39: note: '{' token is here if (is_selectrow_array) { XSRETURN_EMPTY; } else { XSRETURN_UNDEF; } ^~~~~~~~~~~~~~ /usr/local/lib/perl5/5.40.0/darwin-2level/CORE/XSUB.h:340:38: note: expanded from macro 'XSRETURN_EMPTY' #define XSRETURN_EMPTY STMT_START { XSRETURN(0); } STMT_END ^ ./MariaDB.xsi:214:39: warning: '(' and '{' tokens introducing statement expression appear in different macro expansion contexts [-Wcompound-token-split-by-macro] if (is_selectrow_array) { XSRETURN_EMPTY; } else { XSRETURN_UNDEF; } ^~~~~~~~~~~~~~ /usr/local/lib/perl5/5.40.0/darwin-2level/CORE/XSUB.h:340:57: note: expanded from macro 'XSRETURN_EMPTY' #define XSRETURN_EMPTY STMT_START { XSRETURN(0); } STMT_END ^~~~~~~~~~~ /usr/local/lib/perl5/5.40.0/darwin-2level/CORE/XSUB.h:325:5: note: expanded from macro 'XSRETURN' STMT_START { \ ^~~~~~~~~~ /usr/local/lib/perl5/site_perl/5.40.0/darwin-2level/auto/DBI/dbipport.h:4185:31: note: expanded from macro 'STMT_START' Failed 6/7 subtests Can't use an undefined value as a subroutine reference at /usr/local/lib/perl5/5.40.0/TAP/Harness.pm line 612. make: *** [test_dynamic] Error 255 
submitted by /u/kosaromepr
[link] [comments]
op.c: treat bitwise-{and,xor,or} assignment as lvalue

Previously, `($x &= $y) += $z` was fine (since `&=` returns an lvalue),
but not under feature "bitwise":

    Can't modify numeric bitwise and (&) in addition (+) at ...

Similar for `^=` and `|=`.

Extend the lvalue behavior of the old number/string mixed bitwise
assignment operators (`&= ^= |=`) to the new separate bitwise assignment
operators available under feature "bitwise" (`&= ^= |= &.= ^.= |.=`).

Fixes #22412.

t/op/bop.t: consistently use 4-space indentation

Perl commits on GitHub

Published by mauke on Friday 26 July 2024 05:29

t/op/bop.t: consistently use 4-space indentation

Previously, this block used 2- and even 1-space indents unlike the rest
of the file.

This week in PSC (153) | 2024-07-25

blogs.perl.org

Published by Perl Steering Council on Friday 26 July 2024 02:59

After over a month without a PSC meeting, this was the first meeting of the new, shiny PSC. Today, we:

  • welcomed Aristotle, and said goodbye to Paul
  • discussed the current projects in flight, the PPC process and the onboarding tasks (preparing for next year already!)
  • talked about POD, docstrings, and the Perl release process

autodoc: Vertically align long/short usage names

Perl commits on GitHub

Published by khwilliamson on Friday 26 July 2024 00:15

autodoc: Vertically align long/short usage names

perlapi and perlintern now output both the short and long name
prototypes for every API element that has them.  This commit adds code
so that every element in a group is nicely vertically aligned, so that
any 'Perl_' prefix is outdented, and the basenames are vertically
aligned, along with the arguments.

Creating Custom Functions In PostgreSQL

dev.to #perl

Published by Lawrence Cooke on Thursday 25 July 2024 13:19

In PostgreSQL, custom functions can be created to solve complex problems.

These can be written using the default PL/pgSQL scripting language, or they can be written in another scripting language.

Python, Perl, Tcl and R are some of the scripting languages supported.

While PL/pgSQL comes with any Postgres installation, to use other languages requires some setup.

Installing the extension

Before an extension can be used, the extension package needs to be installed.

On Ubuntu you would run:

Perl

sudo apt-get -y install postgresql-plperl-14

The package name 'postgresql-plperl-14' is specific to PostgreSQL version 14. If you're using a different version of PostgreSQL, you need to change the version number in the package name to match your installed PostgreSQL version.

Python 3

sudo apt-get install postgresql-plpython3-14

Activating the extension

To activate the extension in PostgreSQL the extension must be defined using the CREATE EXTENSION statement.

Perl

CREATE EXTENSION plperl;

Python

CREATE EXTENSION plpython3;

Hello world example

Once the extension has been created, a custom function can be created using the extension.

Perl

CREATE OR REPLACE FUNCTION hello(name text) 
RETURNS text AS $$
    my ($name) = @_;
    return "Hello, $name!";
$$ LANGUAGE plperl;

Python

CREATE OR REPLACE FUNCTION hello(name text)
RETURNS text AS $$
    return "Hello, " + name + "!"
$$ LANGUAGE plpython3;

Breaking this down line by line

CREATE OR REPLACE FUNCTION hello(name text)

This line is how a function is created in Postgres. By using CREATE OR REPLACE, it will overwrite whatever function is already defined with the name hello with the new function.

Using CREATE FUNCTION hello(name text) will prevent the function from overwriting an existing function and will error if the function already exists.

RETURNS text AS $$

This defines what Postgres data type will be returned, it's important that the data type specified is a type recognized by Postgres. A custom data type can be specified, if the custom type is already defined.

$$ is a delimiter to mark the beginning and end of a block of code. In this line it's marking the start of the code block.

All code between the start and end $$ will be executed by Postgres

$$ LANGUAGE plperl;

$$ denotes the end of the script and tells Postgres what language the script should be parsed as.

Using the function

Functions can be used like any built-in Postgres function

SELECT hello('world');

This will return a column with the value Hello world!

Functions can be part of more complex queries:

SELECT id, title, hello('world') greeting FROM table;

More complex example

Here is an example function that accepts text from a field and returns a word count.

CREATE OR REPLACE FUNCTION word_count(paragraph text)
RETURNS json AS $$
use strict;
use warnings;

my ($text) = @_;

my @words = $text =~ /\w+/g;
my $word_count = scalar @words;

my $result = '{' .
    '"word_count":' . $word_count .
'}';
return $result;
$$ LANGUAGE plperl;

This returns a JSON formatted result with the word count.

We can add more detailed statistics to the function.

CREATE OR REPLACE FUNCTION word_count(paragraph text)
RETURNS json AS $$
use strict;
use warnings;

my ($text) = @_;

my @words = $text =~ /\w+/g;

my $word_count = scalar @words;

my $sentence_count = ( $text =~ tr/!?./!?./ ) || 0;

my $average_words_per_sentence =
  $sentence_count > 0 ? $word_count / $sentence_count : 0;

my $result = '{' .
    '"word_count":' . $word_count . ',' .
    '"sentence_count":' . $sentence_count . ',' .
    '"average_words_per_sentence":"' . sprintf("%.2f", $average_words_per_sentence) . '"' .
'}';

return $result;
$$ LANGUAGE plperl SECURITY DEFINER;

Now when we use it in a query

SELECT word_count(text_field) word_count FROM table

It will return JSON like

{"word_count":116,"sentence_count":15,"average_words_per_sentence":"7.73"}

Security considerations

When using custom functions or external scripting languages, there are additional security considerations to take into account. It can be a juggling act to get the right balance between usability and security.

Security Definer vs Security Invoker

In the previous function, SECURITY DEFINER option was added to the create function statement.

It's important to think about how you want a function run from a security point of view.

The default behavior is to use SECURITY INVOKER. This will run the function with the privileges of the user who is running the function.

SECURITY DEFINER provides more control over the privileges granted to the function. Using this mode, the function will run with the privileges of the user who created the function.

This can be both good and bad, if a function is created by a user with limited privileges, then there is little harm that can be done to the database.

If the function is created by a user with high access privileges, then the function will run with those same privileges. Depending on the type of function, this could allow a user to run the function with more open privileges than they have been granted.

There are times where this is useful, for example, if a user does not have read privileges to a table, but within the function , read is required, using SECURITY DEFINER can allow the required read privileges for the function to run.

Trusted and untrusted extensions

When creating the extensions above, plperl and plpython3 were used. In most circumstances these are the right extensions to use.

These extensions have limited access to the servers file system and system calls.

Extensions can also be created with a u (plpython3u, plperlu)

These are untrusted extensions and allow more access to the servers file system.

There may be cases where this is required, for example, if you want to use Perl modules, Python Libraries, or use system calls.

In the example above, the JSON output was generated as a string, if desired, the perl JSON module could have been used to encode the data as JSON. To do this would require using the untrusted extension to access the JSON module.

It's advisable to not use the untrusted extensions, but if necessary, use with caution and understand the potential risks.

If Perl is being used, Perl will run in taint mode when the untrusted extension is in use.

Final Thoughts

Being able to take advantage of Perls advanced text processing and memory management, or Pythons data analytic libraries within PostgreSQL can be a really powerful tool.

Passing off complex tasks to tools more suited to handling the task can reduce overhead on the database.

As always, when using custom functions and external scripting languages, take precautions to ensure secure usage.

Perl Weekly Challenge 279: Sort Letters

blogs.perl.org

Published by laurent_r on Wednesday 24 July 2024 19:19

These are some answers to the Week 279, Task 1, of the Perl Weekly Challenge organized by Mohammad S. Anwar.

Spoiler Alert: This weekly challenge deadline is due in a few days from now (on July 28, 2024, at 23:59). This blog post provides some solutions to this challenge. Please don’t read on if you intend to complete the challenge on your own.

Task 1: Sort Letters

You are given two arrays, @letters and @weights.

Write a script to sort the given array @letters based on the @weights.

Example 1

Input: @letters = ('R', 'E', 'P', 'L')
       @weights = (3, 2, 1, 4)
Output: PERL

Example 2

Input: @letters = ('A', 'U', 'R', 'K')
       @weights = (2, 4, 1, 3)
Output: RAKU

Example 3

Input: @letters = ('O', 'H', 'Y', 'N', 'P', 'T')
       @weights = (5, 4, 2, 6, 1, 3)
Output: PYTHON

Sort Letters in Raku

We need some way of combining the data of the two arrays to be able to sort the items of the @letters array in accordance with the values of the @weights array. For example, we could build an intermediate data structure of records containing each a letter and the value of its weight, sort his data structure in accordance with the weight and then extract the letters from the sorted data structure.

We don't need, however to create a new variable to contain an array with this intermediate data structure. We will use an anonymous array of pairs to host this data structure and perform the required operations on a data pipeline. The solution is often called Schwartzian Transform, because it was suggested by Randall Schwartz on a Perl newsgroup back in 1995, in the early days of Perl 5.

A literal translation to Raku of the canonical Perl Schwartzian Transform might look like this:

sub sort-letters (@letters, @weights) {
    return join "", map { $_[0] }, 
    sort { $^a[1] <=> $^b[1] }, 
    map { @letters[$_], @weights[$_] }, 0..@letters.end;
}

When trying to understand this construct, it is probably best to read from bottom to top, and from right to left. On the last line, the 0..@letters.end code at the end creates a list of subscripts used in the beginning of that line to create an anonymous array of arrays. The next line upwards sorts the data according to the weights and, finally, the map on the first line extracts the letters from sorted array.

This is, as I said, a Raku translation of how we would do it in Perl. But Raku offers some opportunities for improvement. In Raku, when the code block or subroutine called by sort takes only one parameter, then it specifies not the comparison subroutine, but the transformation to be applied to each item of the input data before sorting. So, we can simplify our Schwartzian Transform as follows:

sub sort-letters2 (@letters, @weights) {
    return join "", map { $_[0] }, sort { $_[1] }, 
    map { @letters[$_], @weights[$_] }, 0..@letters.end;
}

In addition, the creation of the intermediate data strucure can be greatly simplified using the zip routine, leading to a one-line Schwartzian Transform. We now display the full program:

sub sort-letters3 (@let, @w) {
    join "", map { $_[0] }, sort { $_[1] }, zip @let, @w;
}

my @tests = (< R E P L>, <3 2 1 4>),
            (<A U R K>, <2 4 1 3>),
            (<O H Y N P T>, <5 4 2 6 1 3>);
for @tests -> @test {
    printf "%-14s => ", "@test[0]";
    say sort-letters3 @test[0], @test[1];
}

This program (as well as its previous versions) displays the following output:

$ raku ./sort-letters.raku
R E P L        => PERL
A U R K        => RAKU
O H Y N P T    => PYTHON

Sort Letters in Perl

This is a port to Perl of the first Raku program above, with the original implementation of the Schwartzian Transform. Please refer to the above section if you need explanations.

use strict;
use warnings;
use feature 'say';

sub sort_letters {
    my @letters = @{$_[0]};
    my @indices = @{$_[1]};
    return map $_->[0], sort { $$a[1] <=> $$b[1] } 
    map [ $letters[$_], $indices[$_] ], 0..$#letters;
}

my @tests = ( [ [< R E P L>], [<3 2 1 4>] ],
              [ [<A U R K>], [<2 4 1 3>] ],
              [ [<O H Y N P T>], [<5 4 2 6 1 3>] ] );
for my $test (@tests) {
    printf "%-14s => ", "@{$test->[0]}";
    say sort_letters $test->[0], $test->[1];
}

This program displays the following output:

$ perl  ./sort-letters.pl
R E P L        => PERL
A U R K        => RAKU
O H Y N P T    => PYTHON

Wrapping up

The next week Perl Weekly Challenge will start soon. If you want to participate in this challenge, please check https://perlweeklychallenge.org/ and make sure you answer the challenge before 23:59 BST (British summer time) on August 4, 2024. And, please, also spread the word about the Perl Weekly Challenge if you can.

Maintaining Perl (Tony Cook) June 2024

Perl Foundation News

Published by alh on Tuesday 23 July 2024 14:10


Tony writes:

``` [Hours] [Activity] 2024/06/03 Monday 0.72 #22211 check smoke results, check and re-word one commit message, make PR 22257 1.95 #22230 debugging

2.42 #22230 debugging

5.09

2024/06/04 Tuesday 0.45 #22211 cleanup, testing, update PR 1.20 #22252 review, testing

1.47 #22230 test issues. debugging

3.12

2024/06/05 Wednesday 2.20 #22230 review khw-env changes 1.47 #22230 try an experiment with character encoding, which

works, comment

3.67

2024/06/06 Thursday 1.25 #22230 discussion with khw (zoom, irc) 0.87 #22169 work on a fix

1.10 #22169 debugging, more work on fix

3.22

2024/06/11 Tuesday 0.75 github notifications, look into #22208 warnings and comment 0.62 #22169 more cleanup 1.70 #22169 more cleanup, testing 0.18 #22252 review and approve 0.32 #22268 review and approve 0.12 #22269 review, check history and approve 0.13 #22270 review, research, comment and approve

0.13 #22271 review and comment

3.95

2024/06/12 Wednesday 2.35 #22169 setup regression tests, debugging

2.07 #22169 debugging, fixes

4.42

2024/06/13 Thursday 0.18 github notifications 0.18 #22271 research, work up tiny cast-away-const fix, anything else waits for #22271 to be applied 0.77 #22270 review re-work

0.83 #22270 more review, comment

1.96

2024/06/17 Monday 0.15 github notifications 0.23 #22270 review updated PR and approve 0.22 #22271 more cast-away-const fix, push for CI 1.52 #22169 re-check, polish 1.53 #22169 fix an issue, try avoiding any_sv (CVs already have

the stash)

3.65

2024/06/18 Tuesday 0.08 #22300 review CI results and open PR 22300 0.53 #22273 review discussion, research, comment 0.08 #22276 review and approve 0.53 #22280 review, testing and approve 0.17 #22274 review and apply to blead 2.23 tick-in-name removal: implementation, working through

tests

3.62

2024/06/19 Wednesday 0.52 #22295 review logs, write most of a response, OP, closes the ticket 1.58 tick-in-name removal: working through tests, some core fixes 1.82 tick-in-name removal: more test fixes, commits, push for CI 0.37 #22292 review, research and approve 0.10 #22287 review and approve

0.22 #22289 review, research and comment

4.61

2024/05/20 Monday 0.10 tick-in-name removal: check CI, re-run failed job (appears unrelated to change) 0.08 #22070 review and comment 0.47 security list 0.08 tick-in-name removal: check CI results and open PR 22303 0.30 #22289 review updates, approve and comment on StructCopy() 0.22 macos workflow update: updates, push for CI 0.23 #22296 rebase, minor commit message change, testing, push 0.08 #22296 check CI results and apply to blead 0.08 macos workflow update: check CI results, make PR 22306 0.10 #22070 review latest update, approve

1.18 #22169 more cleanup, testing

2.92

2024/06/24 Monday 0.15 #22257 recheck and apply to blead 0.32 #22282 review and approve 0.75 #22283 review and comments 0.08 #22290 review and approve 0.17 #22293 review and comment 0.33 #22298 review and approve 0.57 #22309 review, comment and approve

1.48 smartmatch removal

3.85

2024/06/25 Tuesday 0.30 github notifications 0.17 #22310 review and approve 0.08 #22311 review and approve 0.20 #22312 review, research and approve 0.35 briefly review coverity scan results, email list about the many new defect reports 0.08 #22313 review and approve 0.08 #22314 review and approve 0.15 #22315 review and approve 0.12 #22318 review and approve 0.08 #22319 review and approve 0.08 #22320 review and comment

2.67 smartmatch removal

4.36

2024/06/26 Wednesday 0.32 #22321 review and approve 0.08 #22322 review and approve 0.15 #22323 review and comment 0.22 #22324 review and comment 0.12 #22325 review and approve 0.20 #22326 review and approve 0.15 #22327 review and approve 0.08 #22328 review and approve 0.73 #22329 review, research and comment 0.13 #22331 review and comment 0.33 #22332 review, research and approve with comment 0.10 #22333 review and comment 0.08 #22334 review and approve 0.18 #22341 review and comment 1.30 smartmatch removal, working through tests

1.08 smartmatch removal, more tests, just deparse to go

5.25

2024/06/27 Thursday 1.32 #22329 long comment 0.32 #22344 review and comment 0.43 #22345 review and approve 0.17 #22349 review and comment 0.78 smartmatch removal: get deparse tests passing (need to do docs)

2.27 smartmatch removal: docs, some cleanup, testing

5.29

Which I calculate is 58.98 hours.

Approximately 53 tickets were reviewed or worked on, and 3 patches were applied. ```

How I use PostgreSQL's timestamptz fields in my Mojo apps

blogs.perl.org

Published by karjala on Tuesday 23 July 2024 11:39

I created a function in Perl called pg_dt, that will convert PostgreSQL’s datetime values into Perl’s DateTime values and vice versa. This is useful both when you want to store DateTime values into the database, or want to convert the pg datetime/timestamp value from the database into a DateTime object value that Perl can use.

I really can’t seem to include code blocks in my posts on this platform (tried Preview with Markdown and Markdown With SmartyPants without success), so you can read the rest of this article on my blog.

Perl Weekly #678 - Perl Steering Council

dev.to #perl

Published by Gabor Szabo on Monday 22 July 2024 15:43

Originally published at Perl Weekly 678

Hi there,

The release of latest version of Perl kicked the process to elect new Perl Steering Council. I am happy to see the return of Philip Bruhat and Graham Knop to the fold alongwith the new member Aristotle Pagaltzis. I would like to take this opportunity to thank, Paul Evans for the quality contributions. Thanks to the PSC, we get regular updates about the work being carried out. If you are interested to know more about it then there is handy website, https://psc.perlhacks.com, created by Dave Cross.

As you all know we had big release Perl v5.40 last month and here in one month we have two more relases Perl v5.41.0 and Perl v5.41.1. If you checkout the changes in the last two releases, you will find mostly updates to Modules and Pragmata. For me, the most significant change is the addition of chdir as a subroutine to the CORE:: namespace.

There is another big change proposed by Curtis Poe about allowing can() to take a list of methods. Having checked the discussion, it seems to have closed without merge unfortunately.

Have you played with Bitcoin? Well I haven't yet but will do soon when I find spare time. For now, if you are interested to learn more about it from the Perl perlspective, we got 2-part series, part one and part two, by Bartosz Jarzyna shared for Perl Advent Calendar 2023. Once you have finished reading them we have an update on development of Bitcoin libraries.

Do you create quick one-liner in Perl? If yes then please do checkout this short video. I started watching but couldn't finish it as I lost the interest halfway through. I have bookmarked it for now, will come back soon. If you are looking for cool one-liners in Perl or Raku then I would highly recommend the contributions by the Team PWC down below under the section The Weekly Challenge.

Last but not least we have another big release Dancer2 1.1.1 announced in the blog post by Jason Crome. I am very excited about the update and to know that it is being looked after. Thanks to the entire Dancer Core Team. Keep it up great work.

Enjoy the summer holiday break and rest of the newsletter.

--
Your editor: Mohammad Sajid Anwar.

Announcements

Dancer2 1.1.1 Released

The Dancer Core Team is happy to announce that Dancer2 1.1.1 is on its way to CPAN.

Updated, curated, Perl module TiddlyWiki

Virtual presentations for Perl developers

Continuous Integration (CI): GitHub Actions for Perl Projects (Free Virtual presentation on August 4)

This events was postponed to August 4. In this virtual event you will learn why and how to use GitHub Actions as a CI system for your Perl projects. The meeting is free of charge thanks to my supporters via Patreon and GitHub. Besides this event I am running many more, so make sure you check the Code Mavens meetup group and also register to it.

Articles

Sailing the Seven YAPCs

YAPC event report by another attendee Buddy Burden. Thanks for sharing your experience with us.

Repository of examples using Perl and Assembly together

Sometimes one needs an extra ounce of performance. Why not combine the high level semantics of Perl with the punch of assembly?

Grants

PEVANS Core Perl 5: Grant Report for April, May, June 2024

Maintaining Perl 5 Core (Dave Mitchell): June 2024

The Weekly Challenge

The Weekly Challenge by Mohammad Sajid Anwar will help you step out of your comfort-zone. You can even win prize money of $50 by participating in the weekly challenge. We pick one champion at the end of the month from among all of the contributors during the month, thanks to the sponsor Lance Wicks.

The Weekly Challenge - 279

Welcome to a new week with a couple of fun tasks "Sort Letters" and "Split String". If you are new to the weekly challenge then why not join us and have fun every week. For more information, please read the FAQ.

RECAP - The Weekly Challenge - 278

Enjoy a quick recap of last week's contributions by Team PWC dealing with the "Sort String" and "Reverse Word" tasks in Perl and Raku. You will find plenty of solutions to keep you busy.

TWC278

Another display of Perl's cool features and great solutions. Thanks for sharing the knowledge with us.

Sginrt and droW

Don't you love the brave heart? I liked taking the difficult path and getting job done with ease. Keep it up great work.

Sort of Reverse

Raku one-liner is showing off method chaining once again. Raku Rocks.

Perl Weekly Challenge: Week 278

Coming from Raku master, the Perl one-liner is one of my favourite. Great work, keep it up.

Split, Sort and Join

Bullet points converted into working code in Perl. You really don't want to miss the gem. Well done.

Perl Weekly Challenge 278: Sort String

Pure regex solution in Perl and Raku. Please do checkout the workings.

Perl Weekly Challenge 278: Reverse Word

Simple yet elegant approach to get the job done. You get the bonus indepth discussion. Well done and keep it up.

CHALLENGES (almost) IN A ROW

Consistent contributor of PostgreSQL solutions. Highly recommended as you get bonus too.

Perl Weekly Challenge 278

Classic one-liner in Perl as always that takes care of every given example. Keep it up great work.

Word Reverse String Sort

I love the musical introduction to a tech blog and then follows the pure tech discussion. Great work, thanks for sharing.

Tangled string and drow

DIY web interface with detailed analysis. Cool quality solutions every week.

The Weekly Challenge - 278

Pure Perl solution without any dependency this week covering every examples. Great work.

The Weekly Challenge #278

Breaking down big task into smaller tasks makes it easy to implement. Take a look and see it yourself.

Reverse the Sort in Strings of Words

Playing with regex can be a tough task for beginners but once you get there it becomes piece of cake. Highly recommended.

Rakudo

2024.29 Intel -exprJIT +5%

Weekly collections

NICEPERL's lists

Great CPAN modules released last week.

Events

Toronto Perl Mongers monthly meeting

July 25, 2024, Virtual event

Continuous Integration (CI): GitHub Actions for Perl Projects

August 4, 2024, in Zoom

GitHub Pages for Perl developers

August 15, 2024, in Zoom

London Perl and Raku Workshop

October 26, 2024, in London, UK

You joined the Perl Weekly to get weekly e-mails about the Perl programming language and related topics.

Want to see more? See the archives of all the issues.

Not yet subscribed to the newsletter? Join us free of charge!

(C) Copyright Gabor Szabo
The articles are copyright the respective authors.

Perl

The Weekly Challenge

Published on Sunday 21 July 2024 23:19

TABLE OF CONTENTS 01. HEADLINES 02. STAR CONTRIBUTORS 03. CONTRIBUTION STATS 04. GUESTS 05. LANGUAGES 06. CENTURION CLUB 07. DAMIAN CONWAY’s CORNER 08. ANDREW SHITOV’s CORNER 09. PERL SOLUTIONS 10. RAKU SOLUTIONS 11. PERL & RAKU SOLUTIONS HEADLINES Thank you Team PWC for your continuous support and encouragement. STAR CONTRIBUTORS Following members shared solutions to both tasks in Perl and Raku as well as blogged about it.

Blog

The Weekly Challenge

Published on Sunday 21 July 2024 23:19

TABLE OF CONTENTS 01. HEADLINES 02. STAR CONTRIBUTORS 03. CONTRIBUTION STATS 04. GUESTS 05. LANGUAGES 06. CENTURION CLUB 07. DAMIAN CONWAY’s CORNER 08. ANDREW SHITOV’s CORNER 09. PERL SOLUTIONS 10. RAKU SOLUTIONS 11. PERL & RAKU SOLUTIONS HEADLINES Thank you Team PWC for your continuous support and encouragement. STAR CONTRIBUTORS Following members shared solutions to both tasks in Perl and Raku as well as blogged about it.

The Weekly Challenge - Perl & Raku

The Weekly Challenge

Published on Sunday 21 July 2024 23:19

The page you are looking for was moved, removed, renamed or might never existed.

Raku

The Weekly Challenge

Published on Sunday 21 July 2024 23:19

TABLE OF CONTENTS 01. HEADLINES 02. STAR CONTRIBUTORS 03. CONTRIBUTION STATS 04. GUESTS 05. LANGUAGES 06. CENTURION CLUB 07. DAMIAN CONWAY’s CORNER 08. ANDREW SHITOV’s CORNER 09. PERL SOLUTIONS 10. RAKU SOLUTIONS 11. PERL & RAKU SOLUTIONS HEADLINES Thank you Team PWC for your continuous support and encouragement. STAR CONTRIBUTORS Following members shared solutions to both tasks in Perl and Raku as well as blogged about it.

Colin Crain › Perl Weekly Review #127

The Weekly Challenge

Published on Sunday 21 July 2024 23:19

( …continues from previous week. )

What's new on CPAN - June 2024

perl.com

Published on Sunday 21 July 2024 22:00

Welcome to “What’s new on CPAN”, a curated look at last month’s new CPAN uploads for your reading and programming pleasure. Enjoy!

APIs & Apps

Data

Development & Version Control

Science & Mathematics

Web

Other

Koha Open Source Ambassadors Initiative and its benefits for Perl - Andrii Nugged - TPRC 2024

The Perl and Raku Conference YouTube channel

Published by The Perl and Raku Conference - Las Vegas, NV 2024 on Saturday 20 July 2024 23:19

(dv) 9 great CPAN modules released last week

Niceperl

Published by Unknown on Saturday 20 July 2024 18:01

Updates for great CPAN modules released last week. A module is considered great if its favorites count is greater or equal than 12.

  1. CPAN::Audit - Audit CPAN distributions for known vulnerabilities
    • Version: 20240718.001 on 2024-07-18, with 14 votes
    • Previous CPAN version: 20240626.001 was 22 days before
    • Author: BDFOY
  2. Dancer2 - Lightweight yet powerful web application framework
    • Version: 1.1.1 on 2024-07-18, with 136 votes
    • Previous CPAN version: 1.1.0 was 7 months, 6 days before
    • Author: CROMEDOME
  3. DBIx::Class::DeploymentHandler - Extensible DBIx::Class deployment
    • Version: 0.002234 on 2024-07-17, with 21 votes
    • Previous CPAN version: 0.002233 was 4 years, 9 months, 22 days before
    • Author: WESM
  4. IO::Socket::SSL - Nearly transparent SSL encapsulation for IO::Socket::INET.
    • Version: 2.088 on 2024-07-14, with 49 votes
    • Previous CPAN version: 2.087 was 6 days before
    • Author: SULLR
  5. List::Gen - provides functions for generating lists
    • Version: 0.975 on 2024-07-16, with 23 votes
    • Previous CPAN version: 0.974 was 12 years, 8 months, 4 days before
    • Author: SOMMREY
  6. Object::Pad - a simple syntax for lexical field-based objects
    • Version: 0.809 on 2024-07-14, with 43 votes
    • Previous CPAN version: 0.808 was 6 months, 17 days before
    • Author: PEVANS
  7. Pod::Man - Convert POD data to various other formats
    • Version: v6.0.2 on 2024-07-14, with 14 votes
    • Previous CPAN version: v6.0.1 was 1 day before
    • Author: RRA
  8. RxPerl - an implementation of Reactive Extensions / rxjs for Perl
    • Version: v6.29.0 on 2024-07-16, with 13 votes
    • Previous CPAN version: v6.28.0 was 9 months, 18 days before
    • Author: KARJALA
  9. Sys::Virt - libvirt Perl API
    • Version: v10.5.0 on 2024-07-16, with 17 votes
    • Previous CPAN version: v10.2.0 was 3 months, 8 days before
    • Author: DANBERR

TinyNES - Graham Ollis - TPRC 2024 - Lightning Talk

The Perl and Raku Conference YouTube channel

Published by The Perl and Raku Conference - Las Vegas, NV 2024 on Thursday 18 July 2024 18:30

Lessons From an Idaho Potato Farmer - David Laulusa - TPRC 2024 - Lightning Talk

The Perl and Raku Conference YouTube channel

Published by The Perl and Raku Conference - Las Vegas, NV 2024 on Thursday 18 July 2024 01:11

SPVM::Resource::Eigen released

dev.to #perl

Published by Yuki Kimoto - SPVM Author on Wednesday 17 July 2024 23:22

SPVM::Resource::Eigen, a SPVM resource for the C++ library Eigen that can calculate the matrix operations required by AI/Deep Learning, has been released.

SPVM::Resource::Eigen - CPAN

Imitating a Drum Circle - Gene Boggs - TPRC 2024 - Lightning Talk

The Perl and Raku Conference YouTube channel

Published by The Perl and Raku Conference - Las Vegas, NV 2024 on Wednesday 17 July 2024 23:20

Glue - Lee Johnson - TPRC 2024 - Lightning Talk

The Perl and Raku Conference YouTube channel

Published by The Perl and Raku Conference - Las Vegas, NV 2024 on Wednesday 17 July 2024 22:54


Paul writes: ``` My time was almost entirely consumed during April and May by running a stage show, and then June by travelling the entire length of the country supporting a long-distance bicycle adventure; as such I didn't get much Perl work done and I've only just got around to writing out what few things I did get done.

Entirely related to a few last pre-release steps to get features nicely lined up for the 5.40 release:

Hours:

5 = Stablise experiments for perl 5.40 https://github.com/Perl/perl5/pull/22123

1 = Fix perlexperiment.pod; create v5.40 feature bundle https://github.com/Perl/perl5/pull/22141

1 = Bugfix for GH22278 (uninitialised fields during DESTROY) https://github.com/Perl/perl5/pull/22280

Total: 7 hours

I'm now back to the usual schedule, so I hope to have more to work on for July onwards... ```

Perl Weekly #677 - Reports from TPRC 2024

dev.to #perl

Published by Gabor Szabo on Monday 15 July 2024 05:22

Originally published at Perl Weekly 677

Hi there!

In case you missed it earlier there plenty of videos from The Perl and Raku Conference in Las Vegas that you can watch.

There is also a thread on Reddit answering the question: Perl and why you use it.

First time I taught Perl was in the year 2000. It was one of the local training companies that hired me, gave me their teaching material, and sent me in the classroom. I remember standing in front of the class for some time that felt ages without any clue what to say. Then somehow I started to speak. Apparently the course went well enough as they asked me to teach again. Since then a lot has happened. I created my own training materials. I started to offer my courses directly to the clients, and I taught Perl to more than a 1,000 people. Both in Israel and in some other countries. It was really nice. It let me travel to Perl conferences and workshops around the world and meet nice people. Unfortunately there are hardly any Perl training courses these days and unless there are some major changes in the language I don't expect this to change.

I am mentioning this because this week is the first time I am teaching an in-person Rust course. Interestingly, to a bunch of Python programmers who are switching from Python to Rust. I am both nervous and excited. I am excited as I love learning and the explaining new technologies and there is a lot to learn in Rust. There is also more to teach in Rust as it is much harder to learn than Perl or Python.

Anyway

Enjoy your week!

--
Your editor: Gabor Szabo.

Event reports

The Perl and Raku Conference 2024 - Las Vegas

The report of Keith Carangelo.

Fear and loathing at YAPC

Despite being the worst attended YAPC in recent memory, 2024's show in Vegas had some of the best talks in a long while.

Virtual presentations for Perl developers

Continuous Integration (CI): GitHub Actions for Perl Projects (Free Virtual presentation on August 4)

This events was postponed to August 4. In this virtual event you will learn why and how to use GitHub Actions as a CI system for your Perl projects. The meeting is free of charge thanks to my supporters via Patreon and GitHub. Besides this event I am running many more, so make sure you check the Code Mavens meetup group and also register to it.

GitHub Pages for Perl developers (Free Virtual presentation on August 15)

In this virtual event you will learn how to use Markdown and GitHub Pages to create a simple web site and then we'll extend our use of GitHub Actions to generate the site using Perl. Register now!

Articles

The Quest for Performance Part IV : May the SIMD Force be with you

See discussion on reddit

A p5p discussion about adding :writer to perlclass

Using Coro and AnyEvent Interactively

I have not been able to figure out how to run an async thread in the background while using a REPL like reply. The moment I run the main loop, it takes over the input from the REPL. Here's what a typical failed REPL session might look like.

Migrating from MySQL to PostgreSQL

Perl and why you use it

Perl script to write into the Fediverse (and Nostr)

apparently NUL is mostly whitespace in Perl?

How to use perl v5.40's boolean builtins in Mojo::Pg queries

Grants

Maintaining Perl 5 Core (Dave Mitchell): June 2024

The Weekly Challenge

The Weekly Challenge by Mohammad Sajid Anwar will help you step out of your comfort-zone. You can even win prize money of $50 by participating in the weekly challenge. We pick one champion at the end of the month from among all of the contributors during the month, thanks to the sponsor Lance Wicks.

The Weekly Challenge - 278

Welcome to a new week with a couple of fun tasks "Sort String" and "Reverse Word". If you are new to the weekly challenge then why not join us and have fun every week. For more information, please read the FAQ.

RECAP - The Weekly Challenge - 277

Enjoy a quick recap of last week's contributions by Team PWC dealing with the "Count Common" and "Strong Pair" tasks in Perl and Raku. You will find plenty of solutions to keep you busy.

TWC277

CPAN modules can be very handy to get you elegant one liner. Thanks for sharing the knowledge with us.

Count the Common Ones and the Strong Pairs

Erlang is the surprise guest language this week. I love the simple narrative, it is so easy to follow. Keep sharing.

Strong Count

Another cool use case for Bag of Raku magics. The end result is one-liner. Great, keep it up.

Strength Uncombined

Bag for Perl can be found in CPAN module Set::Bag. CPAN is the rockstar. Highly recommended.

Perl Weekly Challenge: Week 277

The one liner in the end of the post is the gem of code. Great work, thanks for sharing.

Common Strength

Simple for loop showing the power and getting the job done. Simple yet powerful, keep it up.

Perl Weekly Challenge 277: Count Common

Another example of how to port Bag of Raku in Perl. Great work for spreading the knowledge.

Perl Weekly Challenge 277: Strong Pair

Raku's combinations method is so handy and make the code compact. In Perl, simple for loop is enough. Thanks for sharing.

Perl Weekly Challenge 277

Master of inhouse Perl one-liners sharing great example. You really don't want to miss it. Well done.

They call me the count, because I love to count pairs! Ah, ah, ah!

Another cool use of CPAN module, simple and easy interface to get the job done. Thanks for sharing.

Commons and pairs

Cute little solutions in Perl. So simple yet very easy to follow. Keep it up great work.

The Weekly Challenge - 277

Full on demo of CPAN modules. Happy to see the popularity among team members. Well done and keep it up.

The Weekly Challenge #277

No gimmicks, pure Perl solution using just core functions. The end result is still very powerful. Thanks for sharing.

A Strong Count

PostScript is getting regular space these days in the weekly post. I enjoy reading the code and learning too. Thanks for your contributions.

Strong counting

Today, I learnt how to declare type for list of list in Python. Thanks for sharing knowledge every week.

Weekly collections

NICEPERL's lists

Great CPAN modules released last week;
MetaCPAN weekly report;
StackOverflow Perl report.

Events

Continuous Integration (CI): GitHub Actions for Perl Projects

August 4, 2024, in Zoom

Toronto Perl Mongers monthly meeting

July 25, 2024, Virtual event

London Perl and Raku Workshop

October 26, 2024, in London, UK

You joined the Perl Weekly to get weekly e-mails about the Perl programming language and related topics.

Want to see more? See the archives of all the issues.

Not yet subscribed to the newsletter? Join us free of charge!

(C) Copyright Gabor Szabo
The articles are copyright the respective authors.

NewLib porting++

Perl on Medium

Published by Seo Minsang on Monday 15 July 2024 00:25

resolve error for automake 1.11

(div) 4 great CPAN modules released last week

Niceperl

Published by Unknown on Saturday 13 July 2024 20:41

Updates for great CPAN modules released last week. A module is considered great if its favorites count is greater or equal than 12.

  1. Dumbbench - More reliable benchmarking with the least amount of thinking
    • Version: 0.504 on 2024-07-09, with 17 votes
    • Previous CPAN version: 0.503 was 2 years, 2 months, 18 days before
    • Author: BDFOY
  2. IO::Socket::SSL - Nearly transparent SSL encapsulation for IO::Socket::INET.
    • Version: 2.087 on 2024-07-08, with 49 votes
    • Previous CPAN version: 2.086 was 5 days before
    • Author: SULLR
  3. PerlPowerTools - BSD utilities written in pure Perl
    • Version: 1.046 on 2024-07-11, with 39 votes
    • Previous CPAN version: 1.045 was 2 months, 11 days before
    • Author: BRIANDFOY
  4. Pod::Man - Convert POD data to various other formats
    • Version: v6.0.1 on 2024-07-13, with 14 votes
    • Previous CPAN version: 5.01 was 1 year, 6 months, 19 days before
    • Author: RRA

(dlxxxviii) metacpan weekly report

Niceperl

Published by Unknown on Saturday 13 July 2024 20:39

This is the weekly favourites list of CPAN distributions. Votes count: 36

This week there isn't any remarkable distribution

Build date: 2024/07/13 18:38:42 GMT


Clicked for first time:


Increasing its reputation:

(dcxiv) stackoverflow perl report

Niceperl

Published by Unknown on Saturday 13 July 2024 20:37

I still like printing code listings

rjbs forgot what he was saying

Published by Ricardo Signes on Friday 12 July 2024 12:00

I used to program on paper, then type it in later. Not all the time, but sometimes. Sometimes I’d write pseudocode. Sometimes I wrote just the code I would type in later. Sometimes just flow charts and subroutine signatures. These days, I only really do the last version, because I always have a computer nearby now. I’m not stuck in a boring lecture, for example, with only a legal pad.

Back then, and later, I’d also review code on paper. This was before I was doing formal “code review” for work. Sometimes I just wanted to look at my own code, out of the context of my computer, and think about it. Sometimes I wanted to look at somebody else’s code. Either way, putting it on paper was really useful. I could read it away from the rest of the distractions of my computer, and I could draw circles and arrows in a bunch of different colors.

The way I did it, then, was to use Vim’s :hardcopy command. Like the rest of Vim, it has slightly strange but actually very good documentation. I worked on Windows in the early 2000s, and I could enter :ha and get a printout. Sometimes, though, I’d use the a2ps program instead. That was a bit more work to use, but it produced better listings (as I recall), especially because it would print two pages of code on one piece of paper, while remaining quite legible.

Over time, I printed code less often. This was partly, but not entirely, because I was coding less. Lately, I’ve been coding a bit more. On top of that, I do a lot of it sitting not twenty feet from Mark Dominus, who seems to print out nearly every hunk of code he reviews. This has brought back memories of how much I got out of printing code. It also led to him asking me a question or two about a2ps that left me a little surprised and embarrassed that I had forgotten how to use it well.

Over the last twenty four hours, I’ve tried to get back up to speed, and to find (and simplify) a method for printing code listings on demand. This blog post is a bit of a recounting of that work, and what I found.

PostScript

I wanted to start with a2ps but I have to write a prelude about PostScript.

The “a” in a2ps stands for “any”. The idea is that you feed it basically any kind of source code (or plain text) and it will build a PostScript file for printing. PostScript is a programming language created by Adobe and used (mostly but not entirely) to drive printers. It’s a bit of a weird language, but it’s postfix and stack-based, so I have a soft spot in my heart for it. You can see three books on PostScript on shelf three in my post about my technical bookshelf. Ten years ago, showing the kid different kinds of programming, we wrote this program:

/text {
  /Times-Roman findfont exch scalefont setfont
} def

newpath 24 text 200 400 moveto
        (You're standing on my neck!) show

newpath 284 284 72 0 360 arc stroke
newpath 265 300 12 0 360 arc fill
newpath 303 300 12 0 360 arc fill
newpath 250 225 moveto 275 225 lineto
        275 200 lineto 250 200 lineto
        250 225 lineto stroke
newpath 275 225 moveto 300 225 lineto
        300 200 lineto 275 200 lineto
        275 225 lineto stroke

newpath 300 225 moveto 325 225 lineto
        325 200 lineto 300 200 lineto
        300 225 lineto stroke

It draws a skull with the caption “You’re standing on my neck!”. Try it!

But how? Well, in theory you can send it directly to your printer with lp, but on current macOS (unlike older versions) you will get this error: Unsupported document-format “application/postscript”.

I’m not sure exactly where the problem lies. My suspicion is that it’s in the CUPS service that serves as the mediator between the user and the printer on macOS. Probably I could get around this by using mDNS and IPP, and honestly I am tempted to go learn more. But there was a simpler solution: ps2pdf converts a PostScript program to a PDF file. It’s shipped with Ghostscript and is easy to get from Homebrew.

PDF is actually based on PostScript, but I don’t know the details. PostScript has also been used for rendering graphical displays, using a system called Display PostScript (DPS), which was developed by both Adobe and NeXT, and later became the basis for the MacOS X display system. So, why doesn’t macOS support PostScript well anymore? Honestly, I don’t know.

Anyway: lots of the things I tried using for printing output PostScript, which is meant to be easy to send on to the printer. With ps2pdf installed, printing these files isn’t so hard. It’s just a drag that they can’t be send right to the lp command.

a2ps

Right, back to a2ps! Given a piece of source code, it will spit out a PostScript program representing a nice code listing. Unfortunately, running it out of the box produced something pretty awful, and I had to fumble around a good bit before I got what I wanted. I didn’t save a copy, so if you want to see it, you can try to reproduce it on your own. The problems included being off center, running off the page margins, using a mix of different typefaces in one source listing, using awful colors, and probably other stuff. So, I had to consult the manual. Unfortunately, I started doing that by running man a2ps, immediately hitting the problem that has infuriated geeks for decades: I got a pretty mediocre man page with a footnote saying the real docs were in Texinfo. And info isn’t installed.

Eventually I found myself reading the a2ps manual as a PDF on the web. With that (and with some help from Mark), I found that much of what I needed would come down to putting this in ~/.a2ps/a2psrc:

Options: --medium=letter
Options: --line-numbers 1
Options: --prologue color

This set my paper size, turned on line numbering on every line, and said that I wanted highlighting to be expressed as color, not just font weight.

There were two problems that I could not get over:

  1. Typeface mixing! Everything was in fixed text except for literal strings, which were (to my horror) represented in a proportional font.
  2. Awful colors. For example, subroutine names were printed in black on a bright yellow background. Probably some people think this is fine. I did not. (The a2ps manual admits: “It is pretty known that satisfying the various human tastes is an NEXPTIME-hard problem.”)

So, how to fix it? By hacking on a PostScript file!

a2ps combines (roughly) two things to build the final PostScript program: the prologue, and the program. (There’s also the header (“hdr”), but that’s included directly by the prologue. Let’s not split hairs.)

The program is a series of PostScript instructions that will print out your listing. The prologue is a set of function definitions that the program will used. The a2ps binary (written in C) reads your source document, tokenizes it for syntax highlighting, and then emits a program. For example, here’s a hunk of output for the Perl code that I’ll be using in all the samples in this post.

0 T () S
(sub) K
( ) p
(richsection) L
( \(@elements\) {) p n
(185) # (  Slack::BlockKit::Block::RichText::Section->new\({) N
0 T (    elements => _rtextify\(@elements\),) N
0 T (  }\);) N
0 T (}) N
0 T () N
(190) # () S

The parentheses are string delimiters, and because PostScript is a postfix language, when you see (richsection) L it’s calling the function L with the string “richsection” on the stack. True, there may be other things on the stack, but I happen to know that L is a one-argument function. It looks like this:

/L {
  0 0 0 FG
  1 1 0 true BG
  false UL
  false BX
  fCourier-Bold bfs scalefont setfont
  Show
} bind def

This prints the string on the stack to the current position on the page in black on bright yellow. Yuck. This function comes from the “color” prologue, which is installed in share/a2ps/ps. There’s no way to change parameters to it, so the recommended practice is to copy it into ~/.a2ps and edit that. This would be more horrifying if there were new version of a2ps coming out with notable changes, but there aren’t, so it’s … fine.

I hacked up a copy of color.pro and changed the prologue option in my configuration file to rjbs.pro. I renewed my PostScript programmer credentials! While doing this, I also fixed the typefaces, replacing Times-Roman with Courier in the definition of str, the string literal display function.

This got me some pretty decent output, shown here, and linked to a PDF:

a page of code from a2ps

By the way, all the samples in this post will be from formatting this copy of Slack::BlockKit::Sugar.

This was fine, but I had two smaller problems:

  1. The syntax highlighting is a bit anemic (but not so bad).
  2. The line spacing is a little tight for me.

Fixing the first one means editing the “style sheet” for Perl. This isn’t like CSS at all. It doesn’t define the style, it defines how to mark up tokens as being one thing or another. The functions in the prologue will do the styling. I looked at what might be worth putting here as a sample, but I think if you want to see what they look like, you should check out the perl.ssh file itself. It’s fine, but it’s also obvious that making it better would be an ordeal. I bailed.

Fixing line spacing felt like it should be easy, though. Surely there’d be an option for that, right? Sadly, no. I decided to use my PostScript expertise to work. Here’s the N function, which renders a line and moves to the next position:

/N {
  Show
  /y0 y0 bfs sub store
  x0 y0 moveto
} bind def

bfs is the “body font size”. We’re moving a little down the page by reducing the y position of the cursor by the font size. What if we did this?

/y0 y0 bfs 1.1 mul sub store

That should add a 10% line spacing increase, right? Well, yes, but the problem is this: remember how the a2ps binary is responsible for spitting out the PostScript program? That’s where it computes how many lines per page. By mucking with the vertical spacing, we start running off the end of the page. We need to change the number of lines that a2ps puts on the page. No problem, we’d just tweak this code:

job->status->linesperpage =
  (int) ((printing_h / job->fontsize) - BOTTOM_MARGIN_RATIO);

…at which point I realized I had to install automake. I did, and I went a few steps further, and finally gave up. It was too annoying for a Saturday.

What if instead I changed the default style? I won’t bore you with the PostScript, but I made a new function, vbfs, defined as 0.9 of bfs. I updated all the rendering functions to use that value for size, but the full value was still used for line spacing. This worked! But changing the font size mean that I was ending up with horizontal dead space. I was scaling everything down, when all I wanted to scale up was the vertical space. It was unsatisfactory, and I decided to settle for the tight line spacing.

…for about five minutes. And then I decided to try the other GNU program for turning source code into PostScript.

enscript

GNU Enscript bills itself as “a free replacement for Adobe’s enscript program”. I don’t know what that was, but I can tell you that enscript is basically “a2ps, except different”. It serves the same function. I fed it the same hunk of code, but not before reading the manual and finding --baselineskip, which is exactly what I wanted: a way to control line spacing. I used this invocation:

enscript lib/Slack/BlockKit/Sugar.pm \
  --columns=2       \
  --baselineskip=2  \
  --landscape       \
  --color           \
  --highlight       \
  --line-numbers    \
  --output Sugar.ps

It looked pretty good at first, when looking at the first page of output (not pictured here). On the other hand, here’s page three:

a page of code from enscript

The line spacing is nice (and maybe nicer when cranked up), and the colors aren’t offensive. But they’re all wrong. Part of the issue is that this source is using =func as if it was a real Pod directive, which it isn’t. On the other hand, it’s real enough that Perl will ignore the enclosed documentation. Syntax highlighting should start at =func and end at =cut. The syntax definition for Perl in enscript is very strict, and so this is wrong. And that means that the documentation’s syntax highlighting ends up all wrong all over the place. It’s unusable.

Syntax highlighting in enscript is different than a2ps’s style sheets. Instead, it’s programmed with a little “state” language. You can read the Perl state program, but I’m not sure I recommend it. It’s relatively inscrutable, or at least it is written in terms of some other functionality that doesn’t seem well documented. Fixing the Pod thing seemed trivial, but all I could imagine was an endless stream of further annoyance. Maybe this isn’t fair, but it’s where I ended up.

At this point, settling on a2ps might have been a good idea, but instead I moved on to Vim.

Vim :hardcopy

Way up at the top of this post, I mentioned that I had used Vim in the past. So, why not now? Well, reasons. The first one is that I didn’t remember how, and I knew that “it didn’t work anymore”. But I was in for way more than a penny by now, so I went further down the rabbit hole.

It turned out that “it didn’t work anymore” was trivial. I was getting this error:

E365: Failed to print PostScript file

Right. Because lp doesn’t handle PostScript anymore. I could just write the PostScript file to a file, then apply ps2pdf. I did so, and it was bad. The good news was that getting from bad to okay wasn’t so hard. I had to set some printoptions in Vim.

set printoptions=paper:letter,number:y,left:5pc

This sets my paper size (which I’d already had in my .vimrc actually!), turns on line numbering, and reduces the left margin to 5%. The default left margin was 10%, which was just way too much. It’s nice to have space to write, but I usually do that in the whitespace on the right side of the code. To print to a file in Vim, you can execute :hardcopy > filename.ps. With these settins, I got this output:

a page of code from Vim

The main problem here is that it’s one-up. Only one page of code per sheet of paper. It’s easy to read, but it takes twice as much paper. It’s a waste, and also leads to more desk clutter than necessary. My desk is enough of a mess as it is.

Fortunately, there’s a solution for this! The slightly obscure mpage command reads a PostScript document in, then spits out another one that’s multiple pages per sheet. It hasn’t quite fallen off the web, but it’s not extremely well published. Here’s a link to its man page hosted at a fairly random-seeming location at CMU. Fortunately, it’s in Homebrew. I could take the PDF above and run this:

mpage -2 -bLetter -S Sugar.ps > Sugar2.ps && ps2pdf Sugar2.ps

The -S option there is fairly critical. It says “allow non-square scaling”. In theory this might introduce some distortion, but I can’t notice it, and it gets a better use of space on the page. I also go back to my printoptions in Vim and set all the margins to 0pc. Since mpage will be adding a margin when it produces the two-up pages, I don’t need any margin on the pages it’s combining. I get four more lines per page, plus more space on the right of each listing.

Here’s what we get:

a page of code from Vim, sent through mpage

I wasn’t sure how to get a better set of colors. I’m pretty sure it’s possible, but I’ll have to think about it and play around. There is a very large benefit here, though. The syntax highlighting that I get in the PDF will be based on the same syntax highlighting that I’m used to seeing every day in Vim. I know nothing is critically wrong, and if there was, I’d be very motivated to fix it, because I’d be seeing it every day in my editor!

The real problem, for fixing the colors, is that I use a dark-background color scheme in Vim. Printing tries to emulate your color scheme, but has to correct for the fact that it’s going to print on white paper. The real answer is to have an alternate color scheme ready for printing.

Still, I’m pretty happy with this. All that remained was to make it really easy. So, I wrote a stupid little Perl program that finds a PostScript file on disk, runs it through mpage, then ps2pdf, then puts it on the Desktop and opens it in Preview. Then I updated my Vim configuration to make :hardcopy send print jobs to that instead of lp. It looks like this:

set printexpr=ByzantinePrintFile()
function ByzantinePrintFile()
  call system("/Users/rjbs/bin/libexec/vim-print-helper "
    \.. v:fname_in
    \.. " "
    \.. shellescape(expand('%:t'))
  \)
  call delete("v:fname_in")
  return v:shell_error
endfunc

Vim script is still weird.

This is where I’ll stop, feeling content and full of PostScript. I think for many people, all this PostScript nonsense would leave a bad aftertaste. For them, there’s another option…

Vim 2html.vim

Vim ships with a helper file called 2html.vim, which exports the current buffer as HTML, using the settings and colors currently in use. You can enter this in your Vim command line:

runtime! syntax/2html.vim

…and you’ll get a new Vim buffer full of HTML. The parity with the Vim display is impressive.

HTML output and Vim side by side

The problem is that so far, I’ve found going from HTML to a two-up PDF is too much of a pain. Possibly there’s some weird route from HTML to PostScript to piping through mpage but I think I’ll leave that adventure for another day or another dreamer. Me, I’ve got printing to do.

Maintaining Perl 5 Core (Dave Mitchell): June 2024

Perl Foundation News

Published by alh on Thursday 11 July 2024 08:43


Dave writes:

This is my monthly report on work done during June 2024 covered by my TPF perl core maintenance grant.

I spent most of last month continuing to work on understanding XS and improving its documentation, as a precursor to adding reference-counted stack (PERL_RC_STACK) abilities to XS.

This work is a bit frustrating, as I still haven't got anything publicly to show for it. Privately however, I do have about 4000 lines of notes on ways to improve the documentation and the XS parser itself.

I have also been going through the Extutils::ParseXS' module's code line by line trying to understand it, and have (so far) added about 1000 new lines of code comments to ParseXS.pm, which is nearly ready to be pushed. This has involved a lot of code archaeology, since much of the code is obscure. I have even discovered XS keywords which have been implemented but aren't documented (such as "ATTRS:" and "NOT_IMPLEMENTED_YET").

I have also dug out the xsubpp script from the 1994 perl5.000 release and from time to time run it against some sample XS constructs. Amazingly, it still runs under a modern perl (although I didn't check whether the C code it outputs is still compilable).

SUMMARY: * 1:38 process p5p mailbox * 53:30 rework XS documentation

Total: * 55:08 (HH:MM)

apparently NUL is mostly whitespace in Perl?

rjbs forgot what he was saying

Published by Ricardo Signes on Tuesday 09 July 2024 12:00

This post will be short but baffling.

In the following snippet, the ^@ sequences indicate literal NUL bytes in the source document.

use v5.36.0;
my $i = ^@ 1;
sub foo () {
 say q^@Hell0, w0rld.^@;
 $i^@++;
}

foo(^@);

say ^@$i;

This program will run and print:

Hell0, w0rld.
2

The NUL bytes are all ignored… except for the two that act as the string delimiters for the q-operator. As near as I can tell without reading any of the source for the perl tokenizer (or related code), NUL is treated like whitespace. I am gobsmacked. I know all kinds of weird stuff about Perl, but this one surprised me.

It came up when I tried to run a bunch of small programs through perltidy, and it refused to process just one file, because it was “binary”. There was a NUL after a subroutine’s opening brace. I removed it out, but not because it was a syntax error. Just because it was not in good taste to leave it there.

At this point one may wonder how numba, the Python compiler around numpy Python code, delivers a performance premium over numpy. To do so, let’s inspect timings individually for all trigonometric functions (and yes, the exponential and the logarithm are trigonometric functions if you recall your complex analysis lessons from high school!). But the test is not relevant only for those who want to do high school trigonometry: physics engines e.g. in games will use these function, and machine learning and statistical calculations heavily use log and exp. So getting the trigonometric functions right is one small, but important step, towards implementing a variety of applications. The table below shows the timings for 50M in place transformations:

Function Library Execution Time
Sqrt numpy 1.02e-01 seconds
Log numpy 2.82e-01 seconds
Exp numpy 3.00e-01 seconds
Cos numpy 3.55e-01 seconds
Sin numpy 4.83e-01 seconds
Sin numba 1.05e-01 seconds
Sqrt numba 1.05e-01 seconds
Exp numba 1.27e-01 seconds
Log numba 1.47e-01 seconds
Cos numba 1.82e-01 seconds

The table holds the first hold to the performance benefits: the square root, a function that has a dedicated SIMD instruction for vectorization takes exactly the same time to execute in numba and numpy, while all the other functions are speed up by 2-2.5 time, indicating either that the code auto-vectorizes using SIMD or auto-threads. A second clue is provided by examining the difference in results between the numba and numpy, using the ULP (Unit in the Last Place). ULP is a measure of accuracy in numerical calculations and can easily be computed for numpy arrays using the following Python function:

def compute_ulp_error(array1, array2):
## maxulp set up to a very high number to avoid throwing an exception in the code
    return np.testing.assert_array_max_ulp(array1, array2, maxulp=100000)

These numerical benchmarks indicate that the square root function utilizes pretty much equivalent code in numpy and numba, while for all the other trigonometric functions the mean, median, 99.9th percentile and maximum ULP value over all 50M numbers differ. This is a subtle hint that SIMD is at play: vectorization changes slightly the semantics of floating point code to make use of associative math, and floating point numerical operations are not associative.

Function Mean ULP Median ULP 99.9th ULP Max ULP
Sqrt 0.00e+00 0.00e+00 0.00e+00 0.00e+00
Sin 1.56e-03 0.00e+00 1.00e+00 1.00e+00
Cos 1.43e-03 0.00e+00 1.00e+00 1.00e+00
Exp 5.47e-03 0.00e+00 1.00e+00 2.00e+00
Log 1.09e-02 0.00e+00 2.00e+00 3.00e+00

Finally, we can inspect the code of the numba generated functions for vectorized assembly instructions as detailed here, using the code below:

@njit(nogil=True, fastmath=False, cache=True)
def compute_sqrt_with_numba(array):
    np.sqrt(array, array)


@njit(nogil=True, fastmath=False, cache=True)
def compute_sin_with_numba(array):
    np.sin(array, array)


@njit(nogil=True, fastmath=False, cache=True)
def compute_cos_with_numba(array):
    np.cos(array, array)


@njit(nogil=True, fastmath=False, cache=True)
def compute_exp_with_numba(array):
    np.exp(array, array)


@njit(nogil=True, fastmath=False, cache=True)
def compute_log_with_numba(array):
    np.log(array, array)

## check for vectorization
## code lifted from https://tbetcke.github.io/hpc_lecture_notes/simd.html
def find_instr(func, keyword, sig, limit=5):
    count = 0
    for l in func.inspect_asm(func.signatures[sig]).split("\n"):
        if keyword in l:
            count += 1
            print(l)
            if count >= limit:
                break
    if count == 0:
        print("No instructions found")

# Compile the function to avoid the overhead of the first call
compute_sqrt_with_numba(np.array([np.random.rand() for _ in range(1, 6)]))
compute_sin_with_numba(np.array([np.random.rand() for _ in range(1, 6)]))
compute_exp_with_numba(np.array([np.random.rand() for _ in range(1, 6)]))
compute_cos_with_numba(np.array([np.random.rand() for _ in range(1, 6)]))
compute_exp_with_numba(np.array([np.random.rand() for _ in range(1, 6)]))

And this is how we probe for the presence f the vmovups instruction indicating that the YMM AVX2 registers are being used in calculations

print("sqrt")
find_instr(compute_sqrt_with_numba, keyword="vsqrtsd", sig=0)
print("\n\n")
print("sin")
find_instr(compute_sin_with_numba, keyword="vmovups", sig=0)

As we can see in the output below the square root function uses the x86 SIMD vectorized square root instruction vsqrtsd , and the sin (but also all trigonometric functions) use SIMD instructions to access memory.

sqrt
        vsqrtsd %xmm0, %xmm0, %xmm0
        vsqrtsd %xmm0, %xmm0, %xmm0


sin
        vmovups (%r12,%rsi,8), %ymm0
        vmovups 32(%r12,%rsi,8), %ymm8
        vmovups 64(%r12,%rsi,8), %ymm9
        vmovups 96(%r12,%rsi,8), %ymm10
        vmovups %ymm11, (%r12,%rsi,8)

Let’s shift attention to Perl and C now (after all this is a Perl blog!). In Part I we saw that PDL and C gave similar performance when evaluating the nested function cos(sin(sqrt(x))), and in Part II that the single-threaded Perl code was as fast as numba. But what about the individual trigonometric functions in PDL and C without the Inline module? The C code block that will be evaluated in this case is:

#pragma omp for simd
  for (int i = 0; i < array_size; i++) {
    double x = array[i];
    array[i] = foo(x);
  }

where foo is one of sqrt, sin, cos, log, exp. We will use the simd omp pragma in conjunction with the following compilation flags and the gcc compiler to see if we can get the code to pick up the hint and auto-vectorize the machine code generated using the SIMD instructions.

CC = gcc
CFLAGS = -O3 -ftree-vectorize  -march=native -mtune=native -Wall -std=gnu11 -fopenmp -fstrict-aliasing -fopt-info-vec-optimized -fopt-info-vec-missed
LDFLAGS = -fPIE -fopenmp 
LIBS =  -lm

During compilation, gcc informs us about all the wonderful missed opportunities to optimize the loop. The performance table below also demonstrates this; note that the standard C implementation and PDL are equivalent and equal in performance to numba. Perl through PDL can deliver performance in our data science world.

Function Library Execution Time
Sqrt PDL 1.11e-01 seconds
Log PDL 2.73e-01 seconds
Exp PDL 3.10e-01 seconds
Cos PDL 3.54e-01 seconds
Sin PDL 4.75e-01 seconds
Sqrt C 1.23e-01 seconds
Log C 2.87e-01 seconds
Exp C 3.19e-01 seconds
Cos C 3.57e-01 seconds
Sin C 4.96e-01 seconds

To get the compiler to use SIMD, we replace the flag -O3 by -Ofast and all these wonderful opportunities for performance are no longer missed, and the C code now delivers (but with the usual caveats that apply to the -Ofast flag).

Function Library Execution Time
Sqrt C - Ofast 1.00e-01 seconds
Sin C - Ofast 9.89e-02 seconds
Cos C - Ofast 1.05e-01 seconds
Exp C - Ofast 8.40e-02 seconds
Log C - Ofast 1.04e-01 seconds

With these benchmarks, let’s return to our initial Perl benchmarks and contrast the timings obtained with the non-SIMD aware invokation of the Inline C code:

use Inline (
    C    => 'DATA',
    build_noisy => 1,
    with => qw/Alien::OpenMP/,
    optimize => '-O3 -march=native -mtune=native',
    libs => '-lm'
);

and the SIMD aware one (in the code below, one has to incluce the vectorized version of the mathematics library for the code to compile):

use Inline (
    C    => 'DATA',
    build_noisy => 1,
    with => qw/Alien::OpenMP/,
    optimize => '-Ofast -march=native -mtune=native',
    libs => '-lmvec'
);

The non-vectorized version of the code yield the following table

Inplace  in  base Python took 11.9 seconds
Inplace  in PythonJoblib took 4.42 seconds
Inplace  in         Perl took 2.88 seconds
Inplace  in Perl/mapCseq took 1.60 seconds
Inplace  in    Perl/mapC took 1.50 seconds
C array  in            C took 1.42 seconds
Vector   in       Base R took 1.30 seconds
C array  in   Perl/C/seq took 1.17 seconds
Inplace  in     PDL - ST took 0.94 seconds
Inplace in  Python Numpy took 0.93 seconds
Inplace  in Python Numba took 0.49 seconds
Inplace  in   Perl/C/OMP took 0.24 seconds
C array  in   C with OMP took 0.22 seconds
C array  in    C/OMP/seq took 0.18 seconds
Inplace  in     PDL - MT took 0.16 seconds

while the vectorized ones this one:

Inplace  in  base Python took 11.9 seconds
Inplace  in PythonJoblib took 4.42 seconds
Inplace  in         Perl took 2.94 seconds
Inplace  in Perl/mapCseq took 1.59 seconds
Inplace  in    Perl/mapC took 1.48 seconds
Vector   in       Base R took 1.30 seconds
Inplace  in     PDL - ST took 0.96 seconds
Inplace in  Python Numpy took 0.93 seconds
Inplace  in Python Numba took 0.49 seconds
C array  in   Perl/C/seq took 0.30 seconds
C array  in            C took 0.26 seconds
Inplace  in   Perl/C/OMP took 0.24 seconds
C array  in   C with OMP took 0.23 seconds
C array  in    C/OMP/seq took 0.19 seconds
Inplace  in     PDL - MT took 0.17 seconds

To facilitate comparisons against the various flavors of Python and R, we inserted the results we presented previously in these 2 tables.

The take home points (some of which may be somewhat surprising are):

  1. Performance of base Python was horrendous - nearly 4 times slower than base Perl. Even under parallel processing (Joblib) the 8 threaded Python code was slower than Perl
  2. R performed the best out of the 3 dynamically typed languages considered : nearly 9 times faster than base Python and 2.3 times faster than base Perl.
  3. Inplace modification of Perl arrays by accessing the array containers in C (Perl/mapC) narrowed the gap between Perl and R
  4. Considering variability in timings, the three codes, base R, Perl/mapC and an equivalent inplace modification of a C array are roughly equivalent
  5. The uni-threaded PDL code delivered the same performance as numpy
  6. SIMD aware code, e.g. through numba or (for the case of native C code compiler pragmas, e.g. cntract C array in C timings between the vectorized and the non-vectorized versions of the table) delivered the best single thread performance.
  7. OpenMP pragmas can really speed up operations (such as map) of Perl containers.
  8. Adding threads (OMP code) or the multi-threaded versions of the PDL delivered the best performance.

These observations generate the following big picture questions:

  • Observations 1 and 2 make it quite surprising that base Python earned a reputation for a data science language. In fact, with these performance characteristics of a rather simple numerical code, one should approach the replacement of R (or Perl for those of us who use it) by base Python in data analysis codes.
  • Since hybrid implementations that involve C and Perl can deliver real performance benefits using SIMD aware code, even if a single thread is used, should we be upgrading our Inline::C codes to use SIMD friendly compiler flags? Or (for those who are are afraid of -Ofast, perhaps a carefully prepared mixology of intrinsics, and even assembly?
  • Should we be looking into upgrading various C/XS modules for Perl, so that they use OpenMP?
  • Why are not more people using the jewel of software that is PDL? The auto-threading property alone should make people think about using it for demanding, data intensive tasks.
  • Is it possible to create hybrid applications that rely on both PDL and OpenMP/Inline::C for different computations?
  • Should we look into special builds for PDL that leverage the SIMD properties and compiler pragmas to allow users to have their cake (SIMD vectorization) and eat it (autothreading) too?

Combining calendars

Perl Hacks

Published by Dave Cross on Sunday 07 July 2024 10:35

One of the most popular posts I’ve written in recent months was the one where I talked about all the pointless personal projects I have. The consensus in the many comments I received was that anything you find useful isn’t pointless. And I can’t really argue with that.

But it’s nice when one of your projects is used by other people. And that happened to me recently.

The initial commit in mergecal is from 2016, but I strongly suspect it existed as code that wasn’t in source code control for several years before that. The idea behind it is simple enough. I wanted to be able to share my calendar with someone, but I didn’t have a single iCal file that I could share. For various complicated and yet dull reasons, my calendar is split across a number of separate iCal files. Initially, I remember thinking there must be an online service that will take a list of iCal calendars and produce a single, combined one. But a few hours on Google didn’t find anything so I did what any hacker would do and wrote my own.

It really wasn’t difficult. As usual, it was just a case of plumbing together a few CPAN modules. In this case, Text::vFile::asData did most of the heavy lifting – with JSON used to parse a configuration file. It can’t have taken more than an hour to write. And, as the commit history shows, very few subsequent changes were required. I just set it up with the correct configuration and a cronjob that rebuilt the combined calendar once a day and published it on my web site.

And then I forgot about it for years. The best kind of software.

Then, in January of this year, I got a pull request against the code. This astonished me. MY SOFTWARE HAD A USER. And in the PR, the user said “It boggles my mind that there is still no simpler free solution, even after all those years”.

So maybe this would be useful to a few more people. Perhaps I should market it better (where “better” means “at all”).

As a first step towards that, I’ve rewritten it and released it to CPAN as App::MergeCal. Maybe I should think about putting it online as some kind of web service.

Anyway, it makes me incredibly happy to know my software is used by even one person. Which reminds me – please take the time to say “thank you” to anyone whose software you find useful. It’s a small thing, but you’ll make someone very happy.

The post Combining calendars first appeared on Perl Hacks.

Having run this toy performance example, we will now digress somewhat and contrast the performance against a few Python implementations. First let’s set up the stage for the calculations, and provide commandline capabilities to the Python script.

import argparse
import time
import math
import numpy as np
import os
from numba import njit
from joblib import Parallel, delayed

parser = argparse.ArgumentParser()
parser.add_argument("--workers", type=int, default=8)
parser.add_argument("--arraysize", type=int, default=100_000_000)
args = parser.parse_args()
# Set the number of threads to 1 for different libraries
print("=" * 80)
print(
    f"\nStarting the benchmark for {args.arraysize} elements "
    f"using {args.workers} threads/workers\n"
)

# Generate the data structures for the benchmark
array0 = [np.random.rand() for _ in range(args.arraysize)]
array1 = array0.copy()
array2 = array0.copy()
array_in_np = np.array(array1)
array_in_np_copy = array_in_np.copy()

And here are our contestants:

  • Base Python
    for i in range(len(array0)):
      array0[i] = math.cos(math.sin(math.sqrt(array0[i])))
    
  • Numpy (Single threaded)
    np.sqrt(array_in_np, out=array_in_np)
    np.sin(array_in_np, out=array_in_np)
    np.cos(array_in_np, out=array_in_np)
    
  • Joblib (note that this example is not a true in-place one, but I have not been able to make it run using the out arguments)

def compute_inplace_with_joblib(chunk):
    return np.cos(np.sin(np.sqrt(chunk))) #parallel function for joblib

chunks = np.array_split(array1, args.workers)  # Split the array into chunks
numresults = Parallel(n_jobs=args.workers)(
        delayed(compute_inplace_with_joblib)(chunk) for chunk in chunks
    )# Process each chunk in a separate thread
array1 = np.concatenate(numresults)  # Concatenate the results
  • Numba
    @njit
    def compute_inplace_with_numba(array):
      np.sqrt(array,array)
      np.sin(array,array)
      np.cos(array,array)
      ## njit will compile this function to machine code
    compute_inplace_with_numba(array_in_np_copy)
    

And here are the timing results:

In place in (  base Python): 11.42 seconds
In place in (Python Joblib): 4.59 seconds
In place in ( Python Numba): 2.62 seconds
In place in ( Python Numpy): 0.92 seconds

The numba is surprisingly slower!? Could it be due to the overhead of compilation as pointed out by mohawk2 in an IRC exchange about this issue? To test this, we should call compute_inplace_with_numba once before we execute the benchmark. Doing so, shows that Numba is now faster than Numpy.

In place in (  base Python): 11.89 seconds
In place in (Python Joblib): 4.42 seconds
In place in ( Python Numpy): 0.93 seconds
In place in ( Python Numba): 0.49 seconds

Finally, I decided to take base R for ride in the same example:

n<-50000000
x<-runif(n)
start_time <- Sys.time()
result <- cos(sin(sqrt(x)))
end_time <- Sys.time()

# Calculate the time taken
time_taken <- end_time - start_time

# Print the time taken
print(sprintf("Time in base R: %.2f seconds", time_taken))

which yielded the following timing result:

Time in base R: 1.30 seconds

Compared to the Perl results we note the following about this example:

  • Inplace operations in base Python were ~ 3.5 slower than Perl
  • Single threaded PDL and numpy gave nearly identical results, followed closely by base R
  • Failure to account for the compilation overhead of Numba yields the false impression that it is slower than Numpy. When accounting for the compilation overhead, Numba is x2 faster than Numpy
  • Parallelization with Joblib did improve upon base Python, but was still inferior to the single thread Perl implementation
  • Multi-threaded PDL (and OpenMP) crushed (not crashed!) every other implementation in all languages

I hope this post and the Perl one, provide some food for thought about the language to use for your next data/compute intensive operation. The next part in this series will look into the same example using arrays in C. This final installment will (hopefully) provide some insights about the impact of memory locality and the overhead incurred by using dynamically typed languages.

In the two prior installments of this series, we considered the performance of floating operations in Perl, Python and R in a toy example that computed the function cos(sin(sqrt(x))), where x was a very large array of 50M double precision floating numbers. Hybrid implementations that delegated the arithmetic intensive part to C were among the most performant implementations. In this installment, we will digress slightly and look at the performance of a pure C code implementation of the toy example. The C code will provide further insights about the importance of memory locality for performance (by default elements in a C array are stored in sequential addresses in memory, and numerical APIs such as PDL or numpy interface with such containers) vis-a-vis containers, e.g. Perl arrays which do not store their values in sequential addresses in memory. Last, but certainly not least, the C code implementations will allow us to assess whether flags related to floating point operations for the low level compiler (in this case gcc) can affect performance. This point is worth emphasizing: common mortals are entirely dependent on the choice of compiler flags when “piping” their “install” or building their Inline file. If one does not touch these flags, then one will be blissfully unaware of what they may missing, or pitfalls they may be avoiding. The humble C file makefile allows one to make such performance evaluations explicitly.

The C code for our toy example is listed in its entirety below. The code is rather self-explanatory, so will not spend time explaining other than pointing out that it contains four functions for

  • Non-sequential calculation of the expensive function : all three floating pointing operations take place inside a single loop using one thread
  • Sequential calculations of the expensive function : each of the 3 floating point function evaluations takes inside a separate loop using one thread
  • Non-sequential OpenMP code : threaded version of the non-sequential code
  • Sequential OpenMP code: threaded of the sequential code

In this case, one may hope that the compiler is smart enough to recognize that the square root maps to packed (vectorized) floating pointing operations in assembly, so that one function can be vectorized using the appropriate SIMD instructions (note we did not use the simd program for the OpenMP codes). Perhaps the speedup from the vectorization may offset the loss of performance from repeatedly accessing the same memory locations (or not).


#include <stdlib.h>
#include <string.h>
#include <math.h>
#include <stdio.h>
#include <omp.h>

// simulates a large array of random numbers
double*  simulate_array(int num_of_elements,int seed);
// OMP environment functions
void _set_openmp_schedule_from_env();
void _set_num_threads_from_env();



// functions to modify C arrays 
void map_c_array(double* array, int len);
void map_c_array_sequential(double* array, int len);
void map_C_array_using_OMP(double* array, int len);
void map_C_array_sequential_using_OMP(double* array, int len);

int main(int argc, char *argv[]) {
    if (argc != 2) {
        printf("Usage: %s <array_size>\n", argv[0]);
        return 1;
    }

    int array_size = atoi(argv[1]);
    // printf the array size
    printf("Array size: %d\n", array_size);
    double *array = simulate_array(array_size, 1234);

    // Set OMP environment
    _set_openmp_schedule_from_env();
    _set_num_threads_from_env();

    // Perform calculations and collect timing data
    double start_time, end_time, elapsed_time;
    // Non-Sequential calculation
    start_time = omp_get_wtime();
    map_c_array(array, array_size);
    end_time = omp_get_wtime();
    elapsed_time = end_time - start_time;
    printf("Non-sequential calculation time: %f seconds\n", elapsed_time);
    free(array);

    // Sequential calculation
    array = simulate_array(array_size, 1234);
    start_time = omp_get_wtime();
    map_c_array_sequential(array, array_size);
    end_time = omp_get_wtime();
    elapsed_time = end_time - start_time;
    printf("Sequential calculation time: %f seconds\n", elapsed_time);
    free(array);

    array = simulate_array(array_size, 1234);
    // Parallel calculation using OMP
    start_time = omp_get_wtime();
    map_C_array_using_OMP(array, array_size);
    end_time = omp_get_wtime();
    elapsed_time = end_time - start_time;
    printf("Parallel calculation using OMP time: %f seconds\n", elapsed_time);
    free(array);

    // Sequential calculation using OMP
    array = simulate_array(array_size, 1234);
    start_time = omp_get_wtime();
    map_C_array_sequential_using_OMP(array, array_size);
    end_time = omp_get_wtime();
    elapsed_time = end_time - start_time;
    printf("Sequential calculation using OMP time: %f seconds\n", elapsed_time);

    free(array);
    return 0;
}



/*
*******************************************************************************
* OMP environment functions
*******************************************************************************
*/
void _set_openmp_schedule_from_env() {
  char *schedule_env = getenv("OMP_SCHEDULE");
  printf("Schedule from env %s\n", getenv("OMP_SCHEDULE"));
  if (schedule_env != NULL) {
    char *kind_str = strtok(schedule_env, ",");
    char *chunk_size_str = strtok(NULL, ",");

    omp_sched_t kind;
    if (strcmp(kind_str, "static") == 0) {
      kind = omp_sched_static;
    } else if (strcmp(kind_str, "dynamic") == 0) {
      kind = omp_sched_dynamic;
    } else if (strcmp(kind_str, "guided") == 0) {
      kind = omp_sched_guided;
    } else {
      kind = omp_sched_auto;
    }
    int chunk_size = atoi(chunk_size_str);
    omp_set_schedule(kind, chunk_size);
  }
}

void _set_num_threads_from_env() {
  char *num = getenv("OMP_NUM_THREADS");
  printf("Number of threads = %s from within C\n", num);
  omp_set_num_threads(atoi(num));
}
/*
*******************************************************************************
* Functions that modify C arrays whose address is passed from Perl in C
*******************************************************************************
*/

double*  simulate_array(int num_of_elements, int seed) {
  srand(seed); // Seed the random number generator
  double *array = (double *)malloc(num_of_elements * sizeof(double));
  for (int i = 0; i < num_of_elements; i++) {
    array[i] =
        (double)rand() / RAND_MAX; // Generate a random double between 0 and 1
  }
  return array;
}

void map_c_array(double *array, int len) {
  for (int i = 0; i < len; i++) {
    array[i] = cos(sin(sqrt(array[i])));
  }
}

void map_c_array_sequential(double* array, int len) {
  for (int i = 0; i < len; i++) {
    array[i] = sqrt(array[i]);
  }
  for (int i = 0; i < len; i++) {
    array[i] = sin(array[i]);
  }
  for (int i = 0; i < len; i++) {
    array[i] = cos(array[i]);
  }
}

void map_C_array_using_OMP(double* array, int len) {
#pragma omp parallel
  {
#pragma omp for schedule(runtime) nowait
    for (int i = 0; i < len; i++) {
      array[i] = cos(sin(sqrt(array[i])));
    }
  }
}

void map_C_array_sequential_using_OMP(double* array, int len) {
#pragma omp parallel
  {
#pragma omp for schedule(runtime) nowait
    for (int i = 0; i < len; i++) {
      array[i] = sqrt(array[i]);
    }
#pragma omp for schedule(runtime) nowait
    for (int i = 0; i < len; i++) {
      array[i] = sin(array[i]);
    }
#pragma omp for schedule(runtime) nowait
    for (int i = 0; i < len; i++) {
      array[i] = cos(array[i]);
    }
  }
}

A critical question is whether the use of fast floating compiler flags, a trick that trades speed for accuracy of the code, can affect performance. Here is the makefile withut this compiler flag

CC = gcc
CFLAGS = -O3 -ftree-vectorize  -march=native  -Wall -std=gnu11 -fopenmp -fstrict-aliasing 
LDFLAGS = -fPIE -fopenmp
LIBS =  -lm

SOURCES = inplace_array_mod_with_OpenMP.c
OBJECTS = $(SOURCES:.c=_noffmath_gcc.o)
EXECUTABLE = inplace_array_mod_with_OpenMP_noffmath_gcc

all: $(SOURCES) $(EXECUTABLE)

clean:
	rm -f $(OBJECTS) $(EXECUTABLE)

$(EXECUTABLE): $(OBJECTS)
	$(CC) $(LDFLAGS) $(OBJECTS) $(LIBS) -o $@

%_noffmath_gcc.o : %.c 
	$(CC) $(CFLAGS) -c $< -o $@

and here is the one with this flag:

CC = gcc
CFLAGS = -O3 -ftree-vectorize  -march=native -Wall -std=gnu11 -fopenmp -fstrict-aliasing -ffast-math
LDFLAGS = -fPIE -fopenmp
LIBS =  -lm

SOURCES = inplace_array_mod_with_OpenMP.c
OBJECTS = $(SOURCES:.c=_gcc.o)
EXECUTABLE = inplace_array_mod_with_OpenMP_gcc

all: $(SOURCES) $(EXECUTABLE)

clean:
	rm -f $(OBJECTS) $(EXECUTABLE)

$(EXECUTABLE): $(OBJECTS)
	$(CC) $(LDFLAGS) $(OBJECTS) $(LIBS) -o $@

%_gcc.o : %.c 
	$(CC) $(CFLAGS) -c $< -o $@

And here are the results of running these two programs

  • Without -ffast-math
    OMP_SCHEDULE=guided,1 OMP_NUM_THREADS=8 ./inplace_array_mod_with_OpenMP_noffmath_gcc 50000000
    Array size: 50000000
    Schedule from env guided,1
    Number of threads = 8 from within C
    Non-sequential calculation time: 1.12 seconds
    Sequential calculation time: 0.95 seconds
    Parallel calculation using OMP time: 0.17 seconds
    Sequential calculation using OMP time: 0.15 seconds
    
  • With -ffast-math
    OMP_SCHEDULE=guided,1 OMP_NUM_THREADS=8 ./inplace_array_mod_with_OpenMP_gcc 50000000
    Array size: 50000000
    Schedule from env guided,1
    Number of threads = 8 from within C
    Non-sequential calculation time: 0.27 seconds
    Sequential calculation time: 0.28 seconds
    Parallel calculation using OMP time: 0.05 seconds
    Sequential calculation using OMP time: 0.06 seconds
    

    Note that one can use the fastmath in Numba code as follows (the default is fastmath=False):

    @njit(nogil=True,fastmath=True)
    def compute_inplace_with_numba(array):
      np.sqrt(array,array)
      np.sin(array,array)
      np.cos(array,array)
    

    A few points that are worth noting:

  • The -ffast-math gives major boost in performance (about 300% for both the single threaded and the multi-threaded code), but it can generate erroneous results
  • Fastmath also works in Numba, but should be avoided for the same reasons it should be avoided in any application that strives for accuracy
  • The sequential C single threaded code gives performance similar to the single threaded PDL and Numpy
  • Somewhat surprisingly, the sequential code is about 20% faster than the non-sequential code when the correct (non-fast) math is used.
  • Unsurprisingly, multi-threaded code is faster than single threaded code :)
  • I still cannot explain how numbas delivers a 50% performance premium over the C code of this rather simple function.

(diii) 8 great CPAN modules released last week

Niceperl

Published by Unknown on Saturday 06 July 2024 23:16

Updates for great CPAN modules released last week. A module is considered great if its favorites count is greater or equal than 12.

  1. App::perlimports - Make implicit imports explicit
    • Version: 0.000055 on 2024-07-04, with 18 votes
    • Previous CPAN version: 0.000054 was 5 days before
    • Author: OALDERS
  2. DBD::mysql - A MySQL driver for the Perl5 Database Interface (DBI)
    • Version: 5.007 on 2024-07-01, with 56 votes
    • Previous CPAN version: 5.006 was 27 days before
    • Author: DVEEDEN
  3. Firefox::Marionette - Automate the Firefox browser with the Marionette protocol
    • Version: 1.59 on 2024-06-30, with 16 votes
    • Previous CPAN version: 0.77 was 4 years, 11 months, 23 days before
    • Author: DDICK
  4. IO::Socket::SSL - Nearly transparent SSL encapsulation for IO::Socket::INET.
    • Version: 2.086 on 2024-07-03, with 49 votes
    • Previous CPAN version: 2.085 was 5 months, 12 days before
    • Author: SULLR
  5. Kelp - A web framework light, yet rich in nutrients.
    • Version: 2.17 on 2024-07-06, with 44 votes
    • Previous CPAN version: 2.12 was 10 days before
    • Author: BRTASTIC
  6. Module::CoreList - what modules shipped with versions of perl
    • Version: 5.20240702 on 2024-07-02, with 43 votes
    • Previous CPAN version: 5.20240609 was 23 days before
    • Author: BINGOS
  7. Net::AMQP::RabbitMQ - interact with RabbitMQ over AMQP using librabbitmq
    • Version: 2.40012 on 2024-07-04, with 15 votes
    • Previous CPAN version: 2.40011 was 11 days before
    • Author: MSTEMLE
  8. Syntax::Keyword::Match - a match/case syntax for perl
    • Version: 0.15 on 2024-07-04, with 13 votes
    • Previous CPAN version: 0.14 was 2 months, 4 days before
    • Author: PEVANS

Sometimes, one’s code must simply perform and principles, such as aesthetics, “cleverness” or commitment to a single language solution simply go out of the window. At the TPRC I gave a talk (here are the slides) about how this can be done for bioinformatics applications, but I feel that a simpler example is warranted to illustrate the potential venues to maximize performance that a Perl programmer has at their disposal when working in data intensive applications.

So here is a toy problem to illustrate these options. Given a very large array of double precision floats transform them in place with the following function : cos(sin(sqrt(x))). The function has 3 nested floating point operations. This is an expensive function to evaluate, especially if one has to calculate for a large number of values. We can generate reasonably quickly the array values in Perl (and some copies for the solutions we will be examining) using the following code:

my $num_of_elements = 50_000_000;
my @array0 = map { rand } 1 .. $num_of_elements;    ## generate random numbers
my @array1 = @array0;                               ## copy the array
my @array2 = @array0;                               ## another copy
my @array3 = @array0;                               ## yet another copy
my @rray4  = @array0;                               ## the last? copy
my $array_in_PDL      = pdl(@array0);    ## convert the array to a PDL ndarray
my $array_in_PDL_copy = $array_in_PDL->copy;    ## copy the PDL ndarray

The posssible solutions include the following:

Inplace modification using a for-loop in Perl.

for my $elem (@array0) {
    $elem = cos( sin( sqrt($elem) ) );
}

Using Inline C code to walk the array and transform in place in C. . Effectively one does a inplace map using C. Accessing elements of Perl arrays (AV* in C) in C is particularly performant if one is using perl 5.36 and above because of an optimized fetch function introduced in that version of Perl.

void map_in_C(AV *array) {
  int len = av_len(array) + 1;
  for (int i = 0; i < len; i++) {
    SV **elem = av_fetch_simple(array, i, 0); // perl 5.36 and above
    if (elem != NULL) {
      double value = SvNV(*elem);
      value = cos(sin(sqrt(value))); // Modify the value
      sv_setnv(*elem, value);
    }
  }
}

Using Inline C code to transform the array, but break the transformation in 3 sequential C for-loops. This is an experiment really about tradeoffs: modern x86 processors have a specialized, vectorized square root instruction, so perhaps the compiler can figure how to use it to speed up at least one part of the calculation. On the other hand, we will be reducing the arithmetic intensity of each loop and accessing the same data value twice, so there will likely be a price to pay for these repeated data accesses.

void map_in_C_sequential(AV *array) {
  int len = av_len(array) + 1;
  for (int i = 0; i < len; i++) {
    SV **elem = av_fetch_simple(array, i, 0); // perl 5.36 and above
    if (elem != NULL) {
      double value = SvNV(*elem);
      value = sqrt(value); // Modify the value
      sv_setnv(*elem, value);
    }
  }
  for (int i = 0; i < len; i++) {
    SV **elem = av_fetch_simple(array, i, 0); // perl 5.36 and above
    double value = SvNV(*elem);
    value = sin(value); // Modify the value
    sv_setnv(*elem, value);
  }
  for (int i = 0; i < len; i++) {
    SV **elem = av_fetch_simple(array, i, 0); // perl 5.36 and above
    double value = SvNV(*elem);
    value = cos(value); // Modify the value
    sv_setnv(*elem, value);
  }
}

Parallelize the C function loop using OpenMP. In a previous entry we discussed how to control the OpenMP environment from within Perl and compile OpenMP aware Inline::C code for use by Perl, so let’s put this knowledge into action! On the Perl side of the program we will do this:

use v5.38;
use Alien::OpenMP;
use OpenMP::Environment;
use Inline (
    C    => 'DATA',
    with => qw/Alien::OpenMP/,
);
my $env = OpenMP::Environment->new();
my $threads_or_workers = 8; ## or any other value
## modify number of threads and make C aware of the change
$env->omp_num_threads($threads_or_workers);
_set_num_threads_from_env();

## modify runtime schedule and make C aware of the change
$env->omp_schedule("guided,1");    ## modify runtime schedule
_set_openmp_schedule_from_env();

On the C part of the program, we will then do this (the helper functions for the OpenMP environment have been discussed previously, and thus not repeated here).

#include <omp.h>
void map_in_C_using_OMP(AV *array) {
  int len = av_len(array) + 1;
#pragma omp parallel
  {
#pragma omp for schedule(runtime) nowait
    for (int i = 0; i < len; i++) {
      SV **elem = av_fetch_simple(array, i, 0); // perl 5.36 and above
      if (elem != NULL) {
        double value = SvNV(*elem);
        value = cos(sin(sqrt(value))); // Modify the value
        sv_setnv(*elem, value);
      }
    }
  }
}

Perl Data Language (PDL) to the rescue. The PDL set of modules is yet another way to speed up operations and can save the programmer from C. It also autoparallelizes given the right directives so why not use it?

use PDL;
## set the minimum size problem for autothreading in PDL
set_autopthread_size(0);
my $threads_or_workers = 8; ## or any other value

## PDL
## use PDL to modify the array - multi threaded
set_autopthread_targ($threads_or_workers);
$array_in_PDL->inplace->sqrt;
$array_in_PDL->inplace->sin;
$array_in_PDL->inplace->cos;


## use PDL to modify the array - single thread
set_autopthread_targ(0);

$array_in_PDL_copy->inplace->sqrt;
$array_in_PDL_copy->inplace->sin;
$array_in_PDL_copy->inplace->cos;

Using 8 threads we get something like this

Inplace benchmarks
Inplace  in         Perl took 2.85 seconds
Inplace  in Perl/mapCseq took 1.62 seconds
Inplace  in    Perl/mapC took 1.54 seconds
Inplace  in   Perl/C/OMP took 0.24 seconds

PDL benchmarks
Inplace  in     PDL - ST took 0.94 seconds
Inplace  in     PDL - MT took 0.17 seconds

Using 16 threads we get this!

Starting the benchmark for 50000000 elements using 16 threads/workers

Inplace benchmarks
Inplace  in         Perl took 3.00 seconds
Inplace  in Perl/mapCseq took 1.72 seconds
Inplace  in    Perl/mapC took 1.62 seconds
Inplace  in   Perl/C/OMP took 0.13 seconds

PDL benchmarks
Inplace  in     PDL - ST took 0.99 seconds
Inplace  in     PDL - MT took 0.10 seconds

A few observations:

  • The OpenMP and the multi-threaded (MT) of the PDL are responsive to the number of workers, while the solutions are not. Hence, the timings of the pure Perl and the inline non-OpenMP solution timings in these benchmarks give an idea of the natural variability in performance
  • Writing the map version of the code in C improved performance by about 180% (contrast Perl and Perl/mapC).
  • Using PDL in a single thread improved performance by 285-300% (contrast PDL - ST and Perl timings).
  • There was a price to pay for repeated memory access (contrast Perl/mapC to Perl/mapCseq)
  • OpenMP and multi-threaded PDL operations gave similar performance (though PDL appeared faster in these examples). The code run 23-30 times faster.

In summary, there are both native (PDL modules) and foreign (C/OpenMP) solutions to speed up data intensive operations in Perl, so why not use them widely and wisely to make Perl programs performant?

New Standards of Conduct published

Perl Foundation News

Published by D Ruth Holloway on Thursday 04 July 2024 12:20

The Board of the The Perl and Raku Foundation has created a new Standard of Conduct to help combat bullying, harassment, and abuse in our communities. It is designed to help protect anyone (paid or volunteer, in-person attendee or remote worker) who is doing the work of the Foundation in promoting and advancing the Perl and Raku programming languages. I did a talk on the new Standards at TPRC in Las Vegas last week, and you can watch the video on YouTube to learn more about our rationale, scope, and implementation details.

The current official version of the document is visible on GitHub, at https://github.com/tpf/soc/blob/main/Standards_of_Conduct.md. The new Standards will take effect on August 1, and the time between now and then is a public-comment period. There are still minor amendments under way with the Board, and the document will be updated near the end of July with those amendments. If you have constructive feedback on these new Standards, we'd love to hear them! You can reach me via email.

The inaugural Response Team has been created, with these members of our communities stepping up to serve:

  • Ruth Holloway, Board Member, Team Lead
  • Peter Krawczyk, Board Member
  • Abigail
  • T. Alex Beamish
  • Jason Crome
  • Sarah Gray
  • Matthew Stuckwisch

Reports of violations of these new standards after August 1 should be sent via email to soc@perlfoundation.org, as that email address is the only supported reporting mechanism in the new Standards.

How can we happily marry Perl to Assembly ?

Killing-It-with-PERL

Published on Thursday 04 July 2024 00:00

This is probably one of the things that should never be allowed to exist, but why not use Perl and its capabilities to inline foreign code, to FAFO with assembly without a build system? Everything in a single file! In the process one may find ways to use Perl to enhance NASM and vice versa. But for now, I make no such claims : I am just using the perlAssembly git repo to illustrate how one can use Perl to drive (and learn to code!) assembly programs from a single file. (Source code may be found in the perlAssembly repo )

x86-64 examples

Adding Two Integers

Simple integer addition in Perl - this is the Hello World version of the perlAssembly repo But if we can add two numbers, why not add many, many more?

The sum of an array of integers

Explore multiple equivalent ways to add large arrays of short integers (e.g. between -100 to 100) in Perl. The Perl and the C source files contain the code for:

  • ASM_blank : tests the speed of calling ASM from Perl (no computations are done)
  • ASM : passes the integers as bytes and then uses conversion operations and scalar floating point addition
  • ASM_doubles : passes the array as a packed string of doubles and do scalar double floating addition in assembly
  • ASM_doubles_AVX: passes the array as a packed string of doubles and do packed floating point addition in assembly
  • ForLoop : standard for loop in Perl
  • ListUtil: sum function from list utilities
  • PDL : uses summation in PDL

Scenarios w_alloc : allocate memory for each iteration to test the speed of pack, those marked as wo_alloc, use a pre-computed data structure to pass the array to the underlying code. Benchmarks of the first scenario give the true cost of offloading summation to of a Perl array to a given function when the source data are in Perl. Timing the second scenario benchmarks speed of the underlying implementation.

This example illustrates

  • an important (but not the only one!) strategy to create a data structure that is suitable for Assembly to work with, i.e. a standard array of the appropriate type, in which one element is laid adjacent to the previous one in memory
  • the emulation of declaring a pointer as constant in the interface of a C function. In the AVX code, we don’t FAFO with the pointer (RSI in the calling convention) to the array directly, but first load its address to another register that we manipulate at will.

Results

Here are the timings!

  mean median stddev
ASM_blank 2.3e-06 2.0e-06 1.1e-06
ASM_doubles_AVX_w_alloc 3.6e-03 3.5e-03 4.2e-04
ASM_doubles_AVX_wo_alloc 3.0e-04 2.9e-04 2.7e-05
ASM_doubles_w_alloc 4.3e-03 4.1e-03 4.5e-04
ASM_doubles_wo_alloc 8.9e-04 8.7e-04 3.0e-05
ASM_w_alloc 4.3e-03 4.2e-03 4.5e-04
ASM_wo_alloc 9.2e-04 9.1e-04 4.1e-05
ForLoop 1.9e-02 1.9e-02 2.6e-04
ListUtil 4.5e-03 4.5e-03 1.4e-04
PDL_w_alloc 2.1e-02 2.1e-02 6.7e-04
PDL_wo_alloc 9.2e-04 9.0e-04 3.9e-05

Let’s say we wanted to do this toy experiment in pure C (using Inline::C of course!) This code obtains the integers as a packed “string” of doubles and forms the sum in C

double sum_array_C(char *array_in, size_t length) {
    double sum = 0.0;
    double * array = (double *) array_in;
    for (size_t i = 0; i < length; i++) {
        sum += array[i];
    }
    return sum;
}

Here are the timing results:

  mean median stddev
C_doubles_w_alloc 4.1e-03 4.1e-03 2.3e-04
C_doubles_wo_alloc 9.0e-04 8.7e-04 4.6e-05

What if we used SIMD directives and parallel loop constructs in OpenMP? All three combinations were tested, i.e. SIMD directives alone (the C equivalent of the AVX code), OpenMP parallel loop threads and SIMD+OpenMP. Here are the timings!

  mean median stddev
C_OMP_w_alloc 4.0e-03 3.7e-03 1.4e-03
C_OMP_wo_alloc 3.1e-04 2.3e-04 9.5e-04
C_SIMD_OMP_w_alloc 4.0e-03 3.8e-03 8.6e-04
C_SIMD_OMP_wo_alloc 3.1e-04 2.5e-04 8.5e-04
C_SIMD_w_alloc 4.1e-03 4.0e-03 2.4e-04
C_SIMD_wo_alloc 5.0e-04 5.0e-04 8.9e-05

Discussion of the sum of an array of integers example

  • For calculations such as this, the price that must be paid is all in memory currency: it takes time to generate these large arrays, and for code with low arithmetic intensity this time dominates the numeric calculation time.
  • Look how insanely effective sum in List::Util is : even though it has to walk the Perl array whose elements (the doubles, not the AV*) are not stored in a contiguous area in memory, it is no more than 3x slower than the equivalent C code C_doubles_wo_alloc.
  • Look how optimized PDL is compared to the C code in the scenario without memory allocation.
  • Manual SIMD coded in assembly is 40% faster than the equivalent SIMD code in OpenMP (but it is much more painful to write)
  • The threaded OpenMP version achieved equivalent performance to the single thread AVX assembly programs, with no obvious improvement from combining SIMD+parallel loop for pragmas in OpenMP.
  • For the example considered here, it thus makes ZERO senso to offload a calculation as simple as a summation because ListUtil is already within 15% of the assembly solution (at a latter iteration we will also test AVX2 and AVX512 packed addition to see if we can improve the results).
  • If however, one was managing the array, not as a Perl array, but as an area in memory through a Perl object, then one COULD consider offloading. It may be fun to consider an example in which one adds the output of a function that has an efficient PDL and assembly implementation to see how the calculus changes (in the to-do list for now).

Disclaimer

The code here is NOT meant to be portable. I code in Linux and in x86-64, so if you are looking into Window’s ABI or ARM, you will be disappointed. But as my knowledge of ARM assembly grows, I intend to rewrite some examples in Arm assembly!

List of new CPAN distributions – Jun 2024

Perlancar

Published by perlancar on Monday 01 July 2024 00:07

dist author abstract date
Alien-RtAudio JBARRETT Install RtAudio 2024-06-23T15:44:22
Alien-SunVox JBARRETT Install The SunVox Library – Alexander Zolotov's SunVox modular synthesizer and sequencer 2024-06-20T09:22:51
Alien-libextism EXTISM find or build and install libextism with development dependencies 2024-06-06T03:28:08
Aozora2Epub YOSHIMASA Convert Aozora Bunko XHTML to EPUB 2024-06-14T10:10:23
App-ArticleWrap SCHROEDER word wrap news articles or mail files 2024-06-28T11:16:03
App-BPOMUtils-NutritionLabelRef PERLANCAR Get one or more values from BPOM nutrition label reference (ALG, acuan label gizi) 2024-06-05T00:06:03
App-CommonPrefixUtils PERLANCAR Utilities related to common prefix 2024-06-01T00:05:57
App-CommonSuffixUtils PERLANCAR Utilities related to common suffix 2024-06-02T00:06:21
App-KemenkesUtils-RDA PERLANCAR Get one or more values from Indonesian Ministry of Health's RDA (AKG, angka kecukupan gizi, from Kemenkes) 2024-06-12T00:06:32
App-NKC2MARC SKIM Tool to fetch record from National library of the Czech Republic to MARC file. 2024-06-26T07:14:52
App-Timestamper-Log-Process SHLOMIF various filters and queries for App::Timestamper logs. 2024-06-09T04:42:04
App-runscript SVW Module that implements the runscript utility 2024-06-25T08:11:16
Audio-SunVox-FFI JBARRETT Bindings for the SunVox library – a modular synthesizer and sequencer 2024-06-22T11:40:41
Bio-SeqAlignment-Applications-SequencingSimulators-RNASeq-Polyester CHRISARG Skeleton package that does nothing but reserve the namespace. 2024-06-09T13:20:02
Bio-SeqAlignment-Components-Libraries-edlib CHRISARG basic edlib library 2024-06-13T03:23:35
Bio-SeqAlignment-Components-SeqMapping CHRISARG Imports all modules relevant to sequence mapping 2024-06-10T04:15:38
Bio-SeqAlignment-Components-Sundry CHRISARG Miscellaneous components for building awesome sequencing apps. 2024-06-09T16:30:30
Bio-SeqAlignment-Examples-EnhancingEdlib CHRISARG Parallelizing Edlib with MCE, OpenMP using Inline and FFI::Platypus 2024-06-13T01:02:44
Bio-SeqAlignment-Examples-TailingPolyester CHRISARG "Beefing" up the RNA sequencing simulator polyester with polyA tails 2024-06-12T12:26:38
Bot-Telegram VASYAN A micro^W nano framework for creating telegram bots based on WWW::Telegram::BotAPI 2024-06-17T11:13:27
CXC-DB-DDL-Field-Pg DJERIUS DBD::Pg specific Field class 2024-06-27T14:28:18
DWIM-Block DCONWAY Use AI::Chat without having to write the infrastructure code 2024-06-26T22:36:44
Data-Checks PEVANS XS functions to assist in value constraint checking 2024-06-19T13:26:59
Data-ISO8583 CADE 2024-06-08T22:52:33
Data-Tranco GBROWN An interface to the Tranco domain list. 2024-06-03T13:35:31
DateTime-Schedule TYRRMINAL Determine scheduled days in range based on inclusions/exclusions 2024-06-11T20:45:23
Devel-StatProfiler MBARBON low-overhead sampling code profiler 2024-06-24T21:51:40
ExtUtils-Typemaps-Misc LEONT A collection of miscelaneous typemap templates 2024-06-20T19:43:50
Extism EXTISM Extism Perl SDK 2024-06-06T04:07:08
Filter-Syntactic DCONWAY Source filters based on syntax, instead of luck 2024-06-26T22:39:31
Graphics-ColorNamesCMYK-All PERLANCAR CMYK colors from all Graphics::ColorNamesCMYK::* 2024-06-03T00:06:12
HATX HOEKIT A fluent interface for Hash and Array Transformations 2024-06-28T13:43:31
Kelp-Module-Beam-Wire BRTASTIC Beam::Wire dependency injection container for Kelp 2024-06-02T14:32:03
Kelp-Module-YAML BRTASTIC YAML encoder / decoder for Kelp 2024-06-24T19:18:49
Math-Recaman SIMONW Calculate numbers in Recamán's sequence 2024-06-25T17:25:03
Multi-Dispatch DCONWAY Multiple dispatch for Perl subs and methods 2024-06-26T22:42:28
RT-Extension-SMSWebhook-Twilio BPS RT-Extension-SMSWebhook-Twilio Extension 2024-06-13T22:40:55
Raylib-FFI PERIGRIN Perl FFI bindings for raylib 2024-06-27T05:58:30
SMS-Send-CZ-Smsmanager RADIUSCZ SMS::Send driver for SMS Manager – Czech Republic 2024-06-10T20:58:38
SPVM-R KIMOTO Porting R language Features 2024-06-26T05:10:09
Sah-PSchemaBundle PERLANCAR Convention for Sah-PSchemaBundle-* distribution 2024-06-06T00:06:05
Sah-PSchemaBundle-Array PERLANCAR Parameterized schemas related to array type 2024-06-07T00:06:17
Sah-PSchemaBundle-Perl PERLANCAR Parameterized schemas related to Perl 2024-06-08T00:06:02
Sah-PSchemaBundle-Re PERLANCAR Various regular-expression (parameterized) schemas 2024-06-09T00:05:56
Sah-SchemaBundle-CPAN PERLANCAR Sah schemas related to CPAN 2024-06-15T00:07:21
Sah-SchemaBundle-Chrome PERLANCAR Various Sah schemas related to Google Chrome 2024-06-12T00:06:44
Sah-SchemaBundle-Code PERLANCAR Various schemas related to 'code' type and coderefs 2024-06-10T04:51:45
Sah-SchemaBundle-Collection PERLANCAR Various Sah collection (array/hash) schemas 2024-06-16T00:05:16
Sah-SchemaBundle-Color PERLANCAR Sah schemas related to color codes/names 2024-06-12T00:06:55
Sah-SchemaBundle-ColorScheme PERLANCAR Sah schemas related to color schemes 2024-06-23T15:45:50
Sah-SchemaBundle-ColorTheme PERLANCAR Sah schemas related to ColorTheme 2024-06-30T00:05:31
Sah-SchemaBundle-Nutrient PERLANCAR Sah schemas related to nutrients 2024-06-04T02:10:55
SpeL-Wizard WDAEMS engine to build audio files from the spel files generated by SpeL and maintain their up-to-dateness 2024-06-10T08:18:20
SpeL-Wizard-20240610-TRIAL WDAEMS engine to build audio files from the spel files generated by SpeL and maintain their up-to-dateness 2024-06-10T08:02:44
String-Random-Regexp-regxstring BLIAKO Generate random strings from a regular expression 2024-06-28T23:27:57
Switch-Back DCONWAY given/when for a post-given/when Perl 2024-06-26T22:42:39
Switch-Right DCONWAY Switch and smartmatch done right this time 2024-06-26T22:45:36
Sys-GetRandom-PP MAUKE pure Perl interface to getrandom(2) 2024-06-15T05:21:14
Tags-HTML-Footer SKIM Tags helper for HTML footer. 2024-06-03T15:29:04
Tags-HTML-Message-Board SKIM Tags helper for message board. 2024-06-03T16:01:31
XDR-Gen EHUELS Generate (de)serializers for XDR definitions 2024-06-08T15:05:53

Stats

Number of new CPAN distributions this period: 61

Number of authors releasing new CPAN distributions this period: 29

Authors by number of new CPAN distributions this period:

No Author Distributions
1 PERLANCAR 17
2 CHRISARG 6
3 DCONWAY 5
4 SKIM 3
5 JBARRETT 3
6 EXTISM 2
7 WDAEMS 2
8 BRTASTIC 2
9 LEONT 1
10 GBROWN 1
11 EHUELS 1
12 DJERIUS 1
13 YOSHIMASA 1
14 CADE 1
15 BPS 1
16 BLIAKO 1
17 SVW 1
18 KIMOTO 1
19 VASYAN 1
20 PEVANS 1
21 TYRRMINAL 1
22 SCHROEDER 1
23 HOEKIT 1
24 SIMONW 1
25 SHLOMIF 1
26 MAUKE 1
27 RADIUSCZ 1
28 PERIGRIN 1
29 MBARBON 1

trying to cope with Slack’s BlockKit

rjbs forgot what he was saying

Published by Ricardo Signes on Sunday 30 June 2024 12:00

The other day, a concatenation of circumstances led me to thinking about the lousy state of sending formatted text to Slack. We have a bot called Synergy at work, and the bot posts lots of content. Mostly it’s plain text, but sometimes we have it send text with bold or links. This is for a couple reasons. Our bot supports channels other than Slack (like SMS and Discord and the console), so we can’t express everything in Slack-oriented terms. But even doing so would be hard, because of the APIs involved.

Slack’s APIs are dreadful, and not in the usual ways. At first glance, they look fairly modern and straightforward. Maybe they seem a bit over-engineered, but Slack is a large-scale system, and some of that is to be expected. But when you really start using them, they’re surprisingly painful. Common patterns turn out to have weird exceptions. Things don’t compose because of seemingly-arbitrary restrictions. The documentation seems auto-generated, with all that entails: missing context, lack of an overview of any given abstraction, no helpful hints. Auto-generated documentation is, at least, usually accurate and comprehensive in the methods and types provided, because it comes from the source code. Unhappily, the Slack documentation is frequently inaccurate.

sending text isn’t good enough

So, as I said, the state of the Slack APIs and developer ecosystem is such that I’ve avoided trying to get more out of it. Then again, the kind of formatting we can use easily in Synergy’s messages is not great. It usually uses roughly this:

await $slack->api_call('chat.postMessage', {
  text    => $text,
  channel => $channel,
  as_user => jtrue(),
});

Nice and simple! So, how does this get us rich text? Well, the $text content just gets formatted with Markdown! No, sorry, wait… with mrkdwn. The Slack “mrkdwn” format is like Markdown, but worse. It differs in how bold, italic, and link work, at least. Maybe other things, too. You can disable this, but then you just get plain text, which has its own problems.

This is an actual, practical problem. When we display search results from our work tracker, Linear, imagine we find an issue called “problems with Secure::Key::HSM”. This will get displayed as “problems with Secure:🔑:HSM” because mrkdwn is always on the lookout for emoji colon codes.

mrkdwn is intended for human writers, just like Markdown is intended for human writers. (That said, I don’t actually know whether a human can avoid that problem above.) If you want your software to write rich text output, you should use a less ambiguous, tricky format. For example, HTML. In Slack, the format provided is BlockKit and especially its “rich text” blocks. BlockKit is a (mediocre) object model for describing content that Slack can display in many different contexts (called “surfaces”), like: modal dialogs, messages, canvases, and more. Instead of supplying a hunk of mrkdwn, you can supply an array of “blocks”, and Slack will display them. Blocks are not ambiguous. Take the problematic Linear issue from the last paragraph. The form we want would be expressed in BlockKit like this:

[ {
  "type": "section",
  "text": { "type": "plain_text", "text": "problems with Secure::Key::HSM" }
} ]

Great! (The bad form is actually exactly the same, but replace plain_text with mrkdwn.)

sending rich text is a big pain

But let’s say that in the message above, you wanted to call them “serious problems”, with the italics. You can’t use plain text, because you want italics. You can’t use mrkdwn, because you’d get that emoji. You need a rich text block. Like so:

[
  {
    "type": "rich_text",
    "elements": [
      {
        "type": "rich_text_section",
        "elements": [
          {
            "type": "text",
            "text": "serious",
            "style": { "italics": true }
          },
          {
            "type": "text",
            "text": " problems with Secure::Key::HSM"
          }
        ]
      }
    ]
  }
]

Kinda great, but tedious. Unambiguous, but awful to write.

Also, there’s a bug. I put “italics” when I should’ve put “italic”. No problem, because you can easily expect an error like this:

Unknown property "italics" at /0/elements/0/elements/style

Not great, but gets the job done, right? Well, the problem is that while you can easily expect that, you won’t get it. What you get is this:

invalid_blocks

No matter how serious or trivial the error, and no matter where it is, that’s what you get. That is the entire body of the response to sending any invalid message. Then you try to go re-read the documentation about how it should work, and you pour over each and every element in the structure and the corresponding documentation. Worse, that documentation, as I said, is bad. I found at least a handful of errors (now reported). The combination of (bad docs) and (bad error messages) and (wordy structure) made it easy to sigh and go on using using (sigh) mrkdwn.

making it hard to screw up

Well, now I didn’t want to do that anymore, so I wrote a bunch of classes, representing each of the kinds of blocks. (More on “kinds of blocks” below.)

Here’s a piece of rich text I generated for testing:

Here is a safe link: click me

  • it will be fun
  • it will be cool 🙂
  • it will be enough

This is easy to type, and simple, and I wanted it to be easy to program.

Here’s the code for that four line Slack rich text:

my $blocks = Slack::BlockKit::BlockCollection->new({
  blocks => [
    Slack::BlockKit::Block::RichText->new({
      elements => [
        Slack::BlockKit::Block::RichText::Section->new({
          elements => [
            Slack::BlockKit::Block::RichText::Text->new({
              text => "Here is a ",
            }),
            Slack::BlockKit::Block::RichText::Text->new({
              text  => "safe",
              style => { italic => 1 },
            }),
            Slack::BlockKit::Block::RichText::Text->new({
              text => " link: ",
            }),
            Slack::BlockKit::Block::RichText::Link->new({
              text  => "click me",
              unsafe => 1,
              url   => "https://fastmail.com/",
              style => { bold => 1 },
            }),
          ],
        }),
        Slack::BlockKit::Block::RichText::List->new({
          style => 'bullet',
          elements => [
            Slack::BlockKit::Block::RichText::Section->new({
              elements => [
                Slack::BlockKit::Block::RichText::Text->new({
                  text => "it will be fun",
                }),
              ]
            }),
            Slack::BlockKit::Block::RichText::Section->new({
              elements => [
                Slack::BlockKit::Block::RichText::Text->new({
                  text => "it will be cool",
                }),
                Slack::BlockKit::Block::RichText::Emoji->new({
                  name => 'smile',
                }),
              ]
            }),
            Slack::BlockKit::Block::RichText::Section->new({
              elements => [
                Slack::BlockKit::Block::RichText::Text->new({
                  text => "it will be enough",
                }),
              ],
            }),
          ],
        }),
      ],
    })
  ]
});

Okay, the good news is that if I screw up on any property or type, this code will give me an error message that tells me a lot more about what happened, and it happens before we try to send the reply. The error message isn’t great, but it will be something roughly like “Slack::BlockKit::Block::RichText::Text object found where ExpansiveBlockArray expected in Section constructor”. Maybe a little worse. But it’s descriptive, it has a stack trace, and it happens client-side.

The bad news is that it took 57 lines of Perl, mostly fluff, to generate four lines of text. This made me sour, so I needed some sugar:

use Slack::BlockKit::Sugar -all => { -prefix => 'bk_' };
bk_blocks(
  bk_richblock(
    bk_richsection(
      "Here is a ", bk_italic("safe"), " link: ",
      bk_link("https://fastmail.com/", "click me", { style => { bold => 1 } }),
    ),
    bk_ulist(
      "it will be fun",
      bk_richsection("it will be cool", bk_emoji('smile')),
      "it will be enough",
    ),
  )
);

This is 13 lines. Maybe it could be shorter, but not much. All the same type checking applies. Also, each function in there has a reusable return value, so you could do things like:

# provide either an ordered List of 2+ Sections or just one Section
my @reasons = gather_reasons();
return bk_blocks(@reasons > 1 ? bk_olist(@reasons) : @reasons);

how does it work?

It’s practically not worth explaining how it works. It is almost the definition of “a simple matter of programming”. You can read the source for Slack::BlockKit on GitHub. It’s a bunch of Moose classes with type constraints and a few post-construction validation routines. I think it’s a pretty decent example of how Moose can be useful. Moose made it easy to make the whole thing work by letting me decompose the problem into a few distinct parts: a role for elements with “style”, a set of types (like for “which kinds of elements can be children of this block”), and BUILD submethods for validating one-off quirks that can’t be expressed in type constraints.

The thing worth nothing, maybe, is how it doesn’t work. It isn’t auto-generated code. It’d be pretty nice if Slack was publishing some data description that could automatically generate an API, maybe with some sugar. I don’t normally enjoy that sort of thing, but it’d probably be better than doing this by hand and then digging for exceptions, right? Anyway, they don’t do that, so I wrote this by hand. It felt silly, but it was only a few hours of work, much of it done while on a road trip.

some of the problems I encountered

I feel a bit childish calling out the problems with the BlockKit API (or its documentation), but I’m going to do it anyway.

First, I’ll say that my biggest problem is the lack of a clear object model. The term “block” and “object” and “element” are used without much clear distinction. It might be safe to say “anything with a block_id propety is a block”. Beyond that, I wouldn’t bet much. Many Block types have an elements property. Some say they contain “objects” and some say “elements”. A rich_text_list thing might be an object, or maybe an element? Or maybe an element is a kind of object? There’s also a kind of object called a “composition object”, which just seems to be another name for “types we will reuse”. I can’t tell if a composition object is substantially different from an “element”. They have their own page in the docs, anyway.

This isn’t a huge deal, but a hierarchy of types makes it easier to build a system that has to represent… well, a hierarchy of types!

non-isolated type definition

Next, the types that do exist are not clearly separated. The most jarring example of this might be:

  1. some of the rich text blocks can contain, in their elements, “link” objects
  2. “link” objects have a set of styles, chosen from { bold, code, italic, strike }
  3. …but the “link” objects contained in a “preformatted” rich text block can only have bold or italic styles

So, the type definition for a link object isn’t a complete definition, it’s contingent on the context. To verify the object, you have to put the verification into the parent: when constructed, a RichText::Preformatted object must scan its children for Link objects and raise an error based on their styling. (The alternative would be to have a distinct PrefomattedLink type, which sounds worse.)

But actually the alternative is to do nothing. This restriction isn’t enforced, and in fact using the strike style works! (I can’t say whether the code style works, since by definition a link in a preformatted section will be in code style.)

missing or bogus properties

The text object (not to be confused with the text composition object) exists almost entirely to pass along its text and style properties for rending. But the text property isn’t documented. It appears in nearly every example, though, so you can figure that one out.

Quite a few blocks are documented as having pixel-count properties, like the border property on the list block. When used at all, these trigger an invalid_blocks error. Possibly their validity is related to the surface onto which they’re rendered, but I can’t find any documentation of this.

Other properties have incorrect type definitions. The preformatted block says its elements can contain channel, emoji, user, and usergroup objects. Trying to use any of those, though, will get an error. Only link and text objects actually work.

what’s next?

I filed an pull request against Slack::Notify, which Rob N★ quickly applied and shipped (thanks!), which makes this code easy to use. So, the iron is hot and I want to strike it. I’ll probably put this in place to replace a couple gross bits in Synergy.

But also, I should write some tests first.

But also, and much less likely, it’d be great to have some form of Markdown Esperanto, where I could write Markdown (or the like) in my code, and then have it formatted into the right output based on the output channel. Given a Markdown DOM, it would be fairly simple to spit out BlockKit rich text. I think. That would be cool, but I’m pretty sure I won’t ever do it. We’ll see, though.

What's New in Perl v5.40?

perl.com

Published on Friday 28 June 2024 09:00

This article was originally published at The Weekly Challenge.


Perl, the most versatile and powerful programming language, continues to evolve. With the addition of Corinna to core Perl, I look forward to every release for new features. On 9th June 2024, we had the latest public release of Perl v5.40. There have been significant enhancements in this release. You can check out the main attraction yourself.

In this post, I would like to share my personal favourites.

1. The new __CLASS__ keyword
2. The :reader attribute for field variables
3. A space is permitted in the -M command-line option
4. The new ^^ logical xor operator
5. The try/catch feature is no longer experimental
6. for iterating over multiple values at a time is no longer experimental

1. The new __CLASS__ keyword


Do you remember our good old friend, __PACKAGE__? Well, it is a special token that returns the name of the package in which it occurs. Most commonly, you will find this __PACKAGE__->meta->make_immutable in a Moose class.

Similar to __PACKAGE__, we now have a special token __CLASS__ for the new core OO. In most cases, it would behave same as __PACKAGE__. Having said that, it shines when you are dealing with subclass.


use v5.40;
use experimental 'class';

class Example1 {
    field $x = __CLASS__->default_x;
    field $y = __CLASS__->default_y;

    sub default_x { 10 }
    sub default_y { 20 }
    method sum { return $x + $y }
}

class Example2 :isa(Example1) {

    sub default_x { 1 }
    sub default_y { 2 }
}

say Example1->new->sum;  # 30
say Example2->new->sum;  # 3

2. The :reader attribute for field variables


With the introduction of new OO in Perl v5.38, this is how one can create a class.


use v5.38;
use experimental 'class';

class Employee {
    field $name :param;
    field $age  :param;

    method name    { return $name }
    method get_age { return $age  }
}

my $emp = Employee->new(name => "Joe", age => 40);
say $emp->name;      # Joe
say $emp->get_age;   # 40


If you noticed, the method name() and get_age() is just a generic getter method.

Luckily in the latest release, the same can be achieved like below with the use of :reader without having to explicitly define the getter methods.

I must admit, it looks a lot cleaner definition of class comparatively.


use v5.40;
use experimental 'class';

class Employee {
    field $name :param :reader;
    field $age  :param :reader(get_age);
}

my $emp = Employee->new(name => "Joe", age => 40);
say $emp->name;      # Joe
say $emp->get_age;   # 40


There are two variants, one that would give you regular getter and the second where you can provide your own method name.

You may be wondering, how about setter?

Well I am hoping in the next release we might get that too.


3. A space is permitted in the -M command-line option


Prior to Perl v5.40, this is how you would use -M switch.


$ p538 -MList::Util=sum -E 'say sum(1, 2, 3, 4)'
10


However if you forced a space in an earlier Perl, you would get error Missing argument to -M like below:


$ p538 -M List::Util=sum -E 'say sum(1, 2, 3, 4)'
Missing argument to -M


With the release of Perl v5.40, you no longer get error.


$ p540 -M List::Util=sum -E 'say sum(1, 2, 3, 4)'
10

4. The new ^^ logical xor operator


Prior to Perl v5.40, we had 3 low-precedence logical operators and, or and xor. Also we had 2 medium-precedence logical operators && and ||.

In the earlier release of Perl, this is how one would use low-precedence xor operator.


use v5.38;

my $x = 1;
my $y = 0;

($x xor $y) and say 'Either $x or $y is true but not both.';


With the addition of the new medium-precedence xor operator ^^, the same can be achieved like below:


use v5.40;

my $x = 1;
my $y = 0;

$x ^^ $y and say 'Either $x or $y is true but not both.';

5. The try/catch feature is no longer experimental


We all know try/catch was added to the core Perl v5.34 as experimental.


use v5.34;
use experimental 'try';

try {
    1/0;
} catch ($e) {
    say "try/catch exception: $e";
}


It stayed experimental even in Perl v5.36.


use v5.36;
use experimental 'try';

try {
    1/0;
} catch ($e) {
    say "try/catch exception: $e";
}


However it is no longer experimental in Perl v5.40. Hurrah!!!


use v5.40;

try {
    1/0;
} catch ($e) {
    say "try/catch exception: $e";
}

6. for iterating over multiple values at a time is no longer experimental


Do you remember iterating over multiple values at a time was an experimental feature added to the core Perl v5.36?


use v5.36;
use experimental 'for_list';

for my ($p, $q) (1,2,3,4) {
    say $p, $q;
}


It is no longer experimental in Perl v5.40.


use v5.40;

for my ($p, $q) (1,2,3,4) {
    say $p, $q;
}

Just to show realtime result, please see below:


$ p540 -E 'for my ($p, $q) (@ARGV) { say $p, $q; }' 1 2 3 4
12
34

$ p540 -E 'for my ($p, $q) (@ARGV) { say $p, $q; }' 1 2 3 4 5
12
34
5

$ p540 -E 'for my ($p, $q) (@ARGV) { say $p, $q; }' 1 2 3 4 5 6
12
34
56

I have only scratched the surface so far. Maybe in the next post I will try to explore further enhancements.

TPRC 2024 Feedback Form

Perl Foundation News

Published by Amber Krawczyk on Thursday 27 June 2024 17:46

Please help us by filling out our brief TPRC feedback form at https://forms.gle/DcUDX6JzWT72yXTY7 Couldn't make it this year? We still want your feedback!