URL doesn't exist anymore so link to a mirror instead

Perl commits on GitHub

Published by scottchiefbaker on Friday 14 February 2025 04:45

URL doesn't exist anymore so link to a mirror instead

Also add some comments

I can't install a Perl module. I have it installed and working OK on another laptop. Can I copy the whole installation from that laptop? What I tried:

  • I have Strawberry on my current laptop. I tried to install Image::Magick, but it gives an error. I read that in principle it is possible to install Image::Magick to Strawberry from cpan , but it seems it's going to be quite complicated.
  • I tried to install ActiveState Perl instead, but its installation procedure is too complicated. Plus, I would prefer to keep my Strawberry installation alongside with ActiveState for legacy programs. So I felt tempted to try a simpler quick-and-dirty shortcut:
  • I have Activestate Perl on my old laptop, with Image::Magick installed and working OK. So I hoped to copy it completely to my current laptop:
  • I copied the whole tree c:\Perl64 and prepended its \bin to the Path variable. Now perl --verion successfully presents itself as ActiveState. However, it still looks for modules under c:\Strawbwerry not under c:\Perl64\lib, where Image::Magick is really present (copied from the old laptop). Is there an environment variable to control where pel will look for modules? Is there anything else that I should copy from my old laptop? I would prefer simple solutions, without compiling things locally. And without changing my Perl code itself Both systems (new - target and old - source) are Windows-10 Very strangely perl-V says: @INC: C:/Perl64/site/lib C:/Perl64/lib And still it does not find Image::Magick, which really is present in: C:\Perl64\site\lib\Image\Magick.pm Strangely, the error message says: Can't locate Image/Magick.pm in @INC (you may need to install the Image::Magick module) (@INC entries checked: C:/Strawberry/perl/site/lib C:/Strawberry/perl/vendor/lib C:/Strawberry/perl/lib) Note that these @INC entries do not correspnd to to the value of @INC reported by perl -V (see above) Interestingly, on the old (source) laptop, perl-V reports the same as o the new (target) one: @INC: C:/Perl64/site/lib C:/Perl64/lib And there Image::Magick works OK, with the same tree od c:\Perl64 .

Perl and assembly : more stuff

r/perl

Published by /u/ReplacementSlight413 on Thursday 13 February 2025 19:46

The non #Perl mind can not comprehend the marriages between Perl and #Assembly that are possible....

https://github.com/nrdvana/perl-CPU-x86_64-InstructionWriter

submitted by /u/ReplacementSlight413
[link] [comments]

How to include a table title in Perl Text::Table::More

Perl questions on StackOverflow

Published by Zilore Mumba on Thursday 13 February 2025 19:35

I have made a simple version of a table using Text::Table::More as given here. I have been able to include a title being having the first row span all the columns.

I tried to remove the top_border with "top_border => 0" but it does not work. can this be done?

Also the unclear guide (to me) in this module suggests that one can color rows. Is this doable? My code below.

#!perl
use 5.010001;
use strict;
use warnings;
use Text::Table::More qw/generate_table/;

my $rows = [
# header row
[{text=>"Upcoming Program Achievements in Entertainment", align => "left", colspan=>5}],
# first data row
["Year",
"Comedy",
"Drama",
"Variety",
"Lead Comedy Actor"],
# second data row
[1962,
"The Bob Newhart Show (NBC)",
"The Defenders (CBS)",
"The Garry Moore Show (CBS)",
"E. G. Marshall (CBS)"],
# third data row
[1963,
"The Dick Van Dyke Show (CBS)",
"The Dick Van Dyke Show (CBS)",
"The Andy Williams Show (NBC)",
"The Andy Williams Show (NBC)"],
# fourth data row
[1964,
"The Danny Kaye Show (CBS)",
"Dick Van Dyke (CBS)",
"Mary Tyler Moore (CBS)",
"The Andy Williams Show (NBC)"],
];

binmode STDOUT, "utf8";
print generate_table(
rows => $rows,      # required
to_border => 0      #top border was put here, this is what I said doesn't work
header_row => 1,    # optional, default 0
separate_rows => 1, # optional, default 0
border_style => "UTF8::SingleLineBoldHeader",
row_attrs => [
[0, {align=>'middle', bottom_border=>1}],
 ],
col_attrs => [[2, {valign=>'middle'}],
],
);

vadrer/perl-tvision: perl bindings for the github.com/magiblot/tvision.

r/perl

Published by /u/briandfoy on Thursday 13 February 2025 12:31

.gitattributes: declare files generated from regen/mg_vtable.pl as generated

Perl Monks unresponsive?

r/perl

Published by /u/bazoo513 on Thursday 13 February 2025 10:50

I can't access perlmonks.org for the last two days. The address responds to pings, but the browser complains about timeout. When I try with wget, I see that the site sends 301 (moved permanently).

Does anybody know anything about this?

submitted by /u/bazoo513
[link] [comments]

How can I split a string on the first occurrence of a digit

Perl questions on StackOverflow

Published by Zilore Mumba on Wednesday 12 February 2025 22:14

I have strings which consist of a name and two digits. I would like to extract the name and the digits into one variable for each. The problem I have is that some names have spaces in them. When I split on /\s+/ the name is split into two.

my (${st_name}, $val1, $val2) = split(/\s+/, $line, 3);

I have tried to split on /\d+/, I do not get the digits. I have tried to get the index of the first digit, not sure if it is really

my $index = index ($line, \d);

I will appreciated any assistance. Code tried

use strict;
use warnings;

while (my $line = <DATA>){
my (${st_name}, $val1, $val2) = split(/\s+/, $line, 3);   #doesn't work

my $index = index ($line, \d);
${st_name}=$line(0, $index);
my ($val1, $val2) = $line($index)


__DATA__
Maputsoe 2       1
Butha-Buthe (Butha-Buthe District) 2       1

Musings on Perl Catalyst Chained actions

r/perl

Published by /u/jnapiorkowski on Wednesday 12 February 2025 21:29

A short mediation on things I've been dog fooding with my personal Catalyst projects and considering making public. Feedback welcomed:

https://dev.to/jjn1056/perl-catalyst-thoughts-on-chained-actions-4ck5

submitted by /u/jnapiorkowski
[link] [comments]

How to direct output of Text::Table::More to a text file

Perl questions on StackOverflow

Published by Zilore Mumba on Wednesday 12 February 2025 19:25

I am able to output a table generated with the module Text::Table:Boxed as per example given here, the first lines being as follows:

binmode STDOUT, ":utf8";

use Text::Table::Boxed;
open(my $tfh, "> /home/zmumba/Output/Rtable.txt") or die "open 'Rtable.txt': failed $! ($^E)";
my $tb = Text::Table::Boxed->new({
... table contens ...
})

print $tfh $tb
close($tfh)

Text::Table::More as given here, does not clearly offer the possibility of directing output to a file. The first lines

#!perl
use 5.010001;
use strict;
use warnings;
use Text::Table::More qw/generate_table/;

my $rows = [
...Table content ...
];

binmode STDOUT, "utf8";
print generate_table(
.. attributes ...
);

I am able to generate the table to screen. I do not see how I get this table to a file.

I have data which comprises of a location name and some risk levels from 1 to 4, for each of 4 weather parameters, which are 1)24hr rainfall, 2) 5-day total rainfall, 3) minimum temperature and 4) maximum temperature. On a day,there can be only risk level 1 (no risk) for all the four parameters, in which case there is nothing to process. Or there can be one, two, up to four to process.

I am able to determine if there is no processing required (i.e. risk level 1 every where).

A data sample is given in __DATA__ below.

I am able to determine how many weather elements have risk levels higher than 1 (from the file header), and therefore am able to make an array @Riskarray, from this header.

What I would like to do is to split each line and write in a separate file for each weather element, two columns (the location and the risk value). This is similar to How can I split a string on the first occurrence of a digit for which I was very ably assisted.

In the code snippet below, I have no idea how to proceed after the while statement.

#!/usr/bin/perl -w

use strict;
use warnings;

my @RiskArray=("H1dR", "H5dR", "Tmin", "Tmax");
my $arraynum=scalar @RiskArray;

my $hdr=<DATA>;
for my $i (0..$#RiskArray){
 while (my $line = <DATA>){

 }
}

__DATA__
Risk    H1dR  H5dR  Tmin  Tmax
Maputsoe, 2, 1, 3,  2
Butha-Buthe (Butha-Buthe District), 4, 3, 2, 1
Mohale's Hoek, 2, 2, 3, 4
Qacha's Nek, 2, 3, 1,  2

sample output will be four files, e.g. file "H1dRrv.txt" will contain

Maputsoe, 2
Butha-Buthe (Butha-Buthe District), 4
Mohale's Hoek, 2
Qacha's Nek, 2

and H5dRrv.txt

Maputsoe, 1
Butha-Buthe (Butha-Buthe District), 3
Mohale's Hoek, 2
Qacha's Nek, 3
regen/HeaderParser: correctly handle backslash line continuations

C says that backslash-newline pairs should be deleted from the source
before doing any parsing (like a source filter).

    #def\
    ine A\
    B /\
    * this is a comment *\
    / "\\
    n"

is thus equivalent to

    #define AB "\n"

(In particular, backslash-newline should never be replaced by a space.)

HeaderParser: spell "preprocessor" consistently

Perl commits on GitHub

Published by mauke on Wednesday 12 February 2025 13:48

HeaderParser: spell "preprocessor" consistently

HeaderParser: properly anchor regexes for cpp directives

Perl commits on GitHub

Published by mauke on Wednesday 12 February 2025 13:48

HeaderParser: properly anchor regexes for cpp directives

Consistently use ^ in front of and \b behind preprocessor directives.

module error message "Can't make loaded symbols global"

r/perl

Published by /u/tseeling on Wednesday 12 February 2025 12:56

I am compiling a bunch of modules for AIX, and afterwards test by printing out the $<module>::VERSION info.

For some modules I receive an error message I don't understand. What does this mean? Is the module unusable on AIX? Is there something to configure at compile time so that it works?

Term::Menus Can't make loaded symbols global on this platform while loading /users/me/perl5/lib/perl5/aix-thread-multi/auto/B/Utils/Utils.so at /usr/opt/perl5/lib/5.38/aix-thread-multi/DynaLoader.pm line 206.

submitted by /u/tseeling
[link] [comments]

My Python Language Solution to Task 2: Decode XOR from The Weekly Challenge 308

dev.to #perl

Published by Robert McIntosh on Tuesday 11 February 2025 20:20

1. Introduction

The Weekly Challenge, organized by Mohammad S. Anwar, is a friendly competition in which developers compete by solving a pair of tasks. It encourages participation from developers of all languages and levels through learning, sharing, and having fun.

Task 2: Decode XOR from The Weekly Challenge encourages developers to decode an array of encoded integers.

In this post I present my Python language solution to Task 2: Decode XOR and finish with a brief conclusion.

2. Task 2: Decode XOR

You are given an encoded array, @encoded, and an initial integer, $initial.

Write a script to find the original array that produced the given encoded array. It was encoded such that encoded[i] = original[i] XOR original[i + 1].

The Weekly Challenge 308, Task 2: Decode XOR

Example 1 and Example 2 present the expected outputs from given inputs.

Example 1

Input: @encoded = (1, 2, 3), $initial = 1
Output: (1, 0, 2, 1)

If the original array is @original = (1, 0, 2, 1), to encode

- $encoded[0] = $original[0] XOR $original[1] = (1 XOR 0) = 1
- $encoded[1] = $original[1] XOR $original[2] = (0 XOR 2) = 2
- $encoded[2] = $original[2] XOR $original[3] = (2 XOR 1) = 3

Example 2

Input: @encoded = (6, 2, 7, 3), $initial = 4
Output: (4, 2, 0, 7, 4)

3. My solution to Task 2: Decode XOR

3.1 Decoding The XOR Coding

An interesting and important property of XOR is that it acts as its own inverse. If the result of an XOR operation and one of the original values is known, then the other original value can be determined from the XOR of these two values.

For example, if c = a XOR b, then if:

  • b is known, a can be calculated from a = c XOR b.
  • a is known, b can be calculated from b = c XOR a.

Because the task encoding uses XOR, its inverse property facilitates the decoding process. With the initial value of the original array, the remaining original values can be decoded from the encoded array.

The encoding process is as follows:

encoded[i] = original[i] XOR original[i + 1]

From which the decoding process follows:

original[i + 1] = encoded[i] XOR original[i]

3.2 My Solution

def decode(encoded: list[int], initial: int) -> [int]:
    original = [
        initial,
    ]
    for encoded_element in encoded:
        original_element = original[-1] ^ encoded_element
        original.append(original_element)
    return original

Listing 1: Python solution to Task 2

Listing 1 shows my Python solution to Task 2. It uses a list of integers to hold the decoded array elements, a for loop, and the decoding expression described in section 3.1.

  • The list original is initialized with initial as its first element.
  • The for encoded_element in encoded loop iterates through the encoded array. For each encoded_element:
    • original_element is determined using the decoding expression.
    • original_element s appended to the list original.
  • The list original is returned.

4. Conclusion

In this post I presented my Python language solution to Task 2: Decode XOR from The Weekly Challenge 308.

For more information on participating in The Weekly Challenge, please visit theweeklychallenge.org.

Learn more about Python at https://python.org.

Learn more about the "inverse" of XOR at https://stackoverflow.com/questions/14279866/what-is-the-inverse-function-to-xor

When Laziness Isn't

blogs.perl.org

Published by silent11 on Tuesday 11 February 2025 17:00

I just needed a few rows of UUIDs in a column of a spreadsheet, more for esthetics than anything else. uuidgen to the rescue.

At the time I didn't realize that uuidgen natively supports outputting multiple ids like so
uuidgen -C 8


The truly lazy path would have been to read the fine uuidgen manual.

Alas, supposing I needed to make multiple calls to uuidgen, I went with a Perl one-liner with a loop, as I couldn't recall the Bash loop syntax.

Here comes the laziness... I I didn't want to write something like this:

perl -e 'print `uuidgen` for @{[1..5]}';


I'm not so found of of perl's de-reference syntax these days, also that array reference/range was giving "the ick" as my kids would say. I needed something lazier, cleaner. I wondered if there were any default/exported arrays available to me that don't have too many elements to them.... Ah, I know!



$ perl -e 'print `uuidgen` for @INC';

d2c9c4b9-2126-4eda-ba52-ca30fdc55db0
eac4f86a-04eb-4c1a-aba1-fb1fa5c7dcda
2a2c416c-00bc-46d8-b7ce-c639f73cef26
4cc052cc-6423-4420-bbf5-595a7ad28c51
0bb78a2e-f4e9-44cd-80ae-e463197398f5
37728b6c-69dc-4669-99e7-2814b0d5e2a6
5acf78b2-6938-465b-ad8a-3bf29037e749
87d6d4ef-e85c-40bb-b3c2-acf9dc88f3e1


This is more a case of (ab)using a variable for an unintended purpose, but today it got the job done, even if it wasn't the most lazy approach. Hubris? Maybe.

nicsell supports the German Perl Workshop

blogs.perl.org

Published by Max Maischein on Tuesday 11 February 2025 13:41

Sie bieten, wir catchen!
nicsell ist ein Domain-Backorder-Dienst, auch Dropcatcher genannt, der es Ihnen ermöglicht, auf eine Vielzahl freiwerdender Domains zu bieten, die sich aktuell in der Löschungsphase befinden.
Schon ab einem geringen Startgebot von 10 € können Sie an unseren Auktionen teilnehmen und haben die Chance an Ihre Wunschdomain zu gelangen.
Übrigens: Zur Verstärkung unseres Teams in Osnabrück suchen wir engagierte Perl-Entwickler (m/w/d). Bei Interesse freuen wir uns auf Ihre Bewerbung!

Nicsell

Premium XS Integration, Pt 2

blogs.perl.org

Published by Nerdvana on Tuesday 11 February 2025 07:52

This is a continuation of a series of articles about how to write XS libraries that are more convenient and foolproof for the Perl users, while not blocking them from using the actual C API.

If you spot anything wrong, or want to contribute suggestions, open an issue at the GitHub repo

Wrapping Transient Objects

One frequent and difficult problem you will encounter when writing XS wrappers around a C library is what to do when the C library exposes a struct which the user needs to see, but the lifespan of that struct is controlled by something other than the reference the user is holding onto.

For example, consider the Display and Screen structs of libX11. When you connect to an X server, the library gives you a Display pointer. Within that Display struct are Screen structs. Some of the X11 API uses those Screen pointers as parameters, and you need to expose them in the Perl interface. But, if you call XCloseDisplay on the Display pointer those Screen structs get freed, and now accessing them will crash the program. The Perl user might still be holding onto a X11::Xlib::Screen Perl object, so how do you stop them from crashing the program when they check an attribute of that object?

Indirect References

For the case of X11 Screens there was an easy workaround: The Screen structs are numbered, and a pair of (Display, ScreenNumber) can refer to the Screen struct without needing the pointer to it. Because the Perl Screen object references the Perl Display object, the methods of Screen can check whether the display is closed before resolving the pointer to a Screen struct, and die with a useful message instead of a crash.

From another perspective, you can think of them like symlinks. You reference one Perl object which has control over its own struct’s lifecycle and then a relative path from that struct to whatever internal data structure you’re wrapping with the current object.

While this sounds like a quick solution, there’s one other detail to worry about: cyclical references. If the sub-object is referring to the parent object, and the parent refers to a collection of sub-objects, Perl will never free these objects. For the case of X11 Screens, the list of screen structs is known at connection-time and is almost always just one Screen, and doesn’t change at runtime. [1] An easy solution for a case like this is to have a strong reference from Display to Screen, and weak references (Scalar::Util::weaken) from Screen to Display, and create all the Screen objects as soon as the Display is connected.

1) this API is from an era before people thought about connecting new monitors while the computer was powered up, and these days can more accurately be thought of as a list of graphics cards rather than “screens”

Lazy Cache of Wrapper Objects

If the list of Screens were dynamic, or if I just didn’t want to allocate them all upfront for some reason, another approach is to wrap the C structs on demand. You could literally create a new wrapper object each time they access the struct, but you’d probably want to return the same Perl object if they access two references to the same struct. One way to accomplish this is with a cache of weak references.

In Perl it would look like:

package MainObject {
  use Moo;
  use Scalar::Util 'weaken';

  has is_closed         => ( is => 'rwp' );

  # MainObject reaches out to invalidate all the SubObjects
  sub close($self) {
    ...
    $self->_set_is_closed(1);
  }

  has _subobject_cache => ( is => 'rw', default => sub {+{}} );

  sub _new_cached_subobject($self, $ptr) {
    my $obj= $self->_subobject_cache->{$ptr};
    unless (defined $obj) {
      $obj= SubObject->new(main_ref => $main, data_ptr => $ptr);
      weaken($self->_subobject_cache->{$ptr}= $obj);
    }
    return $obj;
  }

  sub find_subobject($self, $search_key) {
    my $data_ptr= _xs_find_subobject($self, $search_key);
    return $self->_new_cached_subobject($data_ptr);
  }
}

package SubObject {
  use Moo;

  has main_ref => ( is => 'ro' );
  has data_ptr => ( is => 'ro' );

  sub method1($self) {
    # If main is closed, stop all method calls
    croak "Object is expired"
      if $self->main_ref->is_closed;
    ... # operate on data_ptr
  }

  sub method2($self) {
    # If main is closed, stop all method calls
    croak "Object is expired"
      if $self->main_ref->is_closed;
    ... # operate on data_ptr
  }
}

Now, the caller of find_subobject gets a SubObject, and it has a strong reference to MainObject, and MainObject’s cache holds a weak reference to the SubObject. If we call that same method again with the same search key while the first SubObject still exists, we get the same Perl object back. As long as the user holds onto the SubObject, the MainObject won’t expire, but the SubObjects can get garbage collected as soon as they aren’t needed.

One downside of this exact design is that every method of SubObject which uses data_ptr will need to first check that main_ref isn’t closed (like shown in method1). If you have frequent method calls and you’d like them to be a little more efficient, here’s an alternate version of the same idea:

package MainObject {
  ...

  # MainObject reaches out to invalidate all the SubObjects
  sub close($self) {
    ...
    $_->data_ptr(undef)
      for grep defined, values $self->_subobject_cache->%*;
  }

  ...
}

package SubObject {
  ...

  sub method1($self) {
    my $data_ptr= $self->data_ptr
      // croak "SubObject belongs to a closed MainObject";
    ... # operate on data_ptr
  }

  sub method2($self) {
    my $data_ptr= $self->data_ptr
      // croak "SubObject belongs to a closed MainObject";
    ... # operate on data_ptr
  }

  ...
}

In this pattern, the sub-object doesn’t need to consult anything other than its own pointer before getting to work, which comes in really handy with the XS Typemap. The sub-object also doesn’t need a reference to the main object (unless you want one to prevent the main object from getting freed while a user holds SubObjects) so this design is a little more flexible. The only downside is that closing the main object takes a little extra time as it invalidates all of the SubObject instances, but in XS that time won’t be noticeable.

Lazy Cache of Wrapper Objects, in XS

So, what does the code above look like in XS? Here we go…

/* First, the API for your internal structs */

struct MainObject_info {
  SomeLib_MainObject *obj;
  HV *wrapper;
  HV *subobj_cache;
  bool is_closed;
};

struct SubObject_info {
  SomeLib_SubObject *obj;
  SomeLib_MainObject *parent;
  HV *wrapper;
};

struct MainObject_info*
MainObject_info_create(HV *wrapper) {
  struct MainObject_info *info= NULL;
  Newxz(info, 1, struct MainObject_info);
  info->wrapper= wrapper;
  return info;
}

void MainObject_info_close(struct MainObject_info* info) {
  if (info->is_closed) return;
  /* All SubObject instances are about to be invalid */
  if (info->subobj_cache) {
    HE *pos;
    hv_iterinit(info->subobj_cache);
    while (pos= hv_iternext(info->subobj_cache)) {
      /* each value of the hash is a weak reference,
         which might have become undef at some point */
      SV *subobj_ref= hv_iterval(info->subobj_cache, pos);
      if (subobj_ref && SvROK(subobj_ref)) {
        struct SubObject_info *s_info =
          SubObject_from_magic(SvRV(subobj_ref), 0);
        if (s_info) {
          /* it's an internal piece of the parent, so
             no need to call a destructor here */
          s_info->obj= NULL;
          s_info->parent= NULL;
        }
      }
    }
  }
  SomeLib_MainObject_close(info->obj);
  info->obj= NULL;
  info->is_closed= true;
}

void MainObject_info_free(struct MainObject_info* info) {
  if (info->obj)
    MainObject_info_close(info);
  if (info->subobj_cache)
    SvREFCNT_dec((SV*) info->subobj_cache);
  /* The lifespan of 'wrapper' is handled by perl,
   * probably in the process of getting freed right now.
   * All we need to do is delete our struct.
   */
  Safefree(info);
}

The gist here is that MainObject has a set of all SubObject wrappers which are still held by the Perl script, and during “close” (which, in this hypothetical library, invalidates all SubObject pointers) it can iterate that set and mark each wrapper as being invalid.

The Magic setup for MainObject goes just like in the previous article:

static int MainObject_magic_free(pTHX_ SV* sv, MAGIC* mg) {
  MainObject_info_free((struct MainObject_info*) mg->mg_ptr);
}
static MAGIC MainObject_magic_vtbl = {
  ...
};

struct MainObject_info *
MainObject_from_magic(SV *objref, int flags) {
  ...
}

The destructor for the magic will call the destructor for the info struct. The “frommagic” function instantiates the magic according to ‘flags’, and so on.

Now, the Magic handling for SubObject works a little differently. We don’t get to decide when to create or destroy SubObject, we just encounter these pointers in the return values of the C library functions, and need to wrap them in order to show them to the perl script.

/* Return a new ref to an existing wrapper, or
 * create a new wrapper and cache it.
 */
SV * SubObject_wrap(SomeLib_SubObject *sub_obj) {
  /* If your library doesn't have a way to get the main object
   * from the sub object, this gets more complicated.
   */
  SomeLib_MainObject *main_obj= SomeLib_SubObject_get_main(sub_obj);
  SV **subobj_entry= NULL;
  SubObject_info *s_info= NULL;
  HV *wrapper= NULL;
  SV *objref= NULL;
  MAGIC *magic;

  /* lazy-allocate the cache */
  if (!main_obj->subobj_cache) {
    main_obj->subobj_cache= newHV();

  /* See if the SubObject has already been wrapped.
   * Use the pointer as the key
   */
  subobj_entry= hv_fetch(
    main_obj->subobj_cache,
    &sub_obj, sizeof(void*), 1
  );
  if (!subobj_entry)
    croak("lvalue hv_fetch failed"); /* should never happen */

  /* weak references may have become undef */
  if (*subobj_entry && SvROK(*subobj_entry))
    /* we can re-use the existing wrapper */
    return newRV_inc( SvRV(*subobj_entry) );

  /* Not cached. Create the struct and wrapper. */
  Newxz(s_info, 1, struct SubObject_info);
  s_info->obj= sub_obj;
  s_info->wrapper= newHV();
  s_info->parent= main_obj;
  objref= newRV_noinc((SV*) s_info->wrapper);
  sv_bless(objref, gv_stashpv("YourProject::SubObject", GV_ADD));

  /* Then attach the struct pointer to its wrapper via magic */
  magic= sv_magicext((SV*) s_info->wrapper, NULL, PERL_MAGIC_ext,
      &SubObject_magic_vtbl, (const char*) s_info, 0);
#ifdef USE_ITHREADS
  magic->mg_flags |= MGf_DUP;
#else
  (void)magic; // suppress warning
#endif

  /* Then add it to the cache as a weak reference */
  *subobj_entry= sv_rvweaken( newRV_inc((SV*) s_info->wrapper) );

  /* Then return a strong reference to it */
  return objref;
}

Again, this is roughly equivalent to the Perl implementation of new_cached_subobject above.

Now, when methods are called on the SubObject wrapper, we want to throw an exception if the SubObject is no longer valid. We can do that in the function that the Typemap uses:

struct SubObject_info *
SubObject_from_magic(SV *objref, int flags) {
  struct SubObject_info *ret= NULL;

  ... /* inspect magic */

  if (flags & OR_DIE) {
    if (!ret)
      croak("Not an instance of SubObject");
    if (!ret->obj)
      croak("SubObject belongs to a closed MainObject");
  }
  return ret;
}

Now, the Typemap:

TYPEMAP
struct MainObject_info *   O_SomeLib_MainObject_info
SomeLib_MainObject*        O_SomeLib_MainObject
struct SubObject_info *    O_SomeLib_SubObject_info
SomeLib_SubObject*         O_SomeLib_SubObject

INPUT
O_SomeLib_MainObject_info
  $var= MainObject_from_magic($arg, OR_DIE);

INPUT
O_SomeLib_MainObject
  $var= MainObject_from_magic($arg, OR_DIE)->obj;

INPUT
O_SomeLib_SubObject_info
  $var= SubObject_from_magic($arg, OR_DIE);

INPUT
O_SomeLib_SubObject
  $var= SubObject_from_magic($arg, OR_DIE)->obj;

OUTPUT
O_SomeLib_SubObject
  sv_setsv($arg, sv_2mortal(SubObject_wrap($var)));

This time I added an “OUTPUT” entry for SubObject, because we can safely wrap any SubObject pointer that we see in any of the SomeLib API calls, and get the desired result.

There’s nothing stopping you from automatically wrapping MainObject pointers with an OUTPUT typemap, but that’s prone to errors because sometimes an API returns a pointer to the already-existing MainObject, and you don’t want perl to put a second wrapper on the same MainObject. This problem doesn’t apply to SubObject, because we re-use any existing wrapper by checking the cache. (of course, you could apply the same trick to MainObject and have a global cache of all the known MainObject instances, and actually I do this in X11::Xlib)

But in general, for objects like MainObject I prefer to special-case my constructor (or whatever method initializes the instance of SomeLib_MainObject) with a call to _from_magic(..., AUTOCREATE) on the INPUT typemap rather than returning the pointer and letting Perl’s typemap wrap it on OUTPUT.

After all that, it pays off when you add a bunch of methods in the rest of the XS file.

Looking back to the find_subobject method of the original Perl example, all you need in the XS is basically the prototype for that function of SomeLib:

SomeLib_SubObject *
find_subobject(main, search_key)
  SomeLib_MainObject *main
  char *key

and XS translation handles the rest!

Reduce Redundancy in your Typemap

I should mention that you don’t need a new typemap INPUT/OUTPUT macro for every single data type. The macros for a typemap provide you with a $type variable (and others, see perldoc xstypemap) which you can use to construct function names, as long as you name your functions consistently. If you have lots of different types of sub-objects, you could extend the previous typemap like this:

TYPEMAP
struct MainObject_info *    O_INFOSTRUCT_MAGIC
SomeLib_MainObject*         O_LIBSTRUCT_MAGIC

struct SubObject1_info *    O_INFOSTRUCT_MAGIC
SomeLib_SubObject1*         O_LIBSTRUCT_MAGIC_INOUT

struct SubObject2_info *    O_INFOSTRUCT_MAGIC
SomeLib_SubObject2*         O_LIBSTRUCT_MAGIC_INOUT

struct SubObject3_info *    O_INFOSTRUCT_MAGIC
SomeLib_SubObject3*         O_LIBSTRUCT_MAGIC_INOUT

INPUT
O_INFOSTRUCT_MAGIC
  $var= @{[ $type =~ / (\w+)/ ]}_from_magic($arg, OR_DIE);

INPUT
O_LIBSTRUCT_MAGIC
  $var= @{[ $type =~ /_(\w*)/ ]}_from_magic($arg, OR_DIE)->obj;

INPUT
O_LIBSTRUCT_MAGIC_INOUT
  $var= @{[ $type =~ /_(\w*)/ ]}_from_magic($arg, OR_DIE)->obj;

OUTPUT
O_LIBSTRUCT_MAGIC_INOUT
  sv_setsv($arg, sv_2mortal(@{[ $type =~ /_(\w*)/ ]}_wrap($var)));

Of course, you can choose your function names and type names to fit more conveniently into these patterns.

Finding the MainObject for a SubObject

Now, you maybe noticed that I made the convenient assumption that the C library has a function that looks up the MainObject of a SubObject:

SomeLib_MainObject *main= SomeLib_SubObject_get_main(sub_obj);

That isn’t always the case. Sometimes the library authors assume you have both pointers handy and don’t bother to give you a function to look one up from the other.

The easiest workaround is if you can assume that any function which returns a SubObject also took a parameter of the MainObject as an input. Then, just standardize the variable name given to the MainObject and use that variable name in the typemap macro.

OUTPUT
O_SomeLib_SubObject
  sv_setsv($arg, sv_2mortal(SubObject_wrap(main, $var)));

This macro blindly assumes that “main” will be in scope where the macro gets expanded, which is true for my example:

SomeLib_SubObject *
find_subobject(main, search_key)
  SomeLib_MainObject *main
  char *key

But, what if it isn’t? What if the C API is basically walking a linked list, and you want to expose it to Perl in a way that the user can write:

for (my $subobj= $main->first; $subobj; $subobj= $subobj->next) {
  ...
}

The problem is that the “next” method is acting on one SubObject and returning another SubObject, with no reference to “main” available.

Well, if a subobject wrapper exists, then it knows the main object, so you just need to look at that SubObject info’s pointer to parent (the MainObject) and make that available for the SubObject’s OUTPUT typemap:

SomeLib_SubObject *
next(prev_obj_info)
  struct SubObject_info *prev_obj_info;
  INIT:
    SomeLib_MainObject *main= prev_obj_info->parent;
  CODE:
    RETVAL= SomeLib_SubObject_next(prev_obj_info->obj);
  OUTPUT:
    RETVAL

So, now there is a variable ‘main’ in scope when it’s time for the typemap to construct a wrapper for the SomeLib_SubObject.

Conclusion

In Perl, the lifespan of objects is nicely defined: the destructor runs when the last reference is lost, and you use a pattern of strong and weak references to control the order the destructors run. In C, the lifespan of objects is dictated by the underlying library, and you might need to go to some awkward lengths to track which ones the Perl user is holding onto, and then flag those objects when they become invalid. While somewhat awkward, it’s very possible thanks to weak references and hashtables keyed on the C pointer address, and the users of your XS library will probably be thankful when they get a useful error message about violating the lifecycle of objects, instead of a mysterious segfault.

Premium XS Integration, Pt 1

blogs.perl.org

Published by Nerdvana on Tuesday 11 February 2025 07:06

Intro

There are several competing philosophies for wrapping external C libraries. One is that the XS module should hide all the details of the library and provide a clean “Perlish interface”. The opposite extreme is that the external C functions should be exposed to Perl using an extremely minimal XS layer, or the Foreign Function Interface (FFI) and all the logic for working with the library should be written in Perl.

I advocate something in the middle. I think that a good interface should expose as much of the low-level as possible (to make the most usage of that library possible by other Perl modules) while “padding the sharp edges” so that it is difficult for Perl-side usage to crash the program. Higher level features can be provided in addition to the low level API via XS, Perl modules, or both.

If you consider that the average C library is an awkward mess of state machines and lightly-enforced state requirements that will segfault if not carefully obeyed, wrapping that nicely for the Perl developer is going to require a lot of data translation and runtime sanity checks. If you skip those runtime sanity checks in your wrapper library, it drags down the efficiency of your subsequent Perl development to the level of C development, which is to say, sitting around scratching your head for hours wondering why the program keeps segfaulting. (or attaching gdb to your debug build of perl) If you write those runtime checks in Perl, like with the FFI approach, your runtime performance can suffer significantly. If you write those runtime checks in XS, you can actually do quite a lot of them before there’s any notable decrease in the performance of the script.

Meanwhile, C code runs an order of magnitude faster than Perl opcodes, so if you’re going to require the end user to use a compiled module already, I feel it makes sense to put as much of the higher-level routines into XS as you have time for. But, the higher level routines shouldn’t be at the expense of the lower-level ones, or else you limit what people can do with the library.

This guide will explain all the tricks I know to write safe, fast, convenient, and powerful XS libraries.

(If you spot anything wrong, or want to contribute suggestions, open an issue at the GitHub repo

Binding Objects

One of the first things you’ll need to do for any C library which allocates “objects” is to bind them to a matching Perl object, usually a blessed scalar ref or hash ref. (The C language doesn’t have official objects of course, but a library often allocates a struct or opaque pointer with a lifespan and a destructor function that they expect you to call when you’re done with it, which is the same theme as an object.)

If you read through the common tutorials, you’ll probably see a recipe like

SV*
new(class, some_data)
  SV *class;
  IV some_data;
  INIT:
    LibWhaever_obj *obj;
  CODE:
    obj= LibWhaever_create(some_data);
    if (!obj) croak("LibWhaever_create failed");
    RETVAL= (SV*) newRV_noinc(newSViv((IV)obj));
    sv_bless(RETVAL,
             gv_stashpv("YourProject::LibWhatever", GV_ADD));
  OUTPUT:
    RETVAL

void
DESTROY(self)
  SV *self;
  INIT:
    LibWhaever_obj *obj;
  PPCODE:
    obj= (LibWhaever_obj*) SvIV(SvRV(self));
    LibWhaever_destroy(obj);
    XSRETURN(0);

This is about the least effort/overhead you can have for binding a C data structure to a Perl blessed scalar ref, and freeing it when the Perl object goes out of scope. (you can also move some of this code to the typemap, but I’ll come back to that later)

I don’t like this pattern for several reasons:

  • If someone passes the object to Storable’s dclone, it happily makes a copy of your scalar ref and then when the first object goes out of scope it runs the destructor, and the other object is now referring to freed memory and will probably segfault during its next use.
  • When you create a new thread in a threaded Perl, it clones objects, creating the same bug.
  • The pointer is stored as an integer visible to Perl, and could get altered by sloppy/buggy Perl code, and then you get a segfault.
  • A user could subclass the XS object, and write their own DESTROY method that forgets to call $self->SUPER::DESTROY, leaking the C object.
  • Sloppy/buggy Perl code could re-bless the class, also bypassing the DESTROY call.
  • Sloppy/buggy Perl code could call DESTROY on something which isn’t the blessed scalar-ref containing a valid pointer.

While most of these scenarios shouldn’t happen, if by unfortunate circumstances they do happen, someone loses a bunch of hours debugging it, especially if they aren’t the XS author and don’t know about these pitfalls.

Magic

A much more reliable way to link the C structs to the Perl blessed refs is through Perl’s “magic” system. Magic is the name for essentially a pointer within the SV/AV/HV of your object which points to a linked list of C metadata. This metadata describes various things, like operator-overloading or ties or other low-level Perl features. One type of magic is reserved for “extensions” (that’s you!)

There is a fair amount of effort and boilerplate to set up magic on your objects, but consider these benefits:

  • You are guaranteed that only the object your C code created will carry the pointer to your C struct, and no sloppy/buggy Perl-level operations can break that.
  • If the magic-attached pointer isn’t present, you can cleanly die with an error message to the user that somehow they have called your XS method on something that isn’t your object.
  • Your C-function destructor is described by the magic metadata, and does not rely on a DESTROY Perl method. This also makes destruction faster if Perl doesn’t need to call a Perl-level DESTROY function.
  • Magic can be applied equally to any type of ref, so you can use one pattern for whatever you are blessing, or even let the user choose what kind of ref it will be.
  • You can even use Moo or Moose to create the object, then attach your magic to whatever ref the object system created.
  • You get a callback when a new Perl thread starts and attempts to clone your object. (letting you clone it, or throw an exception that it can’t be cloned which is at least nicer to the user than a segfault would be)

With that in mind, lets begin suffering through the details.

Defining Magic

Magic is described with “struct MGVTBL”:

static int
YourProject_LibWhatever_magic_free(pTHX_ SV* sv, MAGIC* mg) {
  LibWhatever_obj *obj= (LibWhatever_obj*) mg->mg_ptr;
  LibWhatever_destroy(obj);
}

#ifdef USE_ITHREADS
static int
YourProject_LibWhatever_magic_dup(pTHX_ MAGIC *mg,
  CLONE_PARAMS *param)
{
  croak("This object cannot be shared between threads");
  return 0;
};
#else
#define YourProject_LibWhatever_magic_dup 0
#endif

// magic table for YourProject::LibWhatever
static MGVTBL YourProject_LibWhatever_magic_vtbl= {
  0, /* get */
  0, /* set */
  0, /* length */
  0, /* clear */
  YourProject_LibWhatever_magic_free, /* free */
#ifdef MGf_COPY
  0, /* copy magic to new variable */
#endif
#ifdef MGf_DUP
  YourProject_LibWhatever_magic_dup /* dup for new threads */
#endif
#ifdef MGf_LOCAL
  ,0 /* local */
#endif
};

You only need one static instance for each type of magic your module creates. It’s just metadata telling Perl how to handle your particular type of extension magic. The ifdefs are from past versions of the struct that had fewer fields, though if your module is requiring Perl 5.8 you can assume ‘copy’ and ‘dup’ exist, and from 5.10 ‘local’ always exists as well.

Next, the recipe to attach it to a new Perl object:

SV * my_wrapper(LibWhatever_obj *cstruct) {
  SV *obj, *objref;
  MAGIC *magic;
  obj= newSV(0); // or newHV() or newAV()
  objref= newRV_noinc(obj);
  sv_bless(objref, gv_stashpv("YourProject::LibWhatever", GV_ADD));
  magic= sv_magicext(
    obj,               // the inner SV/AV/HV, not the ref to it
    NULL,
    PERL_MAGIC_ext,                      // "extension magic"
    &YourProject_LibWhatever_magic_vtbl, // show perl your functions
    (const char*) cstruct,               // your custom pointer
    0);
#ifdef USE_ITHREADS
  magic->mg_flags |= MGf_DUP;
#else
  (void)magic; // suppress warning
#endif
  return objref;
}

The key there is ‘sv_magicext’. Note that you’re applying it to the thing being referred to, not the scalar ref that you use for the call to sv_bless. The messy ifdef part is due to the ‘dup’ field of the magic table only being used when perl was compiled with threading support. The reference to YourProject_LibWhatever_magic_vtbl is both an instruction for Perl to know what functions to call, but also a unique value used to identify your extension magic from anyone else’s.

To read your pointer back from an SV provided to you, the recipe is:

LibWhatever_obj* YourProject_LibWhatever_from_magic(SV *objref) {
  SV *sv;
  MAGIC* magic;

  if (SvROK(objref)) {
    sv= SvRV(objref);
    if (SvMAGICAL(sv)) {
      // Iterate magic attached to this scalar to find our vtable
      for (magic= SvMAGIC(sv); magic; magic = magic->mg_moremagic)
        if (magic->mg_type == PERL_MAGIC_ext
         && magic->mg_virtual == &YourProject_LibWhatever_magic_vtbl)
          // If found, the mg_ptr points to the fields structure.
          return (LibWhatever_obj*) magic->mg_ptr;
    }
  }
  return NULL;
}

This might look a little expensive, but there is likely only one type of magic on your object, so the loop exits on the first iteration, and all you did was “SvROK”, “SvRV”, “SvMAGICAL”, and “SvMAGIC” followed by two comparisons. It’s actually quite a bit faster than verifying the inheritance of the blessed package name.

So there you go - you can now attach your C structs with magic.

In the comments, Leon T. points out that you should really be using mg_findext:

magic= mg_findext(sv, PERL_MAGIC_ext, &YourProject_LibWhatever_magic_vtbl);
if (magic)
  return (LibWhatever_obj*) magic->mg_ptr;

He’s right, you should… but iterating the linked list without a function call will be a tiny bit faster. :-)

Convenience via Typemap

In a typical wrapper around a C library, you’re going to be writing a lot of methods that need to call YourProject_LibWhatever_from_magic on the first argument. To make that easier, lets move this decoding step to the typemap.

Without a typemap:

IV
method1(self, param1)
  SV *self
  IV param1
  INIT:
    LibWhatever_obj *obj= YouProject_LibWhatever_from_magic(self);
  CODE:
    if (!obj) croak("Not an instance of LibWhatever");
    RETVAL= LibWhatever_method1(obj, param1);
  OUTPUT:
    RETVAL

With a typemap entry like:

TYPEMAP
LibWhatever_obj*        O_LibWhatever_obj

INPUT
O_LibWhatever_obj
  $var= YourProject_LibWhatever_from_magic($arg);
  if (!$var) croak("Not an instance of LibWhatever");

the XS method becomes

IV
method1(obj, param1)
  LibWhatever_obj *obj
  IV param1
  CODE:
    RETVAL= LibWhatever_method1(obj, param1);
  OUTPUT:
    RETVAL

If you have some functions that take an optional LibWhatever_obj pointer, try this trick:

typedef LibWhatever_obj Maybe_LibWhatever_obj;

...

void
show(obj)
  Maybe_LibWhatever_obj *obj
  PPCODE:
    if (obj) {
      printf("...", LibWhatever_get_attr1(obj));
    }
    else {
      printf("NULL");
    }

TYPEMAP
LibWhatever_obj*        O_LibWhatever_obj
Maybe_LibWhatever_obj*  O_Maybe_LibWhatever_obj

INPUT
O_LibWhatever_obj
  $var= YourProject_LibWhatever_from_magic($arg);
  if (!$var) croak("Not an instance of LibWhatever");

INPUT
O_Maybe_LibWhatever_obj
  $var= YourProject_LibWhatever_from_magic($arg);

If you want to save a bit of compiled .so file size, you can move the error message into the ‘from_magic’ function, with a flag:

#define OR_DIE 1

LibWhatever_obj*
YourProject_LibWhatever_from_magic(SV *objref, int flags) {
  SV *sv;
  MAGIC* magic;

  if (SvROK(objref)) {
    sv= SvRV(objref);
    if (SvMAGICAL(sv)) {
      // Iterate magic attached to this scalar to find our vtable
      for (magic= SvMAGIC(sv); magic; magic = magic->mg_moremagic)
        if (magic->mg_type == PERL_MAGIC_ext
         && magic->mg_virtual == &YourProject_LibWhatever_magic_vtbl)
          // If found, the mg_ptr points to the fields structure.
          return (LibWhatever_obj*) magic->mg_ptr;
    }
  }
  if (flags & OR_DIE)
    croak("Not an instance of LibWhatever");
  return NULL;
}

TYPEMAP
LibWhatever_obj*        O_LibWhatever_obj
Maybe_LibWhatever_obj*  O_Maybe_LibWhatever_obj

INPUT
O_LibWhatever_obj
  $var= YourProject_LibWhatever_from_magic($arg, OR_DIE);

INPUT
O_Maybe_LibWhatever_obj
  $var= YourProject_LibWhatever_from_magic($arg, 0);

You can play further games with this, like automatically initializing the SV to become one of your blessed objects if it wasn’t defined, in the style of Perl’s open my $fh, ..., or maybe an option to add the magic to an existing object created by a pure-perl constructor. Do whatever makes sense for your API.

More Than One Pointer

In all the examples so far, I’m storing a single pointer to a type defined in the external C library being wrapped. Chances are, though, you need to store more than just that one pointer.

Imagine a poorly-written C library where you need to call SomeLib_create to get the object, then a series of SomeLib_setup calls before any other function can be used, then if you want to call SomeLib_go you have to first call SomeLib_prepare or else it segfaults. You could track these states in Perl variables in a hash ref, but it would just be easier if they were all present in a local C struct of your creation.

So, rather than attaching a pointer to the library struct with magic, you can attach your own allocated struct, and your struct can have a pointer to all the library details. For extra convenience, your struct can also have a pointer to the Perl object which it is attached to, which lets you access that object from other methods you write which won’t have access to the Perl stack.

struct YourProject_objinfo {
  SomeLib_obj *obj;
  HV *wrapper;
  bool started_setup, finished_setup;
  bool did_prepare;
};

struct YourProject_objinfo*
YourProject_objinfo_create(HV *wrapper) {
  struct YourProject_objinfo *objinfo= NULL;
  Newxz(objinfo, 1, struct YourProject_objinfo);
  objinfo->wrapper= wrapper;
  /* other setup here ... */
  return objinfo;
}

void
YourProject_objinfo_free(struct YourProject_objinfo *objinfo) {
  if (objinfo->obj) {
    SomeLib_obj_destroy(objinfo->obj);
  }
  /* other cleanup here ... */
  Safefree(objinfo);
}

static int YourProject_objinfo_magic_free(pTHX_ SV* sv, MAGIC* mg) {
  YourProject_objinfo_free(
    (struct YourProject_objinfo *) mg->mg_ptr);
}

One other thing that has changed from the previous scenario is that you can allocate this struct and attach it to the object whenever you want, instead of waiting for the user to call the function that creates the instance of SomeLib_obj. This gives you more flexible ways to deal with creation of the magic.

Here’s a pattern I like:

#define OR_DIE 1
#define AUTOCREATE 2

struct YourProject_objinfo*
YourProject_objinfo_from_magic(SV *objref, int flags) {
  SV *sv;
  MAGIC* magic;

  if (!sv_isobject(objref))
    /* could also check 'sv_derived_from' here, but that's slow */
    croak("Not an instance of YourProject");

  sv= SvRV(objref);
  if (SvMAGICAL(sv)) {
    /* Iterate magic attached to this scalar to find our vtable */
    for (magic= SvMAGIC(sv); magic; magic = magic->mg_moremagic)
      if (magic->mg_type == PERL_MAGIC_ext
       && magic->mg_virtual == &YourProject_objinfo_magic_vtbl)
        /* If found, the mg_ptr points to the fields structure. */
        return (struct YourProject_objinfo*) magic->mg_ptr;
  }
  if (flags & AUTOCREATE) {
    struct YourProject_objinfo *ret;
    if (SvTYPE(sv) != SVt_PVHV)
      croak("Expected blessed hashref");
    ret= YourProject_objinfo_create((HV*)sv);
    magic= sv_magicext(sv, NULL, PERL_MAGIC_ext,
      &YourProject_objinfo_magic_vtbl, (const char*) ret, 0);
#ifdef USE_ITHREADS
    magic->mg_flags |= MGf_DUP;
#else
    (void)magic; // suppress warning
#endif
    return ret;
  }
  if (flags & OR_DIE)
    croak("Not an initialized instance of YourProject");
  return NULL;
}

typedef struct YourProject_objinfo Maybe_YourProject_objinfo;
typedef struct YourProject_objinfo Auto_YourProject_objinfo;

Then in the typemap:

TYPEMAP
struct YourProject_objinfo*  O_YourProject_objinfo
Maybe_YourProject_objinfo*   O_Maybe_YourProject_objinfo
Auto_YourProject_objinfo*    O_Auto_YourProject_objinfo

INPUT
O_YourProject_objinfo
  $var= YourProject_objinfo_from_magic($arg, OR_DIE);

INPUT
O_Maybe_YourProject_objinfo
  $var= YourProject_objinfo_from_magic($arg, 0);

INPUT
O_Auto_YourProject_objinfo
  $var= YourProject_objinfo_from_magic($arg, AUTOCREATE);

(I should note here that you don’t need a new typemap macro for each additional type, you can use the ‘$type’ variable (holding the C type being converted) to create generic rules for multiple types. See next article for an example.)

Then use it in your XS methods to conveniently implement your sanity checks for this annoying C library:

# This is called by the pure-perl constructor, after blessing the hashref
void
_init(objinfo, param1, param2)
  Auto_YourProject_objinfo* objinfo
  IV param1
  IV param2
  PPCODE:
    if (objinfo->obj)
      croak("Already initialized");
    objinfo->obj= SomeLib_create(param1, param2);
    if (!objinfo->obj)
      croak("SomeLib_create failed: %s", SomeLib_get_last_error());
    XSRETURN(0);

bool
_is_initialized(objinfo)
  Maybe_YourProject_objinfo* objinfo
  CODE:
    RETVAL= objinfo != NULL && objinfo->obj != NULL;
  OUTPUT:
    RETVAL

void
setup(objinfo, key, val)
  struct YourProject_objinfo* objinfo
  const char *key
  const char *val
  PPCODE:
    if (objinfo->finished_setup)
      croak("Cannot call 'setup' after 'prepare'");
    if (!SomeLib_setup(objinfo->obj, key, val))
      croak("SomeLib_setup failed: %s", SomeLib_get_last_error());
    objinfo->setup_started= true;
    XSRETURN(0);

void
prepare(objinfo)
  struct YourProject_objinfo* objinfo
  PPCODE:
    if (!objinfo->started_setup)
      croak("Must call setup at least once before 'prepare'");
    objinfo->finished_setup= true;
    if (!SomeLib_prepare(objinfo->obj))
      croak("SomeLib_prepare failed: %s", SomeLib_get_last_error());
    objinfo->did_prepare= true;
    XSRETURN(0);

void
somelib_go(objinfo)
  struct YourProject_objinfo* objinfo
  PPCODE:
    if (!objinfo->did_prepare)
      croak("Must call 'prepare' before 'go'");
    if (!SomeLib_go(objinfo->obj))
      croak("SomeLib_go failed: %s", SomeLib_get_last_error());
    XSRETURN(0);

Like how clean the XS methods got?

Conclusion

When you use the pattern above, your module becomes almost foolproof against misuse. You provide helpful errors for the Perl coder to guide them toward correct usage of the library with easy-to-understand errors (well, depending on how much effort you spend on that) and they don’t have to pull their hair out trying to log all the API calls and compare to the C library documentation to figure out which one happened in the wrong order resulting in a mysterious crash.

The code above is all assuming that the C library is providing objects whose lifespan you are in control of. Many times, the objects from a C library will have some other lifespan that the user can’t directly control with the Perl objects. I’ll cover some techniques for dealing with that in the next article.

Writing git extensions in Perl

dev.to #perl

Published by Juan Julián Merelo Guervós on Monday 10 February 2025 09:34

Introduction

Most people will tell you git is a source control tool; some people will tell you that git is a content-addressable filesystem. It's all that, but the interesting thing is that it's a single-tool interface to frameworks that allow you to create products as a team.
Enter the absolutely simple extension mechanism that git has: write a n executable called git-xxx and git will dutifully call it when you make git xxx. Which is why, to make an easier onramp for students in my 7th-semester class in Computer Science, I created an extension called git iv (IV is the acronym for the class). The extension allows them to create branches with specific names, as well as upload those branches, without needing to remember specific git commands.

You might argue that remembering git commands is what students should do, but in fact they don't, and since this is not part of the core of the class, I prefer to eliminate sources of trouble for them (which eventually become sources of trouble for me) using this.

Writing the extension in Perl

There are many good things that can be said about Perl, for this or for anything else. But in this case there's a thing that makes it ideal for writing extensions: git includes a Perl module called Git, which is a Perl interface to all the Git commands. This is distributed with git, so if you've got git, you've got this library.

The whole extension is not hosted in this GitHub repo; this will contain the most up-to-date version as well as documentation and other stuff.

So here's the preamble to the extension:

use strict;
use warnings;
use lib qw( /Library/Developer/CommandLineTools/usr/share/git-core/perl
            /usr/share/perl5 );
use Git;

use v5.14;

my $HELP_FLAG = "-h";
my $USAGE_STRING = <<EOC;
Uso:
    git iv objetivo <número> -- crea una rama para ese objetivo
    git iv sube-objetivo     -- sube al repo remoto la rama

    git iv $HELP_FLAG -- imprime este mensaje
EOC

The main caveat about the extension is that some flags will be handled by git itself. There are probably quite a few of those, but one of them is --help. git xxx --help will try to look up a manual page for git xxx. This is why above a different help flag is defined. And also a usage string, which is helpful when you don't remember the exact shape of the subcommands. In this case, I use git iv as the extension name and as interface to the stuff that needs to be made; but there are subcommands that will do different things. These are implemented later:

my @subcommands = qw(objetivo sube-objetivo);
push( @subcommands, quotemeta $HELP_FLAG);

die( usage_string() ) unless @ARGV;
my $subcommand = shift;

die "No se reconoce el subcomando $subcommand" unless grep( /\Q$subcommand/, @subcommands );

my @args = @ARGV;

I prefer not to include any dependencies; there are powerful command line flag libraries out there, but in this case, a single script is best. So you handle whatever comes after iv uniformly, be it a subcommand or a flag. But the issue with the flag is that it includes a dash -, so we wrap it so that it can be used safely in regexes. Like the one, for instance, 4 lines below: in case the subcommand at the front of the command line is not part of the list, it will bail out showing the usage string.

Anything after the subcommand will be gobbled into @args.

if ( $subcommand eq $HELP_FLAG ) {
  say $USAGE_STRING;
} else {

  my $repo;

  eval {
    $repo = Git->repository;
  } or die "Aparentemente, no estás en un repositorio";

  if ( $subcommand eq "objetivo" ) {
    die $USAGE_STRING unless @args;
    $repo->command( "checkout", "-b", "Objetivo-" . $args[0]);
  }

  if ( $subcommand eq "sube-objetivo" ) {
    my $branch = $repo->command( "rev-parse", "--abbrev-ref", "HEAD" );
    chomp($branch);
    $repo->command ( "push", "-u", "origin", $branch );
  }
}

Now it's a matter of processing the subcommand. If it's the flag -h, print the usage string; if it's any of the other subcommands, we need to work with the git repository.

$repo = Git->repository; creates an object out of the Git library we mentioned before that we will use to issue the different plumbing or high level commands. One of the subcommands will do a checkout: $repo->command( "checkout", "-b", "Objetivo-" . $args[0]); will convert itself to the equivalent command. You can even work with plumbing commands such as rev-parse to check the branch you're in and create that branch remotely, ad the other command does.

Concluding

Perl saves you a whole lot of trouble when writing this kind of thing. Besides, the fact that it will be most probably be installed in any system you use to develop (Mac, Linux or WSL) will save you trouble asking for prerequisites for this script.

perl

dev.to #perl

Published by RAK on Monday 10 February 2025 08:10

perl rest web service client

dev.to #perl

Published by RAK on Monday 10 February 2025 08:07

The Weekly Challenge - 308

The Weekly Challenge

Published on Monday 10 February 2025 04:51

Welcome to the Week #308 of The Weekly Challenge.

RECAP - The Weekly Challenge - 307

The Weekly Challenge

Published on Monday 10 February 2025 03:34

Thank you Team PWC for your continuous support and encouragement.

The Weekly Challenge - Guest Contributions

The Weekly Challenge

Published on Sunday 09 February 2025 18:49

As you know, The Weekly Challenge, primarily focus on Perl and Raku. During the Week #018, we received solutions to The Weekly Challenge - 018 by Orestis Zekai in Python. It was pleasant surprise to receive solutions in something other than Perl and Raku. Ever since regular team members also started contributing in other languages like Ada, APL, Awk, BASIC, Bash, Bc, Befunge-93, Bourne Shell, BQN, Brainfuck, C3, C, CESIL, Chef, COBOL, Coconut, C Shell, C++, Clojure, Crystal, D, Dart, Dc, Elixir, Elm, Emacs Lisp, Erlang, Excel VBA, F#, Factor, Fennel, Fish, Forth, Fortran, Gembase, GNAT, Go, GP, Groovy, Haskell, Haxe, HTML, Hy, Idris, IO, J, Janet, Java, JavaScript, Julia, K, Kap, Korn Shell, Kotlin, Lisp, Logo, Lua, M4, Maxima, Miranda, Modula 3, MMIX, Mumps, Myrddin, Nelua, Nim, Nix, Node.js, Nuweb, Oberon, Octave, OCaml, Odin, Ook, Pascal, PHP, Python, PostgreSQL, Postscript, PowerShell, Prolog, R, Racket, Rexx, Ring, Roc, Ruby, Rust, Scala, Scheme, Sed, Smalltalk, SQL, Standard ML, SVG, Swift, Tcl, TypeScript, Uiua, V, Visual BASIC, WebAssembly, Wolfram, XSLT, YaBasic and Zig.

(dxxxiv) 12 great CPAN modules released last week

Niceperl

Published by Unknown on Saturday 08 February 2025 23:42

Updates for great CPAN modules released last week. A module is considered great if its favorites count is greater or equal than 12.

  1. App::Netdisco - An open source web-based network management tool.
    • Version: 2.083001 on 2025-02-06, with 17 votes
    • Previous CPAN version: 2.082001 was 8 days before
    • Author: OLIVER
  2. Crypt::Passphrase - A module for managing passwords in a cryptographically agile manner
    • Version: 0.021 on 2025-02-04, with 17 votes
    • Previous CPAN version: 0.020 was 25 days before
    • Author: LEONT
  3. CryptX - Cryptographic toolkit
    • Version: 0.085 on 2025-02-08, with 51 votes
    • Previous CPAN version: 0.084 was 3 months, 23 days before
    • Author: MIK
  4. Imager - Perl extension for Generating 24 bit Images
    • Version: 1.026 on 2025-02-08, with 67 votes
    • Previous CPAN version: 1.025 was 2 months, 22 days before
    • Author: TONYC
  5. IO::Prompter - Prompt for input, read it, clean it, return it.
    • Version: 0.005002 on 2025-02-07, with 27 votes
    • Previous CPAN version: 0.005001 was 1 year, 6 months, 22 days before
    • Author: DCONWAY
  6. Mozilla::CA - Mozilla's CA cert bundle in PEM format
    • Version: 20250202 on 2025-02-02, with 19 votes
    • Previous CPAN version: 20240924 was 4 months, 8 days before
    • Author: LWP
  7. PerlPowerTools - BSD utilities written in pure Perl
    • Version: 1.049 on 2025-02-06, with 40 votes
    • Previous CPAN version: 1.048 was 1 month, 28 days before
    • Author: BRIANDFOY
  8. Rex - the friendly automation framework
    • Version: 1.16.0 on 2025-02-05, with 86 votes
    • Previous CPAN version: 1.15.0 was 3 months before
    • Author: FERKI
  9. SPVM - The SPVM Language
    • Version: 0.990043 on 2025-02-07, with 35 votes
    • Previous CPAN version: 0.990042 was 16 days before
    • Author: KIMOTO
  10. Sys::Virt - libvirt Perl API
    • Version: v11.0.0 on 2025-02-07, with 17 votes
    • Previous CPAN version: v10.9.0 was 3 months, 6 days before
    • Author: DANBERR
  11. Test::Warnings - Test for warnings and the lack of them
    • Version: 0.038 on 2025-02-02, with 18 votes
    • Previous CPAN version: 0.037 was 28 days before
    • Author: ETHER
  12. YAML::LibYAML - Perl YAML Serialization using XS and libyaml
    • Version: v0.903.0 on 2025-02-02, with 57 votes
    • Previous CPAN version: v0.902.0 was 4 months, 12 days before
    • Author: TINITA

(dc) metacpan weekly report - Perlmazing

Niceperl

Published by Unknown on Saturday 08 February 2025 23:39

This is the weekly favourites list of CPAN distributions. Votes count: 48

Week's winners (+3): Perlmazing 

Build date: 2025/02/08 22:37:34 GMT


Clicked for first time:


Increasing its reputation:

The Weekly Challenge - 307

The Weekly Challenge

Published on Friday 07 February 2025 13:17

Welcome to the Week #307 of The Weekly Challenge.

RECAP - The Weekly Challenge - 306

The Weekly Challenge

Published on Friday 07 February 2025 13:16

Thank you Team PWC for your continuous support and encouragement.

Announce Perl.Wiki.html V 1.23

blogs.perl.org

Published by Ron Savage on Friday 07 February 2025 03:41

Available now: Perl.Wiki.html V 1.23.

As from today, when I announce a new version of one of my TiddlyWikis, I'll list the 2 most recent parts of the change log. These are just copies of the most recent parts of the para called VersionInfo.

And why 2 parts? To help those who missed the last announcement.

o Author:
- Ron Savage
- Melbourne, Victoria, Australia
- http://savage.net.au
- https://symboliciq.au
- https://quantumiq.au
- https://metacpan.org/author/RSAVAGE

o version:
- Perl extension for Version Objects
- https://metacpan.org/dist/versionhttps://metacpan.org/dist/version

o V 1.23:
- 2025-02-07
- Add https://quantumiq.au to the Author block above
- Add Advanced::Config to ConfigFiles
- Add AnyEvent::Net::Curl::Queued to EventStuff
- Add App::Dapper to TemplateStuff
- Add para BusinessApps with Business::ISBN and Business::ISSN
- Add Check::NetworkSpans to NetworkProgramming
- Add Colouring::In::XS to Color
- Add Convert::CEGH, Convert::Number::Coptic, Convert::Number::Digits, Convert::Number::Roman, String::Equivalence::Amharic, Text::DoubleMetaphone, Text::Metaphone, Text::Metaphone::Amharic, Text::Phonetic::Metaphone and Text::TransMetaphone to LinguaStuff
- Add Coro, Coro::LWP, Coro::Multicore and Coro::Select to ParallelProcessing
- Add Crypt::Credentials to CryptoStuff
- Add Data::Random, Data::SimplePassword, Date::Range, Date::Simple, Mock::Data and Mock::Populate to TestingHelp
- Add Data::Record::Serialize, DBD::Firebird and DBIx::Migration to DatabaseAndSQL
- Add Geo::Coder::US::Census and Geo::StreetAddress::US to GeographicStuff
- Add Image::Dot to ChartingAndPlotting
- Add JSON::Schema::Modern to JsonStuff
- Add HTTP::Request::FromCurl and LWP::Protocol::Net::Curl to HTTPHandling
- Add para LargeLanguageModels containing Ollama and WebService::Ollama
- Add Mozilla::CA to BrowserStuff
- Add Net::Domain::Parts to DnsStuff
- Add Parser::FIT to DataTraversal
- Add Statistics::Distributions to StatisticsStuff
- Add Text::Password::Pronounceable to UsernamePassword
- Add Text::Unidecode to UTF8
- Add Time::Local and Time::Precise to DatesAndTimes

o V 1.22:
- 2025-01-20
- Add Algorithm::CurveFit, App::bookmarks, App::remotediff, App::uricolor, Catmandu, Data::Match, Data::DRef, Data::Difference, Data::Peek, Data::Walker and DBIx::Array to DataTraversal
- Add App::geoip and Geo::Spline to GeographicStuff
- Add App::subsort and Sort::Sub to SortStuff
- Add Array::Transpose,Math::MatrixBool and Math::MatrixReal to ArrayHandling
- Add Bit::Vector and Set::CrossProduct to SetStuff
- Add Config::General to ConfigFiles
- Add Control::CLI and Control::CLI::Extreme to NetworkProgramming
- Add CPAN::UnsupportedFinder to CPAN
- Add para CpanFile, on how to generate a cpanfile. Add CPAN::Audit
- Add Crypt::Komihash, Crypt::URandom, Crypt::URandom::Token, Rand::URandom, Random::Any and Sys::GetRandom to RandomStuff
- Add Crypt::Passphrase::Yescrypt, Crypt::SysRandom and Crypt::Yescrypt to CryptoStuff
- Add App::Sqitch, Class::Phrasebook::SQL, Data::Phrasebook, Fey, Fey::ORM, MySQL::Hi, Rose::DB::Object, SQL::Library and WebService::Chroma to DatabaseAndSQL
- Add Data::Identifier to DataTypes
- Add DateTime::Format::Text, DateTime::TimeZone, DateTimeX::Easy and Oxford::Calendar to DatesAndTimes
- Add DBIx::Array::Connect, DBIx::MyPassword, DBIx::Password, DBIx::PasswordIniFile and Pass::OTP to UsernamePassword
- Add Devel::REPL, perldb.pl, Reply and Runtime::Debugger to DebuggingStuff
- Add para DFA with module DFA::Kleene
- Add para EnterpriseApps with EAI::Wrap
- Add Feersum to EventStuff
- Add File::Information to FileHandling
- Add Function::Parameters to ParameterTypes
- Add Genealogy::FindaGrave in GenealogyStuff
- Add para GrammarProcessing with module Grammar::Improver
- Add Hash::Match to HashHandling
- Add HTML::D3 to ChartingAndPlotting
- Add HTTP::Parser::XS to HTTPHandling
- Add para HTTPServers with modules Furl, Plack, PSGI, Starlet and Starman, and mention nginx and picohttpparser
- Add Graph::Kruskal to Graphs
- Add JE to JavaScript
- Add JSON_LD, Schema.org and Rose::HTML::Objects to HTMLHandling
- Add Locale::TextDomain and Locale::TextDomain::OO::Extract to LocaleStuff
- Add Log::Dispatchouli to LoggingStuff
- Add para SecurityPolicy
- Add Module::LicenseScanner to LicenceStuff
- Add Pod::Perldoc::ToToc to PodHandling
- Add Poz to DataValidation
- Add Regexp::Compare and Regex::Range::Number to RegexpStuff. Expand some entries
- Add String::Sprintf to StringStuff
- Add Surveyor::App to BenchmarkingTools
- Add Syntax::Feature::Loop, Syntax::Feature::QwComments and Syntax::Feature::Void to NewOperators
- Add Test::File, Test::HTTPStatus, Test::Mockingbird and Test::Output to TestingHelp
- Add Text::HTML::Turndown to MarkdownHandling
- Add Text::Table::Read::RelationOn::Tiny to TableData
- Add Time::OlsonTZ::Data to DatesAndTimes
- Add Unicode::GCString to UTF8
- Add VAPID to WebPush and add details for HTTP::Request::Webpush
- Add para WorkFlows and module Workflow
- Add XS::Parse::Sublike and Sublike::Extended to XS
- Fold contents of SecretManagement into UsernamePassword
- Rename para RegexpExamples to RegexpStuff
- Update para builtins

A view upward toward the wooden framing of a house under construction against a blue sky.

Recently I’ve been working on a project with a Vue front-end and two back-ends, one in Python using the Django framework and one in Perl using the Mojolicious framework. So, it’s a good time to spend some words to share the experience and do a quick comparison.

Previously I wrote a post about Perl web frameworks, and now I’m expanding the subject into another language.

Django was chosen for this project because it’s been around for almost 20 years now and provides the needed maturity and stability to be long-running and low-budget. In this regard, it has proved a good choice so far. Recently it saw a major version upgrade without any problems to speak of. It could be argued that I should have used the Django REST Framework instead of plain Django. However, at the time the decision was made, adding a framework on top of another seemed a bit excessive. I don’t have many regrets about this, though.

Mojolicious is an old acquaintance. It used to have fast-paced development but seems very mature now, and it’s even been ported to JavaScript.

Both frameworks have just a few dependencies (which is fairly normal in the Python world, but not in the Perl one) and excellent documentation. They both follow the model-view-controller pattern. Let’s examine the components.

Views

Both frameworks come with a built-in template system (which can be swapped out with something else), but in this project we can skip the topic altogether as both frameworks are used only as back-end for transmitting JSON, without any HTML rendering involved.

However, let’s see how the rendering looks for the API we’re writing.

use Mojo::Base 'Mojolicious::Controller', -signatures;
sub check ($self) {
    $self->render(json => { status => 'OK' });
}
from django.http import JsonResponse
def status(request):
    return JsonResponse({ "status":  "OK" })

Nothing complicated here, just provide the right call.

Models

Django

Usually a model in context of web development means a database and here we are going to keep this assumption.

Django comes with a comprehensive object-relational mapping (ORM) system and it feels like the natural thing to use. I don’t think it makes much sense to use another ORM, or even to use raw SQL queries (though it is possible).

You usually start a Django project by defining the model. The Django ORM gives you the tools to manage the migrations, providing abstraction from the SQL. You need to define the field types and the relationships (joins and foreign keys) using the appropriate class methods.

For example:

from django.db import models
class User(AbstractUser):
    email = models.EmailField(null=False, blank=False)
    site = models.ForeignKey(Site, on_delete=models.CASCADE, related_name="site_users")
    libraries = models.ManyToManyField(Library, related_name="affiliated_users")
    expiration = models.DateTimeField(null=True, blank=True)
    created = models.DateTimeField(auto_now_add=True)
    last_modified = models.DateTimeField(auto_now=True)

These calls provide not only the SQL type to use, but also the validation. For example, the blank parameter is a validation option specifying whether Django will accept an empty value. It is different from the null option, which directly correlates to SQL. You can see we’re quite far from working with SQL, at least two layers of abstraction away.

In the example above, we’re also defining a foreign key between a site and a user (many-to-one), so each user belongs to one site. We also define a many-to-many relationship with the libraries record. I like how these relationships are defined, it’s very concise.

Thanks to these definitions, you get a whole admin console almost for free, which your admin users are sure to like. However, I’m not sure this is a silver bullet for solving all problems. With large tables and relationships the admin pages load slowly and they could become unusable very quickly. Of course, you can tune that by filtering out what you need and what you don’t, but that means things are not as simple as “an admin dashboard for free” — at the very least, there’s some configuring to do.

As for the query syntax, you usually need to call Class.objects.filter(). As you would expect from an ORM, you can chain the calls and finally get objects out of that, representing a database row, which, in turn, you can update or delete.

The syntax for the filter() call is based on the double underscore separator, so you can query over the relationships like this:

for agent in (Agent.objects.filter(canonical_agent_id__isnull=False)
              .prefetch_related('canonical_agent')
              .order_by('canonical_agent__name', 'name')
              .all()):
    agent.name = "Dummy"
    agent.save()

In this case, provided that we defined the foreign keys and the attributes in the model, we can search/​order across the relationship. The __isnull suffix, as you can imagine, results in a WHERE canonical_agent_id IS NOT NULL query, while in the order_by call we sort over the joined table using the name column. Looks nice and readable, with a touch of magic.

Of course things are never so simple, so you can build complex queries with the Q class combined with bytewise operators (&, |).

Here’s an example of a simple case-insensitive search for a name containing multiple words:

from django.db.models import Q

def api_list(request)
    term = request.GET.get('search')
    if term
        words = [ w for w in re.split(r'\W+', term) if w ]
        if words:
            query = Q(name__icontains=words.pop())
            while words:
                query = query & Q(name__icontains=words.pop())
            # logger.debug(query)
            agents = Agent.objects.filter(query).all()

To sum up, the ORM is providing everything you need to stay away from the SQL. In fact, it seems like Django doesn’t like you doing raw SQL queries.

Mojolicious and Perl

In the Perl world things are a bit different.

The Mojolicious tutorial doesn’t even mention the database. You can use any ORM or no ORM at all, if you prefer so. However, Mojolicious makes the DB handle available everywhere in the application.

You could use DBIx::Connector, DBIx::Class, Mojo::Pg (which was developed with Mojolicious), or whatever you prefer.

For example, to use Mojo::Pg in the main application class:

package MyApp;
use Mojo::Base 'Mojolicious', -signatures;
use Mojo::Pg;
use Data::Dumper::Concise;

sub startup ($self) {
    my $config = $self->plugin('NotYAMLConfig');
    $self->log->info("Starting up with " . Dumper($config));
    $self->helper(pg => sub {
                      state $pg = Mojo::Pg->new($config->{dbi_connection_string});
                  });

In the routes you can call $self->pg to get the database object.

The three approaches I’ve mentioned here are different.

DBIx::Connector is basically a way to get you a safe DBI handle across forks and DB connection failures.

Mojo::Pg gives you the ability to do abstract queries but also gives some convenient methods to get the results. I wouldn’t call it a ORM; from a query you usually gets hashes, not objects, you don’t need to define the database layout, and it won’t produce migrations for you, though there is some migration support.

Here’s an example of standard and abstract queries:

sub list_texts ($self) {
    if (my $sid = $self->param('sid')) {
        my $sql = 'SELECT * FROM texts WHERE sid = ? ORDER BY sorting_index';
        @all = $self->pg->db->query($sql, $sid)->hashes->each;
    }
    $self->render(json => { texts => \@all });

The query above can be rewritten with an abstract query, using the same module.

@all = $self->pg->db->select(texts => undef,
                             { sid => $sid },
                             { order_by => 'sorting_index' })->hashes->each;

If it’s a simple, static query, it’s basically a matter of taste; do you prefer to see the SQL or not? The second version is usually nicer if you want to build a different query depending on the parameters, so you add or remove keys to the hashes which maps to query and finally execute it.

Now, speaking of taste, for complex queries with a lot of joins I honestly prefer to see the SQL query instead of wondering if the abstract one is producing the correct SQL. This is true regardless of the framework. I have the impression that it is faster, safer, and cleaner to have the explicit SQL in the code rather than leaving future developers (including future me) to wonder if the magic is happening or not.

Finally, nothing stops you from using DBIx::Class, which is the best ORM for Perl, even if it’s not exactly light on dependencies.

It’s very versatile, it can build queries of arbitrary complexity, and you usually get objects out of the queries you make. It doesn’t come with an admin dashboard, it doesn’t enforce the data types and it doesn’t ship any validation by default (of course, you can implement that manually). The query syntax is very close to the Mojo::Pg one (which is basically SQL::Abstract).

The gain here is that, like in Django’s ORM, you can attach your methods to the classes representing the rows, so the data definitions live with the code operating on them.

However, the fact that it builds an object for each result means you’re paying a performance penalty which sometimes can be very high. I think this is a problem common to all ORMs, regardless of the language and framework you’re using.

The difference with Django is that once you have chosen it as your framework, you are basically already sold to the ORM. With Mojolicious and other Perl frameworks (Catalyst, Dancer), you can still make the decision and, at least in theory, change it down the road.

My recommendation would be to keep the model, both code and business logic, decoupled from the web-specific code. This is not really doable with Django, but is fully doable with the Perl frameworks. Just put the DB configuration in a dedicated file and the business code in appropriate classes. Then you should be able to, for example, run a script without loading the web and the whole framework configuration. In this ideal scenario, the web framework just provides the glue between the user and your model.

Controllers

Routes are defined similarly between Django and Mojolicious. Usually you put the code in a class and then point to it, attaching a name to it so you can reference it elsewhere. The language is different, the style is different, but they essentially do the same thing.

Django:

from django.urls import path
from . import views
urlpatterns = [
    path("api/agents/<int:agent_id>", views.api_agent_view, name="api_agent_view"),
]

The function views.api_agent_view will receive the request with the agent_id as a parameter.

Mojolicious:

sub startup ($self) {
    # ....
    my $r = $self->routes;
    $r->get('/list/:sid')->to('API#list_texts')->name('api_list_texts');
}

The ->to method is routing the request to the Myapp::Controller::API::list_texts, which will receive the request with the sid as parameter.

This is pretty much the core business of every web framework: routing a request to a given function.

Mojolicious has also the ability to chain the routes (pretty much taken from Catalyst). The typical use is authorization:

sub startup ($self) {
    ...
    my $r = $self->routes;
    my $api = $r->under('/api/v1', sub ($c) {
        if ($c->req->headers->header('X-API-Key') eq 'testkey') {
            return 1;
        }
        $c->render(text => 'Authentication required!', status => 401);
        return undef;
    }
    $api->get('/check')->to('API#check')->name('api_check');

So the request to /api/v1/check will first go in the first block and the chain will abort if the API key is not set in the header. Otherwise it will proceed to run the API module’s check function.

Conclusion

I’m Perl guy and so I’m a bit biased toward Mojolicious, but I also have a pragmatic approach to programming. Python is widely used — they teach it in schools — while Perl is seen as old-school, if not dead (like all the mature technologies). So, Python could potentially attract more developers to your project, and this is important to consider.

Learning a new language like Python is not a big leap; it and Perl are quite similar despite the different syntax. I’d throw Ruby in the same basket.

Of course both languages provide high quality modules you can use, and these two frameworks are an excellent example.

Building a Simple Web Scraper with Perl

Perl on Medium

Published by Mayur Koshti on Tuesday 04 February 2025 17:33

Extracting Specific Data from HTML Elements

What's new on CPAN - December 2024

perl.com

Published on Monday 03 February 2025 09:00

Welcome to “What’s new on CPAN”, a curated look at last month’s new CPAN uploads for your reading and programming pleasure. Enjoy!

APIs & Apps

  • Automatically generate changelogs based on Git commit history with App::Changelog (OLOOEEZ)
  • Webservice::Sendy::API (OODLER) provides an interface to the Sendy e-mail marketing service, with the purpose of superseding a comparable module that is no longer maintained
  • Manage standup preparation and presentation notes with App::Standup::Diary (SMONFF)
  • Bluesky (SANKO) provides a high-level interface to the Bluesky social network
  • App::datasection (PLICEASE) lets you manage the DATA section of source files from the command line
  • Repeat a command an arbitrary number of times using App::repeat (PERLANCAR)

Config & Devops

Data

Development & Version Control

  • Programmatically update the DATA section of source files with Data::Section::Writer (PLICEASE). Compatible with formats of many DATA section-reading modules
  • Use Test::SpellCheck (PLICEASE) to spellcheck POD within your tests
  • Tie::Hash::DataSection (PLICEASE) lets you access the DATA section of source files via tied hash

Language & International

Science & Mathematics

Web

Proposed Perl Changes (part 2)

Perl Hacks

Published by Dave Cross on Sunday 02 February 2025 17:18

At the end of my last post, we had a structure in place that used GitHub Actions to run a workflow every time a change was committed to the PPC repository. That workflow would rebuild the website and publish it on GitHub Pages.

All that was left for us to do was to write the middle bit – the part that actually takes the contents of the repo and creates the website. This involves writing some Perl.

There are three types of pages that we want to create:

  • The PPCs themselves, which are in Markdown and need to be converted to HTML pages
  • There are a few other pages that describe the PPC process, also in Markdown, which should be converted to HTML
  • An index page which should contain links to the other pages. This page should include a table listing various useful details about the PPCs so visitors can quickly find the ones they want more information on

I’ll be using the Template Toolkit to build the site, with a sprinkling of Bootstrap to make it look half-decent. Because there is a lot of Markdown-to-HTML conversion, I’ll use my Template::Provider::Pandoc module which uses Pandoc to convert templates into different formats.

Parsing PPCs and extracting data

The first thing I did was parse the PPCs themselves, extracting the relevant information. Luckily, each PPC has a “preamble” section containing most of the data we need. I created a basic class to model PPCs which included a really hacky parser to extract this information and create a object of the class.

Building the site

This class abstracts away a lot of the complexity which means the program that actually builds the site is less than eighty lines of code. Let’s look at it in a bit more detail:

#!/usr/bin/perl

use v5.38;
use JSON;
use File::Copy;
use Template;
use Template::Provider::Pandoc;

use PPC;

There’s nothing unusual in the first few lines. We’re just loading the modules we’re using. Note that use v5.38 automatically enables strict and warnings, so we don’t need to load them explicitly.

my @ppcs;

my $outpath = './web';
my $template_path = [ './ppcs', './docs', './in', './ttlib' ];

Here, we’re just setting up some useful variables. @ppcs will contain the PPC objects that we create. One potential clean-up here is to reduce the size of that list of input directories.

my $base = shift || $outpath;
$base =~ s/^\.//;
$base = "/$base" if $base !~ m|^/|;
$base = "$base/" if $base !~ m|/$|;

This is a slightly messy hack that is used to set a <base> tag in the HTML.

my $provider = Template::Provider::Pandoc->new({
  INCLUDE_PATH => $template_path,
});

my $tt = Template->new({
  LOAD_TEMPLATES => [ $provider ],
  INCLUDE_PATH => $template_path,
  OUTPUT_PATH => $outpath,
  RELATIVE => 1,
  WRAPPER => 'page.tt',
  VARIABLES => {
    base => $base,
  }
});

Here, we’re setting up our Template Toolkit processor. Some of you may not be familiar with using a Template provider module. These modules change how TT retrieves templates: if the template has an .md extension, then the text is passed though Pandoc to convert it from Markdown to HTML before it’s handed to the template processor. It’s slightly annoying that we need to pass the template include path to both the provider and the main template engine.

for (<ppcs/*.md>) {
  my $ppc = PPC->new_from_file($_);
  push @ppcs, $ppc;

  $tt->process($ppc->in_path, {}, $ppc->out_path)
    or warn $tt->error;
}

This is where we process the actual PPCs. For each PPC we find in the /ppcs directory, we create a PPC object, store that in the @ppcs variable and process the PPC document as a template – converting it from Markdown to HTML and writing it to the /web directory.

my $vars = {
  ppcs => \@ppcs,
};

$tt->process('index.tt', $vars, 'index.html')
  or die $tt->error;

Here’s where we process the index.tt file to generate the index.html for our site. Most of the template is made up of a loop over the @ppcs variable to create a table of the PPCs.

for (<docs/*.md>) {
  s|^docs/||;
  my $out = s|\.md|/index.html|r;

  $tt->process($_, {}, $out)
    or die $tt->error;
}

There are a few other documents in the /docs directory describing the PPC process. So in this step, we iterate across the Markdown files in that directory and convert each of them into HTML. Unfortunately, one of them is the template.md which is intended to be used as the template for new PPCs – so it would be handy if that one wasn’t converted to HTML. That’s something to think about in the future.

mkdir 'web/images';
for (<images/*>) {
  copy $_, "web/$_";
}

if (-f 'in/style.css') {
  copy 'in/style.css', 'web/style.css';
}

if (-f 'CNAME') {
  copy 'CNAME', "web/CNAME";
}

We’re on the home straight now. And this section is a bit scrappy. You might recall from the last post that we’re building the website in the /web directory. And there are a few other files that need to be copied into that directory in order that they are then deployed to the web server. So we just copy files. You might not know what a CNAME file is – it’s the file that GitHub Pages uses to tell their web server that you’re serving your website from a custom domain name.

my $json = JSON->new->pretty->canonical->encode([
  map { $_->as_data } @ppcs
]);

open my $json_fh, '>', 'web/ppcs.json' or die $!;

print $json_fh $json;

And, finally, we generate a JSON version of our PPCs and write that file to the /web directory. No-one asked for this, but I thought someone might find this data useful. If you use this for something interesting, I’d love to hear about it.

Other bits and pieces

A few other bits and pieces to be aware of.

  • I use a page wrapper to ensure that every generated page has a consistent look and feel
  • The navigation in the page wrapper is hard-coded to contain links to the pages in /docs. It would make sense to change that so it’s generated from the contents of that directory
  • I used a Javascript project called Simple Datatables to turn the main table into a data table. That means it’s easy to sort, page and filter the data that’s displayed
  • There’s a basic hack that hides the email addresses when they appear in the main table. But it’s currently not applied to the PPC pages themselves. I’ve idly contemplated writing a TT filter that would be called something like Template::Filter::RemoveEmailAddresses

In conclusion

But there you are. That’s the system that I knocked together in a few hours a couple of weeks ago. As I mentioned in the last post, the idea was to make the PPC process more transparent to the Perl community outside of the Perl 5 Porters and the Perl Steering Council. I hope it achieves that and, further, I hope it does so in a way that keeps out of people’s way. As soon as someone updates one of the documents in the repository, the workflow will kick in and publish a new version of the website. There are a few grungy corners of the code and there are certainly some improvements that can be made. I’m hoping that once the pull request is merged, people will start proposing new pull requests to add new features.

The post Proposed Perl Changes (part 2) first appeared on Perl Hacks.

(dxxxiii) 6 great CPAN modules released last week

Niceperl

Published by Unknown on Saturday 01 February 2025 21:49

Updates for great CPAN modules released last week. A module is considered great if its favorites count is greater or equal than 12.

  1. App::DBBrowser - Browse SQLite/MySQL/PostgreSQL databases and their tables interactively.
    • Version: 2.423 on 2025-01-27, with 14 votes
    • Previous CPAN version: 2.422 was 4 days before
    • Author: KUERBIS
  2. App::Netdisco - An open source web-based network management tool.
    • Version: 2.082001 on 2025-01-29, with 17 votes
    • Previous CPAN version: 2.081004 was 10 days before
    • Author: OLIVER
  3. Crypt::JWT - JSON Web Token
    • Version: 0.036 on 2025-01-26, with 26 votes
    • Previous CPAN version: 0.035 was 1 year, 3 months, 23 days before
    • Author: MIK
  4. IO::Interactive - Utilities for interactive I/O
    • Version: 1.026 on 2025-01-26, with 16 votes
    • Previous CPAN version: v0.0.3 was 18 years, 11 months, 9 days before
    • Author: BRIANDFOY
  5. Text::CSV_XS - Comma-Separated Values manipulation routines
    • Version: 1.60 on 2025-01-31, with 102 votes
    • Previous CPAN version: 1.59 was 26 days before
    • Author: HMBRAND
  6. Unicode::Tussle - Tom's Unicode Scripts So Life is Easier
    • Version: 1.121 on 2025-01-29, with 13 votes
    • Previous CPAN version: 1.119 was 26 days before
    • Author: BRIANDFOY

Enhancing your MIDI devices with Perl

perl.com

Published on Wednesday 29 January 2025 00:00

This article was originally published at fuzzix.org.

Introduction

These days, even modestly priced MIDI hardware comes stuffed with features. These features may include a clock, sequencer, arpeggiator, chord voicing, Digital Audio Workstation (DAW) integration, and transport control.

Fitting all this into a small device’s form factor may result in some amount of compromise — perhaps modes aren’t easily combined, or some amount of menu diving is required to switch between modes. Your device may even lack the precise functionality you require.

This post will walk through the implementation of a pair of features to augment those found in a MIDI keyboard — a M-Audio Oxygen Pro 61 in this case, though the principle should apply to any device.

Feature 1 : Pedal Tone

A pedal tone (or pedal note, or pedal point) is a sustained single note, over which other potentially dissonant parts are played. A recent video by Polarity Music opened with some exploration of using a pedal tone in Bitwig Studio to compose progressions. In this case, the pedal tone was gated by the keyboard, and the fifth interval of the played note was added resulting in a three note chord for a single played note. This simple setup resulted in some dramatic progressions.

There are, of course, ways to achieve this effect in other DAW software. I was able to use FL Studio’s Patcher to achieve a similar result with two instances of VFX Key Mapper:

FL Studio Patcher with MIDI input routed to FLEX and two instances of VFX Key Mapper

One instance of VFX Key Mapper transposes the incoming note by 7 semitones. The other will replace any incoming note. Alongside the original note, these mappers are routed to FLEX with a Rhodes sample set loaded. It sounds like this (I’m playing just one or two keys at a time here):

A similar method can be used to patch this in other modular environments. In VCV Rack, a pair of quantizers provide the fifth-note offset and pedal tone signals. The original note, the fifth, and the pedal tone are merged and sent to the Voltage Controlled Oscillator (VCO). The gate signal from the keyboard triggers an envelope to open the Voltage Controlled Amplifier (VCA) and Voltage Controlled Filter (VCF).

VCV Rack with the patch described above

This patch is a little less flexible than the FL Studio version — further work is required to support playing multiple notes on the keyboard, for example.

The FL Studio version also has a downside. The played sequence only shows the played notes in the piano roll, not the additional fifth and pedal tone. Tweaking timing and velocity, or adding additional melody is not trivial - any additional notes in the piano roll will play three notes in the Patcher instrument.

If we could coax our MIDI device into producing these additional notes, there would be no need for tricky patching plus we might end up with a more flexible result.

Perl Tone

The approach described here will set up a new software-defined MIDI device which will proxy events from our hardware, while applying any number of filters to events before they are forwarded. These examples will make use of Perl bindings to RtMidi.

We’re going to need a little bit of framework code to get started. While the simplest RtMidi callback examples just sleep to let the RtMidi event loop take over, we may wish to schedule our own events later. I went into some detail previously on Perl, IO::Async, and the RtMidi event loop.

The framework will need to set up an event loop, manage two or more MIDI devices, and store some state to influence decision-making within filter callback functions. Let’s start with those:

use v5.40;
use experimental qw/ class /;

class MidiFilter {
    field $loop       = IO::Async::Loop->new;
    field $midi_ch    = IO::Async::Channel->new;
    field $midi_out   = RtMidiOut->new;
    field $input_name = $ARGV[0];
    field $filters    = {};
    field $stash      = {};

Aside from our event $loop and $midi_out device, there are fields for getting $input_name from the command line, a $stash for communication between callbacks and a store for callback $filters. The callback store will hold callbacks keyed on MIDI event names, e.g. “note_on”. The channel $midi_ch will be used to receive events from the MIDI input controller.

Methods for creating new filters and accessing the stash are as follows:

    method add_filter( $event_type, $action ) {
        push $filters->{ $event_type }->@*, $action;
    }

    method stash( $key, $value = undef ) {
        $stash->{ $key } = $value if defined $value;
        $stash->{ $key };
    }

Adding a filter requires an event type, plus a callback. Callbacks are pushed into $filters for each event type in the order they are declared. If a $value is supplied while accessing the stash, it will be stored for the given $key. The value for the given $key is returned in any case.

Let’s add some methods for sending MIDI events:

    method send( $event ) {
        $midi_out->send_event( $event->@* );
    }

    method delay_send( $delay_time, $event ) {
        $loop->add(
            IO::Async::Timer::Countdown->new(
                delay => $delay_time,
                on_expire => sub { $self->send( $event ) }
            )->start
        )
    }

The send method simply passes the supplied $event to the configured $midi_out device. The delay_send method does the same thing, except it waits for some specified amount of time before sending.

Methods for filtering incoming MIDI events are as follows:

    method _filter_and_forward( $event ) {
        my $event_filters = $filters->{ $event->[0] } // [];

        for my $filter ( $event_filters->@* ) {
            return if $filter->( $self, $event );
        }

        $self->send( $event );
    }

    async method _process_midi_events {
        while ( my $event = await $midi_ch->recv ) {
            $self->_filter_and_forward( $event );
        }
    }

These methods are denoted as “private” via the ancient mechanism of “Add an underscore to the start of the name to indicate that this method shouldn’t be used”. The documentation for Object::Pad (which acts as an experimental playground for perl core class features) details the lexical method feature, which allows for block scoped methods unavailable outside the class. The underscore technique will serve us for now.

The _process_midi_events method awaits receiving a message, passing each message received to _filter_and_forward. The _filter_and_forward method retrieves callbacks for the current event type (The first element of the $event array) and delegates the event to the available callbacks. If no callbacks are available, or if none of the callbacks return true, the event is forwarded to the MIDI output device untouched.

The final pieces are the setup of MIDI devices and the communications channel:

    method _init_out {
        return $midi_out->open_port_by_name( qr/loopmidi/i )
            if ( grep { $^O eq $_ } qw/ MSWin32 cygwin / );

        $midi_out->open_virtual_port( 'Mister Fancy Pants' );
    }

    method go {
        my $midi_rtn = IO::Async::Routine->new(
            channels_out => [ $midi_ch ],
            code => sub {
                my $midi_in = RtMidiIn->new;
                $midi_in->open_port_by_name( qr/$input_name/i ) ||
                    die "Unable to open input device";

                $midi_in->set_callback_decoded(
                    sub( $ts, $msg, $event, $data ) {
                        $midi_ch->send( $event );
                    }
                );

                sleep;
            }
        );
        $loop->add( $midi_rtn );
        $loop->await( $self->_process_midi_events );
    }

    ADJUST {
        $self->_init_out;
    }

The _init_out method takes care of some shortcomings in Windows MIDI, which does not support the creation of virtual ports. On this platform messages will be routed via loopMIDI. On other platforms the virtual MIDI port “RtMidi Output Client:Mister Fancy Pants” is created. The ADJUST block assures this is done during construction of the MidiFilter instance.

The go method creates a routine which instantiates a RtMidi instance, and connects to the hardware MIDI device specified on the command line. A callback is created to send incoming events over the communications channel, then we simply sleep and allow RtMidi’s event loop to take over the routine.

The final step is to await _process_midi_events, which should process events from the hardware until the program is terminated.

Writing Callbacks

Callbacks are responsible for managing the stash, and sending filtered messages to the output device. A callback receives the MidiFilter instance and the incoming event.

In order to implement the pedal tone feature described earlier, we need to take incoming “note on” events and transform them into three “note on” events, then send these to the output MIDI device. A similar filter is needed for “note off” — all three notes must be stopped after being played:

use constant PEDAL => 55; # G below middle C

sub pedal_notes( $note ) {
    ( PEDAL, $note, $note + 7 );
}

sub pedal_tone( $mf, $event ) {
    my ( $ev, $channel, $note, $vel ) = $event->@*;
    $mf->send( [ $ev, $channel, $_, $vel ] ) for pedal_notes( $note );
    true;
}

my $mf = MidiFilter->new;

$mf->add_filter( note_on  => \&pedal_tone );
$mf->add_filter( note_off => \&pedal_tone );

$mf->go;

We start by setting a constant containing a MIDI note value for the pedal tone. The sub pedal_notes returns this pedal tone, the played note, and its fifth. The callback function pedal_tone sends a MIDI message to output for each of the notes returned by pedal_notes. Note the callback yields true in order to prevent falling through to the default action. The callback function is applied to both the “note on” and “note off” events. We finish by calling the go method of our MidiFilter instance in order to await and process incoming messages from the keyboard.

The last step is to run the script:

$ ./midi-filter.pl ^oxy

Rather than specify a fully qualified device name, we can pass in a regex which should match any device whose name starts with “oxy” - there is only one match on my system, the Oxygen Pro.

The device “RtMidi Output Client:Mister Fancy Pants” or “loopMIDI”, depending on your platform, can now be opened in the DAW to receive played notes routed through the pedal tone filter. This filter is functionally equivalent to the FL Studio Patcher patch from earlier, with the added benefit of being DAW-agnostic. If recording a sequence from this setup, all notes will be shown in the piano roll.

Feature 2 : Controller Banks

The Oxygen Pro has four “banks” or sets of controls. Each bank can have different assignments or behaviour for the knobs, keys, sliders, and pads.

A problem with this feature is that there is limited feedback when switching banks - it’s not always visible on screen, depending on the last feature used. Switching banks does not effect the keyboard. Also, perhaps 4 banks isn’t enough.

A simpler version of this feature might be to use pads to select the bank, and the bank just sets the MIDI channel for all future events. There are 16 pads on the device, for each of 16 channels. It should be more obvious which bank (or channel) was the last selected, and if not, just select it again.

This can also be applied to the keyboard by defining callbacks for “note on” and “note off” (or rather, modifying the existing ones). For this device, we also need callbacks for “pitch wheel change” and “channel aftertouch”. The callback for “control change” should handle the mod wheel without additional special treatment.

The pads on this device are set up to send notes on channel 10, usually reserved for drums. Watching for specific notes incoming on channel 10, and stashing the corresponding channel should be enough to allow other callbacks to route events appropriately:

sub set_channel( $mf, $event ) {
    my ( $ev, $channel, $note, $vel ) = $event->@*;
    return false unless $channel == 9;

    my $new_channel = $note - 36;
    $mf->stash( channel => $new_channel );
    true;
}

$mf->add_filter( note_on  => \&set_channel );
$mf->add_filter( note_on  => \&pedal_tone );
$mf->add_filter( note_off => \&set_channel );
$mf->add_filter( note_off => \&pedal_tone );

If the event channel sent to set_channel is not 10 (or rather 9, as we are working with zero-indexed values) we return false, allowing the filter to fall through to the next callback. Otherwise, the channel is stashed and we stop processing further callbacks. As the pad notes are numbered 36 to 51, the channel can be derived by subtracting 36 from the incoming note.

This callback needs to be applied to both “note on” and “note off” events — remember, there is an existing “note off” callback which will erroneously generate three “note off” events unless intercepted. The order of callbacks is also important. If pedal_tone were first, it would prevent set_channel from happening at all.

We can now retrieve the stashed channel in pedal_tone:

sub pedal_tone( $mf, $event ) {
    my ( $ev, $channel, $note, $vel ) = $event->@*;
    $channel = $mf->stash( 'channel' ) // $channel;
    $mf->send( [ $ev, $channel, $_, $vel ] ) for pedal_notes( $note );
    true;
}

The final piece of this feature is to route some additional event types to the selected channel:

sub route_to_channel( $mf, $event ) {
    my ( $ev, $channel, @params ) = $event->@*;
    $channel = $mf->stash( 'channel' ) // $channel;
    $mf->send( [ $ev, $channel, @params ] );
    true;
}

$mf->add_filter( pitch_wheel_change  => \&route_to_channel );
$mf->add_filter( control_change      => \&route_to_channel );
$mf->add_filter( channel_after_touch => \&route_to_channel );

We can now have different patches respond to different channels, and control each patch with the entire MIDI controller (except the pads, of course).

Pickup

You may have spotted a problem with the bank feature. Imagine we are on bank 1 and we set knob 1 to a low value. We then switch to bank 2, and turn knob 1 to a high value. When we switch back to bank 1 and turn the knob, the control will jump to the new high value.

A feature called “pickup” (or “pick up”) allows for bank switching by only engaging the control for knob 1, bank 1 when the knob passes its previous value. That is, the control only starts changing again when the knob goes beyond its previous low value.

Pickup could be implemented in our filters by stashing the last value for each control/channel combination. This would not account for knob/channel combinations which were never touched - large jumps in control changes would still be possible, with no way to prevent them. One would need to set initial values by tweaking all controls on all channels before beginning a performance.

Many DAWs and synths support pickup, and it is better handled there rather than implementing a half-baked and inconsistent solution here.

Feature 1a: Strum

So far we have not taken complete advantage of our event loop. You might remember we implemented a delay_send method which accepts a delay time alongside the event to be sent.

We can exploit this to add some expressiveness (of a somewhat robotic variety) to the pedal tone callback:

use constant STRUM_DELAY => 0.05; # seconds

sub pedal_tone( $mf, $event ) {
    my ( $ev, $channel, $note, $vel ) = $event->@*;
    $channel = $mf->stash( 'channel' ) // $channel;
    my @notes = pedal_notes( $note );

    $mf->send( [ $ev, $channel, shift @notes, $vel ] );

    my $delay_time = 0;
    for my $note ( @notes ) {
        $delay_time += STRUM_DELAY;
        $mf->delay_send( $delay_time, [ $ev, $channel, $note, $vel ] );
    }
    true;
}

We now store the notes and send the first immediately. Remaining snotes are sent with an increasing delay. The delay_send method will schedule the notes and return immediately, allowing further events to be processed.

Scheduling the “note off” events is also a good idea. Imagine a very quick keypress on the keyboard. If the keyboard note off happens before we finish sending the scheduled notes, sending all “note off” events instantaneously would leave some scheduled notes ringing out. Scheduling “note off” events with the same cadence as the “note on” events should prevent this. That is, the same callback can continue to service both event types.

With that change, playing a single key at a time sounds like this:

Demo Patch

This VCV Rack patch should demonstrate the complete set of features built in this post. On the right is an additive voice which responds to MIDI channel 2. The mod wheel is pacthed to control feedback which should influence the brightness of the sound.

The left side is a typical subtractive patch controlled by channel 3, with an envelope controlling a VCA and VCF to shape incoming sawtooths. The mod wheel is patched to allow a Low-Frequency Oscillator (LFO) to frequency modulate the VCO for a vibrato effect.

VCV Rack patch with FM OP controlled by channel 2 and a subtractive patch controlled by channel 3

This is what it sounds like - we first hear the additive patch on channel 2, then the subtractive one on channel 3. Switching channels is as simple as pushing the respective pad on the controller:

Not very exciting, I know — it’s just to demonstrate the principle.

Keen eyes may have spotted an issue with the bank switching callback. When switching to channel 10, then played keyboard keys which overlap with those assigned to the pads may dump you unexpectedly onto a different channel! I will leave resolving this as an exercise for the reader — perhaps one of the pads could be put to another use.

Latency

While I haven’t measured latency of this project specifically, previous experiments with async processing of MIDI events in Perl showed a latency of a fraction of a millisecond. I expect the system described in this post to have a similar profile.

Source Code

There is a gist with the complete source of the MidiFilter project.

It’s also included below:

#!/usr/bin/env perl

# There is currently an issue with native callbacks and threaded perls, which leads to a crash.
# As of Jan 2025, all the available pre-built perls I am aware of for Windows are threaded.
# I was able to work around this by building an unthreaded perl with cygwin / perlbrew... but
# you might want to just try this on Linux or Mac instead :)

use v5.40;
use experimental qw/ class /;

class MidiFilter {
    use IO::Async::Loop;
    use IO::Async::Channel;
    use IO::Async::Routine;
    use IO::Async::Timer::Countdown;
    use Future::AsyncAwait;
    use MIDI::RtMidi::FFI::Device;

    field $loop       = IO::Async::Loop->new;
    field $midi_ch    = IO::Async::Channel->new;
    field $midi_out   = RtMidiOut->new;
    field $input_name = $ARGV[0];
    field $filters    = {};
    field $stash      = {};

    method _init_out {
        return $midi_out->open_port_by_name( qr/loopmidi/i )
            if ( grep { $^O eq $_ } qw/ MSWin32 cygwin / );

        $midi_out->open_virtual_port( 'Mister Fancy Pants' );
    }

    method add_filter( $event_type, $action ) {
        push $filters->{ $event_type }->@*, $action;
    }

    method stash( $key, $value = undef ) {
        $stash->{ $key } = $value if defined $value;
        $stash->{ $key };
    }

    method send( $event ) {
        $midi_out->send_event( $event->@* );
    }

    method delay_send( $dt, $event ) {
        $loop->add(
            IO::Async::Timer::Countdown->new(
                delay => $dt,
                on_expire => sub { $self->send( $event ) }
            )->start
        )
    }

    method _filter_and_forward( $event ) {
        my $event_filters = $filters->{ $event->[0] } // [];

        for my $filter ( $event_filters->@* ) {
            return if $filter->( $self, $event );
        }

        $self->send( $event );
    }

    async method _process_midi_events {
        while ( my $event = await $midi_ch->recv ) {
            $self->_filter_and_forward( $event );
        }
    }

    method go {
        my $midi_rtn = IO::Async::Routine->new(
            channels_out => [ $midi_ch ],
            code => sub {
                my $midi_in = RtMidiIn->new;
                $midi_in->open_port_by_name( qr/$input_name/i ) ||
                    die "Unable to open input device";

                $midi_in->set_callback_decoded(
                    sub( $ts, $msg, $event, $data ) {
                        $midi_ch->send( $event );
                    }
                );

                sleep;
            }
        );
        $loop->add( $midi_rtn );
        $loop->await( $self->_process_midi_events );
    }

    ADJUST {
        $self->_init_out;
    }
}

use constant PEDAL => 55; # G below middle C
use constant STRUM_DELAY => 0.05; # seconds

sub pedal_notes( $note ) {
    ( PEDAL, $note, $note + 7 );
}

sub pedal_tone( $mf, $event ) {
    my ( $ev, $channel, $note, $vel ) = $event->@*;
    $channel = $mf->stash( 'channel' ) // $channel;
    my @notes = pedal_notes( $note );

    $mf->send( [ $ev, $channel, shift @notes, $vel ] );

    my $dt = 0;
    for my $note ( @notes ) {
        $dt += STRUM_DELAY;
        $mf->delay_send( $dt, [ $ev, $channel, $note, $vel ] );
    }
    true;
}

sub set_channel( $mf, $event ) {
    my ( $ev, $channel, $note, $vel ) = $event->@*;
    return false unless $channel == 9;

    my $new_channel = $note - 36;
    $mf->stash( channel => $new_channel );
    true;
}

sub route_to_channel( $mf, $event ) {
    my ( $ev, $channel, @params ) = $event->@*;
    $channel = $mf->stash( 'channel' ) // $channel;
    $mf->send( [ $ev, $channel, @params ] );
    true;
}

my $mf = MidiFilter->new;

$mf->add_filter( note_on  => \&set_channel );
$mf->add_filter( note_on  => \&pedal_tone );
$mf->add_filter( note_off => \&set_channel );
$mf->add_filter( note_off => \&pedal_tone );

$mf->add_filter( pitch_wheel_change  => \&route_to_channel );
$mf->add_filter( control_change      => \&route_to_channel );
$mf->add_filter( channel_after_touch => \&route_to_channel );

$mf->go;

BEGIN {
    $ENV{PERL_FUTURE_DEBUG} = true;
}

Conclusion

After describing some of the shortcomings of a given MIDI controller, and an approach for adding to a performance within a DAW, we walked through the implementation of a framework to proxy a MIDI controller’s facilities through software-defined filters.

The filters themselves are implemented as simple callbacks which may decide to store data for later use, change the parameters of the incoming message, forward new messages to the virtual hardware proxy device, and/or cede control to further callbacks in a chain.

Callbacks are attached to MIDI event types and a single callback function may be suitable to attach to multiple event types.

We took a look at some simple functionality to build upon the device — a filter which turns a single key played into a strummed chord with a pedal tone, and a bank-switcher which sets the channel of all further events from the hardware device.

These simple examples served to demonstrate the principle, but the practical limit to this approach is your own imagination. My imagination is limited, but some next steps might be to add “humanising” random fluctuations to sequences, or perhaps extending the system to combine the inputs of multiple hardware devices into one software-defined device with advanced and complex facilities. If your device has a DAW mode, you may be able to implement visual feedback for the actions and state of the virtual device. You could also coerce non-MIDI devices, e.g. Gamepads, into sending MIDI messages.

Proposed Perl Changes

Perl Hacks

Published by Dave Cross on Sunday 26 January 2025 16:36

Many thanks to Dave Cross for providing an initial implementation of a PPC index page.

Perl Steering Council meeting #177

Maybe I should explain that in a little more detail. There’s a lot of detail, so it will take a couple of blog posts.

About two weeks ago, I got a message on Slack from Phillippe Bruhat, a member of the Perl Steering Council. He asked if I would have time to look into building a simple static site based on the GitHub repo that stores the PPCs that are driving a lot of Perl’s development. The PSC thought that reading these important documents on a GitHub page wasn’t a great user experience and that turning it into a website might lead to more people reading the proposals and, hence, getting involved in discussions about them.

I guess they had thought of me as I’ve written a bit about GitHub Pages and GitHub Actions over the last few years and these were exactly the technologies that would be useful in this project. In fact, I have already created a website that fulfills a similar role for the PSC meeting minutes – and I know they know about that site because they’ve been maintaining it themselves for several months.

I was about to start working with a new client, but I had a spare day, so I said I’d be happy to help. And the following day, I set to work.

Reviewing the situation

I started by looking at what was in the repo.

All of these documents were in Markdown format. The PPCs seemed to have a pretty standardised format.

Setting a target

Next, I listed what would be essential parts of the new site.

  • An index page containing a list of the PPCs – which links to a page for each of the PPCs
  • The PPCs, converted to HTML
  • The other documents, also converted to HTML
  • The site should be automatically rebuilt whenever a change is made to any of the input files

This is exactly the kind of use case that a combination of GitHub Pages and GitHub Actions is perfect for. Perhaps it’s worth briefly describing what those two GitHub features are.

Introducing GitHub Pages

GitHub Pages is a way to run a website from a GitHub repo. The feature was initially introduced to make it easy to run a project website alongside your GitHub repo – with the files that make up the website being stored in the same repo as the rest of your code. But, as often happens with useful features, people have been using the feature for all sorts of websites. The only real restriction is that it only supports static sites – you cannot use GitHub’s servers to run any kind of back-end processing.

The simplest way to run a GitHub Pages website is to construct it manually, put the HTML, CSS and other files into a directory inside your repo called /docs, commit those files and go to the “Settings -> Pages” settings for your repo to turn on Pages for the repo. Within minutes your site will appear at the address USERNAME.github.repo/REPONAME. Almost no-one uses that approach.

The most common approach is to use a static site builder to build your website. The most popular is Jekyll – which is baked into the GitHub Pages build/deploy cycle. You edit Markdown files and some config files. Then each time you commit a change to the repo, GitHub will automatically run Jekyll over your input files, generate your website and deploy that to its web servers. We’re not going to do that.

We’ll use the approach I’ve used for many GitHub Pages sites. We’ll use GitHub Actions to do the equivalent of the “running Jekyll over your input files to generate your website” step. This gives us more flexibility and, in particular, allows us to generate the website using Perl.

Introducing GitHub Actions

GitHub Actions is another feature that was introduced with one use case in mind but which has expanded to be used for an incredible range of ideas. It was originally intended for CI/CD – a replacement for systems like Jenkins or Travis CI – but that only accounts for about half of the things I use it for.

A GitHub Actions run starts in response to various triggers. You can then run pretty much any code you want on a virtual machine, generating useful reports, updating databases, releasing code or (as in this case) generating a website.

GitHub Actions is a huge subject (luckily, there’s a book!) We’re only going to touch on one particular way of using it. Our workflow will be:

  • Wait for a commit to the repo
  • Then regenerate the website
  • And publish it to the GitHub Pages web servers

Making a start

Let’s make a start on creating a GitHub Actions workflow to deal with this. Workflows are defined in YAML files that live in the .github/workflows directory in our repo. So I created the relevant directories and a file called buildsite.yml.

There will be various sections in this file. We’ll start simply by defining a name for this workflow:

name: Generate website

The next section tells GitHub when to trigger this workflow. We want to run it when a commit is pushed to the “main” branch. We’ll also add the “workflow_dispatch” trigger, which allows us to manually trigger the workflow – it adds a button to the workflow’s page inside the repo:

on:
  push:
    branches: 'main'
  workflow_dispatch:

The main part of the workflow definition is the next section – the one that defines the jobs and the individual steps within them. The start of that section looks like this:

jobs:
  build:
    runs-on: ubuntu-latest
    container: perl:latest

    steps:
    - name: Perl version
      run: perl -v

    - name: Checkout
      uses: actions/checkout@v4

The “build” there is the name of the first job. You can name jobs anything you like – well anything that can be the name of a valid YAML key. We then define the working environment for this job – we’re using a Ubuntu virtual machine and on that, we’re going to download and run the latest Perl container from the Docker Hub.

The first step isn’t strictly necessary, but I like to have a simple but useful step to ensure that everything is working. This one just prints the Perl version to the workflow log. The second step is one you’ll see in just about every GitHub Actions workflow. It uses a standard, prepackaged library (called an “action”) to clone the repo to the container.

The rest of this job will make much more sense once I’ve described the actual build process in my next post. But here it is for completeness:

- name: Install pandoc and cpanm
      run: apt-get update && apt-get install -y pandoc cpanminus

    - name: Install modules
      run: |
        cpanm --installdeps --notest .

    - name: Get repo name into environment
      run: |
        echo "REPO_NAME=${GITHUB_REPOSITORY#$GITHUB_REPOSITORY_OWNER/}" >> $GITHUB_ENV

    - name: Create pages
      env:
        PERL5LIB: lib
        GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
      run: |
        mkdir -p web
        perl bin/build $REPO_NAME

    - name: Update pages artifact
      uses: actions/upload-pages-artifact@v3
      with:
        path: web/

Most of the magic (and all of the Perl – for those of you who were wondering) happens in the “Create pages” step. If you can’t wait until the next post, you can find the build program and the class it uses in the repo.

But for now, let’s skim over that and look at the final step in this job. That uses another pre-packaged action to build an artifact (which is just a tarball) which the next job will deploy to the GitHub Pages web server. You can pass it the name of a directory and it will build the artifact from that directory. So you can see that we’ll be building the web pages in the web/ directory.

The second (and final) job is the one that actually carries out the deployment. It looks like this:

deploy:
    needs: build
    permissions:
      pages: write
      id-token: write
    environment:
      name: github-pages
      url: ${{ steps.deployment.outputs.page_url }}
    runs-on: ubuntu-latest
    steps:
      - name: Deploy to GitHub Pages
        id: deployment
        uses: actions/deploy-pages@v4

It uses another standard, pre-packaged action and most of the code here is configuration. One interesting line is the “need” key. That tells the workflow engine that the “build” job needs to have completed successfully before this job can be run.

But once it has run, the contents of our web/ directory will be on the GitHub Pages web server and available for our adoring public to read.

All that is left is for us to write the steps that will generate the website. And that is what we’ll be covering in my next post.

Oh, and if you want to preview the site itself, it’s at https://davorg.dev/PPCs/ and there’s an active pull request to merge it into the main repo.

The post Proposed Perl Changes first appeared on Perl Hacks.

(dxxxii) 15 great CPAN modules released last week

Niceperl

Published by Unknown on Saturday 25 January 2025 23:32

Updates for great CPAN modules released last week. A module is considered great if its favorites count is greater or equal than 12.

  1. App::DBBrowser - Browse SQLite/MySQL/PostgreSQL databases and their tables interactively.
    • Version: 2.422 on 2025-01-23, with 14 votes
    • Previous CPAN version: 2.421 was 14 days before
    • Author: KUERBIS
  2. App::Netdisco - An open source web-based network management tool.
    • Version: 2.081004 on 2025-01-19, with 17 votes
    • Previous CPAN version: 2.081003 was 19 days before
    • Author: OLIVER
  3. DBI - Database independent interface for Perl
    • Version: 1.647 on 2025-01-20, with 275 votes
    • Previous CPAN version: 1.646 was 9 days before
    • Author: HMBRAND
  4. Function::Parameters - define functions and methods with parameter lists ("subroutine signatures")
    • Version: 2.002005 on 2025-01-19, with 60 votes
    • Previous CPAN version: 2.002004 was 1 year, 6 months, 4 days before
    • Author: MAUKE
  5. Math::BigInt - Pure Perl module to test Math::BigInt with scalars
    • Version: 2.003004 on 2025-01-23, with 13 votes
    • Previous CPAN version: 2.003003 was 7 months, 27 days before
    • Author: PJACKLAM
  6. Module::CoreList - what modules shipped with versions of perl
    • Version: 5.20250120 on 2025-01-20, with 43 votes
    • Previous CPAN version: 5.20241220 was 1 month before
    • Author: BINGOS
  7. Net::Curl - Perl interface for libcurl
    • Version: 0.57 on 2025-01-22, with 18 votes
    • Previous CPAN version: 0.56 was 9 months, 21 days before
    • Author: SYP
  8. PDL - Perl Data Language
    • Version: 2.099 on 2025-01-23, with 57 votes
    • Previous CPAN version: 2.098 was 20 days before
    • Author: ETJ
  9. perl - The Perl 5 language interpreter
    • Version: 5.040001 on 2025-01-18, with 427 votes
    • Previous CPAN version: 5.40.1 was 2 days before
    • Author: SHAY
  10. Spreadsheet::ParseXLSX - parse XLSX files
    • Version: 0.36 on 2025-01-24, with 19 votes
    • Previous CPAN version: 0.35 was 10 months, 5 days before
    • Author: NUDDLEGG
  11. SPVM - The SPVM Language
    • Version: 0.990042 on 2025-01-22, with 34 votes
    • Previous CPAN version: 0.990039 was 5 days before
    • Author: KIMOTO
  12. Syntax::Construct - Explicitly state which non-feature constructs are used in the code.
    • Version: 1.040 on 2025-01-20, with 14 votes
    • Previous CPAN version: 1.038 was 3 months, 19 days before
    • Author: CHOROBA
  13. Test::Simple - Basic utilities for writing tests.
    • Version: 1.302209 on 2025-01-22, with 190 votes
    • Previous CPAN version: 1.302207 was 25 days before
    • Author: EXODIST
  14. Test2::Harness - A new and improved test harness with better Test2 integration.
    • Version: 1.000156 on 2025-01-22, with 17 votes
    • Previous CPAN version: 1.000155 was 1 year, 3 months, 19 days before
    • Author: EXODIST
  15. YAML::PP - YAML 1.2 Processor
    • Version: v0.38.1 on 2025-01-24, with 17 votes
    • Previous CPAN version: v0.38.0 was 11 months, 26 days before
    • Author: TINITA

(dxcix) metacpan weekly report

Niceperl

Published by Unknown on Saturday 25 January 2025 23:29

This is the weekly favourites list of CPAN distributions. Votes count: 43

This week there isn't any remarkable distribution

Build date: 2025/01/25 22:28:31 GMT


Clicked for first time:


Increasing its reputation:

Profiling Peak DRAM Use in R With Perl - Part 2

Killing-It-with-PERL

Published on Sunday 19 January 2025 00:00

In the second part of this we implement the solution that was outlines at the end of Part 1:

  1. utilize a Perl application that probes the operating system in real time for the RSS (Resident Set Size), i.e. the DRAM footprint of an application
  2. fire the application from within R as a separate process, provide it with the PID (Process ID) of the R session and put it in the background
  3. do the long, memory hungry application
  4. upon the end of the calculation kill the Perl application and obtain DRAM usage for use within R

The Perl application that implements the first step is straightforward:

  1. Obtain the command line arguments : the PID of the R session, a monitoring interval and a temporary file …
  2. for the Perl application to write out its own PID
  3. Obtain the initial memory usage by call the ps command line utility, using the excellent (and safe as it bypasses the shell!)
  4. IPC::System::Simple MetaCPAN module
  5. Register an event handler that will write out the peak change (“$max_delta”) change of the RSS from the initial value, as well s the initial value
  6. Go into an infinite loop, in which
    • ps is probed for the current value of the RSS,
    • subtracts it from the initial,
    • updates the value of $max_delta if the current value over the baseline is larger than the
    • goes to sleep for a user defined period of time
    • reawakens at the top of the while loop
#!/usr/bin/perl

# Name: monitor_memory.pl
# Purpose: Registers peak DRAM usage of a process
# Usage: perl monitor_memory.pl <PID> <interval_in_seconds> <pidfile>
# Example: perl monitor_memory.pl 12345 1 /tmp/pidfile
# Date: January 19th 2025
# Author: Christos Argyropoulos
# License: MIT https://mit-license.org/

use strict;
use warnings;
use Time::HiRes qw(nanosleep); # high res waiting
use IPC::System::Simple qw(capturex); # safe capture that bypasses the shell

# Process the command line
my ( $pid, $interval_sec, $pidfile ) = @ARGV;
die "Usage: $0 <PID> <interval_in_seconds> <pidfile>\n"
  unless defined $pid && defined $interval_sec && defined $pidfile;

# Write our own PID that R will use to kill the Perl application
open( my $fh, '>', $pidfile ) or die "Can't write to $pidfile: $!";
print $fh "$$\n";
close $fh;

# Obtain initial memory usage
my $interval    = $interval_sec * 1_000_000_000;
my $initial_mem = capturex('ps', 'o', 'rss=', 'p', $pid);
chomp($initial_mem);
my $max_delta = 0;

# Register the INT and TERM signal handlers to print peak and initial DRAM usage
$SIG{INT} = $SIG{TERM} = sub {
    print "$max_delta\t$initial_mem\n";
    exit 0;
};


# Obtain the RSS, store the maximum delta up to this point, sleep and re-awaken
while (1) {
    my $current = capturex('ps', 'o', 'rss=', 'p', $pid);
    chomp($current);
    my $delta = $current - $initial_mem;
    $max_delta = $delta if $delta > $max_delta;
    nanosleep($interval);
}

Having delegated the peak DRAM monitoring to Perl, R is now free to obtain execution timings and memory allocation using Henrik Bengtsson’s excellent profmem package The package’s vignette may be found here and explains what id does and how this works (emphasis is mine):

The profmem() function uses the utils::Rprofmem() function for logging memory allocation events to a temporary file. The logged events are parsed and returned as an in-memory R object in a format that is convenient to work with. All memory allocations that are done via the native allocVector3() part of R’s native API are logged, which means that nearly all memory allocations are logged. Any objects allocated this way are automatically deallocated by R’s garbage collector at some point. Garbage collection events are not logged **by profmem(). **Allocations not logged are those done by non-R native libraries or R packages that use native code Calloc() / Free() for internal objects. Such objects are not handled by the R garbage collector.

Based on this description, monitor_memory.pl and profmem are complementary: the former will log peak running memory use, while the latter will log in all allocations, and the combination will provide the best of both worlds.

The tricky part in the implementation is how to ensure that one does end up with an orphan Perl process when the long calculation throws an error, and thus avoid having to kill the orphan somehow (Unix terminology is so violent). However, the Perl side should not have to worry about the prospect of becoming an orphan, or detecting it has become one. Killing the Perl process (orphan or not), is a task for R, but Perl has to help out, by providing R with the PID of the Perl application. Completion of the communication loop is a necessary, but not sufficient condition for culling the orphan and R has to do its part when implementing the 2nd and 4th step of the top level logic. Here is where R’s tryCatch-finally facilities come to shine:

# Benchmarking allocations, peak DRAM use and execution timing of an expression in R
bench_time_mem<-function(x) {
	gc(reset=FALSE,verbose=FALSE)  ## force a gc here
	pidfile <- tempfile() # temp file to story Perl's PID
	outfile <- tempfile() # stdout redirection for Perl
	pid <- Sys.getpid()   # get R's PID
	ret<-NULL
	system2("./monitor_memory.pl", 
		c(pid,step_of_monitor,pidfile),wait=FALSE,stdout=outfile)
	Sys.sleep(0.2)  # Wait for PID file to be written
	monitor_pid <- readLines(pidfile)[1] # Get Perl's PID
	tryCatch (
		expr = {
			mem<-profmem(time<-system.time(x,gcFirst = FALSE))
			rettime<-c(time)
			names(rettime)<-names(time)
			retval<-c(time,"R_gc_alloc" = sum(mem$bytes,na.rm=T))
		}, # execute R expression, get timing and allocations
		finally = {
			system2("kill",c("-TERM", monitor_pid)) # kill the ? orphan
			Sys.sleep(0.2) # Wait for Perl to finish logging
			memstats<-read.csv(outfile,sep="\t",
				header=FALSE) # get memory statistics
			unlink(c(pidfile,outfile)) # cleanup files
			retval<-c(retval ,
				  "delta"= memstats[1,1]*1024,
				"initial"= memstats[1,2]*1024
				)
		}
	)

	return(retval)
}

The logic of the R partner is fairly straightforward:

  1. Obtain the PID of the R process and a temporary file location for monitor_memory.pl to store its PID
  2. Launch monitor_memory.pl and put it in the background
  3. Proceed to execute the user provided expression in the tryCatch bloc and obtain allocations and execution timings with mem<-profmem(time<-system.time(x,gcFirst = FALSE))
  4. Sum the total memory allocated during the execution of the user expression
  5. Kill the Perl process in the finally block. This code will be executed even if errors are encountered in the user’s expression, and hence no orphan will be allowed to live
  6. Package the DRAM usage into the return value vector and send it to the user, along with allocations and timings.

Let’s look at a complete example written as a R script with command line parsing to see what we have achieved. For this example, we will use the two sequential and the one large allocation we considered in Part 1 :

# Script Name: profile_time_memory.R
# Purpose: illustrate how to profile time and memory usage in R
# Usage: Rscript profile_time_memory.R --sleep_time 1 --size 1E6 --monitor_step 0.001
# Date: January 19th 2025
# Author: Christos Argyropoulos
# License: MIT https://mit-license.org/

library(profmem)
library(getopt)

# Define command line options
spec <- matrix(c(
  'sleep_time', 's', 1, "numeric", "Sleep time in seconds [default 1]",
  'size', 'n', 1, "numeric", "Size of the problem N [default 1E6]",
  'monitor_step', 'm', 1, "numeric", "Monitoring step in seconds [default 0.001]",
  'help', 'h', 0, "logical", "Show this help message and exit"
), byrow = TRUE, ncol = 5)

# Parse command line options
opt <- getopt(spec)


# Assign command line arguments to variables
if(is.null(opt$sleep_time)) opt$sleep_time<-1
if(is.null(opt$size)) opt$size<-1E6
if(is.null(opt$monitor_step)) opt$monitor_step<-0.001

sleep_time <- opt$sleep_time
N <- opt$size
step_of_monitor <- opt$monitor_step

busy_wait <- function(seconds) {
  start_time <- Sys.time()
  stop_time <- Sys.time()
  while (difftime(stop_time, start_time, units = "secs") < seconds) {
    stop_time <- Sys.time()
  }
}

bench_time_mem<-function(x) {
	gc(reset=FALSE,verbose=FALSE)  ## force a gc here
	pidfile <- tempfile() # temp file to story Perl's PID
	outfile <- tempfile() # stdout redirection for Perl
	pid <- Sys.getpid()   # get R's PID
	ret<-NULL
	system2("./monitor_memory.pl", 
		c(pid,step_of_monitor,pidfile),wait=FALSE,stdout=outfile)
	Sys.sleep(0.2)  # Wait for PID file to be written
	monitor_pid <- readLines(pidfile)[1] # Get Perl's PID
	tryCatch (
		expr = {
			mem<-profmem(time<-system.time(x,gcFirst = FALSE))
			rettime<-c(time)
			names(rettime)<-names(time)
			retval<-c(time,"R_gc_alloc" = sum(mem$bytes,na.rm=T))
		}, # execute R expression, get timing and allocations
		finally = {
			system2("kill",c("-TERM", monitor_pid)) # kill the ? orphan
			Sys.sleep(0.2) # Wait for Perl to finish logging
			memstats<-read.csv(outfile,sep="\t",
				header=FALSE) # get memory statistics
			unlink(c(pidfile,outfile)) #cleanup files
			retval<-c(retval ,
				  "delta"= memstats[1,1]*1024,
				"initial"= memstats[1,2]*1024
				)
		}
	)

	return(retval)
}



val<-bench_time_mem(
	{  	
		busy_wait(sleep_time);
		cat("\nAllocating\n")
		q<-rnorm(N);
		busy_wait(sleep_time);
		rm(q);gc(reset=F)
		q2<-rnorm(N);
		busy_wait(sleep_time);
		rm(q2);gc(reset=F)
	}
)

valcp<-bench_time_mem(
	{  
		busy_wait(sleep_time);
		cat("\nAllocating\n")
		q<-rnorm(N*2);
		busy_wait(sleep_time*2);
	}
)




cat("\n Allocating a double vector of length N = ",N," in R")
cat("\n            with busy waiting period  T = ",sleep_time," seconds")
cat("\n            and monitoring memory every : ",step_of_monitor," seconds")
cat("\n Will allocate N->gc->N and then 2N at once\n")
cat(paste(rep("=",65),collapse=""))
cat("\n Measured vs alloc'd in R  (2N at once): ", valcp["delta"]/valcp["R_gc_alloc"])
cat("\n Measured vs alloc'd in R  (N-> gc ->N): ", val["delta"]/val["R_gc_alloc"])
cat("\n")
cat("\n Performance of the two allocations : ")
cat("\n N-> gc ->N : \n")
print(val)
cat("\n 2N at once : \n")
print(valcp)

Running this from the command line we obtain the following output:

Rscript --vanilla  profile_time_memory.R -s 1 -n 1000000 -m 0.00001

Allocating

Allocating

 Allocating a double vector of length N =  1e+06  in R
            with busy waiting period  T =  1  seconds
            and monitoring memory every :  1e-05  seconds
 Will allocate N->gc->N and then 2N at once
=================================================================
 Measured vs alloc'd in R  (2N at once):  0.995325
 Measured vs alloc'd in R  (N-> gc ->N):  0.4750328

 Performance of the two allocations : 
 N-> gc ->N : 
   user.self     sys.self      elapsed   user.child    sys.child   R_gc_alloc 
       3.135        0.022        3.163        0.000        0.000 16210416.000 
       delta      initial 
 7700480.000 76742656.000 

 2N at once : 
   user.self     sys.self      elapsed   user.child    sys.child   R_gc_alloc 
       3.088        0.017        3.110        0.000        0.000 16000048.000 
       delta      initial 
15925248.000 84443136.000 

As you can see the Perl monitor identified that the peak memory use of the sequential allocation-deallocation-allocation was half than of a single-step allocation of the entire workspace. The Perl application’s ability to monitor the process in very fine resolution (theoretically up to nanosec, but reallistically one microsecond is where I’d draw the limit), comes handy when one has to monitor spikes in peak memory usage. Consider the example, in which some recently allocated memory is used for a fast task and then discarded (this is simulated by providing a small value to the s argument of the script). At low resolution, our estimates of the peak memory use are biased, e.g. :

Rscript --vanilla  profile_time_memory.R -s .001 -n 1000000 -m 0.1

Allocating

Allocating

 Allocating a double vector of length N =  1e+06  in R
            with busy waiting period  T =  0.001  seconds
            and monitoring memory every :  0.1  seconds
 Will allocate N->gc->N and then 2N at once
=================================================================
 Measured vs alloc'd in R  (2N at once):  0.07372778
 Measured vs alloc'd in R  (N-> gc ->N):  0.3777522

 Performance of the two allocations : 
 N-> gc ->N : 
   user.self     sys.self      elapsed   user.child    sys.child   R_gc_alloc 
       0.154        0.012        0.165        0.000        0.000 16210416.000 
       delta      initial 
 6123520.000 76742656.000 

 2N at once : 
   user.self     sys.self      elapsed   user.child    sys.child   R_gc_alloc 
       0.108        0.011        0.119        0.000        0.000 16000048.000 
       delta      initial 
 1179648.000 84635648.000 

However changing the resolution of monitoring re-establishes accuracy:

Rscript --vanilla  profile_time_memory.R -s .001 -n 1000000 -m 0.0001 
Allocating

Allocating

 Allocating a double vector of length N =  1e+06  in R
            with busy waiting period  T =  0.001  seconds
            and monitoring memory every :  1e-04  seconds
 Will allocate N->gc->N and then 2N at once
=================================================================
 Measured vs alloc'd in R  (2N at once):  0.995325
 Measured vs alloc'd in R  (N-> gc ->N):  0.4750328

 Performance of the two allocations : 
 N-> gc ->N : 
   user.self     sys.self      elapsed   user.child    sys.child   R_gc_alloc 
       0.180        0.007        0.188        0.000        0.000 16210416.000 
       delta      initial 
 7700480.000 76746752.000 

 2N at once : 
   user.self     sys.self      elapsed   user.child    sys.child   R_gc_alloc 
       0.119        0.009        0.128        0.000        0.000 16000048.000 
       delta      initial 
15925248.000 84447232.000 

Having described the solution, let’s provide some limitations and a context of use that acknowledges these limitations and some extensions:

  • Small allocations (e.g. 100k doubles or below) will be invisible to the Perl monitor. This appears to be related to how the OS manages memory and how the kernel updates the page that is raided by ps for data
  • This code is thus best used to monitor large allocations in long calculations
  • One can extend the Perl monitor to take action with respect to R if memory usage grows at an unstainable rate, alert the user etc (
  • important in my mind for large tasks executing remotely e.g. over the weekend). This is an interesting extension for future work
  • One can easily extend the Perl to work under MacOs (trivial - as it has a ps command line utility), and Windows, e.g. run R under WSL2 or use tasklist instead of ps (another possible extension)

I hope you enjoyed this journey with R and Perl so far! Have fun until the next time.

Profiling Peak DRAM Use in R With Perl - Part 1

Killing-It-with-PERL

Published on Saturday 18 January 2025 00:00

Another year, another opportunity for Perl to excel as a system’s language. Today I decided to take Perl for a (?)wild ride and use it to monitor peak physical DRAM use in a R script. Due to the multi-language nature of the post, there will be a lot of R code in the first part of the series; however, the code is self-explanatory, and should not be difficult to understand (the same applies to the Perl code for those coming from a R background). For those of you who would like to skip the R part (why?), jump straight to the end Part 1 to see the solution and then go to Part 2 for the details.

First, a little bit of background about R’s memory management and the tools that one can use within R to monitor how the process is managing memory. R similar to Perl (?)frees the programmer from having to manage memory manually by providing dynamically allocated containers. R features a garbage collector, which similar to Perl’s uses a reference counting mechanism to return memory back to the operating system. Managing memory in R is as critical as managing memory in Perl, and there are tools available that are built-in the language (the Names and Values in the 2nd edition of the book “Advanced R” is a valuable introduction to memory management, while the Chapter Memory in the first edition of that book is also a useful read). The basic tool used to profile R code is the builtin function Rprof that samples the call stack and writes it out to a log-file that can be subsequently parsed. In the documentation of Rprof, one finds the following disclaimer:

     Note that the (timing) interval cannot be too small.  With
     ‘"cpu"’, the time spent in each profiling step is currently added
     to the interval.  With all profiling events, the computation in
     each profiling step causes perturbation to the observed system and
     biases the results.  What is feasible is machine-dependent.  On
     Linux, R requires the interval to be at least 10ms, on all other
     platforms at least 1ms.  Shorter intervals will be rounded up with
     a warning.

The relative slow sampling frequencing implies that one must somehow slow the code down to capture peak memory usage. One solution for those of us in Linux systems is to take the hint and release the valgrid Kraken as detailed in Profiling R code for memory use; this will really slow the code down and then we can capture stuff (note that taking the hint, also applies to Perl, i.e. see Test::Valgrind in MetaCPAN). But ultimately, we would like not to slow the code that much, especially if we are also trying to obtain performance information at the same time as we profile the memory.

Assuming that one does not want to use an interactive tool to profile the code (the excellent profvis comes to mind), then one is stuck with the low-level option provided by Rprof logging and summaryRprof parsing. This leads us to the next question: what is the overhead by these tools? Overhead is paid in both hard disk space and execution time. Let’talk about space first: for a high resolution logging (e.g. sampling every 10msec), the log file will grow by ~ 10kb per second of calculation. This may not seem much, but profiling an involved calculation quickly adds up: to profile an expression that takes 10shr to execute will consume 10 x 3600 x 10 = 360000 KB ~ 360MB (incidentally ~10hr is the longest calculation I have ever done in R).To give a sense of measure, the total size of the files in my /var/log is ~1.1GB. And yes, while hard-disks are getting bigger, this is no excuse to fill them with log files! To give an idea of the time overhead, we will need to provide an alternative to hard-disk logging (hint: this will be a Perl program!), but first let’s provide a straightforward R implementation of a logging function.

bench_time_mem<-function(x) {  ## x is the R expression to profile
  gc(reset=FALSE,verbose=FALSE)  ## force a gc here
  profing_fname <- tempfile() ## create a temporary file
  Rprof(filename=profing_fname, memory.profiling=TRUE,interval=0.01)
  time<-system.time(x,gcFirst = FALSE)
  Rprof(NULL) ## stop profiling and logging
  memprof<-summaryRprof(profing_fname, memory="tseries",lines="hide",diff=TRUE)
  mem_alloc<-sum(memprof[-1,"vsize.large"]) ## get memory allocated via malloc
  time_prof<-summaryRprof(profing_fname, memory="none",lines="hide",diff=TRUE)
  times<-time_prof$by.total[2,-2]
  retval<-unlist(c(times[1,1],mem_alloc,file.info(profing_fname)$size))
  unlink(profing_fname) ## delete the logfiles
  names(retval)<-c("total_time","R_gc_alloc","logfile_size")
  retval
}

The code should be self-explanatory even for non R users : it first triggers the garbage collector, and then in sequence creates a temporary file, starts logging to that file, execute the R expression (R code within brackets, similar to how one would provide an anonymous code reference in Perl), then parses the file to get memory usage and timing), and returns the total time, the size of the memory allocated by R in the stack when running the expression, and the size of the log file.

Here is a minimally working example:

## function for busy waiting for a fixed number of seconds
busy_wait <- function(seconds) {
  start_time <- Sys.time()
  stop_time <- Sys.time()
  while (difftime(stop_time, start_time, units = "secs") < seconds) {
    stop_time <- Sys.time()
  }
}

N<-100000
work_time <- 1 # second(s)

val<-bench_time_mem(
	{  	
		busy_wait(work_time);  ## work without allocation
		cat("\nAllocating\n")
		q<-rnorm(N);            ## Gaussian random variables
		busy_wait(work_time);   ## work with allocation
		rm(q);gc(reset=F)       ## free memory/trigger the gc
		q2<-rnorm(N);           ## another allocation
		busy_wait(work_time);   ## busy working!
		rm(q2);gc(reset=F)      ## free the memory yet again

	}
)

In this example, we first provide an implementation of a function busy_wait that simulates a work load without further allocation. The actual code to be profiled, alternates periods of busy waiting with allocation and de-allocation. With the values of N and work_time in the code snippet, I obtain the following code profile summary:

> val
  total_time   R_gc_alloc logfile_size 
         3.4    1600000.0      32446.0 

Let’s talk about these figures: first note that parsing the Rprof, does not allow one direct access to the peak memory used during the calculation (only the total amount of memory allocated in the stack). While this information is crucial for performance, i.e. one pays a price for every byte moved around, it is not very informative about whether t he program will even run in the machine, as the datasets scale: R processes data in memory, and when the physical memory is exhausted, the operating system will put the R process out of its misery. In the example above, the peak memory usage while the expression executes is N*8 (since rnorm returns a double precision floating number in C parlance), but since two allocations were done, the profiler can only report their sum. In fact, the output is virtually indistinguishable from the following code which allocates an array of 2 x N doubles in a single go.

valcp<-bench_time_mem(
	{  
		busy_wait(work_time);
		cat("\nAllocating\n")
		q<-rnorm(N*2);
		rm(q);gc(reset=F)
		busy_wait(work_time*2);
	}
)

and

valcp
  total_time   R_gc_alloc logfile_size 
         3.2    1600000.0      30773.0 

While both codes allocated the same total amount of memory, the first code would work if one allocated the entirety of the free memory in a given machine, while the second would croak when roughly half the free memory was requested.

The difference of ~0.2sec execution time, is the price one has to pay for triggering R’s garbage collector. The following shows that while allocating an array of 10^5 doubles and filling it with random numbers is very fast (it takes 6msec in my machine), triggering the gaarbage collector is 32x slower.

> system.time(rnorm(N))
   user  system elapsed 
  0.006   0.000   0.006


> system.time({q<-rnorm(N);rm(q);gc(reset=FALSE)})
   user  system elapsed 
  0.193   0.000   0.194 

So how can one get the peak DRAM usage without logging anything to the hard disk? The answer, which will be provided in Part 2, blends together R and Perl and is conceptually very simple, design-wise:

  • write a Perl script that probes the ps command line utility for resident set size (RSS) i.e. the footprint of the R process in DRAM.
  • put the probing of the RSS in a monitoring loop, so that awakens periodically to sample the size of the RSS in realtime and compares it to that at the beginning of the monitoring phase
  • start the Perl monitoring script as sub-process of the R process, provide it with the Process ID (PID) of the R process and put it in the background
  • when the R expression concludes, kill the Perl process and report the maximum difference of the RSS from the baseline; this gives the peak (over the baseline) footprint of the program in DRAM.

(to be continued…)

Isn’t Perl Dead

…let’s just move on shall we ;-)

Essential Knowledge for Perl Consultants

So you want to be the guy, the one that swoops in to the shop that has been saddled with the legacy Perl application because you’ve been doing Perl since the last century? You know that shop, they have a Perl application and a bunch of developers that only do Python and they’ve suddenly becom allergic to learning something new (to them). From my own experience, here are some of the technologies you’ll encounter and should be familiar with to be the guy.

  • [x] mod_perl
  • [x] FastCGI
  • [x] Moose
  • [x] HTML::Template
  • [x] Mason
  • [ ] Template::Toolkit

I checked off the things I’ve encountered in my last three jobs.

Of course, the newer Perl based frameworks are good to know as well:

  • Mojolicious
  • Catalyst

Some “Nice to Knows”

  • Apache
  • docker
  • cpanm
  • carton
  • make
  • bash

…and of these, I think the most common thing you’ll encounter on sites that run Perl applications is mod_perl.

Thar’s gold in them thar hills!

Well, maybe not gold, but certainly higher rates and salaries for experienced Perl developers. You’re a unicorn! Strut your stuff. Don’t back down and go cheap. Every day someone leaves the ranks of Perl development only to become one of the herd leaving you to graze alone.

Over the last three years I’ve earned over a half-million dollars in salary and consulting fees. Some of you are probably earning more. Some less. But here’s the bottom line, your skills are becoming scarcer and scarcer. And here’s the kicker…these apps aren’t going away. Companies are loathe to touch some of their cash cows or invest in any kind of “rewrite”. And here’s why…

  • They don’t know what the application even does!
  • They don’t have any bandwidth for rewriting applications that “work”.
  • They love technical debt or never even heard of it.

And here’s what they want you do for a big pile of their cash:

  • fix a small bug that may take you a day to find, but only a minute to fix
  • upgrade their version of perl
  • upgrade the platform the app runs on because of security vulnerabilities
  • containerize their application to run in the cloud
  • add a feature that will take a you a week to find out how to implement and a day to actually implement

The Going Rate?

According to the “interweb” the average salary for an experienced Perl developer is around $50/hour or about $100K or so. I’m suspicious of those numbers to be honest. Your mileage may vary but here’s what I’ve been able to get in my last few jobs:

  • $180K/year + bonus
  • $160K/year + a hearty handshake
  • $100/hour

…and I’m not a great negotiator. I do have over 20 years of experience with Perl and over 40 years of experience in IT. I’m not shy about promoting the value of that experience either. I did turn down a job for $155K/year that would have required some technical leadership, a position I think should have been more like $185k/year to lead a team of Perl developers across multiple time zones.

Your best prospects are…your current customers!

Even if you decide to leave a job or are done with an assignement, don’t burn bridges. Be willing to help them with a transition. Be polite, ask for a recommendation if appropriate. If they’re not planning on rehiring, they may be willing to contract with you for spot assignments.

Some Miss Manners Advice

  • Be nice…always
  • Suggest improvements but don’t be upset if they like the crappy app just the way it is
  • Write good documentation! Help someone else pick up your work (it could be you a year from now).
  • Be a mentor but not a know-it-all, you don’t know-it-all, and nobody likes a know-it-all even if you do
  • Don’t be stubborn and fight with the resident guru unless his bad decision is about to take the company off the cliff (and even then don’t fight with him, take it to the boss)
  • Ask questions and take a keen interest in their domain, you never know when a similar job might present itself

The Underbar, episode 0: The New Perl Logo

The Underbase

Published on Friday 17 January 2025 20:00

The New Perl Logo

Yet Another Perl-Powered Company: Geolytica

perl.com

Published on Thursday 16 January 2025 09:00

Imagine you want to parse free-form address input and match it against a database representing the road network.

Example 1

  • Input: "751 FAIR OKS AVENUE PASADNA CA"
  • Output: "751 N Fair Oaks AVE, Pasadena, CA 91103-3069 / 34.158874,-118.151053" View example

Example 2

  • Input: "5 Adne Edle Street, London"
  • Output: "5 THREADNEEDLE STREET, LONDON, United Kingdom EC3V 3NG" View example

The database contains road names, shapes, numbers, zip/postal codes, city names, regions, neighborhood/district names, and more—billions of named location entities worldwide. Add another 100 million points of interest extracted from billions of webpages, and the problem becomes quite difficult.

Two decades of Perl coding, starting in 2005, and this problem is (mostly) solved at Geolytica.


Geolytical logo

Perl at Geolytica

At Geolytica, we harness Perl to manage and enhance vast geo-location datasets and build the application logic of the geocoding engines powering geocoder.ca and geocode.xyz.

Data Cleanup and Enhancement

We continuously update and enhance our location entities database, because ground truth changes - at the speed of life. One standout example is our work with OpenStreetMap’s POI data. A year ago, we utilized an in-house AI tool, (let’s call it “PerlGPT,”) to refine this dataset, correcting inconsistencies and enhancing data quality. The results were significant enough to share with the community at Free POI Data.

Perl’s Versatility and Stability

The beauty of Perl lies in its backward compatibility. Despite our codebase spanning over two decades, upgrading Perl across versions has never broken our code. In contrast, with other languages, we’ve observed issues like API changes causing extensive refactoring or even rewrites. Perl’s design allows for seamless integration of old and new code, which is vital for our specific needs.

Practical Implementation

At Geolytica, tasks like parsing location entities from text involve complex string manipulations. As shown in the examples above, these challenges would be difficult in any programming language, but Perl makes them easier than other options.


Final Thoughts

The best programming language for any job is the one that makes hard problems easy and impossible ones possible. For Geolytica, that language is Perl.


About the Author

Ervin Ruci has been immersed in Perl since 1998, initially building student information systems at Mount Allison University, then registry systems for CIRA from 2000 to 2005. In 2005, he became a location-independent entrepreneur, founding Geolytica to tackle the location intelligence problem.

Creating MIDI Music with Perl

perl.com

Published on Wednesday 15 January 2025 09:00

Music is a vast subject

It is older than agriculture and civilization itself. We shall only cover the essential parts needed to make music on the computer. So let’s get right to the point!

How do you make music with code? And what is music in the first place?

Well, for our purposes, music is a combination of rhythm, melody, and harmony.

Okay, what are these musical elements from the perspective of a programming language? And how do you create these elements with code? Enter: Perl.

Set Up, Play, Write

Here is a basic algorithm that builds an ascending musical phrase two times. It uses named notes with the octave (e.g. C4):

use MIDI::Util qw(setup_score);

my $score = setup_score();

for (1 .. 2) {
  for my $note (qw(C4 D4 E4 F4)) {
    $score->n('qn', $note); # Adds a quarter note
    $score->r('qn');        # Adds a quarter note rest
  }
}

$score->write_score("$0.mid");

Rendering Audio

In order to actually hear some sound, you can either play the MIDI directly, with a command-line player like timidity and an sf2 “soundfont” file, as in this list. Also the MIDI file can be used to create an audio formatted file (e.g. WAV, MP3) that can be played. Here is the command for this, that I use on my Mac:

timidity -c ~/timidity.cfg some.mid -Ow -o - | ffmpeg -i - -acodec libmp3lame -ab 64k some.mp3

These command line switches do the following things: -Ow = Generate RIFF WAVE format output. -o - = Place output on stdout. -i - = Specify the infile as stdin. -acodec libmp3lame = Specify that we are converting to mp3 format. -ab 64k = Set the bitrate (in bits/s).

But wait! You can also generate and play MIDI in real-time with the MIDI::RtMidi::FFI::Device and MIDI::RtMidi::ScorePlayer modules. :D

Back to Creating Music!

So far, we have encountered the “Set up, Play, and Write” algorithm. Next we shall replace the “Play” bit with “Sync” and play the bass and treble parts simultaneously (and again, we use a named note plus octave number in each of the subroutines):

use MIDI::Util qw(setup_score);

my $score = setup_score();

$score->synch(
  sub { bass($score) },
  sub { treble($score) },
);

$score->write_score("$0.mid");

sub bass {
  my ($score) = @_;
  for my $note (qw(C3 F3 G3 C4)) {
    $score->n('qn', $note);
  }
}

sub treble {
  my ($score) = @_;
  for my $note (qw(C4 D4 E4 F4)) {
    $score->n('en', $note);
    $score->r('en');
  }
}

This code is not especially clever, but illustrates the basics.

If we want to repeat the phrase, just add a for loop to the synch:

$score->synch(
  sub { bass($score) },
  sub { treble($score) },
) for 1 .. 4;

Setting Channels, Patches, Volume, and Tempo

Use the MIDI::Util::set_chan_patch() function to set the channel and the patch. To set the tempo in “beats per minute”, give a bpm to the setup_score() function. To set an individual note volume, add "v$num" as an argument to the $score->n() method, where $num is an integer from 0 to 127. Here is code that sets these parameters:

use MIDI::Util qw(setup_score set_chan_patch);

my $bpm = shift || 120;

my $score = setup_score(bpm => $bpm);

$score->synch(
  sub { bass($score) },
  sub { treble($score) },
);

$score->write_score("$0.mid");

sub bass {
  my ($score) = @_;
  set_chan_patch($score, 0, 35);
  for my $note (qw(C3 F3 G3 C4)) {
    $score->n('qn', $note, 'v127');
  }
}

sub treble {
  my ($score) = @_;
  set_chan_patch($score, 1, 0);
  for my $note (qw(C4 D4 E4 F4)) {
    $score->n('en', $note, 'v110');
    $score->r('en');
  }
}

If the synched subroutines do not amount to the same number of beats, the synch will probably not work as expected, and will stop playing the shorter part and keep playing the longer part until it is finished.

Selecting Pitches

What if we want the program to choose notes at random, to add to the score? Here is a simple example:

sub treble {
  my ($score) = @_;
  set_chan_patch($score, 1, 0);

  my @pitches = (60, 62, 64, 65, 67, 69, 71, 72);

  for my $n (1 .. 4) {
    my $pitch = $pitches[int rand @pitches];
    $score->n('en', $pitch);
    $score->r('en');
  }
}

For MIDI-Perl, the named note with octave C4 and the MIDI number 60 are identical, as shown in the tables on this page.

Another, more “music-theory way” is to select notes from a named scale (and this time, over two octaves):

use Music::Scales qw(get_scale_MIDI);
# ...

sub treble {
  my ($score) = @_;
  set_chan_patch($score, 1, 0);

  my $octave = 4;

  my @pitches = (
    get_scale_MIDI('C', $octave, 'major'),
    get_scale_MIDI('C', $octave + 1, 'major'),
  );

  for my $n (1 .. 4) {
    my $pitch = $pitches[int rand @pitches];
    $score->n('en', $pitch);
    $score->r('en');
  }
}

Single Notes, Basslines, and “Melody”

We saw above, how to select pitches at random. But this is the least musical or interesting way. Pitches may be selected by interval choice, as with the excellent Music::VoiceGen module. You could also choose by mathematical computation with Music::Voss (code example), or a probability density, or an evolutionary fitness function, etc.

Basslines are single note lines in a lower register. They have their own characteristics, which I will not attempt to summarize. One thing you can do is to make sure your notes are in the octaves 1 to 3. Fortunately, there is a module for this very thing called Music::Bassline::Generator. Woo!

So what is a melody? Good question. I’ll leave out the long-winded music theory discussion, and just say, “Go forth and experiment!”

Chords and Harmony

We can construct chords at random - oof:

use Data::Dumper::Compact qw(ddc);
use MIDI::Util qw(setup_score);
use Music::Scales qw(get_scale_MIDI);

my @pitches = (
    get_scale_MIDI('C', 4, 'minor'),
    get_scale_MIDI('C', 5, 'minor'),
);

my $score = setup_score();

for my $i (1 .. 8) {
  my @chord = map { $pitches[int rand @pitches] } 1 .. 3;
  print ddc(\@chord);
  $score->n('hn', @chord);
  $score->r('hn');
}

$score->write_score("$0.mid");

We can construct chord progressions by name:

use Data::Dumper::Compact qw(ddc);
use MIDI::Util qw(setup_score midi_format);
use Music::Chord::Note;

my $score = setup_score();

my $mcn = Music::Chord::Note->new;

for my $c (qw(Cm7 F7 BbM7 EbM7 Adim7 D7 Gm)) {
  my @chord = $mcn->chord_with_octave($c, 4);

  @chord = midi_format(@chord); # convert to MIDI-Perl notation
  print ddc(\@chord);

  $score->n('wn', @chord);
}

$score->write_score("$0.mid");

Chord progressions may be constructed algorithmically. Here is an example of a randomized state machine that selects chords from the major scale using the default settings of the Music::Chord::Progression module:

use Data::Dumper::Compact qw(ddc);
use MIDI::Util qw(setup_score);
use Music::Chord::Progression;

my $score = setup_score();

my $prog = Music::Chord::Progression->new;
my $chords = $prog->generate;
print ddc($chords);

$score->n('wn', @$_) for @$chords;

$score->write_score("$0.mid");

Advanced Neo-Riemannian operations can be used with the Music::Chord::Progression::Transform module.

To get chord inversions, use the Music::Chord::Positions module. For instance, say we have a chord like C major (C4-E4-G4), and we want the first or second inversion. Sure, we could just rewrite it to be E4-G4-C5 or G4-C5-E5 - but that’s not programming! The inversion of a chosen chord can be programmatically altered if deemed necessary.

Phrasing

This bit requires creativity! But fortunately, there is also the Music::Duration::Partition module. With it, rhythms can be generated and then applied to single-note, chord, or drum parts. This time, let’s choose the pitches more musically with Music::VoiceGen:

use MIDI::Util qw(setup_score);
use Music::Duration::Partition ();
use Music::Scales qw(get_scale_MIDI);
use Music::VoiceGen ();

my $score = setup_score();

# get rhythmic phrases
my $mdp = Music::Duration::Partition->new(
    size => 4,                  # 1 measure in 4/4
    pool => [qw(hn dqn qn en)], # half, dotted-quarter, quarter, eighth notes
);
my @motifs = $mdp->motifs(4);

# assign voices to the rhythmic motifs
my @pitches = (
  get_scale_MIDI('C', 4, 'minor'),
  get_scale_MIDI('C', 5, 'minor'),
);
my $voice = Music::VoiceGen->new(
    pitches   => \@pitches,
    intervals => [qw(-3 -2 -1 1 2 3)], # allowed interval jumps
);
my @voices;
for my $motif (@motifs) {
    my @notes = map { $voice->rand } @$motif;
    push @voices, \@notes;
}

for (1 .. 4) { # repeat the group of phrases 4 times
    for my $n (0 .. $#motifs) {
        # add each motif with corresponding voices, to the score
        $mdp->add_to_score($score, $motifs[$n], $voices[$n]);
    }
}

$score->write_score("$0.mid");

Sidebar: Modulo arithmetic

If we want to stay within a range, say the chromatic scale of all notes, use the % operator:

use Data::Dumper::Compact qw(ddc);
use Music::Scales qw(get_scale_notes);

my @notes = get_scale_notes('C', 'chromatic');

my %tritones = map { $notes[$_] => $notes[ ($_ + 6) % @notes ] } 0 .. $#notes;
print ddc(\%tritones);

(The “tritone” is a musical interval, once considered to be the “Devil’s interval.” 🎶 Purple haze all in my brain 🎶)

Sidebar: Alternation math

If we want to change every other iteration, we can also use the % operator:

if ($i % 2 == 0) {
  $score->n('qn', $pitch);
  $score->r('qn');
}
else {
  $score->n('hn', $pitch);
}

Beats!

A steady pulse:

use MIDI::Drummer::Tiny;

my $d = MIDI::Drummer::Tiny->new;

$d->count_in(4);  # Closed hi-hat for 4 measures

$d->write;

A simple “backbeat” rhythm:

use MIDI::Drummer::Tiny;

my $d = MIDI::Drummer::Tiny->new(file => "$0.mid");

$d->note(
    $d->quarter,
    $d->closed_hh,
    $_ % 2 ? $d->kick : $d->snare
) for 1 .. $d->beats * $d->bars;

$d->write;

With this module, you can craft unique grooves like John Bonham’s “Fool in the Rain” (code example):

With combinatorial sequences from Music::CreatingRhythms, you can make algorithmic drums (code example):

And how about random grooves (code example)?

Please see the tutorials in the MIDI-Drummer-Tiny distribution for details on how to implement beats in your program.

Differentiation of Parts

This is an involved subject. Ideally, the different parts of a composition are distinct. If the piece starts slow (i.e. with a low note density per measure - not tempo change), then the next section should be more dense, then less again. Start quiet and soft? Follow with loud. If a part is staccato and edgy, it may be followed by a smooth legato section. Low register first, then higher register next, etc, etc. If a piece never changes, it is monotonous!

However that being said, consider “Thursday Afternoon” by Brian Eno. Brilliant.

Conclusion

You too can make music with Perl! This can be comprised of single-note lines (melody and bass, for instance), chord progressions, and of course drums. * Creativity not included. Haha!

References

The example code for this article

MIDI::Util

Music::Scales

Music::VoiceGen

Music::Voss

Music::Bassline::Generator

Data::Dumper::Compact

Music::Chord::Note

Music::Chord::Progression

Music::Chord::Progression::Transform

MIDI::Drummer::Tiny

Music::CreatingRhythms

MIDI::RtMidi::FFI::Device

MIDI::RtMidi::ScorePlayer

My personal music is available to stream on all platforms.