Cellgraph 0.7 is out - and I will tell you about the great new features in the next paragraph and how it helps you to play with logical structures and deepen your understanding of them. But first please let me mention the why!
Almost everybody here has side projects, that might be a waste of time - but you just want to build them because you feel a connection to the idea. Cellgraph is such a project to me. And after looking at all my stalled and half baked projects I wanted have something completed, well designed, beautiful code, nice interface, a joy to use - perfect as I'm capable of. Something that can rekindle your love for programming. In one way it already made me a better programmer, with better habits. Every time I spot a flaw in code, in code structure or even architecture I fix it. Yes its tedious. But I got a sense how good code looks like and whats needed, so i can get there faster in other projects.
Plus there is a lot going on under the hood you can not see, but that is needed to get smooth functionality. Why you can use a slider or arrow buttons to change a value and still get a smooth undo and redo that work as you expect? There is an data history class that can replace current value when they coming in faster. Will probably released as its own module some day.
The other why is: I like beautiful personalized graphics. Its the opposite of AI, you can tweak the details in the sense of directly touching them. And understand what you did. Many of the new features help you with that. Maybe the strongest new feature is the variable subrule mapping. Big word. Well, the state of a cell and its neighbors determine the next state of a cell. If you have a large neighborhood and many states you get combinatoric explosion. So I have 4 levels of how to bundle similar rules, so it stays manageable. The groupings also introduce certain graphical behavior. Also good for learning what is going on: many parameter have randomize functions. You can push it a few times to see what actually changes to see the influence of that parameter. And the undo and redo buttons make it easier to go back to the best you fond or to experiment in general. Also the action rules got completely reworked. They are much more powerful now. But in the end they produce certain areas where the usual pattern just breaks down. Then you have a lot of functions that produce related rules. Also helpful to understand but not a new feature. Lastly something that added a new layer of possibilities was the result application. Cellular automata are just simple functions state of neighborhood cells --> new cell state. But what if you do not just replace the old state with the new but add it (modulo max) or do other operations. This is especially useful when your rule has almost none useful output image and you could not figure out which subrule is the bottle neck that prevents the patterns from change. Then just drop in another result application and you got totally different pattern which than you can tweak into your liking.
And than has got a load of effort in cleaning up the UI, fixing the documentation. Rendering and startup got faster, better UI guidance and more bugs squashed than i want to admit where there. Please give it a try. Thank you.
Published by dami on Friday 21 February 2025 09:28
Lists of Perl distributions on metacpan.org use blue color for regular distributions and red for development releases (containing an underscore). But some are colored in grey, for reasons that I fail to understand, and I found no explanation on the site.
Published by richardleach on Thursday 20 February 2025 22:35
release_schedule.pod - RL to release 5.41.10
Published by khwilliamson on Thursday 20 February 2025 19:54
mem_collxfrm: Handle above-Unicode code points As stated in the comments added by this commit, it is undefined behavior to call strxfrm() on above-Unicode code points, and especially calling it with Perl's invented extended UTF-8. This commit changes all such input into a legal value, replacing all above-Unicode with the highest permanently unassigned code point, U+10FFFF.
Published by khwilliamson on Thursday 20 February 2025 19:54
run/locale.t: Hoist code out of a block The next commit will want to use the results later.
Published by khwilliamson on Thursday 20 February 2025 19:54
run/locale.t: Add detail to test names
Published by khwilliamson on Thursday 20 February 2025 19:54
utf8.h: Split a macro into components This creates an internal macro that skips some error checking for use when we don't care if it is completely well-formed or not.
Published by Chandan Kumar on Thursday 20 February 2025 01:54
Published on Wednesday 19 February 2025 19:17
In Part IIa, we detailed the challenges we faced when automating the deployment of a secure static website using S3, CloudFront, and WAF. Service interdependencies, eventual consistency, error handling, and AWS API complexity all presented hurdles. This post details the actual implementation journey.
We didn’t start with a fully fleshed-out solution that just worked. We had to “lather, rinse and repeat”. In the end, we built a resilient automation script robust enough to deploy secure, private websites across any organization.
The first take away - the importance of logging and visibility. While logging wasn’t the first thing we actually tackled, it was what eventually turned a mediocre automation script into something worth publishing.
run_command()
While automating the process of creating this infrastructure, we need to feed the output of one or more commands into the pipeline. The output of one command feeds another. But each step of course can fail. We need to both capture the output for input to later steps and capture errors to help debug the process. Automation without visibility is like trying to discern the elephant by looking at the shadows on the cave wall. Without a robust solution for capturing output and errors we experienced:
When AWS CLI calls failed, we found ourselves staring at the terminal trying to reconstruct what went wrong. Debugging was guesswork.
The solution was our first major building block: run_command()
.
echo "Running: $*" >&2
echo "Running: $*" >>"$LOG_FILE"
# Create a temp file to capture stdout
local stdout_tmp
stdout_tmp=$(mktemp)
# Detect if we're capturing output (not running directly in a terminal)
if [[ -t 1 ]]; then
# Not capturing → Show stdout live
"$@" > >(tee "$stdout_tmp" | tee -a "$LOG_FILE") 2> >(tee -a "$LOG_FILE" >&2)
else
# Capturing → Don't show stdout live; just log it and capture it
"$@" >"$stdout_tmp" 2> >(tee -a "$LOG_FILE" >&2)
fi
local exit_code=${PIPESTATUS[0]}
# Append stdout to log file
cat "$stdout_tmp" >>"$LOG_FILE"
# Capture stdout content into a variable
local output
output=$(<"$stdout_tmp")
rm -f "$stdout_tmp"
if [ $exit_code -ne 0 ]; then
echo "ERROR: Command failed: $*" >&2
echo "ERROR: Command failed: $*" >>"$LOG_FILE"
echo "Check logs for details: $LOG_FILE" >&2
echo "Check logs for details: $LOG_FILE" >>"$LOG_FILE"
echo "TIP: Since this script is idempotent, you can re-run it safely to retry." >&2
echo "TIP: Since this script is idempotent, you can re-run it safely to retry." >>"$LOG_FILE"
exit 1
fi
# Output stdout to the caller without adding a newline
if [[ ! -t 1 ]]; then
printf "%s" "$output"
fi
}
This not-so-simple wrapper gave us:
stdout
and stderr
for every commandrun_command()
became the workhorse for capturing our needed inputs
to other processes and our eyes into failures.
We didn’t arrive at run_command()
fully formed. We learned it the
hard way:
stdout
took fine-tuningThe point of this whole exercise is to host content, and for that, we need an S3 bucket. This seemed like a simple first task - until we realized it wasn’t. This is where we first collided with a concept that would shape the entire script: idempotency.
S3 bucket names are globally unique. If you try to create one that exists, you fail. Worse, AWS error messages can be cryptic:
Our naive first attempt just created the bucket. Our second attempt checked for it first:
create_s3_bucket() {
if run_command $AWS s3api head-bucket --bucket "$BUCKET_NAME" --profile $AWS_PROFILE 2>/dev/null; then
echo "Bucket $BUCKET_NAME already exists."
return
fi
run_command $AWS s3api create-bucket \
--bucket "$BUCKET_NAME" \
--create-bucket-configuration LocationConstraint=$AWS_REGION \
--profile $AWS_PROFILE
}
Making the script “re-runable” was essential unless of course we could
guarantee we did everything right and things worked the first time.
When has that every happened? Of course, we then wrapped the creation
of the bucket run_command()
because every AWS call still had the
potential to fail spectacularly.
And so, we learned: If you can’t guarantee perfection, you need idempotency.
Configuring a CloudFront distribution using the AWS Console offers a
streamlined setup with sensible defaults. But we needed precise
control over CloudFront behaviors, cache policies, and security
settings - details the console abstracts away. Automation via the AWS
CLI gave us that control - but there’s no free lunch. Prepare yourself
to handcraft deeply nested JSON payloads, get jiggy with jq
, and
manage the dependencies between S3, CloudFront, ACM, and WAF. This is
the path we would need to take to build a resilient, idempotent
deployment script - and crucially, to securely serve private S3
content using Origin Access Control (OAC).
Why do we need OAC?
Since our S3 bucket is private, we need CloudFront to securely retrieve content on behalf of users without exposing the bucket to the world.
Why not OAI?
AWS has deprecated Origin Access Identity in favor of Origin Access Control (OAC), offering tighter security and more flexible permissions.
Why do we need jq
?
In later steps we create a WAF Web ACL to firewall
our CloudFront distribution. In order to associate the WAF Web ACL with
our distribution we need to invoke the update-distribution
API which
requires a fully fleshed out JSON payload updated with the Web ACL id.
GOTHCHA: Attaching a WAF WebACL to an existing CloudFront distribution requires that you use the
update-distribution
API, notassociate-web-acl
as one might expect.
Here’s the template for our distribution configuration (some of the Bash variables used will be evident when you examine the completed script):
{
"CallerReference": "$CALLER_REFERENCE",
$ALIASES
"Origins": {
"Quantity": 1,
"Items": [
{
"Id": "S3-$BUCKET_NAME",
"DomainName": "$BUCKET_NAME.s3.amazonaws.com",
"OriginAccessControlId": "$OAC_ID",
"S3OriginConfig": {
"OriginAccessIdentity": ""
}
}
]
},
"DefaultRootObject": "$ROOT_OBJECT",
"DefaultCacheBehavior": {
"TargetOriginId": "S3-$BUCKET_NAME",
"ViewerProtocolPolicy": "redirect-to-https",
"AllowedMethods": {
"Quantity": 2,
"Items": ["GET", "HEAD"]
},
"ForwardedValues": {
"QueryString": false,
"Cookies": {
"Forward": "none"
}
},
"MinTTL": 0,
"DefaultTTL": $DEFAULT_TTL,
"MaxTTL": $MAX_TTL
},
"PriceClass": "PriceClass_100",
"Comment": "CloudFront Distribution for $ALT_DOMAIN",
"Enabled": true,
"HttpVersion": "http2",
"IsIPV6Enabled": true,
"Logging": {
"Enabled": false,
"IncludeCookies": false,
"Bucket": "",
"Prefix": ""
},
$VIEWER_CERTIFICATE
}
The create_cloudfront_distribution()
function is then used to create
the distribution.
create_cloudfront_distribution() {
# Snippet for brevity; see full script
run_command $AWS cloudfront create-distribution --distribution-config file://$CONFIG_JSON
}
Key lessons:
update-configuation
, not associate-web-acl
for CloudFront
distributionsjq
to modify the existing configuration to add the WAF Web
ACL idCool. We have a CloudFront distribution! But it’s wide open to the world. We needed to restrict access to our internal VPC traffic - without exposing the site publicly. AWS WAF provides this firewall capability using Web ACLs. Here’s what we need to do:
Keep in mind that CloudFront is designed to serve content to the public internet. When clients in our VPC access the distribution, their traffic needs to exit through a NAT gateway with a public IP. We’ll use the AWS CLI to query the NAT gateway’s public IP and use that when we create our allow list of IPs (step 1).
find_nat_ip() {
run_command $AWS ec2 describe-nat-gateways --filter "Name=tag:Environment,Values=$TAG_VALUE" --query "NatGateways[0].NatGatewayAddresses[0].PublicIp" --output text --profile $AWS_PROFILE
}
We take this IP and build our first WAF component: an IPSet. This becomes the foundation for the Web ACL we’ll attach to CloudFront.
The firewall we create will be composed of an allow list of IP addresses (step 2)…
create_ipset() {
run_command $AWS wafv2 create-ip-set \
--name "$IPSET_NAME" \
--scope CLOUDFRONT \
--region us-east-1 \
--addresses "$NAT_IP/32" \
--ip-address-version IPV4 \
--description "Allow NAT Gateway IP"
}
…that form the rules for our WAF Web ACL (step 3).
create_web_acl() {
run_command $AWS wafv2 create-web-acl \
--name "$WEB_ACL_NAME" \
--scope CLOUDFRONT \
--region us-east-1 \
--default-action Block={} \
--rules '[{"Name":"AllowNAT","Priority":0,"Action":{"Allow":{}},"Statement":{"IPSetReferenceStatement":{"ARN":"'$IPSET_ARN'"}},"VisibilityConfig":{"SampledRequestsEnabled":true,"CloudWatchMetricsEnabled":true,"MetricName":"AllowNAT"}}]' \
--visibility-config SampledRequestsEnabled=true,CloudWatchMetricsEnabled=true,MetricName="$WEB_ACL_NAME"
}
This is where our earlier jq
surgery becomes critical - attaching
the Web ACL requires updating the entire CloudFront distribution
configuration. And that’s how we finally attach that Web ACL to our
CloudFront distribution (step 4).
DISTRIBUTION_CONFIG=$(run_command $AWS cloudfront get-distribution-config --id $DISTRIBUTION_ID)
<h1 id="usejqtoinjectwebaclidintoconfigjson">Use jq to inject WebACLId into config JSON</h1>
UPDATED_CONFIG=$(echo "$DISTRIBUTION_CONFIG" | jq --arg ACL_ARN "$WEB_ACL_ARN" '.DistributionConfig | .WebACLId=$ACL_ARN')
<h1 id="passupdatedconfigbackintoupdate-distribution">Pass updated config back into update-distribution</h1>
echo "$UPDATED_CONFIG" > updated-config.json
run_command $AWS cloudfront update-distribution --id $DISTRIBUTION_ID --if-match "$ETAG" --distribution-config file://updated-config.json
At this point, our CloudFront distribution is no longer wide open. It is protected by our WAF Web ACL, restricting access to only traffic coming from our internal VPC NAT gateway.
For many internal-only sites, this simple NAT IP allow list is enough. WAF can handle more complex needs like geo-blocking, rate limiting, or request inspection - but those weren’t necessary for us. Good design isn’t about adding everything; it’s about removing everything that isn’t needed. A simple allow list was also the most secure.
When we set up our bucket, we blocked public access - an S3-wide security setting that prevents any public access to the bucket’s contents. However, this also prevents CloudFront (even with OAC) from accessing S3 objects unless we explicitly allow it. Without this policy update, requests from CloudFront would fail with Access Denied errors.
At this point, we need to allow CloudFront to access our S3
bucket. The update_bucket_policy()
function will apply the policy
shown below.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "cloudfront.amazonaws.com"
},
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::$BUCKET_NAME/*",
"Condition": {
"StringEquals": {
"AWS:SourceArn": "arn:aws:cloudfront::$AWS_ACCOUNT:distribution/$DISTRIBUTION_ID"
}
}
}
]
}
Modern OAC best practice is to use the AWS:SourceArn condition to ensure only requests from your specific CloudFront distribution are allowed.
It’s more secure because it ties bucket access directly to a single distribution ARN, preventing other CloudFront distributions (or bad actors) from accessing your bucket.
"Condition": {
"StringEquals": { "AWS:SourceArn": "arn:aws:cloudfront::$AWS_ACCOUNT:distribution/$DISTRIBUTION_ID" }
}
With this policy in place, we’ve completed the final link in the security chain. Our S3 bucket remains private but can now securely serve content through CloudFront - protected by OAC and WAF.
We are now ready to wrap a bow around these steps in an idempotent Bash script.
jq
patch.
Restrict Access with WAFjq
and update-distribution
.Each segment of our script is safe to rerun. Each is wrapped in run_command()
,
capturing results for later steps and ensuring errors are logged. We
now have a script we can commit and re-use with confidence whenever we
need a secure static site. Together, these steps form a robust,
idempotent deployment pipeline for a secure S3 + CloudFront website -
every time.
You can find the full script here.
A hallmark of a production-ready script is an ‘-h’ option. Oh wait - your script has no help or usage? I’m supposed to RTFC? It ain’t done skippy until it’s done.
Scripts should include the ability to pass options that make it a flexible utility. We may have started out writing a “one-off” but recognizing opportunities to generalize the solution turned this into another reliable tool in our toolbox.
Be careful though - not every one-off needs to be Swiss Army knife. Just because aspirin is good for a headache doesn’t mean you should take the whole bottle.
Our script now supports the necessary options to create a secure, static website with a custom domain and certificate. We even added the ability to include additional IP addresses for your allow list in addition to the VPC’s public IP.
Now, deploying a private S3-backed CloudFront site is as easy as:
Example:
./s3-static-site.sh -b my-site -t dev -d example.com -c arn:aws:acm:us-east-1:cert-id
Inputs:
This single command now deploys an entire private website - reliably and repeatably. It only takes a little longer to do it right!
The process of working with ChatGPT to construct a production ready script that creates static websites took many hours. In the end, several lessons were reinforced and some gotchas discovered. Writing this blog itself was a collaborative effort that dissected both the technology and the process used to implement it. Overall, it was a productive, fun and rewarding experience. For those not familiar with ChatGPT or who are afraid to give it a try, I encourage you to explore this amazing tool.
Here are some of the things I took away from this adventure with ChatGPT.
With regard to the technology, some lessons were reinforced, some new knowledge was gained:
update-distribution
API call not associate-web-acl
when
adding WAF ACLs to your distribution!Thanks to ChatGPT for being an ever-present back seat driver on this journey. Real AWS battle scars + AI assistance = better results.
In Part III we wrap it all up as we learn more about how CloudFront and WAF actually protect your website.
This post was drafted with the assistance of ChatGPT, but born from real AWS battle scars.
If you like this content, please leave a comment or consider following me. Thanks.
Welcome to “What’s new on CPAN”, a curated look at last month’s new CPAN uploads for your reading and programming pleasure. Enjoy!
which
Unix command, but takes a regex as argumentPublished on Wednesday 19 February 2025 18:36
After designing a secure static website on AWS using S3, CloudFront, and WAF as discussed in Part I of this series, we turned our focus to automating the deployment process. While AWS offers powerful APIs and tools, we quickly encountered several challenges that required careful consideration and problem-solving. This post explores the primary difficulties we faced and the lessons we learned while automating the provisioning of this infrastructure.
A key challenge when automating AWS resources is managing service dependencies. Our goal was to deploy a secure S3 website fronted by CloudFront, secured with HTTPS (via ACM), and restricted using WAF. Each of these services relies on others, and the deployment sequence is critical:
Missteps in the sequence can result in failed or partial deployments, which can leave your cloud environment in an incomplete state, requiring tedious manual cleanup.
AWS infrastructure often exhibits eventual consistency, meaning that newly created resources might not be immediately available. We specifically encountered this when working with ACM and CloudFront:
Handling these delays requires building polling mechanisms into your automation or using backoff strategies to avoid hitting API limits.
Reliable automation is not simply about executing commands; it requires designing for resilience and repeatability:
Additionally, logging the execution of deployment commands proved to be an unexpected challenge. We developed a run_command
function that captured both stdout and stderr while logging the output to a file. However, getting this function to behave correctly without duplicating output or interfering with the capture of return values required several iterations and refinements. Reliable logging during automation is critical for debugging failures and ensuring transparency when running infrastructure-as-code scripts.
While the AWS CLI and SDKs are robust, they are often verbose and require a deep understanding of each service:
Throughout this process, we found that successful AWS automation hinges on the following principles:
Automating AWS deployments unlocks efficiency and scalability, but it demands precision and robust error handling. Our experience deploying a secure S3 + CloudFront website highlighted common challenges that any AWS practitioner is likely to face. By anticipating these issues and applying resilient practices, teams can build reliable automation pipelines that simplify cloud infrastructure management.
Next up, Part IIb where we build our script for creating our static site.
This post was drafted with the assistance of ChatGPT, but born from real AWS battle scars.
If you like this content, please leave a comment or consider following me. Thanks.
Published on Wednesday 19 February 2025 18:36
While much attention is given to dynamic websites there are still many uses for the good ‘ol static website. Whether for hosting documentation, internal portals, or lightweight applications, static sites remain relevant. In my case, I wanted to host an internal CPAN repository for storing and serving Perl modules. AWS provides all of the necessary components for this task but choosing the right approach and configuring it securely and automatically can be a challenge.
Whenever you make an architectural decision various approaches are possible. It’s a best practice to document that decision in an Architectural Design Record (ADR). This type of documentation justifies your design choice, spelling out precisely how each approach either meets or fails to meet functional or non-functional requirements. In the first part of this blog series we’ll discuss the alternatives and why we ended up choosing our CloudFront based approach. This is our ADR.
Description | Notes | |
---|---|---|
1. | HTTPS website for hosting a CPAN repository | Will be used internally but we would like secure transport |
2. | Controlled Access | Can only be accessed from within a private subnet in our VPC |
3. | Scalable | Should be able to handle increasing storage without reprovisioning |
4. | Low-cost | Ideally less than $10/month |
5. | Low-maintenance | No patching or maintenance of applicaation or configurations |
6. | Highly available | Should be available 24x7, content should be backed up |
Now that we’ve defined our functional and non-functional requirements let’s look at some approaches we might take in order to create a secure, scalable, low-cost, low-maintenance static website for hosting our CPAN repository.
This solution at first glance seems like the quickest shot on goal. While S3 does offer a static website hosting feature, it doesn’t support HTTPS by default, which is a major security concern and does not match our requirements. Additionally, website-enabled S3 buckets do not support private access controls - they are inherently public if enabled. Had we been able to accept an insecure HTTP site and public access this approach would have been the easiest to implement. If we wanted to accept public access but required secure transport we could have used CloudFront with the website enabled bucket either using CloudFront’s certificate or creating our own custom domain with its own certificate.
Since our goal is to create a private static site, we can however use CloudFront as a secure, caching layer in front of S3. This allows us to enforce HTTPS, control access using Origin Access Control (OAC), and integrate WAF to restrict access to our VPC. More on this approach later…
Pros:
Cons:
Analysis:
While using an S3 website-enabled bucket is the easiest way to host static content, it fails to meet security and privacy requirements due to public access and lack of HTTPS support.
Perhaps the obvious approach to hosting a private static site is to deploy a dedicated Apache or Nginx web server on an EC2 instance. This method involves setting up a lightweight Linux instance, configuring the web server, and implementing a secure upload mechanism to deploy new content.
Pros:
Cons:
Analysis:
Using a dedicated web server is a viable alternative when additional flexibility is needed, but it comes with added maintenance and cost considerations. Given our requirements for a low-maintenance, cost-effective, and scalable solution, this may not be the best approach.
A common approach I have used to securely serve static content from an S3 bucket is to use an internal proxy server (such as Nginx or Apache) running on an EC2 instance within a private VPC. In fact, this is the approach I have used to create my own private yum repository, so I know it would work effectively for my CPAN repository. The proxy server retrieves content from an S3 bucket via a VPC endpoint, ensuring that traffic never leaves AWS’s internal network. This approach requires managing an EC2 instance, handling security updates, and scaling considerations. Let’s look at the cost of an EC2 based solution.
The following cost estimates are based on AWS pricing for us-east-1:
Item | Pricing |
---|---|
Instance type: t4g.nano (cheapest ARM-based instance) | Hourly cost: \$0.0052/hour |
Monthly usage: 730 hours (assuming 24/7 uptime) | (0.0052 x 730 = \$3.80/month) |
Pros:
Cons:
Analysis:
If predictable costs and full server control are priorities, EC2 may be preferable. However, this solution requires maintenance and may not scale with heavy traffic. Moreover, to create an HA solution would require additional AWS resources.
As alluded to before, CloudFront + S3 might fit the bill. To create a secure, scalable, and cost-effective private static website, we chose to use Amazon S3 with CloudFront (sprinkling in a little AWS WAF for good measure). This architecture allows us to store our static assets in an S3 bucket while CloudFront acts as a caching and security layer in front of it. Unlike enabling public S3 static website hosting, this approach provides HTTPS support, better scalability, and fine-grained access control.
CloudFront integrates with Origin Access Control (OAC), ensuring that the S3 bucket only allows access from CloudFront and not directly from the internet. This eliminates the risk of unintended public exposure while still allowing authorized users to access content. Additionally, AWS WAF (Web Application Firewall) allows us to restrict access to only specific IP ranges or VPCs, adding another layer of security.
Let’s look at costs:
Item | Cost | Capacity | Total | |
---|---|---|---|---|
Data Transfer Out | First 10TB is \$0.085 per GB | 25GB/month of traffic | Cost for 25GB: (25 x 0.085 = \$2.13) | |
HTTP Requests | \$0.0000002 per request | 250,000 requests/month | Cost for requests: (250,000 x 0.0000002 = \$0.05) | |
Total CloudFront Cost: \$2.13 (Data Transfer) + \$0.05 (Requests) = \$2.18/month |
Pros:
Cons:
Analysis:
And the winner is…CloudFront + S3!
Using just a website enabled S3 bucket fails to meet the basic requiredments so let’s eliminate that solution right off the bat. If predictable costs and full server control are priorities, Using an EC2 either as a proxy or a full blown webserver may be preferable. However, for a low-maintenance, auto-scaling solution, CloudFront + S3 is the superior choice. EC2 is slightly more expensive but avoids CloudFront’s external traffic costs. Overall, our winning approach is ideal because it scales automatically, reduces operational overhead, and provides strong security mechanisms without requiring a dedicated EC2 instance to serve content.
Now that we have our agreed upon approach (the “what”) and documented our “architectural decision”, it’s time to discuss the “how”. How should we go about constructing our project? Many engineers would default to Terraform for this type of automation, but we had specific reasons for thinking this through and looking at a different approach. We’d like:
While Terraform is a popular tool for infrastructure automation, it introduces several challenges for this specific project. Here’s why we opted for a Bash script over Terraform:
State Management Complexity
Terraform relies on state files to track infrastructure resources, which introduces complexity when running and re-running deployments. State corruption or mismanagement can cause inconsistencies, making it harder to ensure a seamless idempotent deployment.
Slower Iteration and Debugging
Making changes in Terraform requires updating state, planning, and applying configurations. In contrast, Bash scripts execute AWS CLI commands immediately, allowing for rapid testing and debugging without the need for state synchronization.
Limited Control Over Execution Order
Terraform follows a declarative approach, meaning it determines execution order based on dependencies. This can be problematic when AWS services have eventual consistency issues, requiring retries or specific sequencing that Terraform does not handle well natively.
Overhead for a Simple, Self-Contained Deployment
For a relatively straightforward deployment like a private static website, Terraform introduces unnecessary complexity. A lightweight Bash script using AWS CLI is more portable, requires fewer dependencies, and avoids managing an external Terraform state backend.
Handling AWS API Throttling
AWS imposes API rate limits, and handling these properly requires implementing retry logic. While Terraform has some built-in retries, it is not as flexible as a custom retry mechanism in a Bash script, which can incorporate exponential backoff or manual intervention if needed.
Less Direct Logging and Error Handling
Terraform’s logs require additional parsing and interpretation, whereas a Bash script can log every AWS CLI command execution in a simple and structured format. This makes troubleshooting easier, especially when dealing with intermittent AWS errors.
Although Bash was the right choice for this project, Terraform is still useful for more complex infrastructure where:
For our case, where the goal was quick, idempotent, and self-contained automation, Bash scripting provided a simpler and more effective approach. This approach gave us the best of both worlds - automation without complexity, while still ensuring idempotency and security.
This post was drafted with the assistance of ChatGPT, but born from real AWS battle scars.
If you like this content, please leave a comment or consider following me. Thanks.
Published on Wednesday 19 February 2025 18:35
This is the last in our three part series where we discuss the creation of a private, secure, static website using Amazon S3 and CloudFront.
Amazon S3 and CloudFront are powerful tools for hosting static websites, but configuring them securely can be surprisingly confusing-even for experienced AWS users. After implementing this setup for my own use, I discovered a few nuances that others often stumble over, particularly around CloudFront access and traffic routing from VPC environments. This post aims to clarify these points and highlight a potential gap in AWS’s offering.
The typical secure setup for hosting a static website using S3 and CloudFront looks like this:
This setup ensures that even if someone discovers your S3 bucket URL, they won’t be able to retrieve content directly. All access is routed securely through CloudFront.
For many AWS users, especially those running workloads inside a VPC, the first head-scratcher comes when internal clients access the CloudFront-hosted website. You might notice that this traffic requires a NAT gateway, and you’re left wondering:
Here’s the key realization:
CloudFront is a public-facing service. Even when your CloudFront distribution is serving content from a private S3 bucket, your VPC clients are accessing CloudFront through its public endpoints.
This distinction is not immediately obvious, and it can be surprising to see internal traffic going through a NAT gateway and showing up with a public IP.
For my use case, I wasn’t interested in CloudFront’s global caching or latency improvements; I simply wanted a secure, private website hosted on S3, with a custom domain and HTTPS. AWS currently lacks a streamlined solution for this. A product offering like “S3 Secure Website Hosting” could fill this gap by combining:
To restrict access to your CloudFront-hosted site, you can use AWS WAF with an IPSet containing your NAT gateway’s public IP address. This allows only internal VPC clients (routing through the NAT) to access the website while blocking everyone else.
The S3 + CloudFront setup is robust and secure - once you understand the routing and public/private distinction. However, AWS could better serve users needing simple, secure internal websites by acknowledging this use case and providing a more streamlined solution.
Until then, understanding these nuances allows you to confidently deploy secure S3-backed websites without surprises.
This post was drafted with the assistance of ChatGPT, but born from real AWS battle scars.
If you like this content, please leave a comment or consider following me. Thanks.
Published on Wednesday 19 February 2025 11:59
Ever locked yourself out of your own S3 bucket? That’s like asking a golfer if he’s ever landed in a bunker. We’ve all been there.
Scenario:
A sudden power outage knocks out your internet. When service resumes, your ISP has assigned you a new IP address. Suddenly, the S3 bucket you so carefully protected with that fancy bucket policy that restricts access by IP… is protecting itself from you. Nice work.
And here’s the kicker, you can’t change the policy because…you can’t access the bucket! Time to panic? Read on…
This post will cover:
S3 bucket policies are powerful and absolute. A common security pattern is to restrict access to a trusted IP range, often your home or office IP. That’s fine, but what happens when those IPs change without prior notice?
That’s the power outage scenario in a nutshell.
Suddenly (and without warning), I couldn’t access my own bucket. Worse, there was no easy way back in because the bucket policy itself was blocking my attempts to update it. Whether you go to the console or drop to a command line, you’re still hitting that same brick wall—your IP isn’t in the allow list.
At that point, you have two options, neither of which you want to rely on in a pinch:
The root account is a last resort (as it should be), and AWS support can take time you don’t have.
Once you regain access to the bucket again, it’s time to build a policy that includes an emergency backdoor from a trusted environment. We’ll call that the “safe room”. Your safe room is your AWS VPC.
While your home IP might change with the weather, your VPC is rock solid. If you allow access from within your VPC, you always have a way to manage your bucket policy.
Even if you rarely touch an EC2 instance, having that backdoor in your pocket can be the difference between a quick fix and a day-long support ticket.
A script to implement our safe room approach must at least:
This script helps you recover from lockouts and prevents future ones by ensuring your VPC is always a reliable access point.
Our script is light on dependencies but you will need to have curl
and the aws
script installed on your EC2.
A typical use of the command requires only your new IP address and the
bucket name. The aws CLI will try credentials from the environment,
your ~/.aws config
, or an instance profile - so you only need -p
if you
want to specify a different profile. Here’s the minimum you’d need to run the
command if you are executing the script in your VPC:
./s3-bucket-unlock.sh -i <your-home-ip> -b <bucket-name>
Options:
-i
Your current public IP address (e.g., your home IP).-b
The S3 bucket name.-v
(Optional) VPC ID; auto-detected if not provided.-p
(Optional) AWS CLI profile (defaults to $AWS_PROFILE
or default
).-n
Dry run (show policy, do not apply).Example with dry run:
./s3-bucket-unlock.sh -i 203.0.113.25 -b my-bucket -n
The dry run option lets you preview the generated policy before making any changes—a good habit when working with S3 policies.
Someone once said that we learn more from our failures than from our successes. At this rate I should be on the AWS support team soon…lol. Well, I probably need a lot more mistakes under my belt before they hand me a badge. In any event, ahem, we learned something from our power outage. Stuff happens - best be prepared. Here’s what this experience reinforced:
Sometimes it’s not a mistake - it’s a failure to realize how fragile access is. My home IP was fine…until it wasn’t.
Our script will help us apply a quick fix. The process of writing it was a reminder that security balances restrictions with practical escape hatches.
Next time you set an IP-based bucket policy, ask yourself:
Thanks to ChatGPT for being an invaluable backseat driver on this journey. Real AWS battle scars + AI assistance = better results.
Published by Robin on Wednesday 19 February 2025 14:47
Discover Expert-Recommended Perl Books and Practical Tips to Begin Your Programming Journey
Published by /u/briandfoy on Wednesday 19 February 2025 12:31
![]() | submitted by /u/briandfoy [link] [comments] |
Published by /u/prouleau001 on Tuesday 18 February 2025 14:27
Hi everyone,
As part of my learning of Perl I would like to use tools to analyze Perl code and render documentation for it, in a way that Doxygen analyzes C and C++ source code.
I found Doxygen::Filter::Perl and will try to experiment with to render Perl written long time ago that I have to maintain.
Is this what people use? Are there other tools? What do you use?
Published by /u/briandfoy on Tuesday 18 February 2025 12:31
Published by Grinnz on Tuesday 18 February 2025 08:14
Originally published at dev.to
In a previous blog post, I explored the modern way to write CGI scripts using frameworks like Mojolicious. But as pointed out in comments, despite the many benefits, there is one critical problem: when you actually need to deploy to a regular CGI server, where the scripts will be loaded each time and not persisted, frameworks designed for persistent applications add lots of overhead to each request.
CGI scripts have historically been written using the CGI module (or even more ancient libraries). But this module is bulky, crufty, and has serious design issues that led to it being removed from Perl core.
Enter CGI::Tiny. It is built for one thing only: serving the CGI protocol. In most cases, frameworks are still the right answer, but in the case of scripts that are forced to run under the actual CGI protocol (such as shared web hosting), or when you want to just drop in CGI scripts with no need to scale, CGI::Tiny provides a modern alternative to CGI.pm. You can explore the interface differences from CGI.pm or suggested ways to extend CGI::Tiny scripts.
So without further ado, here is the equivalent CGI::Tiny script to my previous blog post's examples:
#!/usr/bin/env perl
use strict;
use warnings;
use CGI::Tiny;
cgi {
my $cgi = $_;
my $input = $cgi->param('input');
$cgi->render(json => {output => uc $input});
};
Published by Gathering Insight on Tuesday 18 February 2025 07:22
Hello, this might be a strange thing to do, but is there a way to download the data behind fastapi.metacpan.org? I want to download everything so I don't clobber the api. In particular, I want to analyze dependencies, activity (of various kinds), number of open issues, etc. I realize a lot of that would be on github these days but not all.
(Why do I need such data? I have a side project idea to measure the "health" of a programming language by analyzing the liveliness of its package/module ecosystem (cpan for perl, npm for node, etc)).
Published by Gabor Szabo on Monday 17 February 2025 07:31
Originally published at Perl Weekly 708
Hi there,
There are many interpretations of what it means to grow? I am using the term for new features. We get lots of improvements and new features with every release of Perl. In v5.38, the experimental class feature was rolled out in core. In the next maintenance release of Perl v5.40, new field attribute :reader was added and many other improvements. The next thing, we all waited was for field attribute :writer. Luckily it is already part of development release v5.41.7. I made this gist demonstrating the core changes.
If you are new to Perl Release Policy then there are two types of release i.e. Maintenance and Development. The even numbers are reserved for the maintenance release e.g. v5.38, v5.40 whereas odd numbers are for the development release e.g. v5.39, v5.41. The maintenance release are mostly production ready.
If you are interested in release history then please checkout the version history page. I found an interesting proposal with regard to the version number.
Recently, I got to try the different facets of parallel and concurrent programming. Please find below the list covered so far.
Enjoy rest of the newsletter.
--
Your editor: Mohammad Sajid Anwar.
nicsell is now supporting German Perl Workshop. nicsell is a domain backorder service, also known as a dropcatcher, which allows you to bid on a large number of domains that are currently being deleted.
This is a continuation of a series of articles about how to write XS libraries that are more convenient and foolproof for the Perl users, while not blocking them from using the actual C API.
The Weekly Challenge by Mohammad Sajid Anwar will help you step out of your comfort-zone. You can even win prize money of $50 by participating in the weekly challenge. We pick one champion at the end of the month from among all of the contributors during the month, thanks to the sponsor Lance Wicks.
Welcome to a new week with a couple of fun tasks "Mind Gap" and "Min Diff". If you are new to the weekly challenge then why not join us and have fun every week. For more information, please read the FAQ.
Enjoy a quick recap of last week's contributions by Team PWC dealing with the "Count Common" and "Decode XOR" tasks in Perl and Raku. You will find plenty of solutions to keep you busy.
Apart from Perl magics, there is CPAN gem, Data::Show, used as well. Cool, keep it up great work.
Nice bunch of one-liners in Raku. Raku Rocks!!!
It is one post where we get Perl and Raku magic together. On top, we have detailed discussion, incredible.
Compact solutions in Perl and PDL. New to PDL? You must check it out.
Welcome back with yet another quality contributions in Raku. Great work.
The post reminded me of good old Truth Table, very handy to cover the test cases. Thanks for sharing.
Lots of mathematical magic shared with this week contribution. Bitwise operation is always tricky. Well done.
Great detailed XOR operation is very interesting, and definitely not to be missed. Thanks for the contributions.
Simple and straight forward approach makes it so easy to decode. Nice work, thanks for sharing.
Clever use of set in Raku and Python, ended up one-liner. Keep it up great work.
My personal favourite Postscript one-liner is USP of the post. Highly recommended.
Python makes me fall in love again and again. Incredibly powerful and easy to follow. Well done and keep it up.
Great CPAN modules released last week.
Virtual event
Paris, France
Virtual event
Munich, Germany
Greenville, South Carolina, USA
You joined the Perl Weekly to get weekly e-mails about the Perl programming language and related topics.
Want to see more? See the archives of all the issues.
Not yet subscribed to the newsletter? Join us free of charge!
(C) Copyright Gabor Szabo
The articles are copyright the respective authors.
Published on Sunday 16 February 2025 09:35
Published by Simon Green on Sunday 16 February 2025 04:23
Each week Mohammad S. Anwar sends out The Weekly Challenge, a chance for all of us to come up with solutions to two weekly tasks. My solutions are written in Python first, and then converted to Perl. It's a great way for us all to practice some coding.
You are given two array of strings, @str1
and @str2
.
Write a script to return the count of common strings in both arrays.
The tasks and examples don't mention what to do if a string appears more than once in both arrays. I've made the assumption that we only need to return it once.
For the command line input, I take two strings that are space separated as shown in the example.
In Python this is a one liner. I turn the lists into sets (which only has unique values) and take the length of the intersection of these two sets.
def count_common(str1: list, str2: list) -> int:
return len(set(str1) & set(str2))
Perl does not have sets or intersections built in. For the Perl solution, I turn both strings into a hash with the key being the strings. I then iterate through the keys of the first hash to see if they appear in the second hash. If they do, I increment the count
variable.
sub main (@inputs) {
my %str1 = map { $_, 1 } split( /\s+/, $inputs[0] );
my %str2 = map { $_, 1 } split( /\s+/, $inputs[1] );
my $count = 0;
foreach my $str ( keys %str1 ) {
$count++ if exists $str2{$str};
}
say $count;
}
$ ./ch-1.py "perl weekly challenge" "raku weekly challenge"
2
$ ./ch-1.py "perl raku java" "python java"
1
$ ./ch-1.py "guest contribution" "fun weekly challenge"
0
You are given an encoded array and an initial integer.
Write a script to find the original array that produced the given encoded array. It was encoded such that encoded[i] = orig[i] XOR orig[i + 1]
.
This is relatively straight forward. For the command line input, I take the last value as the initial
integer, and the rest as the encoded
integers.
For this task, I create the orig
list (array in Perl) with the initial
value. I then iterate over each item in the encoded
list and takes the exclusive-or of it and the last value in the orig
list.
def decode_xor(encoded: list, initial: int) -> list:
orig = [initial]
for i in encoded:
orig.append(i ^ orig[-1])
return orig
$ ./ch-2.py 1 2 3 1
[1, 0, 2, 1]
$ ./ch-2.py 6 2 7 3 4
[4, 2, 0, 7, 4]
Published by breadwild on Sunday 16 February 2025 01:02
I have read all the "Similar questions" posts related to my question, but none have been helpful other than to suggest a subroutine might not have returned a value.
This is part of a proprietary CMS: create pages, add content, etc. that a number of websites on the same server are using. Only one of the sites returns this error, but only for one of the features.
First, the error:
Bareword "Common::dbconnect" not allowed while "strict subs" in use
Now the code:
A database object is created in Super.pm:
use Common;
my ( $dbh ) = Common::dbconnect;
Common.pm has the dbconnect
subroutine:
sub dbconnect {
my ($host, $database, $username, $password) = getvariables();
use DBI qw(:sql_types);
$dbh_cache->{member} = DBI->connect("DBI:mysql:".$database.":".
$host,
$username,
$password,
{ RaiseError => 1 },) or die "Connect failed: $DBI::errstr\n";
return ($dbh_cache->{member});
}
Debugging this has been frustrating:
What am I missing? What should I be trying?
UPDATE:
I should have added that this is at the top of Common.pm (which contains dbconnect
:
use vars qw( @EXPORT );
@EXPORT = qw( checksession dbconnect);
And like I said, all other sites using the same exact code and calling the same exact methods work without error. The better question might be "how do I debug this?"
Published by Unknown on Saturday 15 February 2025 22:54
Published by Welcho on Friday 14 February 2025 21:23
I need to upload a file to a server using SFTP.
When I run the script from the command line, it works perfectly. However, when executed through Apache, it hangs indefinitely and never returns a response.
I'm using:
For some reason it is imperative to use module "Net::SFTP::Foreign".
It’s clearly an Apache-related issue, but after days of troubleshooting, I haven't made any progress. What can I try next?
Here’s my code:
#!/usr/bin/perl
use strict;
use warnings;
use Net::SFTP::Foreign;
my $host = 'host';
my $user = 'user';
my $remote_path = '/var/www/html/img/';
my $local_file = 'file.txt';
print "Content-type: text/html\n\n";
my $sftp = Net::SFTP::Foreign->new(
host=> $host,
user=> $user,
ssh_cmd => '"C:\\Program Files\\PuTTY\\plink.exe"',
more => ['-i', 'D:\\OneDriveIQUE\\trabajo\\localhost\\administrator\\temp\\sftp_key_2.ppk'],
stderr_discard => 1,
);
$sftp->error and die "Error de conexión: " . $sftp->error;
print "Conectado a $host\n";
$sftp->put($local_file, "$remote_path$local_file") or die "Error al subir el archivo: " . $sftp->error;
print "Archivo $local_file subido correctamente a $remote_path\n";
my @files = $sftp->ls($remote_path, names_only => 1);
print "Archivos en $remote_path:\n", join("\n", @files), "\n";
exit;
Published by Alexander Gelbukh on Thursday 13 February 2025 20:03
I can't install a Perl module. I have it installed and working OK on another laptop. Can I copy the whole installation from that laptop? What I tried:
Path
variable. Now perl --verion
successfully presents itself as ActiveState.
However, it still looks for modules under c:\Strawbwerry not under c:\Perl64\lib, where Image::Magick is really present (copied from the old laptop). Is there an environment variable to control where pel will look for modules?
Is there anything else that I should copy from my old laptop?
I would prefer simple solutions, without compiling things locally.
And without changing my Perl code itself
Both systems (new - target and old - source) are Windows-10
Very strangely perl-V says:
@INC:
C:/Perl64/site/lib
C:/Perl64/lib
And still it does not find Image::Magick, which really is present in:
C:\Perl64\site\lib\Image\Magick.pm
Strangely, the error message says:
Can't locate Image/Magick.pm in @INC (you may need to install the Image::Magick module) (@INC entries checked: C:/Strawberry/perl/site/lib C:/Strawberry/perl/vendor/lib C:/Strawberry/perl/lib)
Note that these @INC entries do not correspnd to to the value of @INC reported by perl -V
(see above)
Interestingly, on the old (source) laptop, perl-V
reports the same as o the new (target) one:
@INC:
C:/Perl64/site/lib
C:/Perl64/lib
And there Image::Magick works OK, with the same tree od c:\Perl64 .Published by Zilore Mumba on Thursday 13 February 2025 19:35
I have made a simple version of a table using Text::Table::More
as given here. I have been able to include a title being having the first row span all the columns.
I tried to remove the top_border with "top_border => 0" but it does not work. can this be done?
Also the unclear guide (to me) in this module suggests that one can color rows. Is this doable? My code below.
#!perl
use 5.010001;
use strict;
use warnings;
use Text::Table::More qw/generate_table/;
my $rows = [
# header row
[{text=>"Upcoming Program Achievements in Entertainment", align => "left", colspan=>5}],
# first data row
["Year",
"Comedy",
"Drama",
"Variety",
"Lead Comedy Actor"],
# second data row
[1962,
"The Bob Newhart Show (NBC)",
"The Defenders (CBS)",
"The Garry Moore Show (CBS)",
"E. G. Marshall (CBS)"],
# third data row
[1963,
"The Dick Van Dyke Show (CBS)",
"The Dick Van Dyke Show (CBS)",
"The Andy Williams Show (NBC)",
"The Andy Williams Show (NBC)"],
# fourth data row
[1964,
"The Danny Kaye Show (CBS)",
"Dick Van Dyke (CBS)",
"Mary Tyler Moore (CBS)",
"The Andy Williams Show (NBC)"],
];
binmode STDOUT, "utf8";
print generate_table(
rows => $rows, # required
to_border => 0 #top border was put here, this is what I said doesn't work
header_row => 1, # optional, default 0
separate_rows => 1, # optional, default 0
border_style => "UTF8::SingleLineBoldHeader",
row_attrs => [
[0, {align=>'middle', bottom_border=>1}],
],
col_attrs => [[2, {valign=>'middle'}],
],
);
I just needed a few rows of UUIDs in a column of a spreadsheet, more for esthetics than anything else. uuidgen
to the rescue.
At the time I didn't realize thatuuidgen
natively supports outputting multiple ids like souuidgen -C 8
The truly lazy path would have been to read the fine uuidgen manual.
Alas, supposing I needed to make multiple calls to uuidgen
, I went with a Perl
one-liner with a loop, as I couldn't recall the Bash
loop syntax.
Here comes the laziness... I I didn't want to write something like this:
perl -e 'print `uuidgen` for @{[1..5]}';
I'm not so found of of perl's de-reference syntax these days, also that array reference/range was giving "the ick" as my kids would say. I needed something lazier, cleaner. I wondered if there were any default/exported arrays available to me that don't have too many elements to them.... Ah, I know!
$ perl -e 'print `uuidgen` for @INC';d2c9c4b9-2126-4eda-ba52-ca30fdc55db0
eac4f86a-04eb-4c1a-aba1-fb1fa5c7dcda
2a2c416c-00bc-46d8-b7ce-c639f73cef26
4cc052cc-6423-4420-bbf5-595a7ad28c51
0bb78a2e-f4e9-44cd-80ae-e463197398f5
37728b6c-69dc-4669-99e7-2814b0d5e2a6
5acf78b2-6938-465b-ad8a-3bf29037e749
87d6d4ef-e85c-40bb-b3c2-acf9dc88f3e1
This is more a case of (ab)using a variable for an unintended purpose, but today it got the job done, even if it wasn't the most lazy approach. Hubris? Maybe.
Published by Max Maischein on Tuesday 11 February 2025 13:41
Sie bieten, wir catchen!
nicsell ist ein Domain-Backorder-Dienst, auch Dropcatcher genannt, der es Ihnen ermöglicht, auf eine Vielzahl freiwerdender Domains zu bieten, die sich aktuell in der Löschungsphase befinden.
Schon ab einem geringen Startgebot von 10 € können Sie an unseren Auktionen teilnehmen und haben die Chance an Ihre Wunschdomain zu gelangen.
Übrigens: Zur Verstärkung unseres Teams in Osnabrück suchen wir engagierte Perl-Entwickler (m/w/d). Bei Interesse freuen wir uns auf Ihre Bewerbung!
This is a continuation of a series of articles about how to write XS libraries that are more convenient and foolproof for the Perl users, while not blocking them from using the actual C API.
If you spot anything wrong, or want to contribute suggestions, open an issue at the GitHub repo
One frequent and difficult problem you will encounter when writing XS wrappers around a C library is what to do when the C library exposes a struct which the user needs to see, but the lifespan of that struct is controlled by something other than the reference the user is holding onto.
For example, consider the Display and Screen structs of libX11. When you connect to an X server, the library gives you a Display pointer. Within that Display struct are Screen structs. Some of the X11 API uses those Screen pointers as parameters, and you need to expose them in the Perl interface. But, if you call XCloseDisplay on the Display pointer those Screen structs get freed, and now accessing them will crash the program. The Perl user might still be holding onto a X11::Xlib::Screen Perl object, so how do you stop them from crashing the program when they check an attribute of that object?
For the case of X11 Screens there was an easy workaround: The Screen structs are
numbered, and a pair of (Display, ScreenNumber)
can refer to the Screen struct without
needing the pointer to it. Because the Perl Screen object references the
Perl Display object, the methods of Screen can check whether the display is closed
before resolving the pointer to a Screen struct, and die with a useful message instead
of a crash.
From another perspective, you can think of them like symlinks. You reference one Perl object which has control over its own struct’s lifecycle and then a relative path from that struct to whatever internal data structure you’re wrapping with the current object.
While this sounds like a quick solution, there’s one other detail to worry about: cyclical references. If the sub-object is referring to the parent object, and the parent refers to a collection of sub-objects, Perl will never free these objects. For the case of X11 Screens, the list of screen structs is known at connection-time and is almost always just one Screen, and doesn’t change at runtime. [1] An easy solution for a case like this is to have a strong reference from Display to Screen, and weak references (Scalar::Util::weaken) from Screen to Display, and create all the Screen objects as soon as the Display is connected.
1) this API is from an era before people thought about connecting new monitors while the computer was powered up, and these days can more accurately be thought of as a list of graphics cards rather than “screens”
If the list of Screens were dynamic, or if I just didn’t want to allocate them all upfront for some reason, another approach is to wrap the C structs on demand. You could literally create a new wrapper object each time they access the struct, but you’d probably want to return the same Perl object if they access two references to the same struct. One way to accomplish this is with a cache of weak references.
In Perl it would look like:
package MainObject {
use Moo;
use Scalar::Util 'weaken';
has is_closed => ( is => 'rwp' );
# MainObject reaches out to invalidate all the SubObjects
sub close($self) {
...
$self->_set_is_closed(1);
}
has _subobject_cache => ( is => 'rw', default => sub {+{}} );
sub _new_cached_subobject($self, $ptr) {
my $obj= $self->_subobject_cache->{$ptr};
unless (defined $obj) {
$obj= SubObject->new(main_ref => $main, data_ptr => $ptr);
weaken($self->_subobject_cache->{$ptr}= $obj);
}
return $obj;
}
sub find_subobject($self, $search_key) {
my $data_ptr= _xs_find_subobject($self, $search_key);
return $self->_new_cached_subobject($data_ptr);
}
}
package SubObject {
use Moo;
has main_ref => ( is => 'ro' );
has data_ptr => ( is => 'ro' );
sub method1($self) {
# If main is closed, stop all method calls
croak "Object is expired"
if $self->main_ref->is_closed;
... # operate on data_ptr
}
sub method2($self) {
# If main is closed, stop all method calls
croak "Object is expired"
if $self->main_ref->is_closed;
... # operate on data_ptr
}
}
Now, the caller of find_subobject
gets a SubObject, and it has a strong reference to
MainObject, and MainObject’s cache holds a weak reference to the SubObject. If we call
that same method again with the same search key while the first SubObject still exists,
we get the same Perl object back. As long as the user holds onto the SubObject, the MainObject
won’t expire, but the SubObjects can get garbage collected as soon as they aren’t needed.
One downside of this exact design is that every method of SubObject which uses
data_ptr
will need to first check that main_ref
isn’t closed (like shown in method1
).
If you have frequent method calls and you’d like them to be a little more efficient, here’s
an alternate version of the same idea:
package MainObject {
...
# MainObject reaches out to invalidate all the SubObjects
sub close($self) {
...
$_->data_ptr(undef)
for grep defined, values $self->_subobject_cache->%*;
}
...
}
package SubObject {
...
sub method1($self) {
my $data_ptr= $self->data_ptr
// croak "SubObject belongs to a closed MainObject";
... # operate on data_ptr
}
sub method2($self) {
my $data_ptr= $self->data_ptr
// croak "SubObject belongs to a closed MainObject";
... # operate on data_ptr
}
...
}
In this pattern, the sub-object doesn’t need to consult anything other than its own pointer before getting to work, which comes in really handy with the XS Typemap. The sub-object also doesn’t need a reference to the main object (unless you want one to prevent the main object from getting freed while a user holds SubObjects) so this design is a little more flexible. The only downside is that closing the main object takes a little extra time as it invalidates all of the SubObject instances, but in XS that time won’t be noticeable.
So, what does the code above look like in XS? Here we go…
/* First, the API for your internal structs */
struct MainObject_info {
SomeLib_MainObject *obj;
HV *wrapper;
HV *subobj_cache;
bool is_closed;
};
struct SubObject_info {
SomeLib_SubObject *obj;
SomeLib_MainObject *parent;
HV *wrapper;
};
struct MainObject_info*
MainObject_info_create(HV *wrapper) {
struct MainObject_info *info= NULL;
Newxz(info, 1, struct MainObject_info);
info->wrapper= wrapper;
return info;
}
void MainObject_info_close(struct MainObject_info* info) {
if (info->is_closed) return;
/* All SubObject instances are about to be invalid */
if (info->subobj_cache) {
HE *pos;
hv_iterinit(info->subobj_cache);
while (pos= hv_iternext(info->subobj_cache)) {
/* each value of the hash is a weak reference,
which might have become undef at some point */
SV *subobj_ref= hv_iterval(info->subobj_cache, pos);
if (subobj_ref && SvROK(subobj_ref)) {
struct SubObject_info *s_info =
SubObject_from_magic(SvRV(subobj_ref), 0);
if (s_info) {
/* it's an internal piece of the parent, so
no need to call a destructor here */
s_info->obj= NULL;
s_info->parent= NULL;
}
}
}
}
SomeLib_MainObject_close(info->obj);
info->obj= NULL;
info->is_closed= true;
}
void MainObject_info_free(struct MainObject_info* info) {
if (info->obj)
MainObject_info_close(info);
if (info->subobj_cache)
SvREFCNT_dec((SV*) info->subobj_cache);
/* The lifespan of 'wrapper' is handled by perl,
* probably in the process of getting freed right now.
* All we need to do is delete our struct.
*/
Safefree(info);
}
The gist here is that MainObject has a set of all SubObject wrappers which are still held by the Perl script, and during “close” (which, in this hypothetical library, invalidates all SubObject pointers) it can iterate that set and mark each wrapper as being invalid.
The Magic setup for MainObject goes just like in the previous article:
static int MainObject_magic_free(pTHX_ SV* sv, MAGIC* mg) {
MainObject_info_free((struct MainObject_info*) mg->mg_ptr);
}
static MAGIC MainObject_magic_vtbl = {
...
};
struct MainObject_info *
MainObject_from_magic(SV *objref, int flags) {
...
}
The destructor for the magic will call the destructor for the info struct. The “frommagic” function instantiates the magic according to ‘flags’, and so on.
Now, the Magic handling for SubObject works a little differently. We don’t get to decide when to create or destroy SubObject, we just encounter these pointers in the return values of the C library functions, and need to wrap them in order to show them to the perl script.
/* Return a new ref to an existing wrapper, or
* create a new wrapper and cache it.
*/
SV * SubObject_wrap(SomeLib_SubObject *sub_obj) {
/* If your library doesn't have a way to get the main object
* from the sub object, this gets more complicated.
*/
SomeLib_MainObject *main_obj= SomeLib_SubObject_get_main(sub_obj);
SV **subobj_entry= NULL;
SubObject_info *s_info= NULL;
HV *wrapper= NULL;
SV *objref= NULL;
MAGIC *magic;
/* lazy-allocate the cache */
if (!main_obj->subobj_cache) {
main_obj->subobj_cache= newHV();
/* See if the SubObject has already been wrapped.
* Use the pointer as the key
*/
subobj_entry= hv_fetch(
main_obj->subobj_cache,
&sub_obj, sizeof(void*), 1
);
if (!subobj_entry)
croak("lvalue hv_fetch failed"); /* should never happen */
/* weak references may have become undef */
if (*subobj_entry && SvROK(*subobj_entry))
/* we can re-use the existing wrapper */
return newRV_inc( SvRV(*subobj_entry) );
/* Not cached. Create the struct and wrapper. */
Newxz(s_info, 1, struct SubObject_info);
s_info->obj= sub_obj;
s_info->wrapper= newHV();
s_info->parent= main_obj;
objref= newRV_noinc((SV*) s_info->wrapper);
sv_bless(objref, gv_stashpv("YourProject::SubObject", GV_ADD));
/* Then attach the struct pointer to its wrapper via magic */
magic= sv_magicext((SV*) s_info->wrapper, NULL, PERL_MAGIC_ext,
&SubObject_magic_vtbl, (const char*) s_info, 0);
#ifdef USE_ITHREADS
magic->mg_flags |= MGf_DUP;
#else
(void)magic; // suppress warning
#endif
/* Then add it to the cache as a weak reference */
*subobj_entry= sv_rvweaken( newRV_inc((SV*) s_info->wrapper) );
/* Then return a strong reference to it */
return objref;
}
Again, this is roughly equivalent to the Perl implementation of new_cached_subobject
above.
Now, when methods are called on the SubObject wrapper, we want to throw an exception if the SubObject is no longer valid. We can do that in the function that the Typemap uses:
struct SubObject_info *
SubObject_from_magic(SV *objref, int flags) {
struct SubObject_info *ret= NULL;
... /* inspect magic */
if (flags & OR_DIE) {
if (!ret)
croak("Not an instance of SubObject");
if (!ret->obj)
croak("SubObject belongs to a closed MainObject");
}
return ret;
}
Now, the Typemap:
TYPEMAP
struct MainObject_info * O_SomeLib_MainObject_info
SomeLib_MainObject* O_SomeLib_MainObject
struct SubObject_info * O_SomeLib_SubObject_info
SomeLib_SubObject* O_SomeLib_SubObject
INPUT
O_SomeLib_MainObject_info
$var= MainObject_from_magic($arg, OR_DIE);
INPUT
O_SomeLib_MainObject
$var= MainObject_from_magic($arg, OR_DIE)->obj;
INPUT
O_SomeLib_SubObject_info
$var= SubObject_from_magic($arg, OR_DIE);
INPUT
O_SomeLib_SubObject
$var= SubObject_from_magic($arg, OR_DIE)->obj;
OUTPUT
O_SomeLib_SubObject
sv_setsv($arg, sv_2mortal(SubObject_wrap($var)));
This time I added an “OUTPUT” entry for SubObject, because we can safely wrap any SubObject pointer that we see in any of the SomeLib API calls, and get the desired result.
There’s nothing stopping you from automatically wrapping MainObject pointers with an OUTPUT typemap, but that’s prone to errors because sometimes an API returns a pointer to the already-existing MainObject, and you don’t want perl to put a second wrapper on the same MainObject. This problem doesn’t apply to SubObject, because we re-use any existing wrapper by checking the cache. (of course, you could apply the same trick to MainObject and have a global cache of all the known MainObject instances, and actually I do this in X11::Xlib)
But in general, for objects like MainObject I prefer to special-case my constructor
(or whatever method initializes the instance of SomeLib_MainObject) with a call to
_from_magic(..., AUTOCREATE)
on the INPUT typemap rather than returning the pointer and
letting Perl’s typemap wrap it on OUTPUT.
After all that, it pays off when you add a bunch of methods in the rest of the XS file.
Looking back to the find_subobject
method of the original Perl example, all you need in the
XS is basically the prototype for that function of SomeLib:
SomeLib_SubObject *
find_subobject(main, search_key)
SomeLib_MainObject *main
char *key
and XS translation handles the rest!
I should mention that you don’t need a new typemap INPUT/OUTPUT macro for every single data
type. The macros for a typemap provide you with a $type
variable (and others, see
perldoc xstypemap
) which you can use to construct function names, as long as you name your
functions consistently. If you have lots of different types of sub-objects, you could extend
the previous typemap like this:
TYPEMAP
struct MainObject_info * O_INFOSTRUCT_MAGIC
SomeLib_MainObject* O_LIBSTRUCT_MAGIC
struct SubObject1_info * O_INFOSTRUCT_MAGIC
SomeLib_SubObject1* O_LIBSTRUCT_MAGIC_INOUT
struct SubObject2_info * O_INFOSTRUCT_MAGIC
SomeLib_SubObject2* O_LIBSTRUCT_MAGIC_INOUT
struct SubObject3_info * O_INFOSTRUCT_MAGIC
SomeLib_SubObject3* O_LIBSTRUCT_MAGIC_INOUT
INPUT
O_INFOSTRUCT_MAGIC
$var= @{[ $type =~ / (\w+)/ ]}_from_magic($arg, OR_DIE);
INPUT
O_LIBSTRUCT_MAGIC
$var= @{[ $type =~ /_(\w*)/ ]}_from_magic($arg, OR_DIE)->obj;
INPUT
O_LIBSTRUCT_MAGIC_INOUT
$var= @{[ $type =~ /_(\w*)/ ]}_from_magic($arg, OR_DIE)->obj;
OUTPUT
O_LIBSTRUCT_MAGIC_INOUT
sv_setsv($arg, sv_2mortal(@{[ $type =~ /_(\w*)/ ]}_wrap($var)));
Of course, you can choose your function names and type names to fit more conveniently into these patterns.
Now, you maybe noticed that I made the convenient assumption that the C library has a function that looks up the MainObject of a SubObject:
SomeLib_MainObject *main= SomeLib_SubObject_get_main(sub_obj);
That isn’t always the case. Sometimes the library authors assume you have both pointers handy and don’t bother to give you a function to look one up from the other.
The easiest workaround is if you can assume that any function which returns a SubObject also took a parameter of the MainObject as an input. Then, just standardize the variable name given to the MainObject and use that variable name in the typemap macro.
OUTPUT
O_SomeLib_SubObject
sv_setsv($arg, sv_2mortal(SubObject_wrap(main, $var)));
This macro blindly assumes that “main” will be in scope where the macro gets expanded, which is true for my example:
SomeLib_SubObject *
find_subobject(main, search_key)
SomeLib_MainObject *main
char *key
But, what if it isn’t? What if the C API is basically walking a linked list, and you want to expose it to Perl in a way that the user can write:
for (my $subobj= $main->first; $subobj; $subobj= $subobj->next) {
...
}
The problem is that the “next” method is acting on one SubObject and returning another SubObject, with no reference to “main” available.
Well, if a subobject wrapper exists, then it knows the main object, so you just need to look at
that SubObject info’s pointer to parent
(the MainObject) and make that available for the
SubObject’s OUTPUT typemap:
SomeLib_SubObject *
next(prev_obj_info)
struct SubObject_info *prev_obj_info;
INIT:
SomeLib_MainObject *main= prev_obj_info->parent;
CODE:
RETVAL= SomeLib_SubObject_next(prev_obj_info->obj);
OUTPUT:
RETVAL
So, now there is a variable ‘main’ in scope when it’s time for the typemap to construct a wrapper for the SomeLib_SubObject.
In Perl, the lifespan of objects is nicely defined: the destructor runs when the last reference is lost, and you use a pattern of strong and weak references to control the order the destructors run. In C, the lifespan of objects is dictated by the underlying library, and you might need to go to some awkward lengths to track which ones the Perl user is holding onto, and then flag those objects when they become invalid. While somewhat awkward, it’s very possible thanks to weak references and hashtables keyed on the C pointer address, and the users of your XS library will probably be thankful when they get a useful error message about violating the lifecycle of objects, instead of a mysterious segfault.
Published by Juan Julián Merelo Guervós on Monday 10 February 2025 09:35
Published by Juan Julián Merelo Guervós on Monday 10 February 2025 09:34
Most people will tell you git is a source control tool; some people will tell you that git is a content-addressable filesystem. It's all that, but the interesting thing is that it's a single-tool interface to frameworks that allow you to create products as a team.
Enter the absolutely simple extension mechanism that git has: write a n executable called git-xxx and git
will dutifully call it when you make git xxx. Which is why, to make an easier onramp for students in my 7th-semester class in Computer Science, I created an extension called git iv
(IV is the acronym for the class). The extension allows them to create branches with specific names, as well as upload those branches, without needing to remember specific git
commands.
You might argue that remembering
git
commands is what students should do, but in fact they don't, and since this is not part of the core of the class, I prefer to eliminate sources of trouble for them (which eventually become sources of trouble for me) using this.
There are many good things that can be said about Perl, for this or for anything else. But in this case there's a thing that makes it ideal for writing extensions: git
includes a Perl module called Git
, which is a Perl interface to all the Git commands. This is distributed with git
, so if you've got git
, you've got this library.
The whole extension is not hosted in this GitHub repo; this will contain the most up-to-date version as well as documentation and other stuff.
So here's the preamble to the extension:
use strict;
use warnings;
use lib qw( /Library/Developer/CommandLineTools/usr/share/git-core/perl
/usr/share/perl5 );
use Git;
use v5.14;
my $HELP_FLAG = "-h";
my $USAGE_STRING = <<EOC;
Uso:
git iv objetivo <número> -- crea una rama para ese objetivo
git iv sube-objetivo -- sube al repo remoto la rama
git iv $HELP_FLAG -- imprime este mensaje
EOC
The main caveat about the extension is that some flags will be handled by git
itself. There are probably quite a few of those, but one of them is --help
. git xxx --help
will try to look up a manual page for git xxx
. This is why above a different help flag is defined. And also a usage string, which is helpful when you don't remember the exact shape of the subcommands. In this case, I use git iv
as the extension name and as interface to the stuff that needs to be made; but there are subcommands that will do different things. These are implemented later:
my @subcommands = qw(objetivo sube-objetivo);
push( @subcommands, quotemeta $HELP_FLAG);
die( usage_string() ) unless @ARGV;
my $subcommand = shift;
die "No se reconoce el subcomando $subcommand" unless grep( /\Q$subcommand/, @subcommands );
my @args = @ARGV;
I prefer not to include any dependencies; there are powerful command line flag libraries out there, but in this case, a single script is best. So you handle whatever comes after iv
uniformly, be it a subcommand or a flag. But the issue with the flag is that it includes a dash -
, so we wrap it so that it can be used safely in regexes. Like the one, for instance, 4 lines below: in case the subcommand at the front of the command line is not part of the list, it will bail out showing the usage string.
Anything after the subcommand will be gobbled into @args
.
if ( $subcommand eq $HELP_FLAG ) {
say $USAGE_STRING;
} else {
my $repo;
eval {
$repo = Git->repository;
} or die "Aparentemente, no estás en un repositorio";
if ( $subcommand eq "objetivo" ) {
die $USAGE_STRING unless @args;
$repo->command( "checkout", "-b", "Objetivo-" . $args[0]);
}
if ( $subcommand eq "sube-objetivo" ) {
my $branch = $repo->command( "rev-parse", "--abbrev-ref", "HEAD" );
chomp($branch);
$repo->command ( "push", "-u", "origin", $branch );
}
}
Now it's a matter of processing the subcommand. If it's the flag -h
, print the usage string; if it's any of the other subcommands, we need to work with the git repository.
$repo = Git->repository;
creates an object out of the Git
library we mentioned before that we will use to issue the different plumbing or high level commands. One of the subcommands will do a checkout: $repo->command( "checkout", "-b", "Objetivo-" . $args[0]);
will convert itself to the equivalent command. You can even work with plumbing commands such as rev-parse
to check the branch you're in and create that branch remotely, ad the other command does.
Perl saves you a whole lot of trouble when writing this kind of thing. Besides, the fact that it will be most probably be installed in any system you use to develop (Mac, Linux or WSL) will save you trouble asking for prerequisites for this script.
Published by Unknown on Saturday 08 February 2025 23:42
Published by Unknown on Saturday 08 February 2025 23:39
This is the weekly favourites list of CPAN distributions. Votes count: 48
Week's winners (+3): Perlmazing
Build date: 2025/02/08 22:37:34 GMT
Clicked for first time:
Increasing its reputation:
Published by Marco Pessotto on Thursday 06 February 2025 00:00
Recently I’ve been working on a project with a Vue front-end and two back-ends, one in Python using the Django framework and one in Perl using the Mojolicious framework. So, it’s a good time to spend some words to share the experience and do a quick comparison.
Previously I wrote a post about Perl web frameworks, and now I’m expanding the subject into another language.
Django was chosen for this project because it’s been around for almost 20 years now and provides the needed maturity and stability to be long-running and low-budget. In this regard, it has proved a good choice so far. Recently it saw a major version upgrade without any problems to speak of. It could be argued that I should have used the Django REST Framework instead of plain Django. However, at the time the decision was made, adding a framework on top of another seemed a bit excessive. I don’t have many regrets about this, though.
Mojolicious is an old acquaintance. It used to have fast-paced development but seems very mature now, and it’s even been ported to JavaScript.
Both frameworks have just a few dependencies (which is fairly normal in the Python world, but not in the Perl one) and excellent documentation. They both follow the model-view-controller pattern. Let’s examine the components.
Both frameworks come with a built-in template system (which can be swapped out with something else), but in this project we can skip the topic altogether as both frameworks are used only as back-end for transmitting JSON, without any HTML rendering involved.
However, let’s see how the rendering looks for the API we’re writing.
use Mojo::Base 'Mojolicious::Controller', -signatures;
sub check ($self) {
$self->render(json => { status => 'OK' });
}
from django.http import JsonResponse
def status(request):
return JsonResponse({ "status": "OK" })
Nothing complicated here, just provide the right call.
Usually a model in context of web development means a database and here we are going to keep this assumption.
Django comes with a comprehensive object-relational mapping (ORM) system and it feels like the natural thing to use. I don’t think it makes much sense to use another ORM, or even to use raw SQL queries (though it is possible).
You usually start a Django project by defining the model. The Django ORM gives you the tools to manage the migrations, providing abstraction from the SQL. You need to define the field types and the relationships (joins and foreign keys) using the appropriate class methods.
For example:
from django.db import models
class User(AbstractUser):
email = models.EmailField(null=False, blank=False)
site = models.ForeignKey(Site, on_delete=models.CASCADE, related_name="site_users")
libraries = models.ManyToManyField(Library, related_name="affiliated_users")
expiration = models.DateTimeField(null=True, blank=True)
created = models.DateTimeField(auto_now_add=True)
last_modified = models.DateTimeField(auto_now=True)
These calls provide not only the SQL type to use, but also the validation. For example, the blank
parameter is a validation option specifying whether Django will accept an empty value. It is different from the null
option, which directly correlates to SQL. You can see we’re quite far from working with SQL, at least two layers of abstraction away.
In the example above, we’re also defining a foreign key between a site and a user (many-to-one), so each user belongs to one site. We also define a many-to-many relationship with the libraries record. I like how these relationships are defined, it’s very concise.
Thanks to these definitions, you get a whole admin console almost for free, which your admin users are sure to like. However, I’m not sure this is a silver bullet for solving all problems. With large tables and relationships the admin pages load slowly and they could become unusable very quickly. Of course, you can tune that by filtering out what you need and what you don’t, but that means things are not as simple as “an admin dashboard for free” — at the very least, there’s some configuring to do.
As for the query syntax, you usually need to call Class.objects.filter()
. As you would expect from an ORM, you can chain the calls and finally get objects out of that, representing a database row, which, in turn, you can update or delete.
The syntax for the filter()
call is based on the double underscore separator, so you can query over the relationships like this:
for agent in (Agent.objects.filter(canonical_agent_id__isnull=False)
.prefetch_related('canonical_agent')
.order_by('canonical_agent__name', 'name')
.all()):
agent.name = "Dummy"
agent.save()
In this case, provided that we defined the foreign keys and the attributes in the model, we can search/order across the relationship. The __isnull
suffix, as you can imagine, results in a WHERE canonical_agent_id IS NOT NULL
query, while in the order_by
call we sort over the joined table using the name
column. Looks nice and readable, with a touch of magic.
Of course things are never so simple, so you can build complex queries with the Q
class combined with bytewise operators (&
, |
).
Here’s an example of a simple case-insensitive search for a name containing multiple words:
from django.db.models import Q
def api_list(request)
term = request.GET.get('search')
if term
words = [ w for w in re.split(r'\W+', term) if w ]
if words:
query = Q(name__icontains=words.pop())
while words:
query = query & Q(name__icontains=words.pop())
# logger.debug(query)
agents = Agent.objects.filter(query).all()
To sum up, the ORM is providing everything you need to stay away from the SQL. In fact, it seems like Django doesn’t like you doing raw SQL queries.
In the Perl world things are a bit different.
The Mojolicious tutorial doesn’t even mention the database. You can use any ORM or no ORM at all, if you prefer so. However, Mojolicious makes the DB handle available everywhere in the application.
You could use DBIx::Connector, DBIx::Class, Mojo::Pg (which was developed with Mojolicious), or whatever you prefer.
For example, to use Mojo::Pg in the main application class:
package MyApp;
use Mojo::Base 'Mojolicious', -signatures;
use Mojo::Pg;
use Data::Dumper::Concise;
sub startup ($self) {
my $config = $self->plugin('NotYAMLConfig');
$self->log->info("Starting up with " . Dumper($config));
$self->helper(pg => sub {
state $pg = Mojo::Pg->new($config->{dbi_connection_string});
});
In the routes you can call $self->pg
to get the database object.
The three approaches I’ve mentioned here are different.
DBIx::Connector
is basically a way to get you a safe DBI handle across forks and DB connection failures.
Mojo::Pg
gives you the ability to do abstract queries but also gives some convenient methods to get the results. I wouldn’t call it a ORM; from a query you usually gets hashes, not objects, you don’t need to define the database layout, and it won’t produce migrations for you, though there is some migration support.
Here’s an example of standard and abstract queries:
sub list_texts ($self) {
if (my $sid = $self->param('sid')) {
my $sql = 'SELECT * FROM texts WHERE sid = ? ORDER BY sorting_index';
@all = $self->pg->db->query($sql, $sid)->hashes->each;
}
$self->render(json => { texts => \@all });
The query above can be rewritten with an abstract query, using the same module.
@all = $self->pg->db->select(texts => undef,
{ sid => $sid },
{ order_by => 'sorting_index' })->hashes->each;
If it’s a simple, static query, it’s basically a matter of taste; do you prefer to see the SQL or not? The second version is usually nicer if you want to build a different query depending on the parameters, so you add or remove keys to the hashes which maps to query and finally execute it.
Now, speaking of taste, for complex queries with a lot of joins I honestly prefer to see the SQL query instead of wondering if the abstract one is producing the correct SQL. This is true regardless of the framework. I have the impression that it is faster, safer, and cleaner to have the explicit SQL in the code rather than leaving future developers (including future me) to wonder if the magic is happening or not.
Finally, nothing stops you from using DBIx::Class
, which is the best ORM for Perl, even if it’s not exactly light on dependencies.
It’s very versatile, it can build queries of arbitrary complexity, and you usually get objects out of the queries you make. It doesn’t come with an admin dashboard, it doesn’t enforce the data types and it doesn’t ship any validation by default (of course, you can implement that manually). The query syntax is very close to the Mojo::Pg
one (which is basically SQL::Abstract).
The gain here is that, like in Django’s ORM, you can attach your methods to the classes representing the rows, so the data definitions live with the code operating on them.
However, the fact that it builds an object for each result means you’re paying a performance penalty which sometimes can be very high. I think this is a problem common to all ORMs, regardless of the language and framework you’re using.
The difference with Django is that once you have chosen it as your framework, you are basically already sold to the ORM. With Mojolicious and other Perl frameworks (Catalyst, Dancer), you can still make the decision and, at least in theory, change it down the road.
My recommendation would be to keep the model, both code and business logic, decoupled from the web-specific code. This is not really doable with Django, but is fully doable with the Perl frameworks. Just put the DB configuration in a dedicated file and the business code in appropriate classes. Then you should be able to, for example, run a script without loading the web and the whole framework configuration. In this ideal scenario, the web framework just provides the glue between the user and your model.
Routes are defined similarly between Django and Mojolicious. Usually you put the code in a class and then point to it, attaching a name to it so you can reference it elsewhere. The language is different, the style is different, but they essentially do the same thing.
Django:
from django.urls import path
from . import views
urlpatterns = [
path("api/agents/<int:agent_id>", views.api_agent_view, name="api_agent_view"),
]
The function views.api_agent_view
will receive the request with the agent_id
as a parameter.
Mojolicious:
sub startup ($self) {
# ....
my $r = $self->routes;
$r->get('/list/:sid')->to('API#list_texts')->name('api_list_texts');
}
The ->to
method is routing the request to the Myapp::Controller::API::list_texts
, which will receive the request with the sid
as parameter.
This is pretty much the core business of every web framework: routing a request to a given function.
Mojolicious has also the ability to chain the routes (pretty much taken from Catalyst). The typical use is authorization:
sub startup ($self) {
...
my $r = $self->routes;
my $api = $r->under('/api/v1', sub ($c) {
if ($c->req->headers->header('X-API-Key') eq 'testkey') {
return 1;
}
$c->render(text => 'Authentication required!', status => 401);
return undef;
}
$api->get('/check')->to('API#check')->name('api_check');
So the request to /api/v1/check
will first go in the first block and the chain will abort if the API key is not set in the header. Otherwise it will proceed to run the API
module’s check
function.
I’m Perl guy and so I’m a bit biased toward Mojolicious, but I also have a pragmatic approach to programming. Python is widely used — they teach it in schools — while Perl is seen as old-school, if not dead (like all the mature technologies). So, Python could potentially attract more developers to your project, and this is important to consider.
Learning a new language like Python is not a big leap; it and Perl are quite similar despite the different syntax. I’d throw Ruby in the same basket.
Of course both languages provide high quality modules you can use, and these two frameworks are an excellent example.
Published by Mayur Koshti on Tuesday 04 February 2025 17:33
At the end of my last post, we had a structure in place that used GitHub Actions to run a workflow every time a change was committed to the PPC repository. That workflow would rebuild the website and publish it on GitHub Pages.
All that was left for us to do was to write the middle bit – the part that actually takes the contents of the repo and creates the website. This involves writing some Perl.
There are three types of pages that we want to create:
I’ll be using the Template Toolkit to build the site, with a sprinkling of Bootstrap to make it look half-decent. Because there is a lot of Markdown-to-HTML conversion, I’ll use my Template::Provider::Pandoc module which uses Pandoc to convert templates into different formats.
The first thing I did was parse the PPCs themselves, extracting the relevant information. Luckily, each PPC has a “preamble” section containing most of the data we need. I created a basic class to model PPCs which included a really hacky parser to extract this information and create a object of the class.
This class abstracts away a lot of the complexity which means the program that actually builds the site is less than eighty lines of code. Let’s look at it in a bit more detail:
#!/usr/bin/perl use v5.38; use JSON; use File::Copy; use Template; use Template::Provider::Pandoc; use PPC;
There’s nothing unusual in the first few lines. We’re just loading the modules we’re using. Note that use v5.38 automatically enables strict and warnings, so we don’t need to load them explicitly.
my @ppcs; my $outpath = './web'; my $template_path = [ './ppcs', './docs', './in', './ttlib' ];
Here, we’re just setting up some useful variables. @ppcs will contain the PPC objects that we create. One potential clean-up here is to reduce the size of that list of input directories.
my $base = shift || $outpath; $base =~ s/^\.//; $base = "/$base" if $base !~ m|^/|; $base = "$base/" if $base !~ m|/$|;
This is a slightly messy hack that is used to set a <base> tag in the HTML.
my $provider = Template::Provider::Pandoc->new({ INCLUDE_PATH => $template_path, }); my $tt = Template->new({ LOAD_TEMPLATES => [ $provider ], INCLUDE_PATH => $template_path, OUTPUT_PATH => $outpath, RELATIVE => 1, WRAPPER => 'page.tt', VARIABLES => { base => $base, } });
Here, we’re setting up our Template Toolkit processor. Some of you may not be familiar with using a Template provider module. These modules change how TT retrieves templates: if the template has an .md
extension, then the text is passed though Pandoc to convert it from Markdown to HTML before it’s handed to the template processor. It’s slightly annoying that we need to pass the template include path to both the provider and the main template engine.
for (<ppcs/*.md>) { my $ppc = PPC->new_from_file($_); push @ppcs, $ppc; $tt->process($ppc->in_path, {}, $ppc->out_path) or warn $tt->error; }
This is where we process the actual PPCs. For each PPC we find in the /ppcs
directory, we create a PPC object, store that in the @ppcs
variable and process the PPC document as a template – converting it from Markdown to HTML and writing it to the /web
directory.
my $vars = { ppcs => \@ppcs, }; $tt->process('index.tt', $vars, 'index.html') or die $tt->error;
Here’s where we process the index.tt
file to generate the index.html
for our site. Most of the template is made up of a loop over the @ppcs
variable to create a table of the PPCs.
for (<docs/*.md>) { s|^docs/||; my $out = s|\.md|/index.html|r; $tt->process($_, {}, $out) or die $tt->error; }
There are a few other documents in the /docs
directory describing the PPC process. So in this step, we iterate across the Markdown files in that directory and convert each of them into HTML. Unfortunately, one of them is the template.md
which is intended to be used as the template for new PPCs – so it would be handy if that one wasn’t converted to HTML. That’s something to think about in the future.
mkdir 'web/images'; for (<images/*>) { copy $_, "web/$_"; } if (-f 'in/style.css') { copy 'in/style.css', 'web/style.css'; } if (-f 'CNAME') { copy 'CNAME', "web/CNAME"; }
We’re on the home straight now. And this section is a bit scrappy. You might recall from the last post that we’re building the website in the /web
directory. And there are a few other files that need to be copied into that directory in order that they are then deployed to the web server. So we just copy files. You might not know what a CNAME
file is – it’s the file that GitHub Pages uses to tell their web server that you’re serving your website from a custom domain name.
my $json = JSON->new->pretty->canonical->encode([ map { $_->as_data } @ppcs ]); open my $json_fh, '>', 'web/ppcs.json' or die $!; print $json_fh $json;
And, finally, we generate a JSON version of our PPCs and write that file to the /web
directory. No-one asked for this, but I thought someone might find this data useful. If you use this for something interesting, I’d love to hear about it.
A few other bits and pieces to be aware of.
/docs
. It would make sense to change that so it’s generated from the contents of that directoryBut there you are. That’s the system that I knocked together in a few hours a couple of weeks ago. As I mentioned in the last post, the idea was to make the PPC process more transparent to the Perl community outside of the Perl 5 Porters and the Perl Steering Council. I hope it achieves that and, further, I hope it does so in a way that keeps out of people’s way. As soon as someone updates one of the documents in the repository, the workflow will kick in and publish a new version of the website. There are a few grungy corners of the code and there are certainly some improvements that can be made. I’m hoping that once the pull request is merged, people will start proposing new pull requests to add new features.
The post Proposed Perl Changes (part 2) first appeared on Perl Hacks.
Welcome to “What’s new on CPAN”, a curated look at last month’s new CPAN uploads for your reading and programming pleasure. Enjoy!
Published by Unknown on Saturday 01 February 2025 21:49
This article was originally published at fuzzix.org.
These days, even modestly priced MIDI hardware comes stuffed with features. These features may include a clock, sequencer, arpeggiator, chord voicing, Digital Audio Workstation (DAW) integration, and transport control.
Fitting all this into a small device’s form factor may result in some amount of compromise — perhaps modes aren’t easily combined, or some amount of menu diving is required to switch between modes. Your device may even lack the precise functionality you require.
This post will walk through the implementation of a pair of features to augment those found in a MIDI keyboard — a M-Audio Oxygen Pro 61 in this case, though the principle should apply to any device.
A pedal tone (or pedal note, or pedal point) is a sustained single note, over which other potentially dissonant parts are played. A recent video by Polarity Music opened with some exploration of using a pedal tone in Bitwig Studio to compose progressions. In this case, the pedal tone was gated by the keyboard, and the fifth interval of the played note was added resulting in a three note chord for a single played note. This simple setup resulted in some dramatic progressions.
There are, of course, ways to achieve this effect in other DAW software. I was able to use FL Studio’s Patcher to achieve a similar result with two instances of VFX Key Mapper:
One instance of VFX Key Mapper transposes the incoming note by 7 semitones. The other will replace any incoming note. Alongside the original note, these mappers are routed to FLEX with a Rhodes sample set loaded. It sounds like this (I’m playing just one or two keys at a time here):
A similar method can be used to patch this in other modular environments. In VCV Rack, a pair of quantizers provide the fifth-note offset and pedal tone signals. The original note, the fifth, and the pedal tone are merged and sent to the Voltage Controlled Oscillator (VCO). The gate signal from the keyboard triggers an envelope to open the Voltage Controlled Amplifier (VCA) and Voltage Controlled Filter (VCF).
This patch is a little less flexible than the FL Studio version — further work is required to support playing multiple notes on the keyboard, for example.
The FL Studio version also has a downside. The played sequence only shows the played notes in the piano roll, not the additional fifth and pedal tone. Tweaking timing and velocity, or adding additional melody is not trivial - any additional notes in the piano roll will play three notes in the Patcher instrument.
If we could coax our MIDI device into producing these additional notes, there would be no need for tricky patching plus we might end up with a more flexible result.
The approach described here will set up a new software-defined MIDI device which will proxy events from our hardware, while applying any number of filters to events before they are forwarded. These examples will make use of Perl bindings to RtMidi.
We’re going to need a little bit of framework code to get started. While the simplest RtMidi callback examples just sleep to let the RtMidi event loop take over, we may wish to schedule our own events later. I went into some detail previously on Perl, IO::Async, and the RtMidi event loop.
The framework will need to set up an event loop, manage two or more MIDI devices, and store some state to influence decision-making within filter callback functions. Let’s start with those:
use v5.40;
use experimental qw/ class /;
class MidiFilter {
field $loop = IO::Async::Loop->new;
field $midi_ch = IO::Async::Channel->new;
field $midi_out = RtMidiOut->new;
field $input_name = $ARGV[0];
field $filters = {};
field $stash = {};
Aside from our event $loop
and $midi_out
device, there are fields for
getting $input_name
from the command line, a $stash
for communication
between callbacks and a store for callback $filters
. The callback store will hold
callbacks keyed on MIDI event names, e.g. “note_on”. The channel $midi_ch
will be used to receive events from the MIDI input controller.
Methods for creating new filters and accessing the stash are as follows:
method add_filter( $event_type, $action ) {
push $filters->{ $event_type }->@*, $action;
}
method stash( $key, $value = undef ) {
$stash->{ $key } = $value if defined $value;
$stash->{ $key };
}
Adding a filter requires an event type, plus a callback. Callbacks are pushed
into $filters
for each event type in the order they are declared.
If a $value
is supplied while accessing the stash, it will be stored for the
given $key
. The value for the given $key
is returned in any case.
Let’s add some methods for sending MIDI events:
method send( $event ) {
$midi_out->send_event( $event->@* );
}
method delay_send( $delay_time, $event ) {
$loop->add(
IO::Async::Timer::Countdown->new(
delay => $delay_time,
on_expire => sub { $self->send( $event ) }
)->start
)
}
The send
method simply passes the supplied $event
to the configured
$midi_out
device. The delay_send
method does the same thing, except it
waits for some specified amount of time before sending.
Methods for filtering incoming MIDI events are as follows:
method _filter_and_forward( $event ) {
my $event_filters = $filters->{ $event->[0] } // [];
for my $filter ( $event_filters->@* ) {
return if $filter->( $self, $event );
}
$self->send( $event );
}
async method _process_midi_events {
while ( my $event = await $midi_ch->recv ) {
$self->_filter_and_forward( $event );
}
}
These methods are denoted as “private” via the ancient mechanism of “Add an underscore to the start of the name to indicate that this method shouldn’t be used”. The documentation for Object::Pad (which acts as an experimental playground for perl core class features) details the lexical method feature, which allows for block scoped methods unavailable outside the class. The underscore technique will serve us for now.
The _process_midi_events
method awaits receiving a message, passing each
message received to _filter_and_forward
. The _filter_and_forward
method
retrieves callbacks for the current event type (The first element of the $event
array)
and delegates the event to the available callbacks. If no callbacks are
available, or if none of the callbacks return true
, the event is forwarded
to the MIDI output device untouched.
The final pieces are the setup of MIDI devices and the communications channel:
method _init_out {
return $midi_out->open_port_by_name( qr/loopmidi/i )
if ( grep { $^O eq $_ } qw/ MSWin32 cygwin / );
$midi_out->open_virtual_port( 'Mister Fancy Pants' );
}
method go {
my $midi_rtn = IO::Async::Routine->new(
channels_out => [ $midi_ch ],
code => sub {
my $midi_in = RtMidiIn->new;
$midi_in->open_port_by_name( qr/$input_name/i ) ||
die "Unable to open input device";
$midi_in->set_callback_decoded(
sub( $ts, $msg, $event, $data ) {
$midi_ch->send( $event );
}
);
sleep;
}
);
$loop->add( $midi_rtn );
$loop->await( $self->_process_midi_events );
}
ADJUST {
$self->_init_out;
}
The _init_out
method takes care of some shortcomings in Windows MIDI, which
does not support the creation of virtual ports. On this platform messages will
be routed via loopMIDI.
On other platforms the virtual MIDI port “RtMidi Output Client:Mister Fancy Pants” is created.
The ADJUST
block assures this is done during construction of the MidiFilter
instance.
The go
method creates a routine which instantiates a RtMidi instance, and
connects to the hardware MIDI device specified on the command line. A callback is
created to send incoming events over the communications channel, then we
simply sleep and allow RtMidi’s event loop to take over the routine.
The final step is to await _process_midi_events
, which should process events
from the hardware until the program is terminated.
Callbacks are responsible for managing the stash, and sending filtered messages to the output device. A callback receives the MidiFilter instance and the incoming event.
In order to implement the pedal tone feature described earlier, we need to take incoming “note on” events and transform them into three “note on” events, then send these to the output MIDI device. A similar filter is needed for “note off” — all three notes must be stopped after being played:
use constant PEDAL => 55; # G below middle C
sub pedal_notes( $note ) {
( PEDAL, $note, $note + 7 );
}
sub pedal_tone( $mf, $event ) {
my ( $ev, $channel, $note, $vel ) = $event->@*;
$mf->send( [ $ev, $channel, $_, $vel ] ) for pedal_notes( $note );
true;
}
my $mf = MidiFilter->new;
$mf->add_filter( note_on => \&pedal_tone );
$mf->add_filter( note_off => \&pedal_tone );
$mf->go;
We start by setting a constant containing a MIDI note value for the pedal tone.
The sub pedal_notes
returns this pedal tone, the played note, and its fifth.
The callback function pedal_tone
sends a MIDI message to output for each of
the notes returned by pedal_notes
.
Note the callback yields true
in order to prevent falling through to
the default action.
The callback function is applied to both the “note on” and “note off” events.
We finish by calling the go
method of our MidiFilter
instance in order to
await and process incoming messages from the keyboard.
The last step is to run the script:
$ ./midi-filter.pl ^oxy
Rather than specify a fully qualified device name, we can pass in a regex which should match any device whose name starts with “oxy” - there is only one match on my system, the Oxygen Pro.
The device “RtMidi Output Client:Mister Fancy Pants” or “loopMIDI”, depending on your platform, can now be opened in the DAW to receive played notes routed through the pedal tone filter. This filter is functionally equivalent to the FL Studio Patcher patch from earlier, with the added benefit of being DAW-agnostic. If recording a sequence from this setup, all notes will be shown in the piano roll.
The Oxygen Pro has four “banks” or sets of controls. Each bank can have different assignments or behaviour for the knobs, keys, sliders, and pads.
A problem with this feature is that there is limited feedback when switching banks - it’s not always visible on screen, depending on the last feature used. Switching banks does not effect the keyboard. Also, perhaps 4 banks isn’t enough.
A simpler version of this feature might be to use pads to select the bank, and the bank just sets the MIDI channel for all future events. There are 16 pads on the device, for each of 16 channels. It should be more obvious which bank (or channel) was the last selected, and if not, just select it again.
This can also be applied to the keyboard by defining callbacks for “note on” and “note off” (or rather, modifying the existing ones). For this device, we also need callbacks for “pitch wheel change” and “channel aftertouch”. The callback for “control change” should handle the mod wheel without additional special treatment.
The pads on this device are set up to send notes on channel 10, usually reserved for drums. Watching for specific notes incoming on channel 10, and stashing the corresponding channel should be enough to allow other callbacks to route events appropriately:
sub set_channel( $mf, $event ) {
my ( $ev, $channel, $note, $vel ) = $event->@*;
return false unless $channel == 9;
my $new_channel = $note - 36;
$mf->stash( channel => $new_channel );
true;
}
$mf->add_filter( note_on => \&set_channel );
$mf->add_filter( note_on => \&pedal_tone );
$mf->add_filter( note_off => \&set_channel );
$mf->add_filter( note_off => \&pedal_tone );
If the event channel sent to set_channel
is not 10 (or rather 9, as we are
working with zero-indexed values) we return false, allowing the filter to
fall through to the next callback. Otherwise, the channel is stashed and we
stop processing further callbacks. As the pad notes are numbered 36 to 51,
the channel can be derived by subtracting 36 from the incoming note.
This callback needs to be applied to both “note on” and “note off” events —
remember, there is an existing “note off” callback which will erroneously
generate three “note off” events unless intercepted.
The order of callbacks is also important. If pedal_tone
were first, it would
prevent set_channel
from happening at all.
We can now retrieve the stashed channel in pedal_tone
:
sub pedal_tone( $mf, $event ) {
my ( $ev, $channel, $note, $vel ) = $event->@*;
$channel = $mf->stash( 'channel' ) // $channel;
$mf->send( [ $ev, $channel, $_, $vel ] ) for pedal_notes( $note );
true;
}
The final piece of this feature is to route some additional event types to the selected channel:
sub route_to_channel( $mf, $event ) {
my ( $ev, $channel, @params ) = $event->@*;
$channel = $mf->stash( 'channel' ) // $channel;
$mf->send( [ $ev, $channel, @params ] );
true;
}
$mf->add_filter( pitch_wheel_change => \&route_to_channel );
$mf->add_filter( control_change => \&route_to_channel );
$mf->add_filter( channel_after_touch => \&route_to_channel );
We can now have different patches respond to different channels, and control each patch with the entire MIDI controller (except the pads, of course).
You may have spotted a problem with the bank feature. Imagine we are on bank 1 and we set knob 1 to a low value. We then switch to bank 2, and turn knob 1 to a high value. When we switch back to bank 1 and turn the knob, the control will jump to the new high value.
A feature called “pickup” (or “pick up”) allows for bank switching by only engaging the control for knob 1, bank 1 when the knob passes its previous value. That is, the control only starts changing again when the knob goes beyond its previous low value.
Pickup could be implemented in our filters by stashing the last value for each control/channel combination. This would not account for knob/channel combinations which were never touched - large jumps in control changes would still be possible, with no way to prevent them. One would need to set initial values by tweaking all controls on all channels before beginning a performance.
Many DAWs and synths support pickup, and it is better handled there rather than implementing a half-baked and inconsistent solution here.
So far we have not taken complete advantage of our event loop. You might
remember we implemented a delay_send
method which accepts a delay time
alongside the event to be sent.
We can exploit this to add some expressiveness (of a somewhat robotic variety) to the pedal tone callback:
use constant STRUM_DELAY => 0.05; # seconds
sub pedal_tone( $mf, $event ) {
my ( $ev, $channel, $note, $vel ) = $event->@*;
$channel = $mf->stash( 'channel' ) // $channel;
my @notes = pedal_notes( $note );
$mf->send( [ $ev, $channel, shift @notes, $vel ] );
my $delay_time = 0;
for my $note ( @notes ) {
$delay_time += STRUM_DELAY;
$mf->delay_send( $delay_time, [ $ev, $channel, $note, $vel ] );
}
true;
}
We now store the notes and send the first immediately. Remaining snotes are
sent with an increasing delay. The delay_send
method will schedule the notes
and return immediately, allowing further events to be processed.
Scheduling the “note off” events is also a good idea. Imagine a very quick keypress on the keyboard. If the keyboard note off happens before we finish sending the scheduled notes, sending all “note off” events instantaneously would leave some scheduled notes ringing out. Scheduling “note off” events with the same cadence as the “note on” events should prevent this. That is, the same callback can continue to service both event types.
With that change, playing a single key at a time sounds like this:
This VCV Rack patch should demonstrate the complete set of features built in this post. On the right is an additive voice which responds to MIDI channel 2. The mod wheel is pacthed to control feedback which should influence the brightness of the sound.
The left side is a typical subtractive patch controlled by channel 3, with an envelope controlling a VCA and VCF to shape incoming sawtooths. The mod wheel is patched to allow a Low-Frequency Oscillator (LFO) to frequency modulate the VCO for a vibrato effect.
This is what it sounds like - we first hear the additive patch on channel 2, then the subtractive one on channel 3. Switching channels is as simple as pushing the respective pad on the controller:
Not very exciting, I know — it’s just to demonstrate the principle.
Keen eyes may have spotted an issue with the bank switching callback. When switching to channel 10, then played keyboard keys which overlap with those assigned to the pads may dump you unexpectedly onto a different channel! I will leave resolving this as an exercise for the reader — perhaps one of the pads could be put to another use.
While I haven’t measured latency of this project specifically, previous experiments with async processing of MIDI events in Perl showed a latency of a fraction of a millisecond. I expect the system described in this post to have a similar profile.
There is a gist with the complete source of the MidiFilter project.
It’s also included below:
#!/usr/bin/env perl
# There is currently an issue with native callbacks and threaded perls, which leads to a crash.
# As of Jan 2025, all the available pre-built perls I am aware of for Windows are threaded.
# I was able to work around this by building an unthreaded perl with cygwin / perlbrew... but
# you might want to just try this on Linux or Mac instead :)
use v5.40;
use experimental qw/ class /;
class MidiFilter {
use IO::Async::Loop;
use IO::Async::Channel;
use IO::Async::Routine;
use IO::Async::Timer::Countdown;
use Future::AsyncAwait;
use MIDI::RtMidi::FFI::Device;
field $loop = IO::Async::Loop->new;
field $midi_ch = IO::Async::Channel->new;
field $midi_out = RtMidiOut->new;
field $input_name = $ARGV[0];
field $filters = {};
field $stash = {};
method _init_out {
return $midi_out->open_port_by_name( qr/loopmidi/i )
if ( grep { $^O eq $_ } qw/ MSWin32 cygwin / );
$midi_out->open_virtual_port( 'Mister Fancy Pants' );
}
method add_filter( $event_type, $action ) {
push $filters->{ $event_type }->@*, $action;
}
method stash( $key, $value = undef ) {
$stash->{ $key } = $value if defined $value;
$stash->{ $key };
}
method send( $event ) {
$midi_out->send_event( $event->@* );
}
method delay_send( $dt, $event ) {
$loop->add(
IO::Async::Timer::Countdown->new(
delay => $dt,
on_expire => sub { $self->send( $event ) }
)->start
)
}
method _filter_and_forward( $event ) {
my $event_filters = $filters->{ $event->[0] } // [];
for my $filter ( $event_filters->@* ) {
return if $filter->( $self, $event );
}
$self->send( $event );
}
async method _process_midi_events {
while ( my $event = await $midi_ch->recv ) {
$self->_filter_and_forward( $event );
}
}
method go {
my $midi_rtn = IO::Async::Routine->new(
channels_out => [ $midi_ch ],
code => sub {
my $midi_in = RtMidiIn->new;
$midi_in->open_port_by_name( qr/$input_name/i ) ||
die "Unable to open input device";
$midi_in->set_callback_decoded(
sub( $ts, $msg, $event, $data ) {
$midi_ch->send( $event );
}
);
sleep;
}
);
$loop->add( $midi_rtn );
$loop->await( $self->_process_midi_events );
}
ADJUST {
$self->_init_out;
}
}
use constant PEDAL => 55; # G below middle C
use constant STRUM_DELAY => 0.05; # seconds
sub pedal_notes( $note ) {
( PEDAL, $note, $note + 7 );
}
sub pedal_tone( $mf, $event ) {
my ( $ev, $channel, $note, $vel ) = $event->@*;
$channel = $mf->stash( 'channel' ) // $channel;
my @notes = pedal_notes( $note );
$mf->send( [ $ev, $channel, shift @notes, $vel ] );
my $dt = 0;
for my $note ( @notes ) {
$dt += STRUM_DELAY;
$mf->delay_send( $dt, [ $ev, $channel, $note, $vel ] );
}
true;
}
sub set_channel( $mf, $event ) {
my ( $ev, $channel, $note, $vel ) = $event->@*;
return false unless $channel == 9;
my $new_channel = $note - 36;
$mf->stash( channel => $new_channel );
true;
}
sub route_to_channel( $mf, $event ) {
my ( $ev, $channel, @params ) = $event->@*;
$channel = $mf->stash( 'channel' ) // $channel;
$mf->send( [ $ev, $channel, @params ] );
true;
}
my $mf = MidiFilter->new;
$mf->add_filter( note_on => \&set_channel );
$mf->add_filter( note_on => \&pedal_tone );
$mf->add_filter( note_off => \&set_channel );
$mf->add_filter( note_off => \&pedal_tone );
$mf->add_filter( pitch_wheel_change => \&route_to_channel );
$mf->add_filter( control_change => \&route_to_channel );
$mf->add_filter( channel_after_touch => \&route_to_channel );
$mf->go;
BEGIN {
$ENV{PERL_FUTURE_DEBUG} = true;
}
After describing some of the shortcomings of a given MIDI controller, and an approach for adding to a performance within a DAW, we walked through the implementation of a framework to proxy a MIDI controller’s facilities through software-defined filters.
The filters themselves are implemented as simple callbacks which may decide to store data for later use, change the parameters of the incoming message, forward new messages to the virtual hardware proxy device, and/or cede control to further callbacks in a chain.
Callbacks are attached to MIDI event types and a single callback function may be suitable to attach to multiple event types.
We took a look at some simple functionality to build upon the device — a filter which turns a single key played into a strummed chord with a pedal tone, and a bank-switcher which sets the channel of all further events from the hardware device.
These simple examples served to demonstrate the principle, but the practical limit to this approach is your own imagination. My imagination is limited, but some next steps might be to add “humanising” random fluctuations to sequences, or perhaps extending the system to combine the inputs of multiple hardware devices into one software-defined device with advanced and complex facilities. If your device has a DAW mode, you may be able to implement visual feedback for the actions and state of the virtual device. You could also coerce non-MIDI devices, e.g. Gamepads, into sending MIDI messages.
Many thanks to Dave Cross for providing an initial implementation of a PPC index page.
Maybe I should explain that in a little more detail. There’s a lot of detail, so it will take a couple of blog posts.
About two weeks ago, I got a message on Slack from Phillippe Bruhat, a member of the Perl Steering Council. He asked if I would have time to look into building a simple static site based on the GitHub repo that stores the PPCs that are driving a lot of Perl’s development. The PSC thought that reading these important documents on a GitHub page wasn’t a great user experience and that turning it into a website might lead to more people reading the proposals and, hence, getting involved in discussions about them.
I guess they had thought of me as I’ve written a bit about GitHub Pages and GitHub Actions over the last few years and these were exactly the technologies that would be useful in this project. In fact, I have already created a website that fulfills a similar role for the PSC meeting minutes – and I know they know about that site because they’ve been maintaining it themselves for several months.
I was about to start working with a new client, but I had a spare day, so I said I’d be happy to help. And the following day, I set to work.
Reviewing the situation
I started by looking at what was in the repo.
All of these documents were in Markdown format. The PPCs seemed to have a pretty standardised format.
Setting a target
Next, I listed what would be essential parts of the new site.
This is exactly the kind of use case that a combination of GitHub Pages and GitHub Actions is perfect for. Perhaps it’s worth briefly describing what those two GitHub features are.
Introducing GitHub Pages
GitHub Pages is a way to run a website from a GitHub repo. The feature was initially introduced to make it easy to run a project website alongside your GitHub repo – with the files that make up the website being stored in the same repo as the rest of your code. But, as often happens with useful features, people have been using the feature for all sorts of websites. The only real restriction is that it only supports static sites – you cannot use GitHub’s servers to run any kind of back-end processing.
The simplest way to run a GitHub Pages website is to construct it manually, put the HTML, CSS and other files into a directory inside your repo called /docs, commit those files and go to the “Settings -> Pages” settings for your repo to turn on Pages for the repo. Within minutes your site will appear at the address USERNAME.github.repo/REPONAME. Almost no-one uses that approach.
The most common approach is to use a static site builder to build your website. The most popular is Jekyll – which is baked into the GitHub Pages build/deploy cycle. You edit Markdown files and some config files. Then each time you commit a change to the repo, GitHub will automatically run Jekyll over your input files, generate your website and deploy that to its web servers. We’re not going to do that.
We’ll use the approach I’ve used for many GitHub Pages sites. We’ll use GitHub Actions to do the equivalent of the “running Jekyll over your input files to generate your website” step. This gives us more flexibility and, in particular, allows us to generate the website using Perl.
Introducing GitHub Actions
GitHub Actions is another feature that was introduced with one use case in mind but which has expanded to be used for an incredible range of ideas. It was originally intended for CI/CD – a replacement for systems like Jenkins or Travis CI – but that only accounts for about half of the things I use it for.
A GitHub Actions run starts in response to various triggers. You can then run pretty much any code you want on a virtual machine, generating useful reports, updating databases, releasing code or (as in this case) generating a website.
GitHub Actions is a huge subject (luckily, there’s a book!) We’re only going to touch on one particular way of using it. Our workflow will be:
Making a start
Let’s make a start on creating a GitHub Actions workflow to deal with this. Workflows are defined in YAML files that live in the .github/workflows directory in our repo. So I created the relevant directories and a file called buildsite.yml.
There will be various sections in this file. We’ll start simply by defining a name for this workflow:
name: Generate website
The next section tells GitHub when to trigger this workflow. We want to run it when a commit is pushed to the “main” branch. We’ll also add the “workflow_dispatch” trigger, which allows us to manually trigger the workflow – it adds a button to the workflow’s page inside the repo:
on: push: branches: 'main' workflow_dispatch:
The main part of the workflow definition is the next section – the one that defines the jobs and the individual steps within them. The start of that section looks like this:
jobs: build: runs-on: ubuntu-latest container: perl:latest steps: - name: Perl version run: perl -v - name: Checkout uses: actions/checkout@v4
The “build” there is the name of the first job. You can name jobs anything you like – well anything that can be the name of a valid YAML key. We then define the working environment for this job – we’re using a Ubuntu virtual machine and on that, we’re going to download and run the latest Perl container from the Docker Hub.
The first step isn’t strictly necessary, but I like to have a simple but useful step to ensure that everything is working. This one just prints the Perl version to the workflow log. The second step is one you’ll see in just about every GitHub Actions workflow. It uses a standard, prepackaged library (called an “action”) to clone the repo to the container.
The rest of this job will make much more sense once I’ve described the actual build process in my next post. But here it is for completeness:
- name: Install pandoc and cpanm run: apt-get update && apt-get install -y pandoc cpanminus - name: Install modules run: | cpanm --installdeps --notest . - name: Get repo name into environment run: | echo "REPO_NAME=${GITHUB_REPOSITORY#$GITHUB_REPOSITORY_OWNER/}" >> $GITHUB_ENV - name: Create pages env: PERL5LIB: lib GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} run: | mkdir -p web perl bin/build $REPO_NAME - name: Update pages artifact uses: actions/upload-pages-artifact@v3 with: path: web/
Most of the magic (and all of the Perl – for those of you who were wondering) happens in the “Create pages” step. If you can’t wait until the next post, you can find the build program and the class it uses in the repo.
But for now, let’s skim over that and look at the final step in this job. That uses another pre-packaged action to build an artifact (which is just a tarball) which the next job will deploy to the GitHub Pages web server. You can pass it the name of a directory and it will build the artifact from that directory. So you can see that we’ll be building the web pages in the web/ directory.
The second (and final) job is the one that actually carries out the deployment. It looks like this:
deploy: needs: build permissions: pages: write id-token: write environment: name: github-pages url: ${{ steps.deployment.outputs.page_url }} runs-on: ubuntu-latest steps: - name: Deploy to GitHub Pages id: deployment uses: actions/deploy-pages@v4
It uses another standard, pre-packaged action and most of the code here is configuration. One interesting line is the “need” key. That tells the workflow engine that the “build” job needs to have completed successfully before this job can be run.
But once it has run, the contents of our web/ directory will be on the GitHub Pages web server and available for our adoring public to read.
All that is left is for us to write the steps that will generate the website. And that is what we’ll be covering in my next post.
Oh, and if you want to preview the site itself, it’s at https://davorg.dev/PPCs/ and there’s an active pull request to merge it into the main repo.
The post Proposed Perl Changes first appeared on Perl Hacks.
Published by Unknown on Saturday 25 January 2025 23:32