How phasers work in Perl 6

This is the sixth in a series of articles about migrating code from Perl 5 to Perl 6. This article looks at the special blocks in Perl 5, such as BEGIN and END, and the possibly subtle change in semantics with so-called phasers in Perl 6.

Perl 6 has generalized some Perl 5 features as phasers that weren’t covered by special blocks in Perl 5. And it has added other phasers that are not covered by any (standard) Perl 5 functionality at all.

One important feature to remember about phasers is that they are not part of the normal flow of a program’s execution. The runtime executor decides when a phaser is being run depending on the type of phaser and context. Therefore, all phasers in Perl 6 are spelled in uppercase characters to make them stand out.

An overview

Let’s start with an overview of Perl 5’s special blocks and their Perl 6 counterparts in the order they are executed in Perl 5:

Perl 5Perl 6Notes
BEGINBEGINNot run when loading pre-compiled code
UNITCHECKCHECK 
CHECK No equivalent in Perl 6
INITINIT 
ENDEND 

These phasers in Perl 6 are usually called program execution phasers because they are related to the execution of a complete program, regardless of where they are located in a program.

BEGIN

The semantics of the BEGIN special block in Perl 5 and the BEGIN phaser in Perl 6 are the same. It specifies a piece of code to be executed immediately, as soon as it has been parsed (so before the program, aka the compilation unit, as a whole has been parsed).

There is, however, a caveat with the use of BEGIN in Perl 6: Modules in Perl 6 are pre-compiled by default, as opposed to Perl 5 which does not have any pre-compilation of modules or scripts.

As a user or developer of Perl 6 modules, you do not have to think about whether a module should be pre-compiled (again) or not. This is all done automatically under the hood when installing a module and after each Rakudo Perl 6 update. It is also done automatically whenever a developer makes a change to a module. The only thing you might notice is a small delay when loading a module.

This means that the BEGIN block is executed only when the pre-compilation occurs, not every time the module is loaded. This is different from Perl 5, where modules ordinarily exist only as source code that is compiled whenever a module is loaded (even though that module can load already compiled native library components).

This may cause some unpleasant surprises when porting code from Perl 5 to Perl 6, because pre-compilation may have happened a long time ago or even on a different machine (if it was installed from an OS-distributed package). Consider the case of using the value of an environment variable to enable debugging. In Perl 5 you could write this as:

# Perl 5
my $DEBUG;
BEGIN { $DEBUG = $ENV{DEBUG} // 0 }

This would work fine in Perl 5, as the module is compiled every time it is loaded, so the BEGIN block runs every time the module is loaded. And the value of $DEBUG will be correct, depending on the setting of the environment variable. But not so in Perl 6. Because the BEGIN phaser is executed only once, when pre-compiling, the $DEBUG variable will have the value determined at module pre-compilation time, not at module-loading time!

An easy workaround would be to inhibit pre-compilation of a module in Perl 6:

# Perl 6
no precompilation;  # this code should not be pre-compiled

However, pre-compilation has several advantages that you don’t want to dismiss easily:

  • Data structure setup has to be done just once. If you have data structures that must be set up each time a module is loaded, you can do it once when a module is pre-compiled. This may be a huge time- and CPU saver if the module is loaded often.

  • It can load modules much faster. Because it doesn’t need to parse any source code, a pre-compiled module loads much faster than one that’s compiled over and over again. A prime example is the core setting of Perl 6—the part that is written in Perl 6. This consists of a 64 KLOC/2MB source file (generated from many separate source files for maintainability). It takes about a minute to compile this source file during Perl 6 installation. It takes about 125 milliseconds to load this pre-compiled code at Perl 6 startup. This is almost a 500x speed boost!

Some other features of Perl 5 and Perl 6 that implicitly use BEGIN functionality have the same caveat. Take this example where we want a constant DEBUG to have either the value of the environment variable DEBUG or, if that is not available, the value 0:

# Perl 5
use constant DEBUG => $ENV{DEBUG} // 0;

# Perl 6
my constant DEBUG = %*ENV<DEBUG> // 0;

The best equivalent in Perl 6 is probably an INIT phaser:

# Perl 6
INIT my \DEBUG = %*ENV<DEBUG> // 0;  # sigilless variable bound to value

As in Perl 5, the INIT phaser is run just before execution starts. You can also use Perl 6’s module pre-compilation behavior as a feature:

# Perl 6
say “This module was compiled at { BEGIN DateTime.now }”;
# This module was compiled at 2018-10-04T22:18:39.598087+02:00

But more about that syntax later.

UNITCHECK

The UNITCHECK special block’s functionality in Perl 5 is performed by the CHECK phaser in Perl 6. Otherwise, it is the same; it specifies a piece of code to be executed when compilation of the current compilation unit is done.

CHECK

There is no equivalent in Perl 6 of the Perl 5 CHECK special block. The main reason is you probably shouldn’t be using the CHECK special block in Perl 5 anymore; use UNITCHECK instead because its semantics are much saner. (It’s been available since version 5.10.)

INIT

The functionality of the INIT phaser in Perl 6 is the same as the INIT special block in Perl 5. It specifies a piece of code to be executed just before the code in the compilation unit is executed.

In pre-compiled modules in Perl 6, the INIT phaser can serve as an alternative to the BEGIN phaser.

END

The END phaser’s functionality in Perl 6 is the same as the END special block’s in Perl 5. It specifies a piece of code to be executed after all the code in the compilation unit has been executed or when the code decides to exit (either intended or unintended because an exception is thrown).

An example

Here’s an example using all four program execution phasers and their Perl 5 special block counterparts

# Perl 5
say “running in Perl 5”;
END       { say “END”   }
INIT      { say “INIT”  }
UNITCHECK { say “CHECK” }
BEGIN     { say “BEGIN” }
# BEGIN
# CHECK
# INIT
# running in Perl 5
# END

# Perl 6
say “running in Perl 6”;
END   { say “END”   }
INIT  { say “INIT”  }
CHECK { say “CHECK” }
BEGIN { say “BEGIN” }
# BEGIN
# CHECK
# INIT
# running in Perl 6
# END

More than special blocks

Phasers in Perl 6 have additional features that make them much more than just special blocks.

No need for a Block

Most phasers in Perl 6 do not have to be a Block (i.e., followed by code between curly braces). They can also consist of a single statement without any curly braces. This means that if you’ve written this in Perl 5:

# Perl 5
# need to define lexical outside of BEGIN scope
my $foo;
# otherwise it won’t be known in the rest of the code
BEGIN { $foo = %*ENV<FOO> // 42 };

you can write it in Perl 6 as:

# Perl 6
# share scope with surrounding code
BEGIN my $foo = %*ENV<FOO> // 42;

May return a value

All program execution phasers return the last value of their code so that you can use them in an expression. The above example using BEGIN can also be written as:

# Perl 6
my $foo = BEGIN %*ENV<FOO> // 42;

When used like that with a BEGIN phaser, you are creating a nameless constant and assigning it at runtime.

Because of module pre-compilation, if you want this type of initialization in a module, you would probably be better of using the INIT phaser:

# Perl 6
my $foo = INIT %*ENV<FOO> // 42;

This ensures that the value will be determined when the module is loaded rather than when it is pre-compiled (which typically happens once during the module’s installation).

Other phasers in Perl 6

If you are only interested in learning how Perl 5 special blocks work in Perl 6, you can skip the rest of the article. But you will be missing out on quite a few nice and useful features people have implemented.

Block and Loop phasers

Block and Loop phasers are always associated with the surrounding Block, regardless of where they are located in the Block. Except you are not limited to having just one of each—although you could argue that having more than one doesn’t improve maintainability.

Note that any sub or method is also considered a Block with regards to these phasers.

Name Description
ENTERRun every time when entering a Block
LEAVERun every time when leaving a Block
PRECheck condition before running a Block
POSTCheck return value after having run a Block
KEEPRun every time a Block is left successfully
UNDORun every time a Block is left unsuccessfully

ENTER & LEAVE

The ENTER and LEAVE phasers are pretty self-explanatory: the ENTER phaser is called whenever a Block is entered. The LEAVE phaser is called whenever a Block is left (either gracefully or through an exception). A simple example:

# Perl 6
say “outside”;
{
    LEAVE say “left”;
    ENTER say “entered”;
    say “inside”;
}
say “outside again”;
# outside
# entered
# inside
# left
# outside again

The last value of an ENTER phaser is returned so that it can be used in an expression. Here’s a bit of a contrived example:

# Perl 6
{
    LEAVE say “stayed “ ~ (now ENTER now) ~ ” seconds”;
    sleep 2;
}
# stayed 2.001867 seconds

The LEAVE phaser corresponds to the DEFER functionality in many other modern programming languages.

KEEP & UNDO

The KEEP and UNDO phasers are special cases of the LEAVE phaser. They are called depending on the return value of the surrounding Block. If the result of calling the defined method on the return value is True, then any KEEP phasers will be called. If the result of calling defined is not True, then any UNDO phaser will be called. The actual value of the Block will be available in the topic (i.e., $_).

A contrived example may clarify:

# Perl 6
for 42, Nil {
    KEEP { say “Keeping because of $_” }
    UNDO { say “Undoing because of $_.perl()” }
    $_;
}
# Keeping because of 42
# Undoing because of Nil

As may a real-life example:

# Perl 6
{
    KEEP $dbh.commit;
    UNDO $dbh.rollback;
       # set up a big transaction in a database
    True;  # indicate success
}

So, if anything goes wrong with setting up the big transaction in the database, the UNDO phaser makes sure the transaction can be rolled back. Conversely, if the Block is successfully left, the transaction will be automatically committed by the KEEP phaser.

The KEEP and UNDO phasers give you the building blocks for a poor man’s software transactional memory.

PRE & POST

The PRE phaser is a special version of the ENTER phaser. The POST phaser is a special case of the LEAVE phaser.

The PRE phaser is expected to return a true value if it is OK to enter the Block. If it does not, then an exception will be thrown. The POST phaser receives the return value of the Block and is expected to return a true value if it is OK to leave the Block without throwing an exception.

Some examples:

# Perl 6
{
    PRE { say “called PRE”; False }    # throws exception
   
}
say “we made it!”;                     # never makes it here
# called PRE
# Precondition ‘{ say “called PRE”; False }’ failed

# Perl 6
{
    PRE  { say “called PRE”; True   }  # does NOT throw exception
    POST { say “called POST”; False }  # throws exception
    say “inside the block”;            # also returns True
}
say “we made it!”;                     # never makes it here
# called PRE
# inside the block
# called POST
# Postcondition ‘{ say “called POST”; False }’ failed

If you just want to check if a Block returns a specific value or type, you are probably better off specifying a return signature for the Block. Note that:

# Perl 6
{
    POST { $_ ~~ Int }   # check if the return value is an Int
                     # calculate result
    $result;
}

is just a very roundabout way of saying:

# Perl 6
–> Int {                # return value should be an Int
                     # calculate result
    $result;
}

In general, you would use a POST phaser only if the necessary checks would be very involved and not reducible to a simple type check.

Loop phasers

Loop phasers are a special type of Block phaser specific to loop constructs. One is run before the first iteration (FIRST), one is run after each iteration (NEXT), and one is run after the last iteration (LAST).

NameDescription 
FIRSTRun before the first iteration
NEXTRun after each completed iteration or with next
LASTRun after the last iteration or with last

The names speak for themselves. A bit of a contrived example:

# Perl 6
my $total = 0;
for 1..5 {
    $total += $_;
    LAST  say “—— +\n$total.fmt(‘%6d’)”;
    FIRST say “values\n======”;
    NEXT  say .fmt(‘%6d’);
}
# values
# ======
#      1
#      2
#      3
#      4
#      5
# —— +
#     15

Loop constructs include loop; while, until; repeat/while and repeat/until; for; and map, deepmap, flatmap.

You can use Loop phasers with other Block phasers if you want, but this is usually unnecessary.

Summary

In addition to the Perl 5 special blocks that have counterparts in Perl 6 (called phasers), Perl 6 has a number of special-purpose phasers related to blocks of code and looping constructs. Perl 6 also has phasers related to exception handling and warnings, event-driven programming, and document (pod) parsing; these will be covered in future articles in this series.

Facebook Comments
Spread the love

Posted by News Monkey