Perl to Groovy Mappings (An Introduction)

Recently I took a job with a startup that is using Groovy as their main language. I find the language to be similar to several other languages and they have some comparisons on their site, however one comparison they are missing is to Perl. Since I'm learning Groovy anyway and the two are sufficiently similar that I thought I'd give it a shot.

Before I start with the articles there are a few important things to note about Groovy:

  1. Groovy autoboxes, as such it is completely legitimate to write `"I like cheese".someMethodOnStrings()`.
  2. Groovy automatically makes a variable named `it` as the only parameter for lambda function if no parameter list is given, meaning that: `def foo = { it * 2}` is similar to `def foo = { it -> it * 2}`.
  3. Groovy's syntax provides an alternate way to list parameters for a method call when the last parameter is a lambda function meaning that `someData.someMethod(1,2, { it + 2})` is the same as `someData.someMethod(1,2) { it + 2}`, this is useful for methods that loop over lists, like `each`.

 

Perl to Groovy Mapping (Iterating)

This entry will primarily go over various approaches towards iterating over structures of data. This will include basic looping as well as equivalents to Perl’s map & grep.

This is not intended to be an exhaustive guide or a best practice for either language. The attempt, as with all articles in this series, is to provide a basic set of mappings between the languages.

Iterating over collections

To loop over all the items in an array in Perl you would probably use the for loop construct:

for my $item (@array) {…}

You could do something similar in Groovy using the each method on collections:

list.each {}

To loop over all the keys in a hash (or Map) you can combine the above with the keys builtin:

for my $key (keys(%hash)) {…}

In Groovy you simply use a lambda that accepts two parameters.

myMap.each { key, _value ->  }

Looping over all the values is similar to the examples above in both, so I will skip that. Looping over each entry in the hash with both the key and value in Groovy is the same as the example above, in Perl you would have to use each, which I don’t like.

It is important to note that with Groovy’s each different things are passed in based on the number of parameters. The example above gave each a 2 parameter lambda and so the lambda got inputs of [key, value]. Had a 1 parameter lambda been passed in then that lambda would have received an Entry object that had accessors for the key and value.

Iterating over a Set of Integers

Both Perl and Groovy provide many ways to iterate over a range. One popular approach in Perl is to use the dot-dot range operator (..) and for looping construct. By using this it is simple to print all the numbers between one and ten:

for my $num (1..10) { print $num }

Groovy has a similar approach:

(1..10).each { print it }

Groovy has a couple other ways to do the same thing, for example numbers have an upto method attached to them that can be used to iterate. As such, the same thing above could be written as:

1.upto(10) { print it }

Of course, Perl also has the ability to loop via the while looping construct and Groovy could loop using a for in on top of what I’ve listed here.

If you wanted to iterate over a set of integers that didn’t increment by one in Perl you might use map and for like this:

for my $multiple_of_five ( map { $_ * 5 } 1..5) {…}

Or you might use grep and for like this:

for my $multiple_of_five ( grep { not $_ % 5 } 1..25) {…}

Or, of course, you could pick up a useful module, like Math::Sequence.

In Groovy you would simply do this:

5.step(25,5) {}

Mapping a List

Perl helpfully provides the map builtin to easily map a list of values from one to another. For example, were a developer wanting to square all the values in a list they would:

my @squared = map { $_**2 } @original_values

The same thing can be performed in Groovy using the collect() method on collections:

List squared = originalValues.collect { it.power(2) }

Grepping a List

Perl provides the grep builtin to easily filter values out of a list and provide a new list with only the remaining values. For example, if a developer was looking for all the items in a list that were odd they could:

my @odd = grep { $_ % 2 } @numbers

To do this in Groovy they would use the findAll method that exists on collections:

List odd = numbers.findAll { it % 2 }

Grepping for the First Item in a List

Perl has a number of modules which are considered ‘core’. It is reasonable to assume any core module is installed and readily available. To get the first item in a list that matches some criteria a Perl developer could use the example above for grepping a list and shift off the top, but more likely they would use List::Util’s first like this:

my $first_odd = first { $_ % 2} @numbers

A similar approach in Groovy could be done via the find method on collections:

def firstOdd = numbers.find { it % 2 }

If You Like X, You Might Like Perl

Objective-C - You'd probably like Moose's concept of roles if you like categories.  Roles also provide the ability to require methods exist on classes so you could use them for interfaces too, but when you consume a role you can't guarantee it won't add functionality.

Clojure - If you were willing to play with Perl 6 you'd probably like the many of its lazy aspects.  The common example being binding an infinitely long fib sequence to a list (my @fib := 0,1,*+*...* ).  Perl 6 also supports multi-method dispatching and pattern matching.  I don't know about Clojure's pattern matching abilities, but I do know that Perl 6's is weaker than Erlang's, so be aware of that.

If you don't want to try Perl 6 you should look at the Higher Order Perl book.  It focuses on functional Perl and steals heavily from Lisp.  In fact, the author states very early on that of the 7 things that make Lisp different, Perl has 6.  It is available free online or from a book store.  The author is currently boycotting Amazon though, so while they sell it you may want to buy from someone else. 

Ruby - I haven't gotten around to learning Ruby.  For what it's worth, there are a lot of people I know that like both Ruby and Perl so I have to assume there are some good similarities.  If you are looking for method_missing it is called autoload.  Besides that, Moose::Exporter is great for making keywords for DSLs.  You can use Perl 5's prototypes, but that gets a little more complicated.  You can also use Devel::Declare if you just want to take the parser over altogether which, of course, is gets more complicated.

Feel free to add more in the comments or tell me where I'm wrong, I'll update the body of the post as the conversation goes along.

Week 4: Currying

This week we'll be covering the process of partially applying parameters to a subroutine, this is based off of closures which we covered last week.  This allows us to create new subroutines that we can pass around that already have some of their parameters defined.  One of the benefits of this is that we don't have to pass around a parameter list and wait until all the parameters are ready, instead we can apply each parameter as we get it and then execute the subroutine at some later time when we are ready.  Doing this in Perl is actually very simple.

The function above provides us the ability to curry other functions.  Analyzing it shows that all we are doing is creating a new anonymous function that is going to call the function we passed in with the parameters we passed in and the parameters it receives.  Let's see an example of how this would work.

So, let's look at how it all worked.  First, we made a new anonymous function that takes a list of parameters, joins them, and prints them.  Next we curried that list with 3 fruit.  To do this we created a new anonymous subroutine that passed in the three fruit into the first anonymous subroutine.  After that we repeated the process, by currying the new subroutine we created from the last curried we now have a subroutine that calls the subroutine created on line 3 with 2 new fruit, that subroutine calls the subroutine created on line 1.  Once we execute this we will output a list of all the fruit we've given to the function.

We don't have to only curry anonymous functions.  Because we can get the reference of named functions and builtins we can also curry those.

This concludes this week's lesson.  Next week we will move past the building blocks of functional programming and start getting into systems built on top of this functional foundation.  Next week we'll cover dynamic dispatch tables.

As an aside, the name curry didn't come from the tasty South Asian type of dish, unlike the naming of Mix-Ins.  Instead, currying is named after the American Mathematician and Logician Haskell Curry.

Week 2: Closures

This week we will be covering closures.  Closures are another basic concept of functional programming.  They are based right on top of anonymous functions so if you haven't read last week's article it would behoove you to do so.

Closures are subroutines that "close over" variables thereby storing them in the scope of the subroutine to be accessed later.  Values stored in variables that exist within the scope of a function are guaranteed to exist as long as the function does.  This allows a developer to create subroutines that do many useful things, including provide state and encapsulate values.

In Perl a subroutine can access variables defined anywhere lexically above it, even once the scope that it is in is left.  This allows us to access otherwise lost data.

Whereas if we change this example to simply:

This code will fail at compilation time because $value isn't defined in the outer scope.

This is not unique to named functions, anonymous functions provide the same functionality:

We can use closures to create new subroutines that have some state stored in them.  For example, here is a subroutine that returns a subroutine that takes numbers to the power of the variable that was closed over.

It is important to note in the example above that each time power_of_generator is called the new anonymous subroutine returned has a new value to $power and cannot interfere with any other subroutines value of $power.

Another useful ability that closures have is that they can safely encapsulate data, more safely than many of the OOP approaches in Perl.

It is important to note in the example above that $name cannot be accessed outside of the scope of the outer curly brackets.  As such, there is no way to set $name to a value that is not title cased.  This is because the only way to set name is through set_name() which will not accept a value that isn't title cased.

The example above only provides the ability to store one name in a running program.  If we wanted to safely store multiple names we could do so by wrapping the $name and accompanying subs into a larger sub.

This concludes our discussion on closures.  Next week we will move on to currying and a discussion on 'higher order' subroutines.

 

Basic Moose Week 1: Creating a Class

*** Update:  Thanks to Kent Fredric (@kentnl) for catching my mistake on his MooseX::Has::Sugar***

Perl5's object system is impressive in its simplicity.  From that simplicity it gains an incredible flexibility which both matches Perl and allows for amazing things.  However, it is easy to find patterns created from the spartan nature of the system.  This last point has caused many people to design new object interfaces on top of Perl's object system.  The most robust and popular of these is Moose.  This first tutorial will show you a simple way to create classes using Moose.

Moose allows the user to easily declare the attributes that are provided on a class along with metadata about those attributes.  This metadata defines many important aspects of the attribute, such as what type of attribute it is, what the permissions are on the attribute, the default value of the attribute, and many other things.  Here is an example of a simple class called employee:

This class definition does several things, so let's break it down.  First we use Moose, this exports some spiffy 'keywords' like has.  Additionally, it very sneakily uses the strict and warnings pragma in our package which is very convenient.  Next we use the has keyword to define an attribute called pay.  We define this attribute as being both readable and writable through its accessor and we say that the accessor should only accept numbers.  Next, we create another attribute called manager, which has an accessor that both allows reading and writing and only accepts Employee objects.  How does it know that Employees are objects?  Well, Moose has a list of things that it considers types.  If the type is already defined then it uses that type, otherwise it assumes it is a class.

That seems like that could be dangerous though.  For example, what if you typo'd 'Num' to be 'num' or 'NUm' or if you spelled out 'number'?  In these cases Moose would assume that 'num', 'NUm', or 'number' were names of classes and only accept values that were of that class.  Don't worry though, someone has already seen and solved that problem, the solution we will use for this is a combination of MooseX::Has::Sugar and MooseX::Types.  MooseX::Has::Sugar provides us with several predefined 'keywords' that we can use and if we typo we'll get compile time errors.  Additionally, it defines rw and ro (the value for is that indicates readonly) in such a way that we no longer have to provide the 'is =>' portion.  MooseX::Types allows us to predeclare types and organize them into libraries.  Now our class definition looks like this:

This looks much better.  By the way, there are many options to MooseX::Has::Sugar and MooseX::Types so you would really benefit from reading that over.

Let's try and use this class in a script:

Now, if we look at that script we see some very interesting things.  We never explicitly created a new, manager, or salary method but they are all there.  This is because Moose created each of these for us to use.  In fact, it is imperative that when you are using Moose you do not create your own new method, that is very dangerous.

Of course, Moose provides many other meta attributes to define the attributes of your class, some of which are:

  • required - whether calls to new() must include this attribute, can be 1 or 0
  • default - the default value of this attribute or a reference to a sub that will return the default value, it is better to use builder though (see below)
  • lazy - whether this attribute will be built when the object is built or whether Moose will wait until it is first used
  • …and more we'll cover in following weeks

We'll rewrite our class by adding some attributes and adding these meta attributes to all the attributes, but this time we won't use MooseX::Has::Sugar or MooseX::Types:

Builder vs. Default

I mentioned above that you can set the default value of an attribute with default however it is better to use builder.  The builder option takes the name of a subroutine that will return a value for the attribute instead of a subref or actual value.  The benefit of this is that if you inherit from a base class that uses a builder you can easily override the default value by overriding the sub that builds the value, with default it is not so simple.

Caveat

There is one important caveat to mention.  Neither Moose nor any of its extensions are magical and, in fact, none of them actually create new keywords.  Instead, all of these modules are importing methods into your package just like any other tool.  This means that they happen in the same order as you would expect function calls to happen.  It also means you can write them any way you want, however this is a place where following the standard benefits everyone.

Dirty Little Secret

As much as I like the idea of MooseX::Has::Sugar and MooseX::Types I should probably mention that I actually rarely use them.  However, they seem like a good tool to start off with.

Further Reading

Week 1: The Lambda Function

***UPDATE:  I modified the final example with Naveed's (@ironcamel) suggestion, it looks much nicer now.***

***UPDATE: Christopher Bottom caught some vestigial lines of code in the final example, which I have removed.***

The Lambda Function is the building block of all functional programming.  It is also referred to as an anonymous function or sometimes in Perl as an anonymous subroutine.  Perl treats subroutines as first class data types.  Because of this it is incredibly simple to create, store, and invoke anonymous subroutines.  To create an anonymous subroutine you simply use the 'sub' keyword without putting a name between the keyword and the body.  For example:

Is an example of an anonymous subroutine.  Obviously it isn't very useful though.  To make this useful we need to store the reference of that anonymous subroutine in a variable:

Now that we have an anonymous subroutine referenced by the $hello variable we can access it by dereferencing and calling it like so:

Anonymous subroutines are just like normal subroutines.  Because of this you can pass parameters in to them.  Passed in parameters are provided in @_ just like in a normal subroutine.  If an anonymous subroutine exists within a larger subroutine then both will have their own @_ within their scope and you won't have to worry about them intermingling.



Without dereferencing and calling it you are simply referring to the variable itself, which you can pass around.  In fact, because these subroutine references are first class they can be treated the same as any other value.  This means that you can store subroutine references in arrays and in hashes (as both the key or the value), and that you can pass it along in subroutines.


This concludes this week's installment in Functional Perl.  Next week will cover the next building block, closures.