For years, Erlang has been an important poster child of the
functional programming community by providing an effective
retort to the all too common taunt: "FP is an ivory tower
wankfest." It was certainly, um, a "relief" to be able to say,
"There's an industrial strength functional programming
language in use at Ericsson to write huge, fault-tolarent
programs for things like phone switches. Hah!"
While somewhat effective, for a long time it was more or less
countered by the fact that most folks couldn't actually
play with Erlang---being, as it was, the proprietory
product of Ericsson.
Fortunately for sensitive academics everywhere, around two
years ago Ericsson released an Open Source version of Erlang
which is both freely downloadable, and free for any use you'd
care to put it (I believe that even Academic Wankering
is permitted, perhaps even encouraged).
Last year, the Open Source and the commerical code bases got
in sync, and peace, harmony, and transparently concurrent
programs reign.
So, what's cool about Erlang? Well, it's a dynamically typed,
concurrent, functional programming language. It's built from
the ground up to handle and make easy---nay, trivial!---the
use of lots and lots of (lightweight) threads.
Essentially, Erlang programming revolves around three key
structures:
-
Functions: the basic lexical unit of computation in
Erlang, a function looks a bit like a Prolog rule: you can
have multiple clauses which are selected by pattern
matching the arguments. And if you understood that, you
didn't need to read it.
-
Modules: the basic unit of..er..modularization. I was
going to say "abstraction", but that's not quite right.
It's not quite wrong either. Dang. Anyway, the module
system is quite simple and straightforward and not at all a
pain to use.
-
Processes: the basic unit...oh, forget it. Erlang uses
a message passing model to communicate between and
coordinate processes. The cool thing is that you, as the
programmer, don't have to care where the receiving process
is (it could be on a different processor or different
machine!) (in principle, of course). Each process can
specify all sorts of neat message selection policies (e.g.,
a certain process can "take" a message from process A only
after receiving a "go ahead" message from process B).
Together, (and with other such goodies as list comprehensions,
function guards, process linking, error trapping and passing)
these make a pretty powerful package wrapped up in a rather
clean syntax. Think of it as a non-graphical BeOS of a
programming language.
What prompted me to finally write about Erlang is the call for participation
for the sixth Erlang conference.
There is lots of very interesting stuff on the agenda,
which, if Kendall ever manages to sell that second kidney,
I'll be reporting on, live, from Stockholm. Of particular
interest (to me):
-
high performance implementations of Erlang (HiPE and ETOS),
-
an Erlang chip(!),
-
EXML...finally, XML in Erlang,
-
and, especially, "Erlang Bit Syntax - The Released
Version".
I'm dying to see the last item. There was a wonderful teaser
about it at last year's conference (ahem, not that I
went, mind you, but you can download the article in
postscript from the proceedings).
Essentially, the original proposal extended the pattern
matching paradigm to binary objects.
"Huh?", you may well ask. (Warning, long technical
simplification ahead:) Pattern matching (or "destructuring",
if you're a Lisper; or "watered down unification" if you're a
Prologer) can be thought of as a kind of assignment. In
standard assignment expressions (or statements), the entire
right hand side is assigned (or copied, or bound) to the left
hand side. So:
aVariable := 1
Of course, you can do this with more complicated data
structures than numbers:
aVariable := #(1 2 3)
(Think of the #(1 2 3) as an array or a list of three items.)
Note that 'aVariable' gets bound to the entire list. If
we want to get at the contents, we must use our indexing or
element referencing syntax on aVariable, or do further
assignments (along with such use of syntax).
theHeader := aVariable[1]
theMiddler := aVariable[2]
Ick. With pattern matching, we can do muliple assignments at
once:
#(theHeader, theMiddler, _) := #(1, 2, 3)
(The underscore is the standard "don't care", or "anonymous"
variable...it allows to you fill in pieces of the
structure without having to come up with names for all
of them.) Now, we match only if our structures match (so it's
really conditional assignment). Hmm. Left hand side is
a three element array; right hand side is a three element
array; bang! we match.
Given that we match, the assignment "lines up" the variables
and does the corresponding assignments. So, theHeader and
theMiddler get bound to 1 and 2, respectively.
(Ok, ok, so you didn't follow it. It's still ruthlessly cool.
Your structures can be arbitrarily complex and you don't have
to do icky dereferencing navigation to pull out the elements
you want.)
Probably the most familiar sort of pattern matching is the
standard use of regular expressions in scripting languages.
They allow you (rather crudely) pull strings (a.k.a byte
arrays) apart without having to resort to nasty looping and
nested conditionals. Pattern matching, in general, extends
this to arbitrary datastructures, and Erlang's bit syntax
brings it back down to binary objects (a.k.a. byte arrays,
a.k.a., strings).
What? Why should we care if it's just bringing ye olde regexes
to Erlang?!? Well, frankly, regexes suck. Erlang's bit syntax
presents an alternative way to do string destructuring, which
should be interesting just for the fact of being an
alternative. It's also more cleanly general. You can use
regexs to munge generic binary data, but it starts to get
painful and confusing. Erlang's bit syntax seems to work
better in those cases. Plus, in Erlang, you dispatch to the
different clauses of a function using pattern matching. Being
able to trigger a certain function variant (or case clause, or
message receive clause) based on the structure of the "string"
passed to it is very convenient (espeically since you
can automatically extract the bits you want for that
clause, bind them to variables, and ignore the rest...all in
the function header).
Ahem, some of this is extrapolation on my part (though it is
very interesting to compare the preliminary Erlang code using
the Bit syntax for handling HTTP and SMTP with other
approaches), as I don't know what the final version looks like
yet (sniff...have to wait for the conference proceedings). But
I'll just add that Erlang is full of little gems like that
already. And the apps it comes bundled with already
make for an extensive playground (e.g., Inets, an apache
configuration compatible HTTP server; Mnesia, a distributed,
fault-tolerant object DBMS; a CORBA ORB; etc.)
I should mention that the "
Erlang Book" (there's a pdf
of the first half available for downloading) is an
excellent read. A bit dated now, and a tad skimpy in
places, it nevertheless is exceedingly clear.