I’ve been learning Elixir over the past few weeks, and I decided that it was time to write a slightly less-than-trivial program. While Elixir is based on Erlang, this program doesn’t play to Erlang’s strengths (massive scaling, message passing, etc.) Instead, it was more of a voyage of discovery for me, and this article is my way of taking you along on the tour.
The Problem at Hand
I work with an amateur sports association that holds several tournaments throughout the year. At each of these tournaments, competitors can get points towards a “grand champion” award presented at the end of the year.
Here’s how the scoring works: for local tournaments, first place earns three points, second place is worth two points, and third place gets one point. For a state-level tournament, first place gets five points, second place gets four points, third place gets three points, fourth place gets two points, and fifth through eighth place get one point.
The placing data is stored in a spreadsheet, a sample of which looks like this table. (I have excluded the first line of the file, which gives the age group for the data.)
First | Last | Team | Hollister | Santa Cruz | Oak Grove | Open State |
---|---|---|---|---|---|---|
Mark |
Arnhelm |
Knights |
1 |
|||
William |
Alvarez |
Woodside |
1 |
3 |
-1 |
|
Ross |
Carter |
Alliance |
1 |
2 |
2 |
|
Shohei |
Takamura |
Athlete Nation |
2 |
1 |
2 |
-2 |
I already have a Perl program that will read directly from the spreadsheet file and construct an HTML table of the standings. The output has point values rather than placing, and it is sorted in descending order of number of points:
First | Last | Team | Total | Hollister | Santa Cruz | Oak Grove | Open State |
---|---|---|---|---|---|---|---|
Shohei |
Takamura |
Athlete Nation |
11 |
2 |
3 |
2 |
4 |
William |
Alvarez |
Woodside |
9 |
3 |
1 |
5 |
|
Ross |
Carter |
Alliance |
7 |
3 |
2 |
2 |
|
Mark |
Arnhelm |
Knights |
3 |
3 |
As part of learning Elixir, I decided to re-implement this program. Instead of
trying to parse an OpenDocument file, I exported the spreadsheet as a
CSV file with \t
(TAB) as the column separator.
Data Design
I store the data for each competitor in an Elixir structure, defined as
follows. The points
list contains the number of points the competitor
gained at each tournament.
defmodule Competitor do
defstruct surname: "", given_name: "", team: "", total: 0, points: []
end
Here, in broad terms, is the pseudocode for the program.
-
Open the file and read the header line
-
Process the file one line at a time
-
Separate the name and team from the placings
-
Calculate the total and convert the placings to points
-
Create a new
Competitor
record and add it to a list
-
-
Use the headings and the competitor list in a function that produces HTML output
Reading the File
The following code opens the CSV file and processes it.
def read_csv(filename) do
input_file = File.open!(filename, [:read, :utf8])
IO.read(input_file, :line) # ignore age group
headings = String.split(chomp(IO.read(input_file, :line)), "\t")
{headings, process_file(input_file, [])}
end
![]() |
If the file name doesn’t exist, File.open! raises an exception giving an
meaningful explanation of the error.
| ||
![]() |
When Elixir reads a line, the line includes the ending newline character
(\r\n for Windows, \n for Linux, and \r for Macintosh). The
chomp/1 function deletes the trailing newline (you could also use
String.strip/1 to do this), and String.split/2 separates
it into a list of strings.
| ||
![]() | This is the return value: a tuple with the headings and the result of processing the input file. |
Here is the chomp/1
function, which uses regular expressions to eliminate
any newline character(s) appearing at the end of the line (\z
, not $
as
in many other regular expression engines). You could just as well use the
String.strip/1
function to do this, but this gives me a chance to show
regular expressions.s
def chomp(str) do
Regex.replace(~r/\r?\n\z|\r\z/, str, "", [{:global, false}])
end
Processing the Input File
As long as there is data to read, process the row and append it to the
head of the namelist
, otherwise return the result of sorting the name
list by point totals.
def process_file(input_file, namelist) do
row = IO.read(input_file, :line)
if (row != :eof) do
process_file(input_file, [process_row(row) | namelist])
else
Enum.sort(namelist, &by_points/2)
end
end
The most interesting part here is Enum.sort/2
, which takes a
list as its first argument and a function (the “sorting function”) as its
second argument. In Elixir, functions are on an equal footing with strings,
integers, and other types of data. You can assign a function to a variable,
you can pass it as an argument (as in this code), and you can even have a
function that returns another function as its value. Treating
functions as “first class citizens” is a very powerful feature, and once you
understand how to take advantage of it, it can make your code clearer and more
flexible.
Every time Enum.sort/2
needs to compare two items, it will pass those
items to the sorting function. The sorting function
returns true
if the first item belongs before the second item, false
otherwise.
Here is the by_points/2
function. It first compares the total points; if
they are equal, then it orders by surname. If those are the same, it
orders by given name, and if those are the same, it uses the team name
to break the tie.
def by_points(a, b) do
if a.total == b.total do
if a.surname == b.surname do
if a.given_name == b.given_name do
a.team < b.team
else
a.given_name < b.given_name
end
else
a.surname < b.surname
end
else
a.total > b.total
end
end
Processing a Row
A row is processed by splitting it on \t
(TAB). The person’s
name and team are separated from the placings. A call
to Enum.map_reduce/3
can convert the placings (first, second, third)
to the appropriate number of points and get the total points all in
one shot.
def process_row(row) do
[first, last, team | placing] = chomp(row) |> String.split("\t")
{points, sum} = Enum.map_reduce(placing, 0, &place_points/2)
%Competitor{surname: last, given_name: first, team: team,
total: sum, points: points)
end
![]() |
The |> operator takes the output of the first function and uses it
as the first argument of the second function. The code is the equivalent of
String.split(chomp(row),"\t")
|
![]() |
Enum.map_reduce/3 takes a list as its first argument, an “accumulator” as
its second argument, and a function for the third argument. Enum.map_reduce/3
passes each item and the accumulator in turn to the function, which returns a
tuple giving the converted item and the new value of the accumulator.
|
In my place_points/2
function, I used pattern matching to handle the cases
of an empty entry or an integer in the CSV file.
def place_points(item, accumulator) when item == "" do
{0, accumulator}
end
def place_points(item, accumulator) do
value = String.to_integer(item)
if value < 0 do
n = max(1, 6 + value) # state tournament
{n, accumulator + n}
else
n = max(1, 4 - value) # local tournament
{n, accumulator + n}
end
end
Creating the HTML
The hard work is done; read_csv/1
gives its caller a list of Competitor
records that are in the proper order. The following html_output/1
function
takes a CSV file name as its argument, passes that file name to read_csv/1
,
and uses the return value to create the HTML file.
def html_output(input_filename) do
ends_with_csv = ~r/\.csv\z/
if input_filename =~ ends_with_csv do
output_filename = Regex.replace(ends_with_csv, input_filename, ".html", [])
else
ouput_filename = input_filename <> ".html"
end
output_file = File.open!(output_filename, [:write, :utf8])
{headings, data} = read_csv(input_filename)
IO.puts output_file, """
<html>
<head>
<title>#{output_filename}</title>
</head>
<body>
<table border="1">
<thead>
#{make_header_row(headings)}
</thead>
<tbody>
"""
emit_html_rows(output_file, data)
IO.puts output_file, """
</tbody>
</table>
</body>
</html>
"""
File.close(output_file)
end
![]() |
If the input file name ends with .csv , replace it with .html ; otherwise
add .html at the end. The expression input_filename =~ ends_with_csv is shorthand
for Regex.match?(ends_with_csv, input_filename)
|
![]() |
This part of the code relies heavily on heredocs to conveniently output
multi-line strings. The ending """ must be on a line of its own.
|
![]() |
The construction #{variable} interpolates the value of the variable
into a string.
|
![]() | But wait…there’s more! You can interpolate a function call, and its return value is inserted into the string. |
![]() |
I could have interpolated the entire table body as a huge string.
However, because there are often nearly a hundred competitors from fifteen
tournaments in a CSV file, I
felt it was better to use IO.puts/1 to write it to the output file one
line at a time.
|
The Table Header
To create the table header, make_header_row/1
gets the list of headings.
It uses pattern matching to isolate the first three items (given name, surname,
and team) from the competitor’s points. The function then reconstructs
the list of headings, adding "Total"
as it goes,
and uses Enum.join
to add closing and opening
table data tags between the items. This string is sandwiched between the tags
that open and close the table row, and the <>
operator concatenates them
all together. Notice the use of defp
to make this a private function; there
is no reason for any other module to call this function.
defp make_header_row([first, last, team | points]) do
"<tr><td>" <>
Enum.join([first, last, team, "Total" | points], "</td><td>") <>
"</td></tr>"
end
Creating the Table Body Rows
The emit_html_rows/2
function takes the output file as its
first parameter and the list of competitors as its second parameter.
defp emit_html_rows(_output_file, []) do
:ok
end
defp emit_html_rows(output_file, [person | remainder]) do
%Competitor{given_name: first, surname: last, team: team,
total: total, points: points} = person
str = "<tr><td>" <>
Enum.join([first, last, team], "</td><td>") <>
"</td><td>" <>
Enum.join(Enum.map([total | points], &html_cell(&1)), "</td><td>")
<>
"</td></tr>"
IO.puts(output_file, str)
emit_html_rows(output_file, remainder)
end
![]() | If there are no more competitors, “my job here is done.” |
![]() |
Otherwise, separate the first person in the list from the remainder,
and use bracket notation to assign the record’s components to individual
variables.
By using the bracket notation, I can refer to |
![]() |
Deep breath here. Enum.map/2 takes a list with the
total points first and the tournament points after it as its
first argument. It passes each item in turn
to html_cell/1 , which converts the value to a string. Then, Enum.join/2
wraps each of those strings in a table cell.
|
![]() |
The finished table row goes to the output file, and emit_html_rows/2
is called again to process the remaining competitors.
|
Here’s html_cell/1
. If someone has zero points (which means they didn’t
place in the tournament), the cell becomes a <br />
element. This is
necessary to ensure that the cell’s borders are visible in older browsers.
Otherwise, the number is converted to a string.
defp html_cell(item) do
if (item == 0) do
"<br />"
else
to_string(item)
end
Conclusion
There you have it. I was able to use Elixir to perform a relatively mundane task: open a file, read it, do some calculations with the data, and write the data out in a new format. I learned quite a few interesting features of Elixir along the way, and I hope you did too.
If you’d like to play around with the code (and improve upon it), here’s the entire Elixir file, and here’s some sample data.