I’ve been learning Elixir over the past few weeks, and I decided that it was time to write a slightly less-than-trivial program. While Elixir is based on Erlang, this program doesn’t play to Erlang’s strengths (massive scaling, message passing, etc.) Instead, it was more of a voyage of discovery for me, and this article is my way of taking you along on the tour.

The Problem at Hand

I work with an amateur sports association that holds several tournaments throughout the year. At each of these tournaments, competitors can get points towards a “grand champion” award presented at the end of the year.

Here’s how the scoring works: for local tournaments, first place earns three points, second place is worth two points, and third place gets one point. For a state-level tournament, first place gets five points, second place gets four points, third place gets three points, fourth place gets two points, and fifth through eighth place get one point.

The placing data is stored in a spreadsheet, a sample of which looks like this table. (I have excluded the first line of the file, which gives the age group for the data.)

First Last Team Hollister Santa Cruz Oak Grove Open State

Mark

Arnhelm

Knights

1

William

Alvarez

Woodside

1

3

-1

Ross

Carter

Alliance

1

2

2

Shohei

Takamura

Athlete Nation

2

1

2

-2

I already have a Perl program that will read directly from the spreadsheet file and construct an HTML table of the standings. The output has point values rather than placing, and it is sorted in descending order of number of points:

First Last Team Total Hollister Santa Cruz Oak Grove Open State

Shohei

Takamura

Athlete Nation

11

2

3

2

4

William

Alvarez

Woodside

9

3

1

5

Ross

Carter

Alliance

7

3

2

2

Mark

Arnhelm

Knights

3

3

As part of learning Elixir, I decided to re-implement this program. Instead of trying to parse an OpenDocument file, I exported the spreadsheet as a CSV file with \t (TAB) as the column separator.

Data Design

I store the data for each competitor in an Elixir structure, defined as follows. The points list contains the number of points the competitor gained at each tournament.

defmodule Competitor do
  defstruct surname: "", given_name: "", team: "", total: 0, points: []
end

Here, in broad terms, is the pseudocode for the program.

  1. Open the file and read the header line

  2. Process the file one line at a time

    1. Separate the name and team from the placings

    2. Calculate the total and convert the placings to points

    3. Create a new Competitor record and add it to a list

  3. Use the headings and the competitor list in a function that produces HTML output

Reading the File

The following code opens the CSV file and processes it.

def read_csv(filename) do
  input_file = File.open!(filename, [:read, :utf8]) 1

  IO.read(input_file, :line) # ignore age group
  headings = String.split(chomp(IO.read(input_file, :line)), "\t") 2

  {headings, process_file(input_file, [])} 3
end
1 If the file name doesn’t exist, File.open! raises an exception giving an meaningful explanation of the error.
2 When Elixir reads a line, the line includes the ending newline character (\r\n for Windows, \n for Linux, and \r for Macintosh). The chomp/1 function deletes the trailing newline (you could also use String.strip/1 to do this), and String.split/2 separates it into a list of strings.
Note When you refer to a function in Elixir, you give its name and its number of arguments, also called arity, so chomp/1 refers to a function named chomp that takes one argument, and String.split/2 refers to the split function that resides in the String module and takes two arguments.
3 This is the return value: a tuple with the headings and the result of processing the input file.

Here is the chomp/1 function, which uses regular expressions to eliminate any newline character(s) appearing at the end of the line (\z, not $ as in many other regular expression engines). You could just as well use the String.strip/1 function to do this, but this gives me a chance to show regular expressions.s

def chomp(str) do
  Regex.replace(~r/\r?\n\z|\r\z/, str, "", [{:global, false}])
end

Processing the Input File

As long as there is data to read, process the row and append it to the head of the namelist, otherwise return the result of sorting the name list by point totals.

def process_file(input_file, namelist) do
  row = IO.read(input_file, :line)
  if (row != :eof) do
    process_file(input_file, [process_row(row) | namelist])
  else
    Enum.sort(namelist, &by_points/2)
  end
end

The most interesting part here is Enum.sort/2, which takes a list as its first argument and a function (the “sorting function”) as its second argument. In Elixir, functions are on an equal footing with strings, integers, and other types of data. You can assign a function to a variable, you can pass it as an argument (as in this code), and you can even have a function that returns another function as its value. Treating functions as “first class citizens” is a very powerful feature, and once you understand how to take advantage of it, it can make your code clearer and more flexible.

Every time Enum.sort/2 needs to compare two items, it will pass those items to the sorting function. The sorting function returns true if the first item belongs before the second item, false otherwise.

Here is the by_points/2 function. It first compares the total points; if they are equal, then it orders by surname. If those are the same, it orders by given name, and if those are the same, it uses the team name to break the tie.

def by_points(a, b) do
  if a.total == b.total do
    if a.surname == b.surname do
      if a.given_name == b.given_name do
        a.team < b.team
      else
        a.given_name < b.given_name
      end
    else
      a.surname < b.surname
    end
  else
    a.total > b.total
  end
end

Processing a Row

A row is processed by splitting it on \t (TAB). The person’s name and team are separated from the placings. A call to Enum.map_reduce/3 can convert the placings (first, second, third) to the appropriate number of points and get the total points all in one shot.

def process_row(row) do
  [first, last, team | placing] = chomp(row) |> String.split("\t") 1

  {points, sum} = Enum.map_reduce(placing, 0, &place_points/2) 2

  %Competitor{surname: last, given_name: first, team: team,
    total: sum, points: points)
end
1 The |> operator takes the output of the first function and uses it as the first argument of the second function. The code is the equivalent of String.split(chomp(row),"\t")
2 Enum.map_reduce/3 takes a list as its first argument, an “accumulator” as its second argument, and a function for the third argument. Enum.map_reduce/3 passes each item and the accumulator in turn to the function, which returns a tuple giving the converted item and the new value of the accumulator.

In my place_points/2 function, I used pattern matching to handle the cases of an empty entry or an integer in the CSV file.

def place_points(item, accumulator) when item == "" do
   {0, accumulator}
end

def place_points(item, accumulator) do
  value = String.to_integer(item)
  if value < 0 do
    n = max(1, 6 + value) # state tournament
    {n, accumulator + n}
  else
    n = max(1, 4 - value) # local tournament
    {n, accumulator + n}
  end
end

Creating the HTML

The hard work is done; read_csv/1 gives its caller a list of Competitor records that are in the proper order. The following html_output/1 function takes a CSV file name as its argument, passes that file name to read_csv/1, and uses the return value to create the HTML file.

def html_output(input_filename) do

  ends_with_csv = ~r/\.csv\z/ 1
  if input_filename =~ ends_with_csv do
    output_filename = Regex.replace(ends_with_csv, input_filename, ".html", [])
  else
    ouput_filename = input_filename <> ".html"
  end
  output_file = File.open!(output_filename, [:write, :utf8])

  {headings, data} = read_csv(input_filename)
  IO.puts output_file, """ 2
  <html>
    <head>
      <title>#{output_filename}</title> 3
    </head>
    <body>
      <table border="1">
      <thead>
        #{make_header_row(headings)} 4
      </thead>
      <tbody>
  """

  emit_html_rows(output_file, data) 5

  IO.puts output_file, """
      </tbody>
      </table>
    </body>
  </html>
  """
  File.close(output_file)
end
1 If the input file name ends with .csv, replace it with .html; otherwise add .html at the end. The expression input_filename =~ ends_with_csv is shorthand for Regex.match?(ends_with_csv, input_filename)
2 This part of the code relies heavily on heredocs to conveniently output multi-line strings. The ending """ must be on a line of its own.
3 The construction #{variable} interpolates the value of the variable into a string.
4 But wait…there’s more! You can interpolate a function call, and its return value is inserted into the string.
5 I could have interpolated the entire table body as a huge string. However, because there are often nearly a hundred competitors from fifteen tournaments in a CSV file, I felt it was better to use IO.puts/1 to write it to the output file one line at a time.

The Table Header

To create the table header, make_header_row/1 gets the list of headings. It uses pattern matching to isolate the first three items (given name, surname, and team) from the competitor’s points. The function then reconstructs the list of headings, adding "Total" as it goes, and uses Enum.join to add closing and opening table data tags between the items. This string is sandwiched between the tags that open and close the table row, and the <> operator concatenates them all together. Notice the use of defp to make this a private function; there is no reason for any other module to call this function.

defp make_header_row([first, last, team | points]) do
  "<tr><td>" <>
  Enum.join([first, last, team, "Total" | points], "</td><td>") <>
  "</td></tr>"
end

Creating the Table Body Rows

The emit_html_rows/2 function takes the output file as its first parameter and the list of competitors as its second parameter.

defp emit_html_rows(_output_file, []) do 1
  :ok
end

defp emit_html_rows(output_file, [person | remainder]) do 2
  %Competitor{given_name: first, surname: last, team: team,
    total: total, points: points} = person
  str = "<tr><td>" <>
    Enum.join([first, last, team], "</td><td>") <>
    "</td><td>" <>
    Enum.join(Enum.map([total | points], &html_cell(&1)), "</td><td>") 3 <>
    "</td></tr>"
  IO.puts(output_file, str) 4
  emit_html_rows(output_file, remainder)
end
1 If there are no more competitors, “my job here is done.”
2 Otherwise, separate the first person in the list from the remainder, and use bracket notation to assign the record’s components to individual variables.

By using the bracket notation, I can refer to first instead of person.given_name, last instead of person.surname, and so forth.

3 Deep breath here. Enum.map/2 takes a list with the total points first and the tournament points after it as its first argument. It passes each item in turn to html_cell/1, which converts the value to a string. Then, Enum.join/2 wraps each of those strings in a table cell.
4 The finished table row goes to the output file, and emit_html_rows/2 is called again to process the remaining competitors.

Here’s html_cell/1. If someone has zero points (which means they didn’t place in the tournament), the cell becomes a <br /> element. This is necessary to ensure that the cell’s borders are visible in older browsers. Otherwise, the number is converted to a string.

defp html_cell(item) do
  if (item == 0) do
    "<br />"
  else
    to_string(item)
  end

Conclusion

There you have it. I was able to use Elixir to perform a relatively mundane task: open a file, read it, do some calculations with the data, and write the data out in a new format. I learned quite a few interesting features of Elixir along the way, and I hope you did too.

If you’d like to play around with the code (and improve upon it), here’s the entire Elixir file, and here’s some sample data.