Grand Champion Standings: A Short Elixir Program

I’ve been learning Elixir over the past few weeks, and I decided that it was time to write a slightly less-than-trivial program. While Elixir is based on Erlang, this program doesn’t play to Erlang’s strengths (massive scaling, message passing, etc.) Instead, it was more of a voyage of discovery for me, and this article is my way of taking you along on the tour.

The Problem at Hand

I work with an amateur sports association that holds several tournaments throughout the year. At each of these tournaments, competitors can get points towards a “grand champion” award presented at the end of the year.

Here’s how the scoring works: for local tournaments, first place earns three points, second place is worth two points, and third place gets one point. For a state-level tournament, first place gets five points, second place gets four points, third place gets three points, fourth place gets two points, and fifth through eighth place get one point.

The placing data is stored in a spreadsheet, a sample of which looks like this table. (I have excluded the first line of the file, which gives the age group for the data.)

First	Last	Team	Hollister	Santa Cruz	Oak Grove	Open State
Mark	Arnhelm	Knights	1
William	Alvarez	Woodside	1		3	-1
Ross	Carter	Alliance	1	2	2
Shohei	Takamura	Athlete Nation	2	1	2	-2

I already have a Perl program that will read directly from the spreadsheet file and construct an HTML table of the standings. The output has point values rather than placing, and it is sorted in descending order of number of points:

First	Last	Team	Total	Hollister	Santa Cruz	Oak Grove	Open State
Shohei	Takamura	Athlete Nation	11	2	3	2	4
William	Alvarez	Woodside	9	3		1	5
Ross	Carter	Alliance	7	3	2	2
Mark	Arnhelm	Knights	3	3

As part of learning Elixir, I decided to re-implement this program. Instead of trying to parse an OpenDocument file, I exported the spreadsheet as a CSV file with \t (TAB) as the column separator.

Data Design

I store the data for each competitor in an Elixir structure, defined as follows. The points list contains the number of points the competitor gained at each tournament.

defmodule Competitor do
  defstruct surname: "", given_name: "", team: "", total: 0, points: []
end

Here, in broad terms, is the pseudocode for the program.

Open the file and read the header line
Process the file one line at a time
1. Separate the name and team from the placings
2. Calculate the total and convert the placings to points
3. Create a new Competitor record and add it to a list
Use the headings and the competitor list in a function that produces HTML output

Reading the File

The following code opens the CSV file and processes it.

def read_csv(filename) do
  input_file = File.open!(filename, [:read, :utf8]) 

  IO.read(input_file, :line) # ignore age group
  headings = String.split(chomp(IO.read(input_file, :line)), "\t") 

  {headings, process_file(input_file, [])} 
end

If the file name doesn’t exist, File.open! raises an exception giving an meaningful explanation of the error.

When Elixir reads a line, the line includes the ending newline character (\r\n for Windows, \n for Linux, and \r for Macintosh). The chomp/1 function deletes the trailing newline (you could also use String.strip/1 to do this), and String.split/2 separates it into a list of strings.

When you refer to a function in Elixir, you give its name and its number of arguments, also called arity, so chomp/1 refers to a function named chomp that takes one argument, and String.split/2 refers to the split function that resides in the String module and takes two arguments.

This is the return value: a tuple with the headings and the result of processing the input file.

Here is the chomp/1 function, which uses regular expressions to eliminate any newline character(s) appearing at the end of the line (\z, not $ as in many other regular expression engines). You could just as well use the String.strip/1 function to do this, but this gives me a chance to show regular expressions.s

def chomp(str) do
  Regex.replace(~r/\r?\n\z|\r\z/, str, "", [{:global, false}])
end

Processing the Input File

As long as there is data to read, process the row and append it to the head of the namelist, otherwise return the result of sorting the name list by point totals.

def process_file(input_file, namelist) do
  row = IO.read(input_file, :line)
  if (row != :eof) do
    process_file(input_file, [process_row(row) | namelist])
  else
    Enum.sort(namelist, &by_points/2)
  end
end

The most interesting part here is Enum.sort/2, which takes a list as its first argument and a function (the “sorting function”) as its second argument. In Elixir, functions are on an equal footing with strings, integers, and other types of data. You can assign a function to a variable, you can pass it as an argument (as in this code), and you can even have a function that returns another function as its value. Treating functions as “first class citizens” is a very powerful feature, and once you understand how to take advantage of it, it can make your code clearer and more flexible.

Every time Enum.sort/2 needs to compare two items, it will pass those items to the sorting function. The sorting function returns true if the first item belongs before the second item, false otherwise.

Here is the by_points/2 function. It first compares the total points; if they are equal, then it orders by surname. If those are the same, it orders by given name, and if those are the same, it uses the team name to break the tie.

def by_points(a, b) do
  if a.total == b.total do
    if a.surname == b.surname do
      if a.given_name == b.given_name do
        a.team < b.team
      else
        a.given_name < b.given_name
      end
    else
      a.surname < b.surname
    end
  else
    a.total > b.total
  end
end

Processing a Row

A row is processed by splitting it on \t (TAB). The person’s name and team are separated from the placings. A call to Enum.map_reduce/3 can convert the placings (first, second, third) to the appropriate number of points and get the total points all in one shot.

def process_row(row) do
  [first, last, team | placing] = chomp(row) |> String.split("\t") 

  {points, sum} = Enum.map_reduce(placing, 0, &place_points/2) 

  %Competitor{surname: last, given_name: first, team: team,
    total: sum, points: points)
end

	The `\|>` operator takes the output of the first function and uses it as the first argument of the second function. The code is the equivalent of `String.split(chomp(row),"\t")`
	`Enum.map_reduce/3` takes a list as its first argument, an “accumulator” as its second argument, and a function for the third argument. `Enum.map_reduce/3` passes each item and the accumulator in turn to the function, which returns a tuple giving the converted item and the new value of the accumulator.

In my place_points/2 function, I used pattern matching to handle the cases of an empty entry or an integer in the CSV file.

def place_points(item, accumulator) when item == "" do
   {0, accumulator}
end

def place_points(item, accumulator) do
  value = String.to_integer(item)
  if value < 0 do
    n = max(1, 6 + value) # state tournament
    {n, accumulator + n}
  else
    n = max(1, 4 - value) # local tournament
    {n, accumulator + n}
  end
end

Creating the HTML

The hard work is done; read_csv/1 gives its caller a list of Competitor records that are in the proper order. The following html_output/1 function takes a CSV file name as its argument, passes that file name to read_csv/1, and uses the return value to create the HTML file.

def html_output(input_filename) do

  ends_with_csv = ~r/\.csv\z/ 
  if input_filename =~ ends_with_csv do
    output_filename = Regex.replace(ends_with_csv, input_filename, ".html", [])
  else
    ouput_filename = input_filename <> ".html"
  end
  output_file = File.open!(output_filename, [:write, :utf8])

  {headings, data} = read_csv(input_filename)
  IO.puts output_file, """ 
  <html>
    <head>
      <title>#{output_filename}</title> 
    </head>
    <body>
      <table border="1">
      <thead>
        #{make_header_row(headings)} 
      </thead>
      <tbody>
  """

  emit_html_rows(output_file, data) 

  IO.puts output_file, """
      </tbody>
      </table>
    </body>
  </html>
  """
  File.close(output_file)
end

	If the input file name ends with `.csv`, replace it with `.html`; otherwise add `.html` at the end. The expression `input_filename =~ ends_with_csv` is shorthand for `Regex.match?(ends_with_csv, input_filename)`
	This part of the code relies heavily on `heredocs` to conveniently output multi-line strings. The ending `"""` must be on a line of its own.
	The construction `#{variable}` interpolates the value of the variable into a string.
	But wait…there’s more! You can interpolate a function call, and its return value is inserted into the string.
	I could have interpolated the entire table body as a huge string. However, because there are often nearly a hundred competitors from fifteen tournaments in a CSV file, I felt it was better to use `IO.puts/1` to write it to the output file one line at a time.

The Table Header

To create the table header, make_header_row/1 gets the list of headings. It uses pattern matching to isolate the first three items (given name, surname, and team) from the competitor’s points. The function then reconstructs the list of headings, adding "Total" as it goes, and uses Enum.join to add closing and opening table data tags between the items. This string is sandwiched between the tags that open and close the table row, and the <> operator concatenates them all together. Notice the use of defp to make this a private function; there is no reason for any other module to call this function.

defp make_header_row([first, last, team | points]) do
  "<tr><td>" <>
  Enum.join([first, last, team, "Total" | points], "</td><td>") <>
  "</td></tr>"
end

Creating the Table Body Rows

The emit_html_rows/2 function takes the output file as its first parameter and the list of competitors as its second parameter.

defp emit_html_rows(_output_file, []) do 
  :ok
end

defp emit_html_rows(output_file, [person | remainder]) do 
  %Competitor{given_name: first, surname: last, team: team,
    total: total, points: points} = person
  str = "<tr><td>" <>
    Enum.join([first, last, team], "</td><td>") <>
    "</td><td>" <>
    Enum.join(Enum.map([total | points], &html_cell(&1)), "</td><td>")  <>
    "</td></tr>"
  IO.puts(output_file, str) 
  emit_html_rows(output_file, remainder)
end

	If there are no more competitors, “my job here is done.”
	Otherwise, separate the first person in the list from the remainder, and use bracket notation to assign the record’s components to individual variables. By using the bracket notation, I can refer to `first` instead of `person.given_name`, `last` instead of `person.surname`, and so forth.
	Deep breath here. `Enum.map/2` takes a list with the total points first and the tournament points after it as its first argument. It passes each item in turn to `html_cell/1`, which converts the value to a string. Then, `Enum.join/2` wraps each of those strings in a table cell.
	The finished table row goes to the output file, and `emit_html_rows/2` is called again to process the remaining competitors.

Here’s html_cell/1. If someone has zero points (which means they didn’t place in the tournament), the cell becomes a <br /> element. This is necessary to ensure that the cell’s borders are visible in older browsers. Otherwise, the number is converted to a string.

defp html_cell(item) do
  if (item == 0) do
    "<br />"
  else
    to_string(item)
  end

Conclusion

There you have it. I was able to use Elixir to perform a relatively mundane task: open a file, read it, do some calculations with the data, and write the data out in a new format. I learned quite a few interesting features of Elixir along the way, and I hope you did too.

If you’d like to play around with the code (and improve upon it), here’s the entire Elixir file, and here’s some sample data.