Sunday 26 October 2014

Python tuple sort idiom in Ruby

Was looking at some Python code recently, which uses tuples to sort, and wondered what the Ruby equivalent code for the same would be.

1. The first snippet
distances = []
distances.append((distance, user))
distances.sort()
return distances
The four statements are quite straight forward. distances is a list. To it, tuples of (distance, user) elements are added. To refresh our understanding of tuples, Learning Python defines them as "Tuples construct simple groups of objects. They work exactly like lists, except that tuples can't be changed in place (they're immutable) and are usually written as a series of items in parantheses, not square brackets."

That means, distances list is a list of tuples. And then the list is sorted. Python sort() is an in-place sort, which rearranges the elements of the list in a sorted order. Here, calling sort on distances re-arranges the tuples in a sorted order. On a list of tuples, what Python's sort does is, it sorts by first element, but in the case of a tie, it sorts by second element, and so on.

So, what is the Ruby code to do the same? One way, and I think a good one at that, is to use Ruby Structs. So, I defined a variable called Tuple of the type Struct and defined the <=> operator. Of course, to arrive at this I browsed through a lot of stack overflow questions.

I put the code in a file called mh_util.rb, and you would have figured out with ease that mh is my name, but that is beside the point, the main thing is that the first line in the file has the keyword module, so I can re-use the file with kewyword require, wherever I want to have the equivalent of a tuple sort. The next main thing is the definition of the comparison operator (<=>) which states how to compare which is, if the first elements of two elements being compared are same, use the second element for compariosn. Finally, the module has a to_s method to help in easy printing of the Tuple.

Here's the Tuple structure My Ruby code that calls Tuple to achieve the above Python functionality is:
distances = []
distances << MH_util::Tuple.new(distance, user)
return distances.sort
To make the usage of my user-defined type Tuple more clearer, here's another Python example (from pythonlearn.com) that takes a list of words and sorts them from longest to shortest. It creates two-element tuples where word length is the first element and the word as the second element.
Here's the Ruby equivalent I came up with using the concept of Structs for tuples when sort is required. 2. The second Python snippet
recommendations = []
recommendations.append((artist, rating)
return sorted(recommendations, key=lambda artistTuple: artistTuple[1], reverse = True)
The first two statements are the same as in the first example. Create a list called recommendations, and add two-element tuples to it. The two elements are artist and rating. artist is string, and rating is numeric.

The sorted function returns a new sorted listed from the first argument. The complete documentation of sorted from docs.python.org:

sorted(iterable[, cmp[, key[, reverse]]])
    Return a new sorted list from the items in iterable.
    The optional arguments cmp, key, and reverse have the same meaning as those for the list.sort() method (described in section Mutable Sequence Types).
    cmp specifies a custom comparison function of two arguments (iterable elements) which should return a negative, zero or positive number depending on whether the first argument is considered smaller than, equal to, or larger than the second argument: cmp=lambda x,y: cmp(x.lower(), y.lower()). The default value is None.
    key specifies a function of one argument that is used to extract a comparison key from each list element: key=str.lower. The default value is None (compare the elements directly).
    reverse is a boolean value. If set to True, then the list elements are sorted as if each comparison were reversed.
    In general, the key and reverse conversion processes are much faster than specifying an equivalent cmp function. This is because cmp is called multiple times for each list element while key and reverse touch each element only once. Use functools.cmp_to_key() to convert an old-style cmp function to a key function.

Coming back to the second Python snippet, basically, the third statement is sorting the tuples on the second element i.e., rating but in the reverse order. So you will have the recommendations list from the highest to the lowest rating.

The equivalent Ruby code is very cool, see the third statement:
recommendations = []
recommendations << MH_util::Tuple.new(artist, neighborRatings[artist])
return recommendations.sort_by {|element| -element.second}
Using the minus sign, to reverse the sorting order is very neat.

1 comment: