A human-driven sort algorithm (MonkeySort)
June 19, 2012 , revised September 5, 2022 in ProjectsSummary: I made a utility that sorts a provided list of items, asking the user to pick the best item out of two.
Every once in a while you have to arrange certain items into a prioritized list: maybe choose projects you’re going to focus on, or movies to watch, or places to visit, et cetera. Could be you don’t need the sorted list itself but the top (three, five) items from the list; this is essentially the same task.
The problem is, while you can easily pick the best option out of two (or conclude that they’re equally good), selecting from numerous options induces decision paralysis.
So, I thought, practically all of computer-based sorting algorithms are based on two-option comparisons. Why not make a utility that uses a well-known sorting algorithm, driven by real human decisions?
Demo
There is a live demo available on https://leonid-shevtsov.github.io/monkeysort/.
Implementation
The algorithm could be divided into two parts: the actual sorting (which is a modified quicksort), and storing and updating the comparison matrix (which is the fun part).
I picked quicksort as the algorithm, for its (usually) low number of iterations, and thus comparisons. It had to be reworked for the case that the result of a comparison is not known yet (and has to be decided by the user).
Asynchronous, “interruptible” quick sort
At first, I decided to make an asynchronous version of quick sort, breaking it up where there’s a need for a comparison. I even got the algorithm coded (which deserves an article of its own), but it turned out to be an unnecessarily complex solution because I found an elementary brute force one.
Brute force, “restartable” quick sort
If you think about it, quick sort runs near-instantaneously for the number of items that you’d sort manually. So, I threw out the asynchronous version and made the simplest quick sort implementation, which bails out as soon as it hits an unknown comparison. After the user makes the decision, the sort restarts from the beginning, and so forth until the entire set is sorted. The actual sorting produces no noticeable delay.
Storing comparisons
It’s convenient to store comparison results in a matrix (it’s not going to be big because the item set is not too big). Element C[i,j]
of that matrix stores the comparison result between A[i]
and A[j]
, and can be “better”, “worse”, “equal” or “unknown”.
Initially, the matrix is initialized with “unknown” values, except for the main diagonal, which is initialized with “equal”.
Processing a comparison result
So, the user made a choice and said that A[i]
is “better” than A[j]
. How do you store that?
-
Obviously, set
Cij
to “better”. -
Also set
Cji
to the opposite (“worse”).This is crucial; not only because asking the user about
A[j]
vsA[i]
is excessive, but because it may lead to a logical contradiction. After each explicit comparison result is known, we must also store every implicitly known result. -
Since comparisons are transitive, we actually get more implied data from the result; for each item
A[k]
:-
(Cij ∈ { "<", ">"}) ⋀ (Cjk ∈ {Cij, "="}) ⇒ Cik = Cij
-
((Cij = "=") ⋀ (Cjk is known)) ⇒ Cik = Cjk
OK, in theory the comparison operator is not necessarily transitive. But we assume that the set of items is totally ordered because otherwise sorting does not make practical sense.
Note that all the implied results can lead to transitive implications of their own, so we recursively process them, too.
-
-
To avoid going into an infinite loop, we avoid processing elements that already have a value.
Post mortem
After coding the whole thing, I found out that there’s another way of approaching the problem. To sort the set, we need to fill in the comparison table. It may well be that using quicksort to select comparisons won’t produce the minimal number of explicit comparisons required. Maybe you can make a better algorithm based on filling in the matrix, and then run a regular sort.
Question: what is the minimal number of explicit comparisons required to fill in such a comparison table?
Ideas: dynamic programming? Graph algorithms?
Liked the post? Treat me to a coffee