FUN with go stdlib – Sorting a slice of structs by multiple field values and really grokking it.

If you watched The Big Bang Theory you know all about Sheldon’s Fun With Flags.

borrowed from here

In this type of posts we’ll be having fun with go’s standard library. Exciting, right?

Hidden in the help docs of go is this little gem:

package main

import (
	"cmp"
	"fmt"
	"slices"
)

func main() {
	type Order struct {
		Product  string
		Customer string
		Price    float64
	}
	orders := []Order{
		{"foo", "alice", 1.00},
		{"bar", "bob", 3.00},
		{"baz", "carol", 4.00},
		{"foo", "alice", 2.00},
		{"bar", "carol", 1.00},
		{"foo", "bob", 4.00},
	}
	// Sort by customer first, product second, and last by higher price
	slices.SortFunc(orders, func(a, b Order) int {
		return cmp.Or(
			cmp.Compare(a.Customer, b.Customer),
			cmp.Compare(a.Product, b.Product),
			cmp.Compare(b.Price, a.Price),
		)
	})
	for _, order := range orders {
		fmt.Printf("%s %s %.2f\n", order.Product, order.Customer, order.Price)
	}

}

Wait, what’s going on inside the SortFunc? Looks kinda confusing.

cmp basics

Let’s start with the building blocks.

cmp.Or() will return the first non zero value, or zero if all values are zero.

	a := 0
	b := 0
	c := 42
	fmt.Println(cmp.Or(a, b, c))
	//42

Ok, that was easy. Let’s move on to cmp.Compare()

It takes 2 arguments, x and y, and returns:

-1 if x is less than y, 0 if x equals y, +1 if x is greater than y.

Let’s look at a code example to quickly understand it:

	fmt.Println(cmp.Compare(1, 100))
	//-1

	fmt.Println(cmp.Compare(1, 1))
	//0

	fmt.Println(cmp.Compare(100, 1))
	//1

slices.SortFunc()

Now let’s start with a quick example on how to use slices.SortFunc():

	type Person struct {
		Age int
	}

	persons := []Person{
		{25},
		{55},
		{10},
	}

	slices.SortFunc(persons, func(a, b Person) int {
		return cmp.Compare(a.Age, b.Age)
	})
	fmt.Println(persons)
	//[{10} {25} {55}]

We start by declaring a Person struct, a slice of Persons. Then we use a custom sort function on the slice.

Looking at the definition of SortFunc we see in the description that:

// SortFunc sorts the slice x in ascending order as determined by the cmp
// function. This sort is not guaranteed to be stable.
// cmp(a, b) should return a negative number when a < b, a positive number when
// a > b and zero when a == b.
//
// SortFunc requires that cmp is a strict weak ordering.
// See https://en.wikipedia.org/wiki/Weak_ordering#Strict_weak_orderings.
func SortFunc[S ~[]E, E any](x S, cmp func(a, b E) int) {
	n := len(x)
	pdqsortCmpFunc(x, 0, n, bits.Len(uint(n)), cmp)
}

Forget about the complex declaration (it uses generics btw), focus on the first 4 lines:

// SortFunc sorts the slice x in ascending order as determined by the cmp
// function. This sort is not guaranteed to be stable.
// cmp(a, b) should return a negative number when a < b, a positive number when
// a > b and zero when a == b.

Well, well, what luck! cmp.Compare() does exactly that! That’s why I used it like so in the code above:

	...
slices.SortFunc(persons, func(a, b Person) int {
		return cmp.Compare(a.Age, b.Age)
	})
...

Recap: sorting function needs an int depending on which element is less than the other, cmp.Compare() does that for us, as arguments to cmp.Compare() we use a person’s age – thus we will end up with a slice sorted in ascending order based on a person’s age.

	//[{10} {25} {55}]

Sort slice of based on 2 fields (for now)

Let’s add another important field to our Person struct: salary. Money makes the world go round…

	type Person struct {
		Age    int
		Salary int
	}

Declare a slice of Persons with various ages and salaries:

	persons := []Person{
		{55, 5000},
		{25, 3000},
		{25, 0},
		{10, 0},
	}

Now we’d like to sort this slice based on two criteria: first order by age, then order by salary (while maintaining the order of age).

We’ll use this little chunk of code to do the sorting:

	slices.SortFunc(persons, func(a, b Person) int {
		return cmp.Or(
			cmp.Compare(a.Age, b.Age),
			cmp.Compare(a.Salary, b.Salary),
		)
	})

And we come to the crux of the matter, the thing that resembles the code in the go docs. What is going on there?

from here:

So the sorting function needs an int returned, based on that it will decide which element is less than another element and do the ordering.

cmp.Or() will return the first non zero value.

It start looking at the first argument, which is in itself a function which returns a value:

cmp.Compare(a.Age, b.Age),

If it gets -1, first element is less than second element. If it gets 1, second element is less than first element. In these 2 cases It doesn’t even care about the second argument because it already got a non-zero value from this first argument.

So basically we’ll have something like:

cmp.Or(1/-1,DONT_CARE)

So this will effectively sort our slice by age, assuming we don’t have identical ages.

But what if we got 2 equal ages?

Then the cmp.Compare()function will return 0.

So we’ll have end up having something like

cmp.Or(DONT_CARE,cmp.Compare(a.Salary, b.Salary))

So forget about sorting the ages (they are equal) and go on on sorting based on the salary.

Full code listing here:

	type Person struct {
		Age    int
		Salary int
	}

	persons := []Person{
		{55, 5000},
		{25, 3000},
		{25, 0},
		{10, 0},
	}

	slices.SortFunc(persons, func(a, b Person) int {
		return cmp.Or(
			cmp.Compare(a.Age, b.Age),
			cmp.Compare(a.Salary, b.Salary),
		)
	})
	fmt.Println(persons)
	// [{10 0} {25 0} {25 3000} {55 5000}]

As you can see, we sort first by age. Then, when ages are equal, we sort based on salary – first the 25 year old which has 0 salary, then the 25 year old which has 3000 salary.

Sort slice based on multiple fields

Let’s dive straight in. We’ll add a weight field (hey, no judgement :P). Then we want to sort first by age, then by salary and finally by weight:

	type Person struct {
		Age    int
		Salary int
		Weight int
	}

	persons := []Person{
		{55, 5000, 80},
		{25, 3000, 40},
		{25, 0, 100},
		{10, 0, 70},
	}

	slices.SortFunc(persons, func(a, b Person) int {
		return cmp.Or(
			cmp.Compare(a.Age, b.Age),
			cmp.Compare(a.Salary, b.Salary),
			cmp.Compare(a.Weight, b.Weight),
		)
	})

	for _, p := range persons {
		fmt.Printf(" Age %d, Salary %d, Weight %d\n", p.Age, p.Salary, p.Weight)
	}
	// Age 10, Salary 0, Weight 70
	// Age 25, Salary 0, Weight 100
	// Age 25, Salary 3000, Weight 40
	// Age 55, Salary 5000, Weight 80

Think on it as:

First sort by age. If ages are equal then:

-> Sort by salary. If salaries are equal then:

–> Sort by weight

And you can do this with multiple fields, using your own fields in the order you see fit.

sort in reverse order

But what if you want to sort a slice of structs based on multiple fields in descending order?

Then you’ll have to just use slices.Reverse(persons) (assuming you sorted the persons slice in ascending order already).

[edit] After I posted this on reddit a redditor pointed out that to sort in reverse order we would simple add a minus sign in front of the return value of the sort function. Thank you “ponylicious”!

Here’s how it would look:

	slices.SortFunc(persons, func(a, b Person) int {
		return -cmp.Or(
			cmp.Compare(a.Age, b.Age),
			cmp.Compare(a.Salary, b.Salary),
			cmp.Compare(a.Weight, b.Weight),
		)
	})

Conclusion

Thank you for reading and please let me know your thoughts in the comments below.

Leave a Reply

Your email address will not be published. Required fields are marked *