Go compiler optimizations for structs

This article looks into the optimizations of the golang compiler with regard to different usage of structs. It answers the following questions:

Does it avoid unnecessary structure copies?
Does it inline where necessary?

To answer theses questions, I wrote a small example and decompiled it to see how well it behaves.
For all the examples, I used the compiler version 1.15 and a main package with a main function calling all the others. I always tried with both pointer and value operations.

Small data structure

First, I defined a small data structure:

type A struct{ contained int }

Then, I added some receivers:

func (a *A) mutate() { a.contained += 123 }
func (a A) doNotMutate() { a.contained += 123 }

With the pointer receiver, the decompilation shows that it’s a simple load & add & store.
For the value receiver on the other hand, as the copied structure will drop on return, it can be a NOP, and it is indeed the case.
Let’s see what happens when a function calls doNotMutate:

func doNotMutateAWithValue(a A) { a.doNotMutate() }
func doNotMutateAWithPtr(a *A) { a.doNotMutate() }

Here, the first is also a NOP, as expected.
The second one simply checks for a nil pointer, so as expected. At call sites, it isn’t called at all, perfect!
Then I tried calling mutate from a function:

func mutateAWithValue(a A) { a.mutate() }
func mutateAWithPtr(a *A) { a.mutate() }

When mutating a discarded value, it actually performs the unneeded computation, too bad. The call is inlined though.
The second one is inlined and acts as expected.
But nicely enough, at call sites, mutateAWithValue is not called at all, that’s a relief.

So let’s recap: for a small struct, the code generated for the methods might not be as efficient as expected, but at call sites, it’s optimized away.

Big data structure

Then I tried with a bigger struct to see if that’s as efficient.

type B struct{ a, b, c, d, e, f, g, h, i, j uint64 }

func (b *B) mutate() { b.i += 123 }
func (b B) doNotMutate() { b.i += 123 }

The first call is correct and simple, as for A.mutate.
But the second call performs the computation on a discarded value. Luckily this is optimized away when calling B.doNotMutate from a function.

Calling the receivers from a function that takes the structure as an argument shows some missing optimizations:

func doNotMutateBWithValue(b B) { b.doNotMutate() }
func doNotMutateBWithPtr(b *B) { b.doNotMutate() }

And it all goes crumbling: in both calls, it actually copies the whole structure before executing the useless computation and discarding the value.
The call sites aren’t saving it: the functions using a value and a pointer argument both copy the structure before discarding it.

func mutateBWithValue(b B) { b.mutate() }
func mutateBWithPtr(b *B) { b.mutate() }

Similarly to the A version, the compiler creates a computation even when not needed, but inlined. When calling B.mutate in mutateBWithValue it is much worse: the generated code copies the structure before dropping it.

Summary

After all these experiments, I can say that the go 1.15 compiler optimizes structure receivers depending on the size of the structure:

small structs are well handled, inlined and eluded as efficiently as possible
bigger structs perform badly, as they are copied when passing through a function
inlining is done pretty intensely, as no call is created for any of the functions or methods
uses of value & pointer receivers are correctly optimized