Having looked at the basic semantics of Go in the previous article, I’m continuing my exploration by looking at Go’s facilities for object orientation. This looks at structures and embedding in more detail, as well as methods and interfaces.
This is the 2nd of the 6 articles that currently make up the “All Go” series.
In the previous article in this series I started to look at Go, beginning with the basic types, flow control structures and functions. This time I’m going to drill deeper into some of Go’s object orientation features.
Go doesn’t offer traditional class-based object orientation in the way of C++ or Python, but it’s more similar to Rust where regular structures can have methods added to them, and we’ll look at how that works first. Secondly, we’ll go into the embedded fields in structures that I mentioned briefly in the previous article — we’ll look at why the feature exists and how it can be used. Thirdly, we’ll look at Go’s notion of interfaces, which are somewhere between Rust’s traits and Java’s interfaces.
If you’re familiar with how how methods work in Rust then Go’s approach isn’t all that different. The main difference is that methods aren’t wrapped in something like an impl
block but are instead individually annotated with a receiver — this is the type on which the method is being defined. This is effectively an additional argument to the method, just like self
in Python or Rust, but syntactically it’s specified a little differently as we’ll see below.
The receiver doesn’t have to be a struct
, you can add methods to any type except interfaces or pointers to pointers — one reason why you might add a method to a scalar type might be to add a function that returns string
to make it printable, for example. However, since the focus of this article is Go’s object orientation features I’m going to focus on structures for now.
Also, it’s worth qualifying that you can only add methods to types declared in the same package as the method — you can’t go tagging on methods to random types elsewhere in the code.
Let’s take a look at a code example — take a look and then I’ll run through what’s going on afterwards.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 |
|
First off we declare a point
structure, and then on line 12 we add a translate()
method to this structure. This method needs to modify the structure, so the receiver is specified as a pointer to point
— if the receiver is passed by value, the method operates on a copy of it and hence cannot modify the original.
One aspect that’s important to note here is that the dot operator is used with both pointers and values — the language will generally perform dereferencing for you automatically as required.
Then we declare a second type rectangle
and on line 21 a function newRectangle()
to construct it. Go doesn’t have constructors or initialisers with fixed names like C++ and Python do, but it seems to be a general idiom to name the method newFoo()
if it creates a new Foo
object — or more likely NewFoo()
if the method is to be exported outside the current module (more of that in a future article).
It’s worth taking a moment to look at what this function is doing and, especially if you’re coming from C/C++, wonder why it works. The surprising thing here is that we’re constructing something within the function and returning a pointer to it — this sort of thing is generally a quick route to crashes in other C-like languages. Because Go uses a garbage collector to free values, however, returning a pointer to local values from a function is perfectly acceptable.
As it happens we could have also used the builtin new()
function here, which allocates a chunk of zeroed memory of sufficient size for a specified type — you can then initialise that structure if the zeroed form isn’t sufficient. However, the use of new()
in this way seems less idiomatic in the community, so probably should be avoided. I believe there are some edge cases where it’s preferred, such as returning pointers to scalar values, but in general its inclusion in the language predates the x = &myStruct{...}
approach and at this point cannot be removed without breaking backwards compatibility. Still, it’s probably important to know about it in case you come across other code which uses it.
Moving on we can see that rectangle
has two methods defined against it. The first of these, area()
on line 33, is defined as taking its receiver by value. This works, because it doesn’t need to modify any of the fields of the object. The second method, translate()
on line 38, takes the receiver by pointer since it does need to modify it.
It would be a mistake to assume that this is purely a choice of mutable vs. immutable methods, however, where pass by value is read-only and pass by pointer is read-write. There’s nothing to stop a pass-by-value version modifying its receiver, it’s just that the receiver is a copy of the original, so any changes you make won’t be reflected in the instance passed in.
To illustrate these semantics, imagine you wanted a new method translateCopy()
which left the original unchanged, but returned a pointer to a new instance which had been translated by a specified amount. One way to do that might be to define this method.
func (r rectangle) translateCopy(x, y float64) *rectangle {
r.bottomLeft.translate(x, y)
r.topRight.translate(x, y)
return &r
}
Now I’m not sure that this is necessarily the best way to implement this function, but it does work. There is a problem with this mixing of methods which take the receiver by pointer and value on the same type, and it’s generally advised that all methods on a given type take either value or pointer and they’re not mixed — we’ll look at the reason for that advice below in the section on Interfaces.
You also need to consider all the usual efficiency factors when deciding between pass by value and reference. Even if you only need to define methods which don’t modify the original, it may be less efficient to pass by value if the structure is large, hence copying it incurs notable overhead. Passing by pointers incurs a cost as well, however — it’s quite possible that it will have diminished locality of reference, hence worsen performance in the presence of the aggressive caching that all modern processors perform.
As always, the thing to do is follow what appear to be the correct language semantics and not worry too much about performance until you’ve got something you can profile and base your optimisation decisions on real performance data instead of guesses about how complex processor architectures may or may not execute your code.
So we’ve seen how to add methods to a type. Now we’ll look at interfaces which are named collections of methods that types can implement.
If you’re coming from Java then Go’s concept of an interface will seem familiar, and if you’re coming from Rust then you can consider them a simplified form of traits. Coming from C++ or Python, they’re a little like a class which provides only abstract methods. I don’t think any of these comparisons are exact, but it gives you an idea what to expect.
To define an interface, you just define the signatures of the methods that comprise it. As with struct
, the interface
keyword declares an anonymous type, so you need to use the type
keyword as well to attach an identifier to it.
type shape interface {
area() float64
perimeter() float64
numCorners() int
translate(x, y float64)
}
You’ll notice that this interface specification doesn’t mention anything about whether the receiver is passed by value or pointer — more on that in a moment.
But now we’ve got an interface, what can we do with it? Well, we can use it as a type, both for declaring variables and also function parameters.
func manipulate(s shape) {
s.translate(10.0, 20.0)
fmt.Printf("Area: %.2f\n", s.area())
fmt.Printf("Perimiter: %.2f\n", s.perimeter())
}
You can pass any type which meets the shape
interface here, or a pointer to any such type as well. Whether a type meets the interface is determined structurally by the compiler — i.e. does it have all the methods specified with the correct signatures. There is no need to explicitly declare that a particular type is intending to meet the interface1.
Whether you use a pointer or value is determined by whether the implementation of the methods takes the receiver by pointer or value - you can’t overload method names, so you have to choose one or the other for any given method.
If you pass a pointer to a type conforming to shape
as the s
parameter above, then you’re able to call any method — this is because it’s easy for the compiler to implicitly dereference the pointer to call those methods which require pass by value.
However, the converse is not true — if you pass an actual instance of a type conforming to shape
by value, then you can only call those methods which take the receiver by value. If you allowed those which accept it by pointer, then they might modify the structure and that would break the contract of the pass by value semantics.
This is why you should avoid mixing methods which take the receiver by pointer or value on the same type, because it makes this sort of thing quite confusing for people using your methods. At this point in time I’m inclined to say that the easiest thing is perhaps just to use pointers everywhere, but there may be further subtleties of which I’m unaware which make this strategy less useful than it first appears.
At this point it’s worth noting the type interface{}
represents any type which implements at least the empty of set of methods — this means all types. You can use this in any context where you don’t want to constrain the type of the value you receive, return or store.
In my experience this sort of thing isn’t a good idea in many cases, in the same way that I wouldn’t recommend ever using void*
in C unless you can’t find any other option. However, that’s useful in some cases — one example is where you want to allow external code to associate some opaque context that you’re just going to return back to it in some future callback or similar. Whether this is likely to occur in Go much is something I haven’t had enough experience to decide, but it’s certainly something that happens in C so I wouldn’t want to rule it out.
What I would suggest is that whilst it’s potentially reasonable to accept an interface{}
value as a parameter, it’s probably a bad idea to return one because you’re forcing the calling code to be able to handle any type of value that might ever occur. If that set expands in the future, you could break calling code.
Still, it’s useful to have the option, as long as you bear in mind the risks.
☑ This section is just for interest, I don’t think anyone needs to understand this to use the language! Therefore don’t let this level of detail concern you if you’re not interested in it.
To help solidify understanding all this, it’s worth considering what’s actually stored in a value with the type of a particular interface — i.e. if we had a variable of type shape
from the example above, what would it contain?
For the sake of this example, let’s suppose that rectangle
implements interface shape
, and that’s the value we’re holding in a variable of type shape
. Any code using this value doesn’t need to care about the implementations of the required methods, it just needs to know how to call them. In C++ this is achieved with a vtable, a lookup table of pointers to methods which is updated in each derived class which overrides a method.
In Go there’s something similar, which I’ve seen called the itable or itab. This contains an entry with metainformation about the concrete type in question — in this case rectangle
. This is important, because if you recall from the previous article we can use a switch
statement to branch based on the type of the value — even if we have a variable of shape
, this can switch based on the actual underlying type used.
The rest of the itable is filled with pointers to the implementations of the interface methods for the concrete type in question. Note that this doesn’t include every method of the type — only those that are in this particular interface.
So an interface type value is essentially a pair, where the first item points to the itable and the second item is a pointer to the actual value2. This is sufficient to use the interface methods — the compiler just has to generate some code that passes the value item in as the receiver to the methods, it doesn’t need to write code that relies on any of the internal structure of the specific type in question.
Here that is in diagram form.
Unlike in C++, these itables can’t be computed exhaustively — it’s quite possible that lots of types might match a particular interface, but never actually be used with it, especially if you consider everything matches interface{}
.
Therefore what Go does is generate a table for each concrete type, containing a list of its methods, and a table for each interface, containing a list of the methods it requires. At runtime it derives the itable by comparing these two lists, and then caches the resultant itable so that any future uses immediately used the cached version.
This means that each pair of concrete type and interface used in the code incurs a small one-off runtime cost, but other than that the performance should be somewhat comparable to the C++ vtable approach. In particular this is better than in Python’s case, which is going to do the dynamic lookup every time (possibly with some caching, however).
In a class-based object-oriented language, inheritance is used to (among other things) compose data members and methods from multiple classes into one. Go may not have classes, but it can achieve a form of these compositions using embedding of structures and interfaces.
By embedding a structure into another one, the members of the “base” structure are made available in the “derived” one3. This is achieved by declaring a structure field with no explicit name, only a type. The type can be used to name the field, but there’s also a feature called promotion where fields and methods of the base structure are made available in the derived structure.
This is illustrated in this simple example.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
|
This shows how the lineColour
field can be directly accessed in the derived circle
structure and the sameColour()
method can be called directly on it. It also illustrates defining String()
methods on objects to make them printable by the functions in the fmt
module.
Since multiple structures can be embedded then there is always the potential for conflicts and ambiguities if two of the attributes or methods clash in name — this is detected by compile error at the point the identifiers are referenced. If they’re never used, the error will not be raised.
Embedded fields can be included by value or by pointer, and as we’ve already seen the methods can be called with a receiver that’s either by value or pointer. In general this doesn’t matter and all such methods are promoted except the case where structure S
includes an embedded member T
, then the method set of S
can only see promoted methods with a receiver of T
, not those with *T
. This is for the same reason as splitting the method sets in the first place, but the additional indirections make things a little trickier to reason out.
On special case of embedded structures like this is that the “base” and “derived” structures both define methods or fields of the same name. Take the example below, where both structures have a defined method called, appropriately enough, method()
. They also both conform to the iface
interface, which requires such.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
|
This is all valid, and behaves more or less as you’d expect. When passing to functions that expect either base
or derived
then you have to select the correct one — for example in line 48 the function baseFunc()
expects a *base
so that’s what has to be passed in. We do this by referring to the members through the embedded field type name instead of relying on promotion to pull them in.
The functions accepting any iface
value, however, will use whatever is the most derived version of method()
available. This means if we want to explicitly call the base version, such as in line 52, then we pass in &d.base
instead of &d
.
As well as embedding fields within a struct
definition, you can also embed interfaces in other interfaces. This is equivalent to simply adding all the functions from the “base” interface into the “derived” interface. To conform to the derived interface a type must implement all methods from both, and any such type will, of course, also conform to the base interface as well.
To illustrate I’ll build up a complete simplified example — since it’s a lot of code, I’ll present it in pieces. First up here’s a definition of the struct point
which we’ve seen before.
1 2 3 4 5 6 7 8 9 10 11 |
|
Now we see the key part — the interface definitions.
13 14 15 16 17 18 19 20 21 22 23 24 25 |
|
The entity
interface is the base here, and anything implementing boundingBox()
conforms to it. Then we have two derived interfaces shape
and image
, both of which extend entity
with their own additional methods.
27 28 29 30 31 32 33 34 35 36 37 38 |
|
In this example we just have one type rectangle
which implements shape
, which requires it to implement both area()
and boundingBox()
.
40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 |
|
We also have one type svgImage
which implements image
. This must also implement boundingBox()
, but in contrast to rectangle
it implements paletteSize()
.
To finish off we define three methods, which each take a parameter of each of these interface types, and a main()
function that creates these types and passes them in.
59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 |
|
This illustrates that rectangle
values conform to shape
, svgImage
values conform to image
, and both of them also conform to entity
. Hopefully that’s all fairly clear by this point.
So that’s the extent of my learning so far about Go’s methods and interfaces, and the extent to which it supports its own forms of inheritance and polymorphism.
I’m trying to decide how I feel about it all, with comparisons to other languages, but in all honesty I think it’ll be hard to be sure of that until I’ve written quite a lot of code in it. In my experience, when you first learn a language the shiny new features immediately stand out — something you expected to be hard, tedious or verbose turns out to be easier or more elegant than you first thought. The deficiencies, on the other hand, are often harder to discern up front — you tend to notice them only when you try something that intuitively you expected to be easy and simple, but it turns out to be harder, more tedious or less elegant than you’d hoped. It takes time to come across these examples and moderate your initial enthusiasm.
In hindsight, however, all I’ve managed to do there is reinvent the Gartner Hype Cycle:
My area of concern around new forms of polymorphism is whether they limit refactoring and code sharing possibilities. In class-based languages which don’t allow multiple inheritance, for example, it can be hard to retrofit mixin classes which bring implementations with them. In Go, I think that could be achieved with struct
embedding, as it happens.
In the case of Go, embedding has some of the properties of inheritance, and the fact that method names are allowed to clash with the most derived one winning has the properties of static polymorphism. Indeed, as we saw above the implementation of interface values has definite similarity to virtual methods in C++.
So, there’s reason to believe that many basic object orientation practices will be possible with Go — the question is whether they are convenient and idiomatic, and that remains to be seen. In any case, it’s perfectly possible to write complex codebases entirely in C where you have to build all this machinery yourself with explicit casting, so what the language makes convenient is always the question in practice.
What I can say so far is that the mechanisms I’ve covered in this article were all fairly intuitive to me, and based on underlying mechanisms that stick in your memory. I’d say probably the only aspect which caused me a few compile errors was the use of pointers vs. values — for example, without thinking I tried to pass interface values by pointer a few times and similar goofs. Compared with Rust the compiler errors could have been a little more helpful4, but it’s tolerable.
So all in all, fairly straightforward stuff so far, which is more or less in keeping with my expectation of Go as a simpler language than Rust. Whether the limitations which make it simpler will prove painful, only time will tell.
I hope this has been interesting and/or useful. In the next article in this series I think I’ll probably look at a few other language features like closures and deferred actions, and possibly generics, saving the meaty topic of concurrency for after that. Have a great day!
Note that this structural typing has some important differences to the duck typing used in languages such as Python. Duck typing is a runtime check whether a type has a particular subset of an interface required at a particular time — for example, if you want an object that behaves like a file
but you only need to read from it, you just care whether it implements a read()
call with the correct semantics. Go’s structural typing, however, is a compile-time check that the entirety of a specified interface is being met, not just the subset of it you happen to be using in a particular context. ↩
Although it’s interesting to note that there are optimisations used to reduce the memory impact in certain cases, and one of those is that if the value itself fits into a machine word then it’s stored directly in the second field in the interface value, instead of this being a pointer to the value. Since the size of the type is known at compile time in each case, the compiler can ensure the methods do the right thing with regards to whether they dereference a pointer or not — this is not something with which the developer need concern themselves. ↩
I’m going to mostly be putting the terms base and derived in quotes, because these aren’t really classes we’re talking about and they’re not really inherited. Instead of base I should probably really be saying nested embedded field, but that’s a bit more of a mouthful and less immediately familiar to those from class-based object oriented languages — so I’ll stick to my more concise, if slightly erroneous, nomenclature. ↩
That said, it’s a high bar — I feel that the Rust compiler really goes above and beyond when it comes to giving very specific errors, and offering helpful suggestions to correct them. Probably the best compiler I’ve ever used in that regard. ↩