Why don’t we use DTOs in FlatBuffers
DTOs (Data Transfer Objects) are classes which people use to represent and store data before they serialise and send the data over the wire. So the primarily purpose of a DTO is to be able to turn an object graph into a buffer (array of bytes).
[Application state] --> [DTOs] --> [Buffer]
And turn a buffer into an object graph.
[Buffer] --> [DTOs] --> [Application state]
Some people (my self including) don’t see the point of separating between Application state
and DTOs
some times. Specifically if the mapping between the Application state
classes and DTO
classes is 1:1. The mapping seems to be redundant and they are ok with integrating the DTO as part of their application state and logic. It is on you and your use case to decide if you want to have a clear boundary between DTOs
and Application state
. What I want to clarify in this post is:
Why FlatBuffers does not let me generate easy to use DTOs
by default?
In FlatBuffers the buffer is build up in a way that we can access separate values directly, without the need to deserialise the whole graph and in some languages even avoiding temporary heap allocations. This is impossible to achieve with text based serialisation formats and binary formats, which are focused around bit packing (e.g. Proto-Buffers, Avro, COBR etc…). In case of those other formats, which need to convert the whole Buffer
into an intermediate object graph DTOs
totally make sense. For FlatBuffers however it is an overkill. This is why by default FlatBuffers generates types, which are just light weight accessors. They are just pointers into the buffer, which let us navigate and extract values in a type safe way. So in case of FlatBuffers the buffer into an object diagram is:
[Buffer] --> [Application state]
There are is no DTO
object graph, we can pick and chose values directly out of the Buffer for our Application state
.
What about serialisation of Application state though?
For turning values into a buffer, FlatBuffers generate a different set of light weight types which we can use to serialise the data directly into a Buffer:
[Application state] --> [Buffer]
Also in this case we are trying to avoid allocating temporary objects just for turning them into a buffer.
Here is a thing though. What if we wanted to have our DTOs as part of the Application state
?
If we have DTOs as part of our Application state we can do things like:
[Buffer] --> [DTO] --> [DTO'] --> [Buffer]
We transform a buffer into a DTO
graph, mutate some values in the DTO
graph and turn the new DTO'
into a Buffer
. Can we do something like this with FlatBuffers?
There are two ways how we can achieve that:
- Change scalar values directly in the buffer
- Use the
— gen-object-api
and generate DTO types
The first solution works only if you need to perform mutation of a given scalar value. Scalar values are booleans, enums, numbers and structs. Say we have a property lucky number in a table which is 7
and now we need to change it to 13
. In such a simple case we can do something like this:
[Application state] --> [Buffer] --> [Buffer']
So we can just mutate the Buffer directly, which is super efficient. However we can’t do it for values, where a value change means a structural change of the buffer. Say we have property name and we want to change Max
to Maxim
. We can’t do that because the string value Max
is encoded somewhere inside of the buffer. String value Maxim
is longer than Max
so we can’t just replace some bytes with another bytes, the Buffer needs to become bigger, values need to be shifted and this is a structural change. So in this case you will have to rebuild the whole Buffer again from scratch.
Before you scream BS and grab your pitchfork. Spend a few second and think how it is working in case of any other serialisation format like JSON or ProtoBuff? Do we need to rebuild the whole buffer?
Yes we do, we are just not confronted with it. We change a value in a DTO and ask it to build a whole new Buffer for us.
So it works the same internally, but feels very light weight for us “externally”.
To give us the same easy feeling about mutating just one value in a complex graph of objects, flatc
provides a generation option — gen-object-api
. The generated types work exactly as you would expect typical DTOs
should work. They transform the whole Buffer
into a graph of DTOs
, which you can use to mutate values and generate a new Buffer
from.
So if you are concerned with efficiency, you should not use DTOs and serialise, deserialise data adhoc. But if your use case allows it, you might want to try — gen-object-api
.