How to not loop over all entities in Unity ECS
I am a vivid fanboy of ECS pattern since 2012. It is important to mentioned that I come to ECS not for the performance reasons, but because I like the way ECS forces a clean and flexible application design upon me 😀.
Today I stumbled upon following tweet:
Which triggered this blog post, so buckle up, I am putting my philosophers hat on.
I see two main concepts which drive the application design, when developing with ECS:
- Separation of state and behaviour
- Bottom up design
I think Separation of state and behaviour does not need any further explanation, but what do I mean by Bottom up design?
With traditional OOP mindset we tend to go top down — from abstract to concrete. A definition of a class is abstract, e.g. Player, Enemy, House, Weapon. They are non specific metaphors. And don’t get me started on the application of inheritance in OOP. This concepts makes us go even further into the abyss of abstract thinking. Which is very welcoming for introduction of accidental complexity.
With ECS, data is KING!
There are no classes, there are just components and systems. Components should reflect the data 1:1. There are no player and enemy, there are components, which reflect data, and if an entity has what it takes to be a player, it will be handled by a system as a player. If it contains things which a system expects from an enemy than this entity will be handled as an enemy. And here is a thing, an entity can be a player and an enemy at the same time.
So how do we avoid looping over all entities?
We avoid it by breaking down components in small atomic elements, which let us execute explicit queries.
Enough with the philosophical mumbo jumbo, let’s get practical
In Unity ECS we can create components, which we can associate with an entity. Unity ECS evaluates all the given component associations and create so called entity archetypes. For example we have components A, B, C and we have entities which in current state have components (A, B), just (C), (A, B, C) and (B, C). We have 4 archetypes in our current state. Important to mention that by adding and removing components from an entity, we change the archetype of the component.
Now why do we need an archetype in the first place?
We need it because the component instances are stored in so called Chunks which are organised by entity archetype. A chunk has a fixed byte size, so do the data components. Given those fix sizes, the engine can compute how many entities will fit into one chunk. In the chunk the component instances are stored by component type and not by entity id. What I mean by that is:
Lets imagine we have 3 entities of (A, B, C) archetype, the layout of the chunk will be:
A1, A2, A3, B1, B2, B3, C1, C2, C3
and not
A1, B1, C1, A2, B2, C2, A3, B3, C3
The first layout is much more beneficial in regards to memory alignment and in case we have a system which is interested only in some of the component types, than we can skip a full memory sector.
Here is a nice diagram from Unity ECS docs, which visualised the concept:
The chunks are organised by archetypes and store components of same type linearly. If there are more entities of a certain archetype, which can fit in one chunk than there will be another chunk created.
Querying for entities in Unity ECS works as following:
Give me all entities which has this components and don’t have this components
The engine checks, which archetypes confirm to the query identifying the chunks we can linearly iterate over. Taking the nice diagram as an example. If we ask for entities which have yellow component, we will need to iterate over all the chunks. If we ask for yellow and blue, we iterate over 5 chunks. If we ask for yellow and not pink, we will iterate only over two chunks in the middle.
The first line of defence to avoid iterating over all entities is component design
That sad, it is often not enough though. The second tweet which provoked this blog post states:
… I use spatial queries for pretty much everything and keep active/passive sets …
I am not 100% sure what the author means by active/passive sets, my guess is: if the state of the entity / game object did not change, don’t touch it.
This concept needs a bit of work in Unity ECS and has multiple possible implementations.
The simplest solution is to define a passive flag component.
By adding this flag component to the entity we are moving the entity from one archetype to another and if the query explicitly states that an entity should not contain passive flag component, chunks with this archetype will not be considered for iteration. If you chose to have an active, passive, or dirty flag component is totally up to you and the queries you want to write.
This is a very straight forward technique, however there are dragons. Reorganising chunks based on addition and deletion of components can be expensive. There are possibilities in Unity ECS to do this kind of work in batches and only at certain sync points, but it still need profiling and consideration.
Another option is to run over all entities but perform early exit based on the certain property of the component.
This is where version numbers might get useful. In Unity ECS the World has a global version number. Every state change committed on the main thread increases the global version number. Every system stores a version number (LastSystemVersion), which stores the global version number the world had last time the system was executed. Chunks also have (ChangeVersion) stored by component type. This version will be set to global version each time chunk is asked to provide a component with a write access (sadly it can’t evaluate if the component was actually changed). Given the chunk (ChangeVersion) there is a mechanism in Unity ECS to skip chunks which have a (ChangeVersion) lower than (LastSystemVersion), meaning that if components in the chunk were not accessed with the write permission since the last system execution, this chunk will not be iterated upon.
That sad, a chunk might include many entities and if only one of those entities was accessed with write permission, the whole chunk and there for all entities / component instances in this chunk will be iterated upon.
If we want to avoid computation with components which did not change. We can add a version field to our component type itself. This way we can set the version field to be the global version, when we update the component and while iterating over the components, we can check if the component version is higher than (LastSystemVersion) skipping the component if it is not.
Now last but not least:
Spatial queries
Or I would rather call it — querying entities by component value.
The simple thing to do is to get all the chunks of component X and while iterating over all of them skip the once which has the “wrong value”.
This is an option but it is not the only option we have.
Another option is to define the component, which value is important for the query as SharedComponentData.
If an archetype have a shared component, than it will put entities with the same shared component value in the same chunk only. I think it is best to give an example here.
Lets say we have a grid based game, and we have a grid position component which is a shared component. Now if we have 20 entities of the same archetype, which includes the grid position component, we will have as much chunks as there are different values of shared component. Say if all 20 entities are on the same grid position than we have just one chunk. If all of the entities are on different grid position, than we have 20 chunks.
When we query for entities with a shared component, we can ask for entities, with explicit shared component value and there for we will iterate only over a certain chunk.
With this option we again stumbled upon the fact that changing the values of shared components might become expensive, due to all the re-organisation of data in chunks. This is something that is use case specific and needs profiling. And we are also not done with the options yet.
Spatial queries are often performed on graphs and trees. What if the nodes of those graphs and trees are entities by themselves?
It is possible to build up a quad tree out of entities. It is debatable, if we will gain some data locality benefits out of this approach, but maybe we will 🤷♂️
Last but not least, while ECS and pure Blittable data types have their benefits, sometimes there is room for the good old reference based data structures, like trees and graphs etc…
Systems in Unity ECS are classes, which can have references to objects. Objects like quad trees, or hash maps, or other reference based data structures. If you don’t have time to figure out the best way to map a given problem with pure data types and ECS, use things you know, with all the benefit / pitfall you would get, if implementing the game in a more traditional OO style.
Thanks for reading and I am always open for discussion.