Why I set out to prove that functional programming works for games, and what it might mean for modern game development.
A fellow on www.fpchat.com slack channel #gamedev asked me the following question about my functional F# game engine, Nu (source available here) –
“I was curious of how this project started. Do you have previous experience in game/engines development?”
I think it was my very painful experience working with commercial game code bases — most recently on the Sims 4 — that finally convinced me of the fundamental flaws of the object-oriented approach to game development. Those conclusions kept being reconfirmed with my successive experiences with modern engines like Unreal, Unity, and in-house game engines.
The bottom line was that, in these code bases, what should have taken 15–30 minutes would typically get estimated at – and actually take! – 3-5 hours. And let me assure you almost all that extra time spend was pure pain, most of which was spent in the debugger trying to figure out just how the hell the horrifically complex system got into the relevant part of its current state to begin with.
As an engineer, I became consistently frustrated due to the complexity that seemed unavoidable with current tools.
However, while working on Sims 4, I was privileged enough to undertake what would be an ongoing conversation with one of the principle engineers on the team. It was this months-long exchange that helped me shape some of my initial ideas of the Nu Game Engine — if only as a contrarian undertaking.
Let me note that my colleague was an awesome chap personally, and gave a great deal of time to these discussions that he did not have to, so even though he argued forcefully, he was one of the nicest and most open-minded people I’ve worked with. As our conversation proceeded, the arguments he gave as to why functional programming could not work for games kept returning to the following two points –
1) The GC would create too many pauses and affect the frame rate of games. This was from his experience of using C# in the Sims 3 engine, and his experience didn’t allow him to conclude otherwise — even though the modern .NET GC was very different than the one that shipped with Sims 3.
To invalidate his assertion, I did some research on modern GC technology, consuming several white papers and a couple of books along the way.
After spending several weeks doing my homework (it was not easy ramping up on such an unfamiliar topic!), I continued our conversation. I suggested that the design of at least some modern GCs —such as those considered ‘pauseless’ due to their iterative nature — would eliminate the issue in theory.
When I brought this to his attention, he didn’t seem to be able to give a concrete rebuttal, so I concluded the approach would, at least in theory, work. It would have been nicer had he been willing to concede the point outright, but fortunately I was able to make up for his lack of explicit concession with my own stubbornness.
Even better, as I prototyped the engine initially, it turned out that, In practice, the .NET 3.0 (and above) GC did not have the type of pauses he worried about — even without an incremental design! As far as bandwidth is concerned, the GC maxes out at 2% of CPU usage in Nu. Outside of a single GC2 hitch at the start of the program (which can be easily hidden with a manual call to GC.Collect() at a loading screen), there are no frame-delaying pauses.
Through judicious use of mutation encapsulated behind the engine’s functional interface (as we’ll talk about later in this article), I can easily fend off all noticeable GC stalls until long after the CPU is soaked with normal simulation processing. Because we reach CPU soak long before GC stalls kick in, especially considering how tightly tuned and optimized the engine itself it, I consider this a non-issue both in theory and in practice thanks to modern GC technology.
2) Functional programming would be too slow.
This is a common concern, and a bit more valid than the other. But is it to the extreme as being suggested by my colleague? And aren’t there some caching optimizations and other workarounds that can assuage this concern?
Currently, the Nu Game Engine can process about 25,000 on-screen entities at 60 FPS before saturating the CPU. For perspective, consider that the modern CPU can only handle about 50,000 particles before they need to be implemented with an alternative programming known as data-orientation — and particles are much cheaper than entities in any game engine.
With this number of entities on the screen at once, the performance limitations depend entirely on the engine’s structure. Consider that in order for an entity to have its current state retrieved, it must be looked up from a map. And not just any map, a purely functional map! How can we process that many entities when we have to rely on this type of data structure?
There are three optimizations that make this fast in Nu. First, we use an innovative purely functional unidirectional map, UMap, rather than the normal F# Map. While the vanilla Map’s look-up time is O(log n), UMap’s look up time is O(1) in Nu’s use case!
(It’s called the ‘unidirectional map’ because its performance is near that of the .NET Dictionary’s so long as most past instances of it are discarded — just as they fortunately are in a game simulation such as Nu!)
Check out the timings –
Second, the most recently-found entity is cached by the engine so that subsequent state retrievals on the same entity require no entity look-up at all! So as far as subsequent reads go, we’re as fast as we’d like to be.
Third, there is the ability to specialize types of entities as ‘imperative’. That is, operations on them mutate the state in-place rather than copying and updating. Because imperative entity operations are in-place, their data can be cached directly in the handle, requiring no look-ups even when dealing with different entities! The above 25,000 number is for when the engine is configured to update entity states imperatively on the back-end. If you want your entities to be purely functional and work with systems such as undo / redo in the editor, you can only have about 12,500 on-screen. Still, that’s a surprisingly small perf loss considering the theoretical performance costs of functional data structures.
Additionally, I’ve included an ECS API that allow entities to scale into the millions In a later article I describe how functional programming and ECS have a synergistic complementation – https://vsyncronicity.com/2020/03/01/functional-programming-and-data-oriented-design/
So, as it stands, we can say the following today with certainty –
Functional game programming should work out-of-the-box for casual, non-AAA games. Concerns 1 and 2 have been demonstrated to be non-blocking both in theory and in practice. With concern 2, you do need an escape hatch to alternate approaches when in need of different scalability properties – and that’s just what Nu’s ECS provides!
The open question is: do these types of techniques work in the context of AAA games like Uncharted 4?
I cannot answer this with certainty… yet. Nu’s non-ECS entities perform nearly as well as Unity’s GameObjects and its Archetype-based ECS system is extremely performant. There also seems to be a general tax on .NET code as vs. C++ – but the .NET jitter is getting better all the time, especially with the recent release of RyuJit. We can also looks forward to Profiler-Guided Optimization in .NET as well – https://devblogs.microsoft.com/dotnet/conversation-about-pgo/
That all said, like as was done with the initial prototype of Nu proving the workability of functional game development, I will assert this much:
There’s only one way to prove that Nu’s idealized combination of functional and data-oriented programming can work for modern AAA games — and that is to try it and see.