Are Performance Issues All Because of OGRE?
Everybody says it a lot: Ogre is an old engine and Kenshi runs at a snail’s pace because of it.
But that’s only half the story.
Yes, Ogre is single-threaded, which slows down the performance pipeline. However, performance comes down to more than an engine. Assets in any game development pipeline need to be optimised – and many of Kenshi’s assets are far heavier than they need to be.
The truth is, a lot of the performance problems can be improved, and LitA combines the performance hacks of Kenshi: Remastered, ReKenshi with a better asset pipeline. Existing assets are optimised by ~40% and the whole material pipeline is designed with strict targets in mind. In total, all assets across the game come in at under 4GB – including the heightmap; terrain materials; clothing; weapons; and races – by making more directed decisions.
That means that a player running a high-end graphics card could have everything loaded at once.
No, that doesn’t mean LoFi made bad decisions, only that they could have done things differently. After all, we’re looking at a completed project in post and figuring out where its weaknesses are – and planning around them.
How We’re Optimising for Performance
LitA uses a mix of scratch-made and vanilla assets. Fundamentally, we’ve optimised them in two ways. Firstly, are polycounts, and secondly, are the materials used in rendering.
On top of those optimisations are a mixture of rendering improvements thanks to Kenshi: Remastered, and in-engine improvements that relate to level design, AI optimisation, and a handful of miscellaneous optimisations.
No, we can’t make Kenshi perform like it’s running on a new engine. But we can squeeze as much performance as possible out of it.
Polycounts refer to the numbere of polygons in an asset. While GPUs improve over time, making polycounts less important overall, there’s still always going to be a hard limit to what an engine can display (and don’t say “Nanite” or “UE5”).
Put simply, we’re working with stricter polygon targets than are used in vanilla.
For instance, the polycount of a set of Holy Nation chest plate is around 6,500 tris. The retopologised mesh used in LitA comes to 2,500 and looks virtually identical. On average, the repurposed vanilla meshes are 40-50% more optimised than those used in vanilla, and new meshes aim for the same targets.
Buildings and level geometry adhere to the same philosophy: Lower polygon counts as much as possible.
Better Material Design
Level geometry – buildings, props, walkways, and features – use a lot of textures. In total, the textures across all buildings used in-game come to between ~1.5GB and ~2.3GB (depending how you count).
LitA’s materials use a different design philosophy, focusing on more trimsheets and seamless materials over individually textured assets.
This is a benefit for two reasons. Firstly, we reduce the total VRAM load of materials to under 1GB; and secondly, we improve the visual fidelity of large assets by tiling the materials more.
What does this mean?
Here’s a texture for a level IV wall from the vanilla game. It’s a 4K asset, meaning the base texture plus normal map come to 42MB:
The problem with this is that huge surface areas – the wall itself – use relatively small parts of the texture map. The second problem is that relatively niche assets use a lot of VRAM – a level IV wall; its gate; and a tower use a total of 6 4K material atlases, totalling 128MB alone.
There simply aren’t enough pixels to render the wall in high fidelity. The end result is a wall that looks like this:
Notice how blocky and pixelated the walls look up-close.
By switching to a material-based workflow, we can tile the textures while upscaling many of the base materials – without losing any performance. This, combined with a handful of other tricks (more modern compression techniques and selective downscaling of non-essential materials, like metalness and normal maps), means that everything looks as good up-close as it does from a distance.
The environmental assets that are textured individually are either niche, location-based assets (like town signs) or small enough to warrant the textures (like faction banners). Additional details and “greebles” are handled via shared trimsheets, like these:
They’re rendered in high quality, and give a lot of details to work with without sacrificing performance.
Bfrizz, the mad genius behind ReKenshi, wrote a new lossless compression algorithm for Kenshi’s heightmap, reducing its size by ~300MB. The heightmap is stored in memory permanently and the compressed version doesn’t lose any details.
This won’t have an in-engine effect, but will significantly improve loading times for non-SSD users. SSD users may notice a minor benefit.
Miscellaneous ReKenshi Hacks
Bfrizz is split between pulling more performance out of Ogre and integrating a scripting API – and we’re happy to wait while he experiments with performance hacks.
One such hack is crunch compression. It’s an addition compression method that can boost loading times, though it’s currently still being tested for stability and performance boosts. Currently, it’s in a working state. Assuming all goes well, it’ll either be integrated into LitA base or LitA: Hot Potato Edition™ (below).
There are a lot of small-time tweaks made that add up to small gains in performance. All clothing items use pre-rendered icons instead of relying on a virtual photobox; siege and mass-battle AI is optimised for faster decision-making; and most mass battles are designed around select locations that avoid GPU-heavy particle and weather effects.
The largest, “hero” type assets, while polygon-heavy, are hand-placed at locations where they won’t make an impact. All miscellaneous terrain features and foliage assets are as optimised by more than 50% when compared with vanilla; and they’re used more selectively to retain the look and feel while avoiding an over-reliance on them.
The world has a more uniform look, too. Yes, there are all sorts of biomes and environments – from frozen tundras to scorching deserts to mysterious shadowlands – but environmental materials are shared more frequently without sudden, jarring transitions between biomes. The result is in the lower overheads when transitioning between biomes.
Let’s face it: Some people run Kenshi on a potato.
What does that mean for us?
Well, in addition to all that we’ve talked about, we’re bundling LitA with the LitA: Hot Potato Edition™ addon. It’s an optional addon designed for low-end users that should offer substantial performance gains at the cost of some of the visuals.
This addon may come in different flavors, depending on popular demand. However, the basic edition will do the following:
- Render all materials at half resolution. This cuts their size on disk down by 75% (meaning less stress on VRAM). Plus, rendering these materials at low resolution instead of compressing them in a photo editor yields far better results.
- Crunch compression may be used on top if it yields enough performance benefits (and assuming we don’t bundle it with the base LitA edition).
- Optimise level geometry. Cutting down on superfluous foliage and level props will remove a lot of superfluous detail. Props like furniture and greebles that are tossed around to liven a place up – we’ll remove a select amount of these that preserve the look and feel of locations while helping with performance.
- Optimise sieges and mass battles. To cut down on the number of NPCs who clog up your GPU during mass battles, Potato Edition™ will reduce the overall NPC count by around 30%. At the same time, the remaining inhabitants of the world will be upscaled in their abilities to account for the loss.
- Optimise base meshes. A lot of the meshes used for armour, characters, and animals can be optimised by around 30-50% at some fidelity loss. This is a tradeoff that will reduce polycounts by a substantial amount at the cost of some minor clipping and detailing issues.
- Rendering hacks (pending). Kindrad is planning on implementing a select number of hacks to improve Kenshi’s rendering pipeline further. It’s an ongoing process – so the amount of performance gained in initial tests will determine the extent to which the hacks are implemented. The Remastered page will go into more detail when we know more.
How much we push the fidelity-optimisation tradeoff will depend heavily on feedback and playtesting.