When compiling with clang, the compiler warns about the lack of return
statement in the non-void function fmt::format.
This fixes it by returning a default error string.
Tricks the compiler into skipping expensive
`uninit var analysis` (`-Wmaybe-uninitialized`) by using
a struct with state rather than separate variables for `best` /
`bestDiff`.
This has no performance impact.
Also optimizes lookup a bit further and moves some code that
does not need to be inlined to the cpp file.
```
Benchmark Time CPU Time Old Time New CPU Old CPU New
-----------------------------------------------------------------------------------------------------------------------------------
BM_GenerateBlendedLookupTable_pvalue 0.0002 0.0002 U Test, Repetitions: 10 vs 10
BM_GenerateBlendedLookupTable_mean +0.0237 +0.0237 2092090 2141601 2091732 2141291
BM_GenerateBlendedLookupTable_median +0.0237 +0.0237 2092104 2141662 2091669 2141319
BM_GenerateBlendedLookupTable_stddev -0.6414 -0.5834 664 238 538 224
BM_GenerateBlendedLookupTable_cv -0.6497 -0.5930 0 0 0 0
BM_BuildTree_pvalue 0.0002 0.0002 U Test, Repetitions: 10 vs 10
BM_BuildTree_mean +0.0410 +0.0410 4495 4679 4494 4678
BM_BuildTree_median +0.0403 +0.0402 4494 4675 4493 4674
BM_BuildTree_stddev +0.9515 +0.9359 7 14 7 14
BM_BuildTree_cv +0.8746 +0.8596 0 0 0 0
BM_FindNearestNeighbor_pvalue 0.0002 0.0002 U Test, Repetitions: 10 vs 10
BM_FindNearestNeighbor_mean -0.0399 -0.0398 1964257108 1885966812 1963954917 1885694336
BM_FindNearestNeighbor_median -0.0397 -0.0396 1963969748 1886074435 1963650984 1885803182
BM_FindNearestNeighbor_stddev -0.3380 -0.3443 1217360 805946 1225442 803469
BM_FindNearestNeighbor_cv -0.3105 -0.3171 0 0 0 0
OVERALL_GEOMEAN +0.0077 +0.0077 0 0 0 0
```
Gets rid of `orig_palette`, we now always have only 2 palettes:
1. `logical_palette`
This palette has color cycling / swapping applied but no global
effects such as brightness / fade-in.
2. `system_palette`
This palette is the actual palette used for rendering.
It is usually `logical_palette` with the global brightness setting
and fade-in/out applied.
Additionally, we now keep the k-d tree around and use it to
update single colors.
The colors that are color-cycled / swapped are never included
in the k-d tree, so the tree does not need updating on color
cycles/swaps.
Our values are `uint8_t`, so we can get the median somewhat faster
than `nth_element`.
```
Comparing build-reld-master/palette_blending_benchmark to build-reld/palette_blending_benchmark
Benchmark Time CPU Time Old Time New CPU Old CPU New
-----------------------------------------------------------------------------------------------------------------------------------
BM_GenerateBlendedLookupTable_pvalue 0.0002 0.0002 U Test, Repetitions: 10 vs 10
BM_GenerateBlendedLookupTable_mean -0.0198 -0.0198 2272848 2227790 2270291 2225348
BM_GenerateBlendedLookupTable_median -0.0199 -0.0199 2272884 2227649 2270323 2225212
BM_GenerateBlendedLookupTable_stddev +0.4575 +0.6710 536 781 396 661
BM_GenerateBlendedLookupTable_cv +0.4870 +0.7047 0 0 0 0
OVERALL_GEOMEAN -0.0198 -0.0198 0 0 0 0
```
1. Achieves near-perfect balancing on non-pathological data
by using a per-node pivot and calculating.
2. Increases the depth to 5 levels, which seems to be the
sweet spot.
New benchmark vs baseline:
```
Comparing build-reld-palette-cleanup/palette_blending_benchmark to build-reld/palette_blending_benchmark
Benchmark Time CPU Time Old Time New CPU Old CPU New
-----------------------------------------------------------------------------------------------------------------------------------
BM_GenerateBlendedLookupTable_pvalue 0.0002 0.0002 U Test, Repetitions: 10 vs 10
BM_GenerateBlendedLookupTable_mean -0.8768 -0.8767 18432956 2270752 18417846 2270141
BM_GenerateBlendedLookupTable_median -0.8767 -0.8767 18421978 2270802 18417838 2270167
BM_GenerateBlendedLookupTable_stddev -0.9860 -0.8051 33641 473 1222 238
BM_GenerateBlendedLookupTable_cv -0.8860 +0.5815 0 0 0 0
OVERALL_GEOMEAN -0.8768 -0.8767 0 0 0 0
```
Uses a k-d tree to quickly find the best match
for a color when generating the palette blending
lookup table.
https://en.wikipedia.org/wiki/K-d_tree
This is 3x faster than the previous naive approach:
```
Benchmark Time CPU Time Old Time New CPU Old CPU New
-----------------------------------------------------------------------------------------------------------------------------------
BM_GenerateBlendedLookupTable_pvalue 0.0002 0.0002 U Test, Repetitions: 10 vs 10
BM_GenerateBlendedLookupTable_mean -0.7153 -0.7153 18402641 5239051 18399111 5238025
BM_GenerateBlendedLookupTable_median -0.7153 -0.7153 18403261 5239042 18398841 5237497
BM_GenerateBlendedLookupTable_stddev -0.2775 +0.3858 2257 1631 1347 1867
BM_GenerateBlendedLookupTable_cv +1.5379 +3.8677 0 0 0 0
OVERALL_GEOMEAN -0.7153 -0.7153 0 0 0 0
```
The distribution is somewhat poor with just 3 levels, so this can be improved further.
For example, here is the leaf size distribution in the cathedral:
```
r0.g0.b0: 88
r0.g0.b1: 10
r0.g1.b0: 2
r0.g1.b1: 32
r1.g0.b0: 27
r1.g0.b1: 4
r1.g1.b0: 12
r1.g1.b1: 81
```
The previous implementation didn't behave quite like A-* is supposed to.
After trying to figure out what's causing it and giving up,
I've reimplemented it in a straightforward manner.
Now it seems to work a lot better.
Also increases maximum player path length to 100 steps.
We still only store the first 25 steps in the save file for vanilla
compatibility.
In C++, globals initialization order accross translation units is not
defined. Accessing a global via a function ensures that it is initialized.
This will be needed for #7638, which will statically initialize change
handlers after the Options object has been initialized.
1. Moves more assets-related stuff from `init` to `engine/assets`.
2. Removes `SDL_audiolib` dependency from `soundsample.h`.
3. Cleans up some unused/missing includes.