Done with the following script:
```ruby
Dir["Source/**/*.{h,c,cc,cpp,hpp}"].each do |path|
v = File.read(path)
next if !v.include?("uint32_t") || v.include?("cstdint")
lines = v.lines
line_num = if lines[2].start_with?(" *")
lines.index { |l| l.start_with?(" */") } + 3
else
3
end
lines.insert(line_num, "#include <cstdint>\n")
File.write(path, lines.join(""))
end
```
then fixed-up manually
1. Unifies the underlying CLX and dun_render blitters.
2. Optimizes them by unrolling loops and using pointer comparison rather
than length comparison (saves a length decrement).
3. In `dun_render`, extracts `RenderLineTransparent/Opaque` branches into
functions via explicit template specialization.
Example RG-99 FPS (non-PGO'd): 17.4->18.4
Removes most `FMT_COMPILE` calls.
`FMT_COMPILE` results in better performance but larger code size.
Removes `FMT_COMPILE` calls for places that are called infrequently,
i.e. not on every frame.
RG-99 binary size reduced by ~4 KiB.
As we recently confirmed, Square and Left/RightTriangle primitives
never use masks other than Transparent and Solid.
Simplify the code to take advantage of that.
We notice that masks can be described by 2 parameters:
1. Whether they have 0 or 1 as their high bits.
2. Whether they shift to the left or to the right on the next line.
Describing masks this way allows us to lift them to template variables and simplify the code.
We also avoid handling the mask in the `RenderLine` loop entirely.
Also fixes a foliage rendering bug: Transparent foliage pixels were previously blended but they should have been simply skipped.
1. Fixes the return value (bytes rendered).
2. Fixes line wrapping / end-of-rendering based on the given rectangle:
1. Accounts for `BaseLineOffset`.
2. Fixes an off-by-one error for the y coordinate.
3. Wraps the cursor when needed.
3. Fix chat input box dimensions (height is 3 * line height).
4. Set the hint that indicates that we do not render the current
IME suggestion (SDL_TEXTEDITING). This indicates to IME
that it should render the suggestion instead.
When using `UNPACKED_MPQS`, avoid all the SDL machinery for reading
files.
This is beneficial not only due to reduced indirection but also because
we can test for the file's existence and get the file size without
opening it, which is much faster.