1. Unifies the underlying CLX and dun_render blitters.
2. Optimizes them by unrolling loops and using pointer comparison rather
than length comparison (saves a length decrement).
3. In `dun_render`, extracts `RenderLineTransparent/Opaque` branches into
functions via explicit template specialization.
Example RG-99 FPS (non-PGO'd): 17.4->18.4
Removes most `FMT_COMPILE` calls.
`FMT_COMPILE` results in better performance but larger code size.
Removes `FMT_COMPILE` calls for places that are called infrequently,
i.e. not on every frame.
RG-99 binary size reduced by ~4 KiB.
For `MultiFileLoader` and `LoadMultipleCl2Sheet`, we now open one file
at a time.
This can help on platforms that limit the number of concurrently open
files, such as the PS2.
The `eof()` does not return true until a failed read attempt.
> Note that the value returned by this function depends on the last operation performed on the stream (and not on the next).
As we recently confirmed, Square and Left/RightTriangle primitives
never use masks other than Transparent and Solid.
Simplify the code to take advantage of that.
We notice that masks can be described by 2 parameters:
1. Whether they have 0 or 1 as their high bits.
2. Whether they shift to the left or to the right on the next line.
Describing masks this way allows us to lift them to template variables and simplify the code.
We also avoid handling the mask in the `RenderLine` loop entirely.
Also fixes a foliage rendering bug: Transparent foliage pixels were previously blended but they should have been simply skipped.
1. Fixes the return value (bytes rendered).
2. Fixes line wrapping / end-of-rendering based on the given rectangle:
1. Accounts for `BaseLineOffset`.
2. Fixes an off-by-one error for the y coordinate.
3. Wraps the cursor when needed.
3. Fix chat input box dimensions (height is 3 * line height).
4. Set the hint that indicates that we do not render the current
IME suggestion (SDL_TEXTEDITING). This indicates to IME
that it should render the suggestion instead.