Commit Graph

2309 Commits

Author SHA1 Message Date
nmlgc c6e2955e06 [Maintenance] Declare missing entity-specific flags in ASM land
Part of P0236, funded by Yanga.
2023-03-29 12:36:55 +02:00
nmlgc b9402be979 [Maintenance] Simplify two-state entity flags
Changing the type to `bool` highlights that these have only two states.
No `switch` required.

Part of P0236, funded by Yanga.
2023-03-29 12:36:55 +02:00
nmlgc 679553fb4d [Decompilation] [th02] Stage tiles: Resetting an incomplete bunch of state
And that's all we can do for now. Almost done with stage tiles!

Part of P0236, funded by Yanga.
2023-03-29 12:36:55 +02:00
nmlgc caa4b6747b [Decompilation] [th02] Stage tiles: .MAP file loading
Part of P0236, funded by Yanga.
2023-03-29 12:36:55 +02:00
nmlgc 366cd1bf95 [Decompilation] [th02] Stage tiles: VRAM source area initialization
So silly.

Part of P0236, funded by Yanga.
2023-03-29 12:36:55 +02:00
nmlgc 1778006684 [Decompilation] [th02] Stage tiles: Low-level EGC-accelerated blitting
Batched this time around!

Part of P0236, funded by Yanga.
2023-03-29 12:36:55 +02:00
nmlgc fdee90a499 [Decompilation] [th02] Stage tiles: Regular scroll and render function
Part of P0236, funded by Yanga.
2023-03-29 12:36:55 +02:00
nmlgc 29cde9b5de [Decompilation] [th02] Stage tiles: Initial screen of a stage
Part of P0236, funded by Yanga.
2023-03-29 12:36:55 +02:00
nmlgc 6acb385a5a [Decompilation] [th02] Stage tiles: Rectangular invalidation
Part of P0236, funded by Yanga.
2023-03-29 12:36:55 +02:00
nmlgc 62c4b7f1c2 [Decompilation] [th02] Stage tiles: Regular rendering
Completes P0235, funded by Ember2528.
2023-03-29 12:36:55 +02:00
nmlgc 59564b3ecf [Decompilation] [th02] Stage tiles: Re-rendering the entire playfield
Part of P0235, funded by Ember2528.
2023-03-29 12:36:55 +02:00
nmlgc 9c95ddace3 [Decompilation] [th02] Stage tiles: Setting and blitting single tiles
Used e.g. for the track marks that the Stage 1 midboss leaves on the
bridge. Nice detail… if it weren't for the wrong rendering order 🎺

Part of P0235, funded by Ember2528.
2023-03-29 12:36:55 +02:00
nmlgc e1ccf2d67d [Decompilation] [th02] Stage tiles: EGC-accelerated single-tile blitting
From the tile source area to the right of the HUD, obviously. So glad
that vram_offset_shift_fast() is possible without inline ASM, we're
definitely going to see that one more often.

Part of P0235, funded by Ember2528.
2023-03-29 12:36:55 +02:00
nmlgc a51910c544 [Naming] [th02] Stage tiles: Public .MPN functions
Part of P0235, funded by Ember2528.
2023-03-29 12:36:55 +02:00
nmlgc ca0a076e92 [Naming] [th02] Stage tiles: Rendering helper functions
Part of P0235, funded by Ember2528.
2023-03-29 12:36:55 +02:00
nmlgc 8ed9343116 [Reverse-engineering] [th02] Stage tiles: Map sections and the .MAP format
8 rows per section, rather than the 5 that TH04 and TH05 would use.

Part of P0235, funded by Ember2528.
2023-03-29 12:36:54 +02:00
nmlgc 7cbaf4a123 [Reverse-engineering] [th02] Stage tiles: Public state
Part of P0235, funded by Ember2528.
2023-03-29 12:36:54 +02:00
nmlgc 54e72475c6 [Reverse-engineering] [th02] Stage tiles: Internal state
Part of P0235, funded by Ember2528.
2023-03-29 12:36:54 +02:00
nmlgc 0a71be2188 [Reverse-engineering] [th02/th04/th05] Stage tiles: VRAM source area offsets
Part of P0235, funded by Ember2528.
2023-03-29 12:36:54 +02:00
nmlgc a5a1967367 [Reverse-engineering] [th02] Scrolling: Speed and interval
Part of P0235, funded by Ember2528.
2023-03-29 12:36:54 +02:00
nmlgc 172a6d2b2b [Maintenance] Add a macro for alternate-instruction VRAM offset calculations
Part of P0235, funded by Ember2528.
2023-03-29 12:36:54 +02:00
nmlgc f062e4b84e [Decompilation] Add a macro for faster VRAM offset calculation with bit shifts
Part of P0235, funded by Ember2528.
2023-03-25 21:28:15 +01:00
nmlgc 0e103c6f88 [Maintenance] Add a VRAM offset roll macro
Part of P0235, funded by Ember2528.
2023-03-25 21:24:57 +01:00
nmlgc 6a92122432 [Maintenance] [th02] Remove invalid Shift-JIS code point in `th02_main.asm`
Would otherwise drive me nuts until I get to decompiling Meira.

Part of P0235, funded by Ember2528.
2023-03-25 21:24:47 +01:00
nmlgc e7a9262f50 [Contributing] Remove trailing commas from `public` in a pre-commit hook
Starting more ore less simple with a shell script calling `sed`, which
should work anywhere Git is used.
2023-03-20 01:55:06 +01:00
nmlgc 0685bd0885 [Maintenance] [th03] Remove trailing comma from `public` directive
These break the build on TASM32 version 5.0… which means that
th03_main.asm didn't assemble on that version ever since 3072208.
Thanks to mu021 for telling me about the resulting crash!
2023-03-20 00:29:02 +01:00
nmlgc 3afb73eada [Reverse-engineering] [th01] Pellets: Document missing resets for delay clouds
The exact reason why pellets can be carried over from Sariel's first
form to her second… and why you probably shouldn't carelessly use that
redundant count of alive pellets to skip loops in the Anniversary
Edition, because that count will be incorrect after a reset.
Thanks to mu021 for reporting this issue!
2023-03-14 00:18:23 +01:00
nmlgc 6370f96d9a [Reverse-engineering] [th01] Pellets: Correctly document interlacing
It *does* affect collision detection after all, making removal for the
Anniversary Edition slightly more tricky.

Part of P0234, funded by Ember2528.
2023-03-04 19:40:55 +01:00
nmlgc 6713b1bf9c [Maintenance] [th01] 16× TRAM letters: Fix macro copy-pasting accidents
Part of P0234, funded by Ember2528.
2023-03-04 19:40:55 +01:00
nmlgc dbc5b511ba [Research] BLITPERF: Allow the GRCG sprite color to be customized
spaztron64 suggested that the GRCG might not always write all 4 planes
if one of the tile registers is 0. By changing the color and comparing
the results of this benchmark, we can prove that real hardware has no
such optimization.

Completes P0233, funded by [Anonymous].
2023-03-04 19:40:55 +01:00
nmlgc 12b8bd550d [Research] BLITPERF: Remove seed customization
There's little point to it, and it frees up a letter for…

Part of P0233, funded by [Anonymous].
2023-03-04 19:40:55 +01:00
nmlgc 4837ec7ec9 [Research] Benchmark various sprite blitting approaches
Running this on various PC-98 models confirms that unchecked blitting
(i.e., what you would intuitively consider to be the best method) is in
fact faster than checking either byte of a 16-pixel-wide sprite
beforehand, and has been throughout the PC-98's lifespan. For optimal
performance on the 286 and 386, we might want to use MOVS instead of
MOV, but even that difference is way too small to truly matter.
Also, nice to see turns out that our blitter outperforms a naive pure C
implementation by 2-4×, depending on the model. And master.lib is not
*that* much faster…

The gaiji in `Research/blitperf.bmp` were taken from the Unifont
version 15.0.01 glyphs for:

	• U+2022 BULLET •
	• U+23F1 STOPWATCH ⏱
	• U+1F40C SNAIL 🐌

Part of P0233, funded by [Anonymous].
2023-03-04 19:40:55 +01:00
nmlgc 6514a6b9a5 [Research] Set optimized code generation flags
Taken from the `debloated` branch.

Part of P0233, funded by [Anonymous].
2023-03-04 19:40:55 +01:00
nmlgc f8a774e1dc [Research] Get HOLDKEY.EXE compiling again
And finally compile this directory during the regular build process to
keep this from happening…

Part of P0233, funded by [Anonymous].
2023-03-04 19:40:55 +01:00
nmlgc aa0aad8141 [Platform] [PC-98] Generic byte-aligned sprite blitter
The fact that every sprite format comes with its own blitter is one of
the major sources of bloat in PC-98 Touhou, and of TH01 in particular.
So how about writing a single decently optimized blitter, and calling
into that from the entire game?

Especially because generating distinct blitting functions for every
width is a much better use of all that memory: It eliminates horizontal
loops, and ensures that we use the optimal MOV variant for each sprite
size. Removing any checks for empty bytes (which will turn out to never
have been a good idea for any PC-98 model ever) and unrolling the main
blitting loop using Duff's Device already gets us something that,
depending on the PC-98 model, is easily 2-4× faster than the typical
naive C implementation you'd find in TH01. With master.lib being not
that faster…

Making more use of C++ templates would have been fancy, but horizontal
sprite clipping can change the blit width depending on runtime values.
So, we're back to X macro code generation after all.

Part of P0233, funded by [Anonymous].
2023-03-04 19:40:55 +01:00
nmlgc abeaf851a4 [Platform] [PC-98] EGC rectangle copies
Yup, unaligned! The prefilling case is quite broken on T98-Next, but
given that this emulator hasn't seen any development since 2010 and
every other emulator gets it right, we can reasonably assume that to be
a bug in that emulator.

Completes P0232, funded by [Anonymous].
2023-03-04 19:40:55 +01:00
nmlgc afa6253683 [Platform] [PC-98] GRCG tile and color wrappers
Choosing C++ RAII wrappers because there's at least one case where ZUN
misplaced a manual grcg_off(). This implementation combines safety with
the optimal instructions for both dynamic and static use cases.

Part of P0232, funded by [Anonymous].
2023-02-28 08:08:17 +01:00
nmlgc d15bb0c4fa [Platform] [PC-98] Graphics GDC initialization
I've copy-pasted this snippet so many times, it's time it gets a proper
home.

Part of P0232, funded by [Anonymous].
2023-02-28 08:08:17 +01:00
nmlgc 03519d2af8 [Platform] [PC-98] Font ROM glyph retrieval
Lol @ getting the glyph header field order wrong in 2021…

Part of P0232, funded by [Anonymous].
2023-02-28 08:08:17 +01:00
nmlgc e5d7c9489c [Platform] [PC-98] Gaiji upload
Will come in handy for various research programs… 👀

Part of P0232, funded by [Anonymous].
2023-02-28 08:08:17 +01:00
nmlgc d22c1e6db3 [Platform] [PC-98] Hardware palette setters
Optimally, these are called *at most* once per frame. No need to
micro-optimize here.

Part of P0232, funded by [Anonymous].
2023-02-28 08:08:17 +01:00
nmlgc f1108b5548 [Platform] [PC-98] VSync: Retrigger the VSync interrupt after INT 18h
Well, that didn't take long. Unlike *debugging* this issue after you
encounter it on real hardware…

Part of P0232, funded by [Anonymous].
2023-02-28 08:08:17 +01:00
nmlgc df0672762b [Platform] [PC-98] VSync interrupt handler
Starting with the simple refresh rate-oblivious code from TH01, until
we've figured out what the rest of the master.lib code is doing and
have valid reasons to include it. Also extending the second counter to
32-bit because we *might* be measuring some processes that could take
longer than 19:35 minutes…

Part of P0232, funded by [Anonymous].
2023-02-28 08:08:17 +01:00
nmlgc 82f27f3771 [Platform] [PC-98] Page flipping
Inline functions wouldn't generate optimal code in some cases.

Part of P0232, funded by [Anonymous].
2023-02-28 08:08:17 +01:00
nmlgc 4548b874d6 [Platform] [PC-98] Font ROM glyph types
Moving the code from TH01 to a new platform layer, and deciding against
the `pc98_` prefix, which is sort of implied by the directory of the
header file it came from. Namespaces would be ideal, but Turbo C++ 4.0J
sadly doesn't support them.

Part of P0232, funded by [Anonymous].
2023-02-28 08:08:17 +01:00
nmlgc 49a6834c4a [Maintenance] Move OUT for 8-bit port numbers to x86real.h
We'd like to use this optimization in the platform layer as well.
Turning it into an inline function via __emit__() also allows us to
turn a bunch of other macros into proper inline functions.

Part of P0232, funded by [Anonymous].
2023-02-28 08:08:17 +01:00
nmlgc e52052abf2 [Maintenance] Introduce EGC register bit count and mask constants
Part of P0232, funded by [Anonymous].
2023-02-28 08:08:17 +01:00
nmlgc bff375d99d [Maintenance] Introduce a VRAM byte bit constant
log₂(BYTE_DOTS), just like SUBPIXEL_FACTOR and SUBPIXEL_BITS.

Part of P0232, funded by [Anonymous].
2023-02-28 08:08:17 +01:00
nmlgc 055cc20f79 [Platform] [x86 Real Mode] CPU flag macros
Over on the `debloated` branch, we're going to use them in our own
platform-specific code, which obviously is not decompiled from
anything.

Part of P0232, funded by [Anonymous].
2023-02-28 08:08:16 +01:00
nmlgc f16144a131 [Naming] Rename peek() and poke() macros to look more intrinsic
And correctly move them to a separate part of x86real.h, as they are
not part of Turbo C++ 4.0J's DOS.H.

Part of P0232, funded by [Anonymous].
2023-02-28 08:08:16 +01:00