Commit Graph

887 Commits

Author SHA1 Message Date
nmlgc e0d90dbdc3 [C decompilation] [th01] Text mode functions
Yet another set of questionable C reimplementations of master.lib functions to
waste my time. And half of them, including z_text_(v)putsa, aren't even called
anywhere.
2015-03-11 23:29:58 +01:00
nmlgc 6d2fa9f077 [C decompilation] [th01/reiiden] Randomly shaped VRAM copy functions, #1
So apparently, TH01 isn't double-buffered in the usual sense, and instead uses
the second hardware framebuffer (page 1) exclusively to keep the background
image and any non-animated sprites, including the cards. Then, in order to
limit flickering when animating the bullet, character and boss sprites on top
of that (or just to the limit number of VRAM accesses, who knows), ZUN goes to
great lengths and tries to make sure to only copy back the pixels that were
modified on plane 0 in the last frame.

(Which doesn't work that well though. When you play the game, you still notice
tons of flickering whenever sprites overlap.)

And by "great lengths", I mean "having a separate counterpart function for
each shape and sprite animated which recalculates and copies back the same
pixels from plane 1 to plane 0", because that's what the new functions here
lead me to believe. Both of them are only called at one place: the wave
function on the second half of Elis' entrance animation, and the horizontal
masked line function for Reimu's X attack animations.
2015-03-10 17:39:00 +01:00
nmlgc 519e24c459 Rename the *_copy_region_* functions to *_copy_rect_*
TH01 copies a lot of different shapes from plane 1 to 0, so "region" feels
awfully unspecific.
2015-03-10 14:18:28 +01:00
nmlgc 44ad3eb4bc [C decompilation] [th01/fuuin] Slow 2x VRAM region scaling
This function raises one of those essential questions about the eventual ports
we'd like to do. I'll explain everything more thoroughly here, since people
who might complain about the ports not being faithful enough need to
understand this.

----

The original plan was aim for "100% frame-perfect" ports and advertise them as
such. However, the PC-98 is not a console with fixed specs. As the name
implies, it's a computer architecture, and a plethora of different, more and
more powerful PC-98 models were released during its lifespan. Even if we only
consider the subset of products that fulfills the minimum requirements to run
the PC-98 Touhou games, that's still a sizable number of systems.

Therefore, the only true definition of a *frame* can be "everything that is
drawn between two Vsync wait calls". Such a *frame* may contain certain
expensive function calls, and certain systems may run these functions slower
than the developer expected, thus effectively leading to more *frames* than
the developer explicitly specified.

This is one of those functions.

Here, we have a scaling function that appears to be written deliberately to
run very slow, which ends up creating the rolling effect you see in the route
selection and the high score and continue screens of TH01. However, that
doesn't change the fact that the function is still CPU-bound, and neither
waits for Vsync nor is iteratively called by something that does. The faster
your CPU, the faster the rolling effect gets… until ultimately, it's faster
than one frame and therefore vanishes altogether. Mind you, this is true on
both emulators and real hardware. The final PC-98 model, the Ra43, had a CPU
clocked at 433 Mhz, and it may have even been instant there.
If you use more optimized algorithm, it also runs faster on the same CPU (I
tried this, and it worked beautifully)… you get the idea.

Still, it may very well be that this algorithm was not a deliberate choice and
simply resulted from a lack of experience, especially since this was ZUN's
first game.

That leaves us with two approaches to porting functions like these:

1) Look at the recommended system requirements ZUN specified, configure the
   PC-98 emulator accordingly, measure how much of the work is done in each
   frame, then rewrite the function to be bound to that specific frame rate…
2) …or just continue using a CPU-bound algorithm, which will pretty much
   complete instantly on any modern system.

I'd argue that 2) is actually the more "faithful" approach. It will run faster
than the typical clock speeds people emulate the games at, and maybe draw a
bit of criticism because of that, but it seems a lot more rational than the
approximation provided by 1). Not to mention that it's undeniably easier to
implement, and hey, a faster game feels a lot better than a slower one, right?

… Oh well, maybe we'll still encounter some kind of CPU-bound animation that
is so essential to the experience that we do want to lock it to a certain
frame rate…
2015-03-09 17:58:30 +01:00
nmlgc 160d4eb69f [C decompilation] [th01/op] [th01/reiiden] Random resident structure stuff 2015-03-07 17:43:39 +01:00
nmlgc 3b175c8980 [Reverse-engineering] [th01] ReiidenConfig structure 2015-03-06 23:04:25 +01:00
nmlgc 0fd7f14784 [C decompilation] [th01/op] Archive functions
Fuck TH02 and above and their bizarre assembly code that indeed appears to be,
uh, playfully "optimized" in the most inadequate of places, far away from the
innermost loop. It's ALWAYS just these one or two instructions I just can't
fucking get out of the C compiler, which lead to the conclusion that these
functions must have either been first compiled to assembly, then "fine-tuned"
and then linked into the executable…

… or I'm really just missing some obscure compiler setting.

At least with TH01, you can tell that the source language must have undeniably
been C++, and the decompilation is a breeze.
2015-03-05 23:12:14 +01:00
nmlgc a3ae0095f0 [C decompilation] [th02] PI display 2015-03-04 04:28:16 +01:00
nmlgc ed0437f80e [C decompilation] [th02] First set of sound driver calls 2015-03-04 02:47:22 +01:00
nmlgc 404044f32b [C decompilation] [th02/op] [th03/op] [th04/op] Frame delay #1 2015-03-04 02:47:16 +01:00
nmlgc a8384c925f [C decompilation] [th02/maine] HUUMA.CFG loading 2015-03-03 07:40:29 +01:00
nmlgc 2ccad4f5a4 Centrally include master.h in ReC98.h 2015-03-03 06:47:23 +01:00
nmlgc 63299cdf42 [C decompilation] [th02/op] High score screen 2015-03-03 04:25:19 +01:00
nmlgc 87b1fb9e14 [C decompilation] [th02/maine] High score screen
MAIN.EXE shares most of the code in this segment, but I can't remove it from
there right now due to the weird ordering of the data segments in that
executable…

And yes, once again, those three seemingly random type casts in here are
*necessary* to build a bit-perfect binary.
2015-03-02 06:30:06 +01:00
nmlgc 37fc899c42 Add some useful increment and decrement macros
Which we'd really like to have for the highscore entering screen.
2015-03-01 22:52:25 +01:00
nmlgc d058666929 [C decompilation] [th02/maine] Rotating rectangle animation
Small detour into MAINE.EXE because it has all the juicy algorithms that will
explain the remaining unknown members of the highscore data structure, and
there's this one code segment here we need to get out of the way first.
2015-02-28 22:37:40 +01:00
nmlgc 2f1b287f3d [C decompilation] [th01] VRAM region copy via EGC
The same function appears unused in TH02's MAINE.EXE. Separate commit because
this was painful enough and we can link the C version into FUUIN.EXE right
now.
2015-02-27 23:11:47 +01:00
nmlgc 1f514b5a6c [C decompilation] [th02/op] Shot type selection
Oh, OK, so this is what the PC-98 GRCG is all about. You call grcg_setcolor(),
and that puts the PC-98 hardware in some sort of "monochromatic mode". Then,
you just write your pixels into any *single* one of the 4 VRAM bitplanes. This
causes the hardware to automatically write to *all* bitplanes in such a way
that the final palette index for each of the 8, 16, or 32 pixels you just wrote
a 1 value to will actually end up to match the color you set earlier.

Don't forget to call grcg_off() at the end though, or you can't draw any
non-monochromatic graphics, heh.
2015-02-25 23:05:20 +01:00
nmlgc cd33367b51 [C decompilation] [th02/op] Music Room
Yes, all of it. Including the bouncing polygons, of course. And since it's
placed at the end of ZUN's code inside the executable, the code's already
position-independent and fully hackable.
2015-02-24 22:38:44 +01:00
nmlgc a7235304ed Make the VRAM plane constants available to C 2015-02-24 22:16:31 +01:00
nmlgc ad9d6f97eb [Reverse-engineering] [th02] MIKOConfig structure 2015-02-23 23:48:03 +01:00
nmlgc 22332a71fa Make all sound functions and variables available to C 2015-02-23 18:28:38 +01:00
nmlgc 46eb3792cf Move frame_delay into the hardware/ subdirectory 2015-02-23 10:29:12 +01:00
nmlgc f0be7dadf4 [Reverse-engineering] [th02] Keyboard input
Don't really understand the other games yet because they start introducing
joystick support and TH03 has multiplayer and then there are these master.lib
modifications that don't really make any sense to me, especially when you add
that TH04 seemingly does not read js_stat *at all*, yet still works just fine
with a gamepad and... urgh.
2015-02-22 22:33:07 +01:00
nmlgc 436f1c5722 [C decompilation] [th01] MDRV2 calls
Still missing mdrv2_resident() though, which we currently can't slot in there
due to that string constant constructor syntax. :/
2015-02-21 20:48:58 +01:00
nmlgc ed8d0e28f5 [C decompilation] [th02/op] Title screen flashing animation 2015-02-21 14:16:27 +01:00
nmlgc 6d8ff6b72e Make previously reduced ZUN functions available to C 2015-02-21 14:12:22 +01:00
nmlgc 145ecaaa54 Rename all code segments to names that Turbo C++ would generate
Well, duh, of course, we *can* do this in order to allow decompilation to be
started at the end (not the beginning) of any segment. In fact, if we hadn't
done this, we would have had to start by moving _TEXT out to libraries....
2015-02-21 12:47:24 +01:00
nmlgc 7836363019 Use a Makefile for the 16-bit part of the build process
Well, that became unbearable pretty quickly. Not sure whether I'm doing all
this Makefile business right, but this looks pretty nice.

It doesn't really help much at this point though because the 32-bit part is
still entirely separate and forces everything to rebuild all the time, but at
least it aborts on C compiler errors.
2015-02-21 11:28:56 +01:00
nmlgc ffd8bb9013 Clean up the last remaining misanalyzed procedure boundaries
After spending a few hours on correctly decompiling ZUN's bulky custom text
renderer used in TH02 and TH03, it unfortunately turned out that TLINK doesn't
actually give us the fine-grained control over segment ordering we'd like to
have in a project like this, and that we can't slot code from one object file
in between segments from another object file. This means that yes, we really
have to decompile the functions in the order they appear in the executables,
starting on either end.

So, have a boring janitorial commit instead.
2015-02-20 22:44:09 +01:00
nmlgc c2a8c221f2 Let Turbo C++ link in the Borland C/C++ runtime for the main EXE files
This took long enough, so we're not covering the COM files right now. Like, I
can't even tell how you're supposed to work around the forced word alignment
for the _TEXT segment. Guess we'll just have to decompile all of these in one
go, just like we did with ZUNSOFT.COM.

Also, it really seems as if we're merely trading one ugly workaround for
another in our quest for identical binaries.
2015-02-19 10:22:00 +01:00
nmlgc 2d5d38426f Finally use standard segment names everywhere
And I guess we just have to ignore and disable that segment alignment warning
for TH01. It's not like this changes anything in the binary.
2015-02-18 14:04:43 +01:00
nmlgc f861b0a5c3 [C decompilation] ZUNSOFT.COM, all of it
And, of course, it recompiles into the exact binary ZUN shipped in 1997.
Success! This project is so going to happen now.
2015-02-17 13:18:14 +01:00
nmlgc cc219ff2b4 Add MASTERS.LIB and MASTER.H from the original distribution
Yup, we'll be linking against the original binary blob for the time being.
Don't worry though, we will (and in fact, have to) recompile the libraries
from source, separately for each game, as part of the build process in the
future, but we'll get to that once we've decompiled some of the non-TH01 code.
2015-02-16 23:10:47 +01:00
nmlgc ff94dce594 Add build batch files and some documentation about the build process
So yeah, that'll be our build environment - just plain batch files calling the
Borland command-line assembler, linker, and eventually C compiler. These are
the exact tools that ZUN used as well. There certainly are other assemblers,
compilers and linkers that could compile this code into 16-bit DOS
executables; Open Watcom is the only free one I know, and the master.lib
manual also mentions C compilers by Microsoft and Symantec. However, I favor
having one clear build path for a single toolchain that will, with the correct
command-line switches for each game, create builds that are bit-perfect to
ZUN's original ones over the possibility of cross-platform builds and the
maintenance nightmare they add.
So, Borland-only it is.

(Also, no Makefile, due to our messy build setup. I think I still prefer this
solution though, as we can have these really nice error messages that double
as build instructions without any dependencies on installed software.)
2015-02-15 23:32:32 +01:00
nmlgc 5268241a06 Merge the second halves of TH04's and TH05's MAIN.EXE back into the main files
I kinda wanted to wait with this until I've brought REIIDEN.EXE down to at
least 65,536 lines as well, but that's not going to happen anytime soon, and
this split has annoyed me enough by now...
2015-02-14 18:58:03 +01:00
nmlgc 2cac434455 [Reverse-engineering] Sound effect playback 2015-02-13 12:56:51 +01:00
nmlgc cc1a2987c4 Forgot the TH05 VRAM planes data file -.- 2015-02-12 21:33:22 +01:00
nmlgc 07519a7238 [Reverse-engineering] 32-bit VRAM plane pointers
I've looked at every openly available piece of PC-98 documentation, and there
don't seem to be any official names for the individual planes. The closest
thing I could find was the description at

	http://island.geocities.jp/cklouch/column/pc98bas/pc98disphw2.htm

explaining that they represent the blue, red, green, and brightness component
when using the default PC-98 palette. However, these planes correspond to
nothing else but the 4 individual bits of the final index into the color
palette, and you can assign any color to every single palette slot. Therefore,
it's merely a convention that your own palettes don't have to follow (and in
Touhou, they don't).

Nevertheless, there doesn't seem to be an alternative, and the Neko Project II
source code uses the same B/R/G/E convention, so I'll go with that as well.
2015-02-10 23:43:34 +01:00
nmlgc 60f6ecec84 [Reverse-engineering] [th01/zunsoft] Identify all global variables
Yup, the code for the first ZUN Soft logo is now completely position-
independent and ready to be decompiled.

(Also, TIL that the PC-98 GRCG has hardware support for double-buffering
through page flipping. Heh, at least one feature that makes it a viable system
for games...)
2015-01-13 18:10:24 +01:00
nmlgc 44146c4749 [Reduction] GRCG modes 2015-01-12 22:48:13 +01:00
nmlgc 0b89233e48 [Reverse-engineering] Music Room comment loading 2014-12-24 21:39:34 +01:00
nmlgc f0ab47fd18 [Reduction] Hardware text colors and effects
Turns out we're not quite done with reduction yet, as there still are a bunch
of macros in master.h that #define PC-98-specific hardware constants and I/O
ports.
2014-12-20 22:36:38 +01:00
nmlgc 04ab24d669 [th01] Undo the floating-point hacks 2014-12-19 06:11:42 +01:00
nmlgc a07e5fad42 [Reverse-engineering] Slot-based PI display
Also covering the two variations for blitting only every second row or
blitting only a 320x200 quarter, as seen in the endings.

So yeah, there's indeed nothing wrong with piread.cpp. TH03 just uses that
separate function that only blits every second row of an image, and indeed
always loads the entire image as it would appear in a PNG conversion. Here's
what happens if you display these images using the non-interlacing function:
https://www.dropbox.com/s/885krj09d9l0890/th03%20PI%20no%20interlace.png
2014-12-18 14:36:43 +01:00
nmlgc 721aa18de8 [Reduction] #709: graph_pack_put_8_noclip
Yeah, it's really just a copy of that function with 3 instructions deleted.
2014-12-17 13:04:21 +01:00
nmlgc bead27b781 Use TASM calling convention syntax for previously identified ZUN functions
With TH03 changing the calling convention for most of the code from __cdecl to
__pascal, I've been getting more and more confused about this myself. So,
let's settle on the following consistent syntax for function calls:

* C where the calling convention is actually __cdecl and where TASM's emitted
  __cdecl code matches the original binary
* PASCAL where the calling convention is actually __pascal
* STDCALL where the calling convention is actually __cdecl, but where
  the caller either defers stack cleanup (summing up the stack size of
  multiple functions, then cleaning it all in a single "add sp" instruction)
  or where the stack is cleared in a different way (e.g. "pop cx").

Unfortunately though, when using the ARG directive to automatically generate
an appropriate RET instruction for the given calling convention, TASM always
emits ENTER and LEAVE instructions even when no local variables are declared,
which greatly limits the number of functions where we can use that syntax. -.-
2014-12-16 05:53:56 +01:00
nmlgc 3b497286a5 [th03/zunsp] Initial state
This one contains 4 instructions (on lines 462, 528, 534 and 1309,
respectively) for which TASM chooses different opcodes than the original
compiler.
2014-12-13 23:58:08 +01:00
nmlgc fe90349a53 [th02/zuninit] Initial state
AKA "the source of the infamous STOP message".

This is pretty much irreducible assembly code, so it may very well be that we
don't even touch this file ever again, but at least it completes our build.
2014-12-11 19:48:34 +01:00
nmlgc 1be1cebc21 Update the Readme file to match the recent ZUN.COM developments
No point in having lame excuses for not dumping it when we've actually started
doing so.
2014-12-09 22:38:34 +01:00