The Touhou PC-98 Restoration Project
Go to file
nmlgc 6d422052ca [Reduction] #570-573: realcvt 2014-11-02 08:27:17 +01:00
libs [Reduction] #570-573: realcvt 2014-11-02 08:27:17 +01:00
th02/strings Identify and reduce gaiji strings across all executables 2014-09-13 12:26:33 +02:00
th04/strings Identify and reduce gaiji strings across all executables 2014-09-13 12:26:33 +02:00
.gitattributes Start out with th05's OP.EXE 2014-06-26 22:47:15 +02:00
ReC98.inc [Reduction] #569: xcvt 2014-11-02 06:55:48 +01:00
readme.md th01/reiiden: Initial state 2014-08-09 03:44:10 +02:00
th01_fuuin.asm [Reduction] #570-573: realcvt 2014-11-02 08:27:17 +01:00
th01_op.asm [Reduction] #570-573: realcvt 2014-11-02 08:27:17 +01:00
th01_reiiden.asm [Reduction] #570-573: realcvt 2014-11-02 08:27:17 +01:00
th01_reiiden_2.inc [Reduction] #563-568: Floating-point emulator initialization 2014-11-01 17:09:13 +01:00
th01_zunsoft.asm Trick TASM into not creating 32-bit default segments 2014-10-31 08:17:54 +01:00
th02_main.asm Trick TASM into not creating 32-bit default segments 2014-10-31 08:17:54 +01:00
th02_maine.asm Trick TASM into not creating 32-bit default segments 2014-10-31 08:17:54 +01:00
th02_op.asm Ensure the correct amount of padding between the TEXTC and DSEG segments 2014-11-01 09:54:57 +01:00
th03_main.asm Ensure the correct amount of padding between the TEXTC and DSEG segments 2014-11-01 09:54:57 +01:00
th03_mainl.asm Ensure the correct amount of padding between the TEXTC and DSEG segments 2014-11-01 09:54:57 +01:00
th03_op.asm Trick TASM into not creating 32-bit default segments 2014-10-31 08:17:54 +01:00
th04_main.asm Trick TASM into not creating 32-bit default segments 2014-10-31 08:17:54 +01:00
th04_main_seg3+4.inc Identify all remaining nopcalls 2014-10-07 06:32:20 +02:00
th04_maine.asm Trick TASM into not creating 32-bit default segments 2014-10-31 08:17:54 +01:00
th04_op.asm Trick TASM into not creating 32-bit default segments 2014-10-31 08:17:54 +01:00
th05_main.asm Trick TASM into not creating 32-bit default segments 2014-10-31 08:17:54 +01:00
th05_main_seg3+4.inc Ensure the correct amount of padding between the TEXTC and DSEG segments 2014-11-01 09:54:57 +01:00
th05_maine.asm Trick TASM into not creating 32-bit default segments 2014-10-31 08:17:54 +01:00
th05_op.asm Trick TASM into not creating 32-bit default segments 2014-10-31 08:17:54 +01:00

readme.md

The Touhou PC-98 Restoration Project ("ReC98")

Overview

This project aims to rebuild the first five games of the Touhou Project series by ZUN Soft (now Team Shanghai Alice), which were originally released exclusively for the NEC PC-9801 system, into a cross-platform, open and truly moddable form.

The original games in question are:

  • TH01: 東方靈異伝 ~ The Highly Responsive to Prayers (1997)
  • TH02: 東方封魔録 ~ the Story of Eastern Wonderland (1997)
  • TH03: 東方夢時空 ~ Phantasmagoria of Dim.Dream (1997)
  • TH04: 東方幻想郷 ~ Lotus Land Story (1998)
  • TH05: 東方怪綺談 ~ Mystic Square (1998)

With the games completely opened, we can then build an efficient environment for modifications on top. This project could easily achieve what thcrap has always wanted to do but struggled to realize, due to its nature of only being a patching framework targeted at 32-bit Windows binaries.

ReC98 is sort of inspired by PyTouhou, a similar project to provide a free/libre Python reimplementation of the engine of TH06, 東方紅魔郷 the Embodiment of Scarlet Devil. However, ReC98 trades proper free software decorum and compatibility to the original data for more accuracy in the beginning, and more freedom in the final implementation. This seems to be a more appropriate approach for the games in question, for two reasons:

  • From a cursory inspection of their code, these games appear to have not much of an "engine", much less a common one. The gameplay is mainly driven by stage- and boss-specific callback functions hardcoded into the executable, rather than the ECL scripts in the Windows games, which would merely require an alternate VM to interpret them.
  • These games stopped being sold in 2002, ZUN has confirmed on multiple occasions to have lost all the data of the "earlier games" [citation needed], and PC-98 hardware is long obsolete. In short, these games are as abandoned as they can possibly be, and are unlikely to ever turn a profit again.

Although this project might be classified as a remake of the games in question, it should be noted that ReC98 has much higher aspirations than, say, a remake in Danmakufu created from scratch. The main objective of this project is to provide exact, credible ports that fully replace the need for the proprietary, PC-98-exclusive original releases and their emulation for even the most conservative fan. Ultimately, this project should merely serve as the foundation for future remastered versions of these games, developed by third parties. And after all, preserving the game's source code is undoubtedly more valuable than preserving a bunch of binaries.

To achieve this, ReC98 has been kept in an openly available Git repository from day one. Each commit represents an atomic step along the way. This makes it easy to prove the correctness of the reconstruction process, or to track down potential regressions.

Moreover, a clear line will be drawn between the original content and fanmade modifications, which will only be available as optional packages. This also means that we won't ship the reconstructed games with any existing English translation patch. For the visible modifications that will be necessary (read: the "Mods" screen in the main menu), great care will be taken to keep them in the spirit of the original PC-98 games.

Is this even viable?

It certainly seems to be. During the development of the static English patches for these games, we identified two main libraries used across all 5 games, and even found their source code. These are:

  • master.lib, a 16-bit x86 assembly library providing an abstraction layer for all components of a PC-98 DOS system
  • as well as the Borland C/C++ runtime library, version 4.0.

These two make up a sizable amount of the code in all the executables. In TH05, for example, they amount to 74% of all code in OP.EXE, and 40% of all code in MAIN.EXE. That's already quite a lot of code we do not have to deal with. Identifying the rest of the code shared across the games will further reduce the workload to a more acceptable amount.

With the Debug edition of Neko Project II, we also have an open-source PC-9821 emulator, capable of running the games. This will greatly help in understanding and porting all hardware-specific code.

Still, it will no doubt take a long time until this project will have made any visible and useful progress. Any help will be appreciated!

Roadmap

Reconstruction phase

First, we have to accurately reconstruct understandable C/C++ source code for all the games from the original binaries, while still staying on the PC-98 platform. To ensure the correctness of the process, each commit during this phase should result in a build that ideally is byte-identical to the previous commit.

Step 1: Dumping (done)

Based on the auto-analysis done by IDA, create assembly dumps for all 16 necessary executables, and edit them until they successfully recompile back into working binaries equivalent to ZUN's builds.

Note that we don't dump the ZUN.COM executables of TH02 and later. This file is essentially a package of multiple smaller executables, doing the following:

  • Check for enough free conventional memory
  • Check the sound hardware installed and run the appropriate PMD driver
  • Set up some interrupt vectors and code shared across the three executables that make up each game
  • Set up gaiji characters - a set of 254 [sic] custom 16x16 1-bit bitmaps that can be displayed on the superimposed text RAM by using specific character codes

The first two functions are not necessary in a cross-platform scenario, while the last two need to be completely rewritten for the ports anyway. Therefore, the ZUN.COM files will be this project's "required proprietary element" as long as we stay on the PC-98 platform.

Step 2: Reduction (current)

Identify duplicated functions and either replace them with references to the original libraries or move them to separate include files. In the end, only ZUN's own code and data will remain in the dumps created in Step 1.

Step 3: Reverse-engineering

Translate the remaining assembly back into platform-independent C code. Some PC-98-specific in-line assembly inside ZUN's code will remain, which will have to be kept and moved to a separate library for now.

Modernization phase

With the source code recreated, everything is possible! Let's turn them into the best games they can possibly be.

Step 4: Porting
  • Refactor Neko Project II's graphics and sound emulation into a library we can hook up to our code
  • Port all of master.lib and the shared hardware-specific library built in Step 3
  • Rewrite all DOS API calls (int 21h) into platform-independent alternatives
Step 5: Sanitization

Get rid of any bad coding practices and the remaining hardware-specific formats, while breaking compatibility to both PC-98 hardware and the original data in the process. This includes, but won't be limited to:

  • removing unnecessary encryption
  • using text and custom fonts instead of graphics where possible
  • standardizing graphics formats to PNG, replacing all the original ones (.grf, .cdg, .cd2, .bos, .mrs, .pi, .grp, as well as hardcoded bitmaps)
  • merging the three executables per game into one
  • changing gaiji characters into a semantically better representation (Unicode characters, alternate fonts, or simply normal graphics) on a case-by-case basis
Step 6: Moddability

Do whatever else is necessary to easily modify the game elements people like to modify. This may involve further changes to the formats used by the games, and the addition of one or more scripting languages (and subsequent porting of previously hardcoded functions). Mods will use the thcrap patch format, complete with support for patch stacking and dependencies, and will be easily selectable from a new menu added to every game.

Since this will most likely result in graphics mods that exceed the specifications of PC-98 hardware, there will also be an optional filter to reduce the rendered output to the original resolution of 640x400 and a 16-color palette, for the sake of keeping the original spirit.

Building

Currently, this code is only known to build with Borland's Turbo Assembler (TASM) and Turbo Linker (TLINK), Version 5.0 or later. Due to the large size of the initial assembly dumps, the 32-bit version of TASM (tasm32) is necessary to assemble them. However, TLINK32 does not support 16-bit DOS targets, so the 16-bit version of TLINK is needed for linking.

To sum up: Compile the .asm files with tasm32 /kh32768 /m /zn (on 32-bit/Windows), and link the resulting .obj files with tlink (on 16-bit/DOS).

Please let us know if there are any other build systems we can use instead!