Dungeon Master: Swoosh Construction Kit - A data extractor




"Oh, wow--that totally rocks!
Thanks a million. You've saved me an incredible amount of work."

Mon Ful Ir, Aug 29 2008

"So I checked out the XSLT converter and my goodness: this is an amazing piece of work!
It's fantastic that you can convert DUNGEON.DAT into XML because this info can be used by anyone now (well, anyone who knows how to use XML files).
Excellent work!"

Gambit, Feb 13 2008

"Greatstone's work looks interesting."
Paul Stevens, Feb 14, 2008

i'm gathering info enough to render complete dungeon viewport (DM snes style) in some ways: (i) using sck (swoosh construction kit) to extract viewport information. it includes decoder for "558 item", very excellent!
Kentaro, Mar 11 2007


Technical documentation - FTL file format

This section is devoted to the description of the FTL file format, supported by the sck tool. If the Encyclopedia site rewrites the "technical documentation" part to easily add more descriptions, perhaps this section will be included in it.
1.0 (26 May 2006): initial release.

Summary

These FTL files are frequently used in DM, CSB, DM2 (Amiga, X68000, SegaCD, MegaCD) games to store resources as:

  • logo animation (FTL swoosh),
  • language selection screen (english, french, german),
  • insert disk screen (blue disk),
  • error screens (damaged master disk, 1 megabyte required),
  • utility disk menu (introduction, champion editor, oracle hints),
  • main program.
The FTL format is a proprietary file format done by FTL and is based on the Amiga hunks, essential part of the Amiga executables.

Global structure:
  • common header
  • headers of all hunks
  • hunk 1
  • hunk 2
  • ...
At least 3 hunks are required:
  • HUNK_BSS (containing bss and jump data),
  • HUNK_DATA (containing the data as images, palettes, sounds, texts, animation, etc.),
  • HUNK_CODE (containing the code).
The uncompressed area 1 of the HUNK_DATA can be described by a mapfile and corresponding items can be extracted by the sck tool.

The following notation has been used:

0x1234hexadecimal word 1234
1011b1011 in binary, 11 in decimal
26:offset 26 in decimal
nibble4 bits
byte8 bits = 2 nibbles
word16 bits = 2 bytes
dword32 bits = 2 words

Structure

Common header (20 bytes):
00: 2 bytes: magic identifier (0x6160)
02: 1 word: this header checksum, called "common header checksum". see note 1 to know how to compute it.
04: 1 word: unknown
06: 12 bytes: unknown
18: 1 word: number of hunks. 3 hunks at least are required.
For each hunk, there is a "Hunk header" (12 bytes):
00: 1 word: hunk type. 3 hunk types are known and required: 0x0010 which is the jumps table, 0x0011 which is the data and 0x0012 which is the code.
02: 1 word: unknown
04: 1 dword: offset in the file from the beginning to find the corresponding hunk. ex: the first hunk offset is 56 (common header + 3 hunk header = 20 + 3*12 = 56).
08: 1 dword: size of the hunk in the file beginning at the previously found offset.
For each hunk, this is the "Hunk" itself (<hunk_size> bytes):
if hunk type is 0x0010 (HUNK_BSS)
00: 1 dword: bss and jump table size in memory. see note 6.
04: 1 dword: size of area 1 of hunk 0x011 (HUNK_DATA) in memory. Basically, this area of the HUNK_DATA is the interesting part of a FTL file to find some resources to extract.
as this area can be compressed, the size in the file can be different from the size in the memory, after decompression. see the HUNK_DATA description.
08: 1 dword: jump table size in memory.
12: 1 dword: bss size in memory.
16: 1 dword: unknown.
20: 1 dword: hunk 0x0012 (HUNK_CODE) size in memory, after decompression if necessary.
as this area can be compressed, the size in the file can be different from the size in the memory. see the hunk 0x0012 (HUNK_CODE) description.
24: 1 dword: HUNK_DATA size in file.
as this area can be compressed, the size here is the size before any decompression.
28: 1 word: unknown.
30: 1 word: unknown.
32: 1 word: unknown.
34: 1 word: HUNK_BSS checkum. see note 2 to know how to compute it.
36: 1 word: HUNK_CODE checkum. see note 5 to know how to compute it.
38: 1 word: HUNK_DATA checkum. see note 4 to know how to compute it.
for each jump (number of jump = jump table size in memory read in this hunk / 8)
40+4*jump_index: 1 dword: jump.

if hunk type is 0x0011 (HUNK_DATA)
00: 1 word: unknown.
02: 1 dword: size of area 1 of this hunk.
06: 1 dword: size of area 2 of this hunk.
10: size_area_1 bytes: this is the interesting part of a FTL file to find some resources to extract.
this area is compressed and resources here cannot be extracted directly.
the size of the uncompressed area 1 can be found in the HUNK_BSS index 04.
the compression algorithm in very simple: it shrinks the consecutive 0x00.
if 2 consecutive 0x00 are found, they must be copied in the uncompressed data and the next 2 bytes, converted as a word, is the number of additonal 0x00 to write in the uncompressed data.
see note 7 to know how to decompress it.
10+size_area_1: size_area_2 bytes: area 2 which is the reloc16 zone (relocation information on 16 bits).

if hunk type is 0x0012 (HUNK_CODE)
if byte 1 = 0x52 and byte 2 = 0x23 (all this hunk is compressed)
00: 2 bytes: magic identifier (0x5223).
02: 2 bytes: magic identifier (0x5223).
04: 1 dword: number of iterations needed to do the decompression.
08: 1920 words: table of the most frequent words used by the decompression.
1928: x bytes: compressed code. see note 8 to know how to decompress it.
else (no compression)
00: hunk_size bytes: code without any compression. so uncompressed_code = code.

Notes:
Note 1: How to compute the common header checksum
To compute the common header checksum, it is necessary to have the common header and all the hunk headers, so the 56 first bytes.
First, sum the last 16 bytes of the common header (to not take into account the magic and this checkum) and each byte must be multiplied by its index in the header.
Note that the resulting checksum must be a word: consequently, the maximum value is 0xFFFF or 65536.
  int checkum = 0;
  for (int i = 4; i < 20; i++) {
    checksum += (byte at index i) * i;
    checksum %= 0xFFFF;
  }
Then, sum all the bytes of the 3 hunk headers and each (unsigned) byte must be multiplied by its index in the "hunk headers" part + 1.
Note that here, the index multiplier doesn't take into account the common header but start at the first hunk header.
Note also that 1 is added to this index.
ex:
  hunk index = 0
  checksum += (byte at index 0 of the hunk 1 header) * (0 + (12 * hunk index) + 1)
  checksum += (byte at index 1 of the hunk 1 header) * (1 + (12 * hunk index) + 1)
  ...
  checksum += (byte at index 11 of the hunk 1 header) * (11 + (12 * hunk index) + 1)
  hunk index = 1
  checksum += (byte at index 0 of the hunk 2 header) * (0 + (12 * hunk index) + 1)
  checksum += (byte at index 1 of the hunk 2 header) * (1 + (12 * hunk index) + 1)
  ...
  checksum += (byte at index 11 of the hunk 2 header) * (11 + (12 * hunk index) + 1)
  hunk index = 2
  checksum += (byte at index 0 of the hunk 3 header) * (0 + (12 * hunk index) + 1)
  checksum += (byte at index 1 of the hunk 3 header) * (1 + (12 * hunk index) + 1)
  ...
  checksum += (byte at index 11 of the hunk 3 header) * (11 + (12 * hunk index) + 1)
  int number_of_hunks = 3;
  int hunk_header_size = 12;
  for (int i = 0; i < number_of_hunks; i++) {
    for (int j = 0; j < hunk_header_size; j++) {
      checksum += hunk_headers[i][j] * (j + hunk_header_size*i + 1);
      checksum %= 0xFFFF;
    }
  }
[back to structure]
Note 2: How to compute the HUNK_BSS checksum
To compute the HUNK_BSS checkum, take this hunk word by word BUT ignore the word at index 34 containing this checksum, and sum each of them.
  int checksum = 0;
  for (int i = 0; i < HUNK_BSS_size; i+=2) {
    checksum += word(HUNK_BSS[i], HUNK_BSS[i+1]);
    checksum %= 0xFFFF;
  }
  checksum -= HUNK_BSS_checkum;
  checksum %= 0xFFFF;
[back to structure]
Note 3: Checksum on checksums
This hunk contains the size of the two other hunks and their checksums: modifying something in these hunks will lead to a false checksum for this hunk! [back to structure]
Note 4: How to compute the HUNK_DATA checkum
To compute the HUNK_DATA checkum, sum all the hunk bytes, as found in the file (without any decompression of area 1), unsigned byte by unsigned byte.
  int checksum = 0;
  for (int i = 0; i < HUNK_DATA_size; i++) {
    checksum += HUNK_DATA[i];
    checksum %= 0xFFFF;
  }
[back to structure]
Note 5: How to compute the HUNK_CODE checkum
To compute the HUNK_CODE checkum, sum all the uncompressed code bytes (if no compression = code bytes. if compression, without the first 1928 bytes, so only the really useful code), unsigned byte by unsigned byte.
  int checksum = 0;
  for (int i = 0; i < uncompressed_code_size; i++) {
    checksum += uncompressed_code[i];
    checksum %= 0xFFFF;
  }
[back to structure]
Note 6: BSS and jump
A BSS (Block Started by Symbol) section contains all reserved and uninitialized space in memory.
A jump section contains specific adresses.
[back to structure]
Note 7: How to decompress HUNK_DATA
The compression algorithm in very simple: it shrinks the consecutive 0x00.
If 2 consecutive 0x00 are found, they must be copied in the uncompressed data and the next 2 bytes, converted as a word, is the number of additonal 0x00 to write in the uncompressed data.
  byte[] uncompressed_area_1 = new byte[uncompressed_area_1_size];
  int uncompressed_area_index = 0;
  while (int i < size_area_1) {
    if  (area_1[i] == 0x00 && area_1[i+1] == 0x00) {
      uncompressed_area[uncompressed_area_index] += area_1[i]; uncompressed_area_index++;
      uncompressed_area[uncompressed_area_index] += area_1[i+1]; uncompressed_area_index++;
      int additional_0x00 = word(area_1[i+2], area_1[i+3])
      for (int j = 0; j < additional_0x00; j++) {
        uncompressed_area[uncompressed_area_index] += 0x00; uncompressed_area_index++;
      }
      i += 4;
    } else {
      uncompressed_area[uncompressed_area_index] += area_1[i]; uncompressed_area_index++;
      uncompressed_area[uncompressed_area_index] += area_1[i+1]; uncompressed_area_index++;
      i += 2;
    }
  }
[back to structure]
Note 8: How to decompress HUNK_CODE
To decompress the code part, allocate a byte array with size = HUNK_CODE size in memory found in HUNK_BSS index 20.
Use also the number of iterations decoded just a few bytes before.
  word[] most_frequent_words = new word[1920];
  // ... this array is filled by the previous 1920 bytes
  byte[] uncompressed_code = new byte[uncompressed_code_size];
  int uncompressed_code_index = 0;
  for (int i = 0; i < number_of_iteration; i++) {
    nibble = get_nibble(compressed_code);
      if (nibble == 0xF) { // 1111b
        nibble_1 =  get_nibble(compressed_code);
        nibble_2 =  get_nibble(compressed_code);
        uncompressed_code[uncompressed_code_index] = byte(nibble_1, nibble_2); uncompressed_code_index++;
        nibble_1 =  get_nibble(compressed_code);
        nibble_2 =  get_nibble(compressed_code);
        uncompressed_code[uncompressed_code_index] = byte(nibble_1, nibble_2); uncompressed_code_index++;
      } else if (nibble >= 0x8) { // 1000b
        nibble_1 =  get_nibble(compressed_code);
        nibble_2 =  get_nibble(compressed_code);
        word = word(nibble, nibble_1, nibble_2);
        // maximum value for word is achieved which nibble = 1110b, nibble_1 = 1111b, nibble_2 = 1111b
        // as nibble must be less than 1111b in this else
        // maximum value for word is 3839
        // minimum value for word is achieved which nibble = 1000b, nibble_1 = 0000b, nibble_2 = 0000b
        // as nibble must be >= 1000b in this else
        // minimum value for word is 2048
        // consequently, to use word as an index in the most frequent words, word must be 0 < word < 1920
        // so...
        word = word - 1920;
        // maximum value for word is 3839 - 1920 = 1919
        // minimum value for word is 2048 - 1920 = 128
        // in this else, the indexes used to get data from the most frequent words table are: 128 <= word <= 1919
        uncompressed_code[uncompressed_code_index] = get_most_significant_byte(most_frequent_words[word]);
        uncompressed_code_index++;
        uncompressed_code[uncompressed_code_index] = get_less_significant_byte(most_frequent_words[word]);
        uncompressed_code_index++;
      } else {
        nibble_1 =  get_nibble(compressed_code);
        byte = byte(nibble, nibble_1);
        // maximum value for byte is achieved with nibble = 0111b, nibble_1 = 1111b
        // as nibble must be > 1000b in this else (so >= 0111b)
        // maximum value for byte is 127
        // minimum value for byte is achieved with nibble = 0000b, nibble_1 = 0000b
        // minimum value for byte is 0
        // in this else, the indexes used to get data from the most frequent words table are: 0 <= byte <= 127
        uncompressed_code[uncompressed_code_index] = get_most_significant_byte(most_frequent_words[byte]);
        uncompressed_code_index++;
        uncompressed_code[uncompressed_code_index] = get_less_significant_byte(most_frequent_words[byte]);
        uncompressed_code_index++;
        }
      }
    }
  }
[back to structure]

Comment

With this format specification, we now know where is the data part inside FTL files, containing palettes, images, sounds, animation, etc.
With the corresponding mapfile describing this specific part (a lot has been already written), we know where and what are the items in this data part.
Consequently, we are able to modify them, then compress the data part, modify related sizes in the hunks and common header, compute new checksums for DATA hunk, BSS hunk and common header.
=> hacked and valid FTL files can now be created!

It is really interesting to see that we found:

  • 2 compression algorithms, not always used, to crunch some parts of a FTL file,
  • a checksum in the common header, depending on the data in all the hunks,
  • a checksum in the BSS hunk, depending on the data in the DATA and CODE hunks,
  • 3 different algorithms to compute all the checksums,
  • a complex algorithm to compute the common header checksum.
Finally, my personal opinion is that the FTL team has really put a lot of efforts to prevent anybody from hacking FTL files.
As the main program is a FTL file, we know why.
Combining this with the "flakey bits" of the original disk and you have a very well protected game!

Credits

meynaf, for his brilliant commented source file of the bje program found in his CSB Amiga 3.3 (en-fr-ge) version, hacked to run on the hard-disk.
Christophe Fontanel, for his help in the 68000 assembly nightmare.