Example: Dodonpachi

Get hi file size: <mame>/hi/ddonpach.hi  => size: 104 bytes



[XML]
<structure file=".hi">...
  <check>...
    <size>104</size>
  </check>
</structure>

Get hiscore.dat information
<mame>/hiscore.dat
...
ddonpach:
ddonpachj:
0:1016ea:64:00:05
0:101626:4:00:06

...

[XML]
<structure>...
  <check>...
    <definition>0:1016ea:64:00:05
                0:101626:4:00:06</definition
>
  </check>
</structure>

When the 5th score only is modified, bytes index 16 to 19 and bytes index 98 to 99 are modified.


bytes 16-19: 0030 2720
bytes 98-99: 0006

We need to compare these hexadecimal data to the ingame snapshot, where the 5th score visible is: 3027206


How to go from these bytes to this score?
Obviously, score uses 2 elements in the input structure. These 2 elements are combined into one value, using the formula:
(bytes 16->19)*10+(bytes 98->99), with integers value represented in base 16 (integer value = hexa value).

2 elements (SCORE1, SCORE2) needs to be defined
[XML]
<structure>
  <elt size="16" type="raw" id="RAW1"/> 
  <elt size="4" type="int"  id="SCORE1" base="16"/>
  <elt size="78" type="raw" id="RAW2"/> 
  <elt size="2" type="int"  id="SCORE2" base="16"/>
  <elt size="4" type="raw" id="RAW13"/>
</structure>

and one output field can sum the two elements using a dedicated format (sum), using also another "*10" format for the 1st part of the score.
[XML]
<output>
<field id="SCORE" format="sum_scores">
</output>
<format id="multiply_10"><multiply>10</multiply></format>
<format id="sum_scores">
  <sum>
    <field id="SCORE1" format="multiply_10"/>
    <field id="SCORE2"/>
  </sum>
</format>

As simple format can be automatically defined based on its id, the following syntax is allowed for "*10":
[XML]
<output>
<field id="SCORE" format="sum_scores">
</output>
<format id="sum_scores">
  <sum>
    <field id="SCORE1" format="*10"/> <!-- implicit format definition //-->
    <field id="SCORE2"/>
  </sum>
</format>

Then, it become obvious that the 4th other scores are stored just before, allowing us to use a loop for elements definition and a table for output.
[XML]
<structure>
<loop count="5"><elt size="4" type="int"  id="SCORE1" base="16"/></loop>
                <elt size="70" type="raw" id="RAW2"/>
<loop count="5"><elt size="2" type="int"  id="SCORE2" base="16"/></loop>
                <elt size="4" type="raw" id="RAW13"/>
</structure>

Using a loop, an internal array will store the elements:
index
column SCORE1
column SCORE2
0
46839552 0
1
590806 5
2
563768 0
3
554433 7
4
302720 6

So, an output table can define these columns from the internal array.
The table index will be displayed using a virtual RANK column, taken from the array index (src="index") and increased by 1 (format="+1").
[XML]
<output>
  <table>
    <column id="RANK" src="index" format="+1"/> <!-- implicit format definition //-->
    <column id="SCORE" format="sum_scores">
  </table>
</output>
<format id="sum_scores">
  <sum>
    <column id="SCORE1" format="*10"/> <!-- implicit format definition //-->
    <column id="SCORE2"/>
  </sum>
</format>

Seeing the ingame snapshot, some data are still missing: NAME, AREA, SPACESHIP, POWERUP, MAXHIT
Let's take NAME.
these data seem to be stored in bytes index 20 to 49, so 6 bytes per name (20->49 = 30 bytes, 30 / 5 = 6).


[XML]
<structure>
...
  <loop count="5"><elt size="6" type="text" id="NAME"/></loop>
...
</structure>

We need to understand how to extract accurate value in an ascii table to display these names.
Seeing hexa data, every odd bytes for a name is set to 01.
NAME1: 01C0 01A4 0194 => ? => PIE
If we skip them, we have 3 remaining bytes, for 3 letters in the ingame snapshot: good step.


[XML]
<elt size="6" type="text" id="NAME" byte-skip="odd"/>

We are still far from the real name :)
NAME1: C0 A4 94 => ? => PIE
Let put the letters in front of their hexa and decimal codes, as well as the targeted ascii decimal, to see if some transformations can be highlighted.

Letter ->hexa ->decimal ->all decimal -> transformation 1 -> decimal / 4 -> transformation 2 -> ascii decimal -> target ascii decimal
E 94 148 148
37 +32 69 69
F

152 +4 38 +32 70 70
G

156 +4 39 +32 71 71
H

160 +4 40 +32 72 72
I A4 164 164 +4 41 +32 73 73
J

168 +4 42 +32 74 74
K

172 +4 43 +32 75 75
L

176 +4 44 +32 76 76
M

180 +4 45 +32 77 77
N

184 +4 46 +32 78 78
O

188 +4 47 +32 79 79
P C0 192 192 +4 48 +32 80 80

So, 2 transformations can be applied: ascii step = 4 and ascii offset = 32
[XML]
<elt size="6" type="text" id="NAME" byte-skip="odd" ascii-step="4" ascii-offset="32"/>

Testing each possible letters and more specifically special characters, we find that 2 characters (" ", ".") don't match the standard ascii table with such algorithm and need a specific translation, using charset/char/src/dst.
[XML]
<structure>
...
<elt size="6" type="text" id="NAME" byte-skip="odd" ascii-step="4" ascii-offset="32" charset="ddonpach"/>
...
</structure>
...
<charset id="ddonpach">
  <char src="0x00" dst=" "/>
  <char src="0x38" dst="."/>
</charset>
...
<output>
    <table>
      ...
      <column id="NAME"/>
    </table>
</output>

Now, focusing on MAXHIT shows that we have again a sequence of 5 values, on 2 bytes, encoded in base 16, just before SCORE2.


[XML]
<structure>
  ...
  <loop count="5"><elt size="2" type="int" id="MAXHIT" base="16"/></loop>
  ...
</structure>
<output>
  <table>
    ...
    <column id="MAXHIT"/>
  </table>
</output>

SPACESHIP and POWERUP are quite similar but interleaved in one loop for the 2 sequence of elements and stored only on 1 byte each, before MAXHIT.

[XML]
<structure>
...
<loop count="5">
  <elt size="1" type="int"  id="SPACESHIP" base="16"/>
  <elt size="1" type="int"  id="POWERUP"   base="16"/>
</loop>
...
</structure>

<output>
    <table>
      <column id="SPACESHIP"/>
      <column id="POWERUP"/>
    </table>
</output>

If we want to display user-friendly information instead of integers for SPACESHIP and POWERUP, let's define another format using case/src/dst list.
[XML]
<output>
  <table>
    ...
    <column id="SPACESHIP"        format="spaceship"/>
    <column id="POWERUP"          format="powerup"/>
  </table>
</output>
 <format id="powerup">
  <case src="0" dst="SHOT"/>
  <case src="1" dst="LASER"/>
</format>

<format id="spaceship">
  <case src="0" dst="RED"/>
  <case src="1" dst="BLUE"/>
  <case src="2" dst="GREEN"/>
</format>

Last data is AREA.
The area is composed on 2 elements: the LOOP and the STAGE, located after the NAMEs.


[XML]
<structure>
  ...
  <loop count="5"><elt size="2" type="int"  id="LOOP"/></loop>
  <loop count="5"><elt size="2" type="int"  id="STAGE"/></loop>
  ...
</structure>

The display algorithm is somewhat complicated.
The area concatenate loop with stage through "-", only for a loop > 1.
For a loop <= 1, the area is the stage.
Also, if all loops and stage have been completed, the game display "ALL" instead of "<loop>-<stage>"

So, first, the column AREA will uses 2 formats, sequentially.
One ("area") to concatenate loop and stage and one ("area_all") to handle the specific case "ALL".
[XML]
<output>
  <table>
    ...
    <column id="AREA" format="area;area_all"/>
    ...
  </table>
</output>
<format id="area">
  <concat>
    <column id="LOOP"  format="default_loop;-"/>
    <column id="STAGE" format="default_stage"/>
  </concat>
</format>

<format id="area_all"><case src="254-7" dst="ALL"/></format>

The LOOP part will also use 2 formats: one to ignore loop value < 1 ("default_loop") and one ("-") to append a suffix to the loop, if the loop value if not empty.
[XML]
<format id="default_loop">
  <case src="0" dst=""/>
  <case src="1" dst=""/>
</format>
<format id="-"><suffix>-</suffix></format>

The STAGE will just handle a default value if value is 0.
[XML]
<format id="default_stage"><case src="0" dst="1"/></format>

Finally, 4 bytes remain at the end of the file, modified each time a new high score is put at the top.
So, we have here an element storing the TOP SCORE (= SCORE index 0 in the scores array).
This data can be displayed as an extra field, only visible when all data are requested (-ra), to let the default output be simple as possible.

[XML]
<structure>
...
  <elt size="4" type="int" id="TOP SCORE" base="16"/>
</structure>
<output>
  ...
  <field id="TOP SCORE" format="*10" display="extra"/>
</output>

Seeing the top score, we should expect to have an additional array of 5*2 bytes defining the last part of the score.
So, we can assume that the hiscore.dat location is not accurate enough and miss to store the full top score...

And that's it! The full XML description has been put on the table.
[XML]
<?xml version="1.0" encoding="utf-8" standalone="no"?>
<!DOCTYPE hi2txt SYSTEM "hi2txt.dtd">
<hi2txt>
  <structure file=".hi">
    <check>
      <!-- optional: if defined and hiscore.dat provided, it allows to select the good structure version the provided file //-->
      <definition>0:1016ea:64:00:05
                  0:101626:4:00:06</definition>
      <!-- optional: if defined, it allows to select the good structure version the provided file
                     useful if multiple structures are defined, but hiscore.dat not provided //-->
      <size>104</size>
    </check>
    <loop count="5"><elt size="4" type="int"  id="SCORE1"    base="16"/></loop>
    <loop count="5"><elt size="6" type="text" id="NAME"      byte-skip="odd" ascii-step="4" ascii-offset="32" charset="ddonpach"/></loop>
    <loop count="5"><elt size="2" type="int"  id="LOOP"/></loop>
    <loop count="5"><elt size="2" type="int"  id="STAGE"/></loop>
    <loop count="5"><elt size="1" type="int"  id="SPACESHIP" base="16"/>
                    <elt size="1" type="int"  id="POWERUP"   base="16"/></loop>   
    <loop count="5"><elt size="2" type="int"  id="MAXHIT"    base="16"/></loop>
    <loop count="5"><elt size="2" type="int"  id="SCORE2"    base="16"/></loop>
                    <elt size="4" type="int"  id="TOP SCORE" base="16"/>
  </structure>
   
  <output>
    <table>
      <column id="RANK" src="index" format="+1"/>
      <column id="SCORE"            format="score"/>
      <column id="NAME"/>
      <column id="AREA"             format="area;area_all"/>
      <column id="SPACESHIP"        format="spaceship"/>
      <column id="POWERUP"          format="powerup"/>
      <column id="MAXHIT"/>
    </table>
    <field id="TOP SCORE" format="*10" display="extra"/>
  </output>
   
  <format id="+1"><add>1</add></format> <!-- not strictly necessary //-->
  <format id="*10"><multiply>10</multiply></format> <!-- not strictly necessary //-->
  <format id="score">
    <sum>
      <column id="SCORE1" format="*10"/>
      <column id="SCORE2"/>
    </sum>
  </format>
  <format id="default_loop">
    <case src="0" dst=""/>
    <case src="1" dst=""/>
  </format>
  <format id="-"><suffix>-</suffix></format>
  <format id="default_stage"><case src="0" dst="1"/></format>
  <format id="area">
    <concat>
      <column id="LOOP"  format="default_loop;-"/>
      <column id="STAGE" format="default_stage"/>
    </concat>
  </format>
  <format id="area_all"><case src="254-7" dst="ALL"/></format>
  <format id="powerup">
    <case src="0" dst="SHOT"/>
    <case src="1" dst="LASER"/>
  </format>
  <format id="spaceship">
    <case src="0" dst="RED"/>
    <case src="1" dst="BLUE"/>
    <case src="2" dst="GREEN"/>
  </format>
 
  <charset id="ddonpach">
    <char src="0x00" dst=" "/>
    <char src="0x38" dst="."/>
  </charset>
</hi2txt>

When hi2txt will decode these high scores, the output will be something like this:
RANK|SCORE|NAME|AREA|SPACESHIP|POWERUP|MAXHIT
1|468395520|PIE|ALL|RED|SHOT|359
2|5908065|OSD|1|BLUE|LASER|96
3|5637680|PIE|2-3|GREEN|SHOT|139
4|5544337|H.S|1|BLUE|LASER|96
5|3027206|PIE|5|RED|SHOT|170

TOP SCORE
468395520

And a front-end can then easily display (with some fancy colors):
RANK     SCORE NAME AREA SPACESHIP POWERUP MAXHIT
   1 468395520  PIE  ALL       RED    SHOT    359
   2   5908065  OSD    1      BLUE   LASER     96
   3   5637680  PIE  2-3     GREEN    SHOT    139
   4   5544337  H.S    1      BLUE   LASER     96
   5   3027206  PIE    5       RED    SHOT    170