While I crack on with looking at the code above address $d3a5 I thought I might share a little about the reverse engineering process. Armed with the disassembly listing of the ROM generated by dis1600, this is largely a process of adding comments and inferring the purpose of the code by visualising its effects. For example, if we follow the mysterious jump to $db87 we identified yesterday, we get to this little code fragment:
L_DBA7: PSHR R5 ; DBA7 0275 [.] CMPI #$a, R0 ; DBA8 0378 000A [..] BC L_DBB0 ; DBAA 0201 0004 [..] ADDI #$10, R0 ; DBAC 02F8 0010 [..] B L_DBB2 ; DBAE 0200 0002 [..] L_DBB0: ADDI #$17, R0 ; DBB0 02F8 0017 [..] L_DBB2: SLL R0, 2 ; DBB2 004C [L] SLL R0, 1 ; DBB3 0048 [H] XORR R3, R0 ; DBB4 01D8 [.] MVO@ R0, R4 ; DBB5 0260 [.] PULR R7 ; DBB6 02B7 [.]
It is possible to sanitise a few commands (replacing the PSHR and PULR with BEGIN and RETURN at the start and end of the subroutine) and add some comments, giving this:
L_DBA7: BEGIN ; CMPI #$a, R0 ; BC L_DBB0 ; if R0 >= 10, goto L_DBB0 ; ADDI #$10, R0 ; add 16 to R0 B L_DBB2 ; ; L_DBB0: ; ADDI #$17, R0 ; add 23 to R0 ; L_DBB2: ; SLL R0, 2 ; multiply by 8 SLL R0, 1 ; XORR R3, R0 ; merge in R3 MVO@ R0, R4 ; write result to R4 RETURN ; done
With some knowledge of how Intellivision graphics work and the characters in the GROM, it is then possible to infer the purpose of a small block like this and enhance the description further giving:
; ------------------------------------------------------------------------------ ; Write a 4 bit value to the screen as a single hex character ; Inputs: R0 = hex value ; R3 = text colour ; R4 = screen address ; Outputs: None L_DBA7: BEGIN ; CMPI #$a, R0 ; BC L_DBB0 ; if 4-bit value >= 10, goto L_DBB0 ; ADDI #$10, R0 ; use GROM 0-9 alphabet (add $10 to B L_DBB2 ; the value) ; L_DBB0: ; ADDI #$17, R0 ; shift to GROM A-F alphabet (add $17 ; to the value) L_DBB2: ; SLL R0, 2 ; convert GROM value to BACKTAB value SLL R0, 1 ; XORR R3, R0 ; merge in colour and any other attributes MVO@ R0, R4 ; write the resulting character to screen RETURN ; done
At this point we can back-track to the mystery function located at $db87 which calls the subroutine at $dba7. This function now also makes sense and can be documented as follows:
; ------------------------------------------------------------------------------ ; Write a 16-bit value to the screen as a 4 character hex number ; Inputs: R1 = value ; R3 = colour ; R4 = screen address if first hex digit ; Outputs: None L_DB87: BEGIN ; MOVR R1, R0 ; clone value in R1 SWAP R0, 1 ; isolate the most significant 4-bit SLR R0, 2 ; nibble / hex digit SLR R0, 2 ; ANDI #$f, R0 ; JSR R5, L_DBA7 ; call $dba7 to write it to the screen ; MOVR R1, R0 ; isolate the second nibble / hex digit SWAP R0, 1 ; ANDI #$f, R0 ; JSR R5, L_DBA7 ; call $dba7 to write to the screen ; MOVR R1, R0 ; isolate the third nibble / hex digit SLR R0, 2 ; SLR R0, 2 ; ANDI #$f, R0 ; JSR R5, L_DBA7 ; call $dba7 to write to the screen ; MOVR R1, R0 ; isolate the least signifcant nibble / ANDI #$f, R0 ; 4th hex digit JSR R5, L_DBA7 ; call $dba7 to write to the screen ; RETURN ; done
And that is it as far as understanding code is concerned. Wash, rinse, repeat. Only a couple of thousand instructions to do.
The other item of work when reverse engineering is inferring the use of RAM addresses. For example, in the following fragment it looks suspiciously as though the addresses $d28c and $d28d are being used to hold the value of an "address of interest" in the game code under test. Perhaps they could represent a cached value of the CPU's current program counter?
L_D9EE: ... skip a bit ... ; SDBD ; load address stored in $d28c, $d28d MVII #$d28c, R4 ; into R5 SDBD ; MVI@ R4, R5 ; MVI@ R5, R1 ; load content of address in R5 to R1 PSHR R5 ; store address content loaded was from + 1 MVII #$7, R3 ; load white colour MVII #$278, R4 ; load screen address JSR R5, L_DB87 ; write 16-bit value to the screen ; PULR R5 ; restore R5 (address of interest + 1) MOVR R1, R2 ; copy data to R0 and R2 MOVR R1, R0 ; MOVR R5, R4 ; move address + 1 to R4 JSR R5, L_DBB7 ; interpret the opcode as a mnemonic
In this fragment, the 16-bit value currently stored in $d28c and $d28d is used as an address to load data from. The data read is then written to the start of the sixth line of the screen, before the code heads off trying to interpret its value as a mnemonic. As already suggested, we can infer from this that the data retrieved is assumed to be a CPU instruction (otherwise why interpret it as a mnemonic?), and that the addresses it was read from represent a pointer to program code, probably the program counter.
Additionally, it suggests that the region of memory where $d28c and $d28d reside are workspace for the debugger, and therefore, are RAM rather than ROM. Interestingly $d28c and $d28d fall within the block of zeroed memory we identified yesterday from $d279 to $d3a5, which seems to mark the boundary between WCB code and the debugger. Perhaps this block is all workspace for the debugger? It makes sense that the debugger would have its own workspace, so as not to interfere with the state of the game. If so, this entire block must be RAM rather than ROM, which may explain why it is predominantly initialised to zero. We can probably go further; because we might want to change the WCB code or data under test for diagnostic purposes, the full address space should probably be implemented as RAM when using the debugger. Unfortunately, this then leaves the door open to the existence of the twin evils of dynamic and self modifying code within the debugger. Both have the potential to make reverse engineering more challenging, as the game code executed may be tweaked by the debugger, and therefore, might be different to that in the assembly listing. Finally, we can tell that because the 16-bit address is split across addresses $d28c and $d28d the RAM used is intended to be 10-bits wide. 16-bit memory would not require this split and 8-bit memory is too small to be used to store the 10-bit wide instructions of the CP-1610 CPU.
I hope this little example shows that this kind of reverse engineering is really 90% perspiration and only 10% inspiration. Never-the-less I think it can be a fun puzzle, in a similar way to things like Sudoku.
Comments
Post a Comment