Bug hunt!



I thought I might take a look at the bug Mike Minkoff was trying to squash inside WCB.  You never know, modern tools like JzIntv might be more persuasive than those at Mike's disposal.  I'm not sure how far this will get as time is running out, and this is secondary to trying to get the document describing Rick's debugger finished, but let's see.

Mike's concern seems to have been the CPU stack, hence the check added at $52ad and the generation diagnostics screen if the stack pointer is greater than $316.


So, let's tweak JzIntv and get it to report how the "highwater mark" of the stack changes as WCB is played.  The following dirty hack to op_exec.c will do the necessary.

    int fn_MVO_Nr   (const instr_t *instr, cp1600_t *cp1600)
    {

        static int maxR6 = 0;

        if (instr->opcode.decoded.reg0 == 6 
            && cp1600->r[instr->opcode.decoded.reg0] > maxR6) {
            
            maxR6 = cp1600->r[instr->opcode.decoded.reg0];
            printf(">>> Max stack increased to %d by code at %d\n", 
                maxR6 + 1, cp1600->r[7]);
                
        }

    ...

And sure enough we get something like this being reported as the game is played:

    >>> Max stack increased to 781 by code at 4436
    >>> Max stack increased to 782 by code at 4561
    >>> Max stack increased to 783 by code at 5183
    >>> Max stack increased to 784 by code at 5163
    >>> Max stack increased to 785 by code at 5183
    >>> Max stack increased to 786 by code at 5163
    >>> Max stack increased to 787 by code at 5183
    >>> Max stack increased to 788 by code at 5183
    >>> Max stack increased to 789 by code at 5183
    >>> Max stack increased to 790 by code at 5163
    >>> Max stack increased to 791 by code at 5183
    >>> Max stack increased to 792 by code at 5183
    >>> Max stack increased to 793 by code at 5163
    >>> Max stack increased to 794 by code at 5183

The final decimal stack pointer value of 794 is equivalent to $31a in hex.  Now this is interesting, because it is larger than the value of $316 that is checked for by Mike's code, but in this instance the diagnostic screen was not shown and the game did not crash.  Whilst the stack pointer has progressed beyond $316, it is never higher than this value at the moment the check is made and all seems to go well. So perhaps a stack pointer value of $316 is a necessary condition for the crash, but not sufficient?

Following on from the previous investigation of WCB crashes we have an ability to quickly play specific computer vs computer games and assess what happens.  Of the 65535 games we played out last time, we can pick 100 that crashed and 100 that didn't and see what the stack is up to in both cases.

Looking at the 100 games that don't crash first.  In all cases, the high tide of the stack is one of $318, $319 or $31a and this value is caused by the PSHR R5 instruction at $143f in the EXEC.  So in all cases WCB could trigger the diagnostics screen, however, it doesn't because the timing of the diagnostic check does not align with the highest values of the stack.

Looking at the 100 games that generate a diagnostic screen next.  Whenever the debug screen is drawn, the high tide mark is between $32b and $330, well above the threshold of $316.  Whatever the final high tide value, it is established by the PSHR R5 instruction at $1669.  It is also interesting to note where the code was when it first pushed the stack beyond the highest known safe value of $31a.  There are five discrete addresses, $1008 (40 instances), $100c (8 instances), $143f (1 instance), $52bd (2 instances), $db87 (43 instances) and $dba7 (6 instances).  Now the last two of these stand out, both $db87 and $dba7 are in the debugger code! We know this is only called from the main game when writing out Mike's diagnostic screen.  This suggests that in some cases the stack test is aligning with stack values of less than $31a and the diagnostics screen is being triggered before the highest known safe value of $31a is seen.  In these circumstances could writing the diagnostics screen could be making things worse?  That is not clear, although $31a is the highest value currently known that might not result in a crash, $31b is not necessarily the lowest known stack value that might generate a crash.  At present this value is not known.

There is also another category of games, those that end with a crash, but with no diagnostic screen.  There aren't many of these, only 40 out of the 65535 games tested in 2017 end this way.  Their stack behaviour is very consistent, the high tide mark is $31c, established by the PSHR R5 at $143f, with the value first pushing through $31a as a consequence of the PSHR R1 at $142b.

I think the next step is to try to establish whether the diagnostic test and screen is actually making things worse...

Comments

  1. I have my theory about the stack test in the code: That code path is likely meant to be invoked from a specific context, such as an EXEC dispatch, and so the usual stack depth should be relatively fixed. It would only only exceed the $316 threshold if the EXEC had been "reentered" inappropriately—e.g. taking a new interrupt before returning from the previous.

    Thus, the threshold can be set much lower than the actual top of stack at that point in the code.

    ReplyDelete
    Replies
    1. ....and, I just looked, the code at $52B2 is an ISR vector, which actually gives a slightly different twist on my theory.

      Either it's re-entering the ISR before exiting it, or there's a call-stack in the code that doesn't leave enough stack for the ISR to operate cleanly. Based on previous experience, I still think the most likely situation is that the ISR re-entered itself.

      Delete
  2. I have my theory about the stack test in the code: That code path is likely meant to be invoked from a specific context, such as an EXEC dispatch, and so the usual stack depth should be relatively fixed. It would only only exceed the $316 threshold if the EXEC had been "reentered" inappropriately—e.g. taking a new interrupt before returning from the previous.

    Thus, the threshold can be set much lower than the actual top of stack at that point in the code.

    ReplyDelete

Post a Comment