Monday, 13 November 2017

On Yer Bike! (Or at least, My Bike)...

Since it has been over 2 weeks since the last update, I thought I'd better post something.

Whilst I have been focusing on life off the computer lately (namely an 82km fun/charity bicycle ride) I have managed to make some progress. What I haven't made any progress on, however, is the elusive 'pass right through some objects' bug.

As a first step towards compiled sprites, I computed the pre-shifted bitmaps for the asteroids. Still largely unoptimised, they at least eliminate the need for a pair of table look-ups for every byte rendered on the screen, and the associated set-up calculations required for that. The extra data also required that I start to use the lower 16KB of the cartridge ROM area.

Unfortunately I can't find the speed/profile numbers that I thought I'd recorded for the game before changing the asteroids. I could quite easily restore a copy from version control, but I'm too lazy to do that at the moment. FTR though, the start of the attract mode is now running about 33% too slow, keeping in mind that the I've only changed the asteroids, and they're still not compiled sprites.

One thing I did notice when counting cycles in my new code though was that there may be room for improvement in my Knight Lore sprite rendering routine. I had previously assumed the post-incrementing index instructions were the most efficient but depending on the situation, it may be faster to use a series of 5-bit constant offset instructions and then adjust the index register (LEA) afterwards.

Next step is to produce compiled sprites for the asteroids and see if that makes much difference. If my theories are correct though, it probably won't make as much difference as pre-shifting the data.

Friday, 27 October 2017

Flicker-be-gone! (mostly)

No progress on the collision-detection bug so I decided to implement the double-buffering (page flipping) since it is almost trivial and won't be affected by any other programming issue.

The arcade machine uses a pair of so-called "ping-pong" buffers to allow the DVG to render the frame whilst the CPU is building the next frame. This comes in very handy indeed on the raster ports (Apple IIGS, Coco3) when erasing the previous frame before rendering the current frame.

Of course double-buffering requires erasing the frame prior to the previous frame. The easiest way to implement this with the current architecture is to extend the 2x buffers to 4x and modify the "ping-pong" logic slightly. No more than a handful of instructions in a few strategic locations...

At this point the game is running quite slowly due to the sub-optimal (to put it mildly) erase/render code, so there's little point synchronising the page flipping to the VBLANK and therefore the video still exhibits some flicker. However it is much improved and gives a taste for things to come...

UPDATE: Tonight I thought I'd add some profiling code before starting on any more of the optimisations. When you first start a game (with 4 asteroids on-screen) it's hovering around 55fps. When things get a lot busier, it's down around 20fps, and the lowest I've encountered is 13fps. And when there's all-but-nothing to render, it hits 89fps.

Will be interesting to see where it goes from here...

UPDATE 2: I've just optimised the copyright rendering. The copyright is unique in that it is rendered every frame, in a fixed location, and therefore never needs to be erased.

After some experimentation, and without resorting to stack blasting (which I can't see being optimal in this case due to the OR'ing operation), I came up with the following for each line of 4 words (Y is the video address):

LDD #0x1234
LDD #0x5678

That's the best I can come up with late on a Friday night (37 -> 22/24 cycles/line). Improvements welcome!

Tuesday, 24 October 2017

A progress report on lack of progress.

Today I had the chance to review the collision-detection code. The bad news is that I couldn't see anything amiss. I did find a few minor issues to do with ADC/ADD but they don't appear to be the cause of the bug. I also managed to effect a few minor optimisations.

I'm still hanging my hat on an issue with the core code, rather than the display mapping. I say that because a lot of the time the shots and objects are spot-on - even the small asteroids and small saucer - but then a shot will pass right through the middle of the large saucer. That's not just a few pixels off... more like a logic bug.

Aside from revisiting the collision-detection code again, I don't have any further theories on the matter. This might turn out to be a tough one.

I've been holding off on the optimisations up until now for a few reasons. One, it's simply nice to have the rest of the porting 100% complete. Two, it's easier to tweak things like display mapping with brain-dead code. And finally, I didn't want to find myself in the situation where I had to re-optimise certain routines because something fundamental wasn't quite right.

Having said all that, I'm wondering whether it is actually safe to press on with the optimisations now and revisit this issue down the track - assuming it doesn't have anything to do with display mapping. I don't want to get bogged down debugging this and lose momentum (again).

At least part of the optimisation - double buffering - won't be affected either way and should be relatively straight-forward. From there it gets more involved with compiled sprites, but I could get a start on objects such as text and the copyright message.

Have to think about it...

Monday, 23 October 2017

Even more display tweaks, looks even better. Still not perfect...

More tweaks to the display mapping, and it's improved even more. There was an offset added in the core (arcade) code when adding the CUR command to the DVG display list that must be peculiar to the vector hardware; I needed to remove that offset to enable objects to use the entire 192 lines of the display. Norbert didn't have this issue because he all-but ignores the display list, except for rendering text and his seemingly arbitrary offset (after scaling) accounts for it - and now makes sense of course.

I've also added clipping to the screen for all objects except the exploding ship. My explosion rendering code differs quite a bit from Norbert's; he uses a generic pixel-plot routine that handles the clipping (it'll render pixels outside the visible display on line 191, which is odd). I will probably not bother with clipping until I optimise the graphics for the Coco3.

After a few more hours of coding and comparing the Atari and Coco3 versions, I'm convinced there's still an issue with collision-detection (apparent in the video below). I'm reasonably sure it's not the display mapping now as sometimes the shots go straight through the middle of the large saucer, and occasionally you can't seem to hit the smallest asteroids. I simply cannot reproduce either bug on Norbert's emulator.

So now I need to go back and review the collision code which is not altogether surprising since it pretty much "worked" straight away. In fact I'm hoping that is the issue because otherwise everything else seems spot-on now, and I can definitely move on to Coco3 optimisations once this issue is sorted - something I've been itching to do for some time now.

A few more fixes and it's looking good... but not perfect.

Another brief update; I seem to only get to work on it in snippets atm...

I've fixed the ship explosion offset (same as ship offset). I also fixed a long-standing bug in the erase routine for the extra life indicators.

It's looking pretty spot-on now as far as object placement goes, although occasionally a shot appears to pass right through an object. I've played a bit of Norbert's Atari emulator, and it doesn't appear to have this issue. So either there's a subtle bug in the 6809 core code, or there are some more offsets that I haven't noticed yet. Tonight I went through the rendering routines in Norbert's code again, and I don't see any more offsets applied.

From here I'll likely render the 1st frame in attract mode and compare all the plot positions for each object on Atari/Coco3. If they match (they're all large asteroids of course) I'll let it run for a fixed number of frames until a few get split and try again. Fortunately the attract mode is completely deterministic from a cold start.

After closer scrutinisation of Norbert's emulator, a few warts become more apparent.

Norbert has the same bug as I had; the first game from a cold start has 3 lives, and subsequent games start with 4. That's because the original code right-shifts a hardware I/O location mapped to a dipswitch, and checks for carry. Under emulation, that's simply a RAM location so although it's seeded with the correct dipswitch value (bit0=1) at initialisation, it's shifted out after the first game starts. I fixed this in the code (also on the Apple version), since unlike Norbert, I have the luxury of assembling the core.

The Atari version is also only rendered every 3rd frame, because the CPU is simply too slow to render every frame and have the game run at full speed (which I must admit, I still don't have a value for). At least, Norbert's code isn't anywhere near as optimised as it could be (an observation, not a criticism). That's fine for a busy screen, but when you're down to only a few small asteroids, the game is obviously too fast. There's no periodic interrupt throttling the game speed.

With any luck, I'm not too far off being in a position to start the Coco3 optimisations...

Saturday, 21 October 2017


Super-brief update... I've fixed the ship/shot offset issue. Nothing sinister at all - simply forgot the '#' character when adding the fixed pixel offsets to the accumulator!

Need to re-check the other offsets now, and fix the display issue at the bottom of the screen, and possibly add Y clipping - and I can then move on to optimisation for the Coco3!

Friday, 20 October 2017

Code that isn't executed has no bugs!

Very brief update.

After five days away from the keyboard I decided there was only one possible reason for the erase ship routine not working... and I was right! Because I cut-and-paste the render routine and used it for the erase by replacing the video writes with CLR, it simply wasn't possible for it to fail. And it wasn't actually failing at all - I simply wasn't calling it!

I have a jump table for the (tokenised) DVG commands, and for the previous iteration without any constant offsets, the ship was erased (for now) by a call to the generic erase_chr routine, since the ship is no larger than a character. So when I added new code to the render_ship routine that moved it...

Right now I'm where I was at with the IIGS version, with the exploding ship graphics as well. In short, all the rendering is done - it's just not done at exactly the right position on the screen. The shots are coming out near the nose of the ship, but it's still offset by a pixel or two in some orientations. Why is a complete mystery to me, as both the ship and shot appear to have a constant offset applied to them.

Most of the asteroid hits seem spot-on too, but occasionally I've seen a shot pass right through the middle of a saucer.

I was hoping things would "just work" with the scaling sorted but it appears there's something still amiss. I just hope it's not buried in the Atari display list code, because that's all Greek to me...

I'm tempted to forge ahead with the Coco3 optimisations at this point, but something is nagging at me to get it exactly right this time before moving on. At least once all the offsets are deduced and coded I can back-port to any Apple version(s)!