Amiga Demo, Part 7

May 03, 2019

The seventh part of this blog series about the development of our demo. This part is focused on line and polygon streaming encoder and decoder (player).

#Introduction Polygon streaming was popularized by Spaceballs with their demo ‘State of the Art’ and sequal ‘9 Fingers’.

The technique has later been used by various groups in various ways. Basically line/polygon-streaming is a frame by frame animation of coordinates. In this case the coordinates are normally already transformed and projected. Thus effectively removing quite a bit of expensive math operations. Left is rasterization.

#Encoding The encoder is responsible for storing the data. It takes some input (in this case a list of polygons or lines), reorder the data and applies some transformation before writing it to disk.

The encoding process is controlled by the demo system. This allows the demo system to encode the full demo with sync preserved. This was actually a side effect as I wanted to move as much redundant code as possible to the demo system.

It was critical for the encoder to store data in such a way that decoding can be done without memory allocations and in realtime on a frame-per-frame basis. Much like an encoded video stream.

##Polygon encoding The polygon encoding used in Dark Goat Rises is very crude and not particulary advanced. We simply store all visible triangles on a per frame basis. Each frame contains the number of polygons and then a list of polygons. Each polygon is written with a color index and then vertices sorted along the Y-axis (skips a few steps during polygon setup). We use 10:6 fixpoint for each component.

##Line encoding Encoding lines is in theory quite simple. However, we wanted to encode lines which could be partially obscured - hidden line through Z-buffer. This made things a tad bit more complicated. A special version of the line drawer was created. This line-drawer was recording aware. Instead of plotting pixels it reports visibility states to the encoder. A pixel can either be hidden or visible (depending on the Z buffer value).

In essence this is done:

begin line
draw pixels
is pixel visible -> yes / no
repeat 2 until finished
end line

The encoder will track the changes between yes/no and record the state changes. The actual line-drawer looks like this (omitting setup):

    rec_begin_line(lineid);
    for (int i=0 ; i<step ; i++)
    {
        plot_zcheck_reg(pixmap, x, y, z);
        x += dx;
        y += dy;
        z += dz;
    }	
    rec_end_line();

The ‘lineid’ is used to track a specific line through multiple frames.

The ‘put_pixel’ function (with state handling) for the line-drawer looks like this:

static void plot_zcheck_reg(GOA_PIXMAP8 *pixmap, float x, float y, float z) {
	if (0 <= x && x < pixmap->width && 0 <= y && y < pixmap->height) {
		int ofs = (pixmap->width * (int)floor(y)) + (int)floor(x);
		if (floor(z) < pixmap->zbuffer[ofs]) {
            // Signal visible pixel to recorder
			rec_line_pixel_visible(x,y,primarycolor);
			pixmap->image[ofs] = primarycolor;
			pixmap->zbuffer[ofs] = floor(z);
		} else {
            // Signal hidden pixel to recorder
			rec_line_pixel_hidden(x,y,primarycolor);
		}
	}
}

The ZBuffer value must be filled in. But the color could be skipped. It was however nice to have the rendered bitmap when debugging problems and verifying encoding/decoding.

The encoder sets the visibility state to no at ‘begin line’. When the visibility is changed the coordinate is recorded. If the state is changed from visible -> hidden a segment is emitted. It is therefore possible that a single line can have two partial lines - or two segments.

When the line is finished all segments for a particular line are pushed on the encoder queue. When a frame is finished the frame is pushed on the frame queue. When the frame is flushed to disk all frames are packed and stored. If a segment has multiple line segements (part of line obscured) they are still tracked as belonging to the same line.

The initial encoder was dumb and simple stored all segments on a per frame basis. However this was way too much data and it didn’t compress very well. Instead we decided to try temporal (time based) compression.

For temporal compression to work fine you need to know which line you operate on. Internally we gave each line a line-identifier. This identifier MUST be same for each frame. Generally you can just use a “counter” as long as each line gets the same value every frame.

As the lines can be ‘broken’ by the Z-buffer there are no clear start/end points. In fact there can be multiple sub-segments for a single line. Also, when you apply delta compression you will get error propagation over time and you need to reset this error. This is done by recording a full version of the segment. In the end, we ended up classifying the line segments like this:

I-Segment, full segment
D-Segment, delta segment (from previous)
B-Segment, broken segment

The I and B Segments (full segment) are encoded with 10:6 fixpoint. While the D-Segments are encoded with 4:4. The first frame is always encoded with full I-Segments. Subsequent frames takes previous frame as input and tries to compute D-Segments where possible. A D-Segment computation is rejected if: The D-Segment encoding is rejected if:

Previous segment was broken (i.e. contained multiple sub-segments)
Distance to previous is larger > 16 pixels
Segment not visible in previous frame
More than 16 frames passed since last Intra encoding.
Current segment is broken

Otherwise a delta is computed between the current position of the vertices and the previous. The delta is encoded as 4:4 and stored.

##Storing A frame is basically divided into 5 sections.

D-Segment Visibility bitmap
I-Segment Visibility bitmap
D-Segments
I-Segments
B-Segments

A frame is stored in the same order as well with an added frame header in front.

The visibility bitmaps are cruicial in order to index the correct line-segment when applying delta calculations during decoding. There is one bit for each line in the animation. The visibility bitmap also pack’s very well with bit-RLE applied.

Another option would be to store an index for each line-segment but that would blow up the file size.

#Decoding It is a requirement to do decoding in realtime otherwise the decoded data would require too much RAM. We also know how many unique lines the whole animation will use so we can avoid memory allocations during decoding. Our decoding process looks like:

Read a frame (and verify - check the frame marker)
Decode D-Segments and compute new coordinates
Decode I-Segments
Decode B-Segments

    // D-Segment decoding
    for (int i=0;i<anim->nUniqueSegments;i++) {
        if (anim->dsegvisibility[i]) {
            if (anim->lines[i].frameMarker != idxPreviousFrame) {
                printf("\n ERROR: Delta decoding to non-existing line: %d\n", i);
            } 	
            // delta stored as 4:4 - this will also apply the delta
            gnk_lpa_decode_fp4_4_segment(f, &anim->lines[i]);
            // Push to draw-queue
            anim->toDraw[anim->nToDraw++] = &anim->lines[i];
            anim->lines[i].frameMarker = anim->nCurrentFrame;
        }
    }

I-Segments are decoded in more or less the same way. Once the I/D segments have been decoded the trailing data contains all B-Segments which are decoded like I-Segments but not saved as references.

In order to avoid Disk I/O (and disable I/O Interrupts) the animation is loaded fully from disk to memory and then streamed from RAM frame by frame.