Polygon filling using tiles

April 02, 2020

Writing software polygon rasterization using tiles instead of the regular scanline approach.

Introduction

Note: CCW (Counter Clock Wise) edge winding throughout!! This is very focused on the actual implementation, for the theory there are several good reads out there.

Good introduction:

https://www.scratchapixel.com/lessons/3d-basic-rendering/rasterization-practical-implementation/rasterization-stage

Fabian ‘ryg’ Giesen:

Intel Software Occlusion: https://software.intel.com/en-us/articles/software-occlusion-culling

Background

There are basically two approaches to software rasterization of triangles.

Scanline
Tilebased

Scanline is very easy to grasp on a conceptual level and while fairly easy to get running it can also get quite hairy. Scanline rendering is also not suitable for parallel rendering.

Tile based rendering is a slightly more difficult concept but once understood the rendering is pretty neat. And it is very suitable for parallel rendering.

The good things:

No division required
Only integer math required
No states needed

Concept

Read the links in the Background!!!!

TL;DR;

Working with screen space (pixel) coordinates
Edge function properties (lambdas) can be used to interpolate over the triangle
Checking the sign determines inside/outside of the pixel
Edge deltas doesn’t need to be divided by length

Basic Implementation

The basic implementation goes something like this:

Determine the bounding box for the triangle
Compute the edge factors
Compute edge deltas
Iterate over the bounding box

I assume you have somekind of bitmap structure and a function (putpixel) which can draw a pixel of a certain color.

To determine the bounding box we simply find the min/max of the three vertices.

    // C/C++ don't have "integer max/min" - I just reuse these for the heck of it
    #define fmin3(a,b,c) (fmin(fmin(a,b),c))
    #define fmax3(a,b,c) (fmax(fmax(a,b),c))

    int minX = fmin3(v0->x, v1->x, v2->x);
    int minY = fmin3(v0->y, v1->y, v2->y);
    int maxX = fmax3(v0->x, v1->x, v2->x);
    int maxY = fmax3(v0->y, v1->y, v2->y);

Edge factor, you will have to compute this factor for all three edges. Remember the winding is important. DO NOT mix CW/CCW winding.

static int edgeFactor(const Point2D *a, const Point2D *b, const Point2D *c) {
    return ((b->x - a->x)*(c->y - a->y) - (b->y - a->y)*(c->x - a->x));    
}

// Inside your filler...
Point2D p = {minX, minY};   // Upper left of our bounding box

int w0_scan = edgeFactor(v1, v0, &p);
int w1_scan = edgeFactor(v0, v2, &p);
int w2_scan = edgeFactor(v2, v1, &p);

Take a very close look here - because this is the key to understand how we can split this into multiple tiles. The edge factor is computed for a specific POINT relative to the edges. Basically we could compute this for every pixel on the screen, checking the sign of the factors and draw if they are all positive. Not very effective but very compact.

It basically gives you the starting point for edge interpolation relative to the screen point.

Edge deltas, are simply the edge vectors made up from the vertices.

    // A is used per scanline, B for horizontal
    int A10 = v1->y - v0->y, B10 = v0->x - v1->x;
    int A02 = v0->y - v2->y, B02 = v2->x - v0->x;
    int A21 = v2->y - v1->y, B21 = v1->x - v2->x;

Interpolate and draw

    for (int y = minY; y <= maxY; y++) {
        int w0 = w0_scan;
        int w1 = w1_scan;
        int w2 = w2_scan;
        for (int x = minX; int x <= maxX; int x++) {
            // If p is on or inside all edges, render pixel.
            if ((w0 | w1 | w2) >= 0) {
                putpixel(target, x, y, color);
            }
            w0 += A10;
            w1 += A02;
            w2 += A21;
        }

        w0_scan += B10;
        w1_scan += B02;
        w2_scan += B21;

    }

Note on inside/outside check:

            if ((w0 | w1 | w2) >= 0) {
                putpixel(target, x, y, color);
            }

The sign-bit is the MSB bit. If ANY number is negative this bit will be set, otherwise it will be zero. As we don’t care about the actual value we can simply bit wise OR everything togehter and check if the sign bit got there or not.

Notes on the above

This basically just creates a single tile around one triangle and draws it. But the principles of the edgefactors relative to pixels are exactly what makes this algorithm possible to split in to independent tiles for rendering.

Many tiles

Splitting the above principle into rendering many small tiles requires the following:

Tag each tile with the triangles that intersects with it
Draw each tile independently

I have defined a tile structure to hold data for each tile. It’s not needed but makes debugging easier.

typedef struct {
    int ix, iy;     // index of tile in grid
    int x,y;        // tile position in pixels (screen coordinates)
    int w, h;       // width/height of tile in pixels
    int nTriangles;
    Triangle *triangles[MAX_TRI_PER_BIN];
} Tile;

The triangle looks like:

typedef struct {
    const Point2D *v0, *v1, *v2;
    uint32_t col;

    // Bounding box - constant, will not change after setup
    int minX, maxX;
    int minY, maxY;

    // Edge factors - constant, will not change after setup
    int A10, A02, A21;
    int B10, B02, B21;
} Triangle;

The pixel width/height of a tile is a factor of 2. When using a small screen resolution I set this to 32 but you can do 64 or 128 as well. Using a factor of 2 makes it easier when you tag the tiles with which the triangle intersects. To simplify this even further we will just map the bounding box of the triangle to the tiles. And this is just a scaling function.

#define TILE_BITS 5    // tiles are 32x32 pixels
static void gnk_tilefiller_pushtriangle(TileFiller *filler, Triangle *tri) {
    // Scale down bounding box values for triangle
    int yStart = tri->minY >> TILE_BITS;
    int yEnd = tri->maxY >> TILE_BITS;

    int xStart = tri->minX >> TILE_BITS;
    int xEnd = tri->maxX >> TILE_BITS;

    for (int y=yStart;y<=yEnd;y++) {
        for(int x=xStart;x<=xEnd;x++) {
            filler->tiles[x + y * filler->nWidth].triangles = tri;
            filler->tiles[x + y * filler->nWidth].nTriangles++;
        }
    }
}

When the triangles have been tagged to each intersecting tile we can just draw the tiles independently.

static void gnk_tilefiller_filltile(BITMAP *target, Tile *tile) {
    // No triangles in this tile, leave...
    if (tile->nTriangles == 0) {
        return;
    }
    
    int w0_scan;
    int w1_scan;
    int w2_scan;

    // Iterate over all triangles in the tile
    Point2D p;
    int bin_scan = 0;
    for(int i=0;i<tile->nTriangles;i++) {
        Triangle *tri = tile->triangles[i];

        p.x = tile->x;
        p.y = tile->y;

        w0_scan = edgeFactor(tri->v1, tri->v0, &p);
        w1_scan = edgeFactor(tri->v0, tri->v2, &p);
        w2_scan = edgeFactor(tri->v2, tri->v1, &p);

        for(p.y=tile->y;p.y<(tile->y + tile->h);p.y++, bin_scan++) {

            int w0 = w0_scan;
            int w1 = w1_scan;
            int w2 = w2_scan;

            for(p.x=tile->x;p.x<(tile->x + tile->w);p.x++) {

                if ((w0 | w1 | w2) >= 0) {
                    putpixel(target, p.x, p.y, tri->col);
                }

                w0 += tri->A10;
                w1 += tri->A02;
                w2 += tri->A21;
            }

            w0_scan += tri->B10;
            w1_scan += tri->B02;
            w2_scan += tri->B21;
        }
    }
}