For a long time I've wanted to spend some time writing down my recollections of what I did on the TA graphics engine. It was a weird time, just before hardware acceleration showed up. Early hardware acceleration had pretty insane driver overhead. For example the first glide API did triangle setup because the hardware didn't have it yet. Accelerated transform was out of the question. Anyway none of this was really a factor because that stuff was just showing up when we were working on TA and we couldn't have sold any games on it.
Anyway I met Chris at GDC in 1996 and he fairly quickly offered me a job
working on the game. I had just wrapped up work on Radix a few months
before and was looking for something new since most of the Radix guys were
going back to school.
So I went back to Ottawa and while I waited for visa paperwork to move to
the states I ended up writing Thred which became a whole other story that I'll
talk about some other time. Once the visa paperwork came through I moved to
Seattle at the end of July 1996 just in time for Seafair.
Monday morning rolls around and I start meeting my new co-workers and
getting the vibe. I got a brand new smoking hot Pentium 166Mhz right out of a Dell box. Upgraded to 32mb of ram even! That was the first time I
ever saw a DIMM incidentally. We all ooo'd and ahhh'd over this new
amazing DIMM technology. I was super excited to be there and actually
getting paid too!
I had already done a bit of work remotely so I had a little bit of an idea about
the code but I hadn't seen the whole picture. The engine was primarily
written in C using mostly fixed point math. At that point using floats
wasn't really done but it made sense to start using them. So we did. This
means we ended up with an engine that was a blend of fixed point and floating
point, including a decent amount of floating point asm code. Ugh.
Jeff and I both tried to rip out the fixed point stuff but it was ingrained too
deep. Oh well.
So my primary challenge on the rendering side was to increase the
performance of the unit rendering, improve image quality and add new features
to support the game play.
The engine was also very limited graphically in a lot of ways because it was using an
8-bit palette. This meant I had to use a lot of lookup tables to do
things like simple alpha blending. Even simple Gouraud shading would
require a lookup table for the light intensity value. Nasty compared to what we do
today. The artists did come up with a versatile palette for the game but 256 colors is still 256 colors at the end of the day.
Getting all of the units to render as real 3d objects was slow.
Basically all of the units and buildings were 3d models. Everything else
was either a tiled terrain backdrop or what we called a feature which was just
an animated sprite (e.g. trees).
So there were a few obvious things to do
to make this faster. One of them was to somehow cache the 3d units and turn them into a sprite
which could be rendered a lot more quickly. For a normal unit like a tank we would cache off a bitmap that contained the
image of the tank rendered at the correct orientation (we call this an imposter
today, look up talisman). There was a caching system with a pool that we could
ask to give us the bitmap. It could de-allocate to make room in the cache
using a simple round robin scheme. The more memory your machine had the
bigger the cache was up to some limit. We would store off the orientation
of that image and then simply blt it to the screen to draw the tank. If a
tank was driving across flat terrain at the same angle we could move the bitmap
around because we used an orthographic projection. Units sitting on the
ground doing nothing were effectively turned into bitmaps. Wreckage too.
There was another wrinkle here; the actual units were made from
polygons that had to be sorted. But sometimes the animators would move
the polys through each other which caused weird popping so a static sorting was no
good. In addition it didn't handle intersection at all. So I
decided to double the size of the bitmap that I used and Z-buffer the unit (in
8-bits) only against itself. So it was still turned into a bitmap but at
least the unit itself could intersect, animate etc without having worry about
it. I think at the time this was the correct decision and actually having
a full screen Z-buffer for the game probably also would have been the correct
decision (instead we rendered in layers).
Now all of this sounds great but there were other issues. For example
a lot of units moving on the screen at the same time could still bring the
machine to its knees. I could limit this to some extent by limiting the
numbers of units that got updated any given frame. For example rotation
could be snapped more which means not every unit has to get rendered every
frame. Of course units of the same type with the same transform could
just use the same sprite. Even with everything I could come up with at
the time you could still worst case it and kill performance. Sorry!
I was given a task that was pretty hard and I did my best.
Once I had all the moving units going I realized I had a problem. The
animators wanted the buildings to animate with spinney things and other objects
that moved every frame! The buildings were some of the most expensive
units to render because of their size and complexity. By even animating
one part they were flushing the cache every frame and killing
performance. So I came up with another idea. I split the building
into animating and non-animating parts. I pre-rendered the non-animating
parts into a buffer and kept around the z-buffer. Then each frame I
rendered just the animating parts against that base texture using the z-buffer
and then used the result for the screen. I retrospect I could have sped
this up by doing this part on the screen itself but there were some logistical issues
due to other optimizations.
After I had the building split out, the animating stuff split out, the z-buffering
and the caching I still had a few more things I needed to do. I haven't
talked about shadows at all. Unit shadows and building shadows were
handled differently. Unit shadows simply took the cached texture
and rendered it offset from the unit with a special shader (shader haha it was a special blt routine really) that used a
darkening palette lookup. E.g. if there was anything at that texel just
render shadow there like an alpha test type deal. This gave me some extra
bang for the buck in the caching because I had another great use for that
texture and I think the shadows hold up well.
Not all was well in shadow land with it came to buildings though. Due
to their tall spires and general complexity I decided to go ahead and properly
project the shadows. This ended up significantly increasing the footprint
of the buildings and the fill rate started to become sub-optimal because a
single building could really take up a lot of the screen. Render the
shadow (which overlaps the building and a lot more) then render the building
itself on top and you are just wasting bandwidth. So the next step was to
render the projected shadow, render the building (both into the cache) then cut
out the shape of the building from the shadow and then RLE encode the shadow
since it's all the same intensity. Now rendering consisted of
render the shadow (not overlapping and faster because it's a few RLE spans) and
then render the building. Ahhh... way faster.
Now the whole way that TA did texture mapping was just screwed.
Frankly we had no idea what we were doing. Jeff knew it was fucked but it
was just so already built that it wasn't changeable in the time we had
anyway. I could do a 100x better job at this today 16 years later.
So we had some pretty serious image quality issues mostly related to
aliasing, especially of textures (there were no UV's each quad had a texture
with a specific rotation stretched to it). So the one thing I did that I
think worked well is anti-alias the buildings. Basically for the
non-animating part of the building I would allocate a buffer that was double
the size in each dimension. I rendered the building at this larger size
and then anti-aliased that into the final cache. So the AA only happened
once when it got cached which means I could spend some cycles. This only
applied to buildings.
Now doing AA in 8-bits is going to require some sort of lookup table.
Since I had 4 pixels that I wanted to shrink down to 1 pixel I came up with a
simple solution. It's very similar to what we use for bloom type stuff today
which is simply separating the vertical and horizontal elements. So the
lookup table took 2 8-bit values and returned a single 8-bit value that
represented the closest color in the palette to the average of the
colors. No I didn't take into account gamma correctly or much else to be
honest. Anyway I would simply do a lookup on the top two pixel and the
bottom two pixels. The results from those two ops were then looked up to
give me the final color, so 3 lookups. It drastically improved the look
up the buildings.
Except I fucked up and left a bug in there. Ever notice that a lot of
the buildings have a weird purple halo? Basically the table broke when
dealing with the edge and transparency because I didn't have a correct way to
represent that. Then I ran out of time, I think I could have fixed it.
Anyway I wrote some particle system stuff, lighting effect stuff and some
other cool effects that didn't get used (psionics!). But the unit
rendering was by far the most complicated part of the whole renderer and it's
what I ended up spending the most time on.
BTW I still think TA was an amazing game and I'm still interested in pushing
that kind of game more in the future. It seems like every time I do an
RTS a few years later I'm ready to take another shot at it (SupCom was the last
one, I'll do some posts about its engine sometime too).