2015/08/24

Map Objects: Designing the Entity System

As you may have noticed, I've been laying off talking about dynamic objects in a level. Sure, we can animate our tiles and have a camera that you could theoretically move (well, I have a piece of test code that does move it, but I haven't shown that to you and it won't be in the final game anyway), but we don't have NPCs or anything similar.
This is because I wanted to get the rendering pipeline somewhat presentable, and didn't want to talk about entity systems until I absolutely had to. Tile maps don't need to be in the entity system, they represent the space the entity lives in. The camera doesn't have to be either, because there will only ever be one and it will be operated by different pieces of code depending on game state. The virtual screen is explicitly outside of the game world, while all entities are inside.

ECS Architecture

A decade or so ago, people started realizing that using class hierarchies for their game objects was more curse than blessing. Around the same time, the most common performance bottle neck was found to be the CPU's memory cache.
The Entity/Component/System pattern, designed to combat both problems simultaneously, was all the rage until about five years ago, and is commonly considered the definitive method these days. Unity has a somewhat improper implementation without the System part, and as far as I know Unreal uses object composition as well.

An entity is your basic game object, made up of multiple components, flat data structures that contain part of the state of the entity. Systems then operate on all components of a certain type, rather than all entities worrying about themselves - the RenderingSystem draws all entities that have a MeshComponent, the PhysicsSystem moves all entities with a ColliderComponent and performs collision detection.
If you perform the same instructions on all elements of an array, you significantly decrease the probability of encountering a cache miss, granting a boost in performance if that's your bottle neck.
This pattern also decouples the different tasks your engine has to accomplish; you can define an entity to have any combination of components, massively reducing the number of different types of game object you might need. This in turn means that you can come up with new types of entities on the fly (unless, of course, that would require a new type of component or a new system), without writing any code whatsoever.

Now to rain on the parade, because I'm a cynic who enjoys rain, we really shouldn't expect an advantage in performance. With C# running on the CLR, and calls into MonoGame inside even our hottest loops, there's enough going on outside of our control that the cache will probably get trashed with every iteration. Cache-optimizing the code between these two layers just won't do all that much.
Not that we'd really need to. I expect no more than like 30 NPCs on any given map, and even fewer other objects. For all its other strengths, I'll still implement something similar, it's just something to keep in mind.

Requirements

Now, before we rush in specifying implementation details of a "robust, powerful entity system" for the heck of it, maybe we should look at what we actually want it to do. Once we know what kinds of entities we'll need, we can worry about what components we might need to construct them.

The obvious thing is NPCs: Animated sprites walking about the map that the player can interact with by pressing A while facing them. The specifics of what an NPC does when interacted with should be defined in the event scripting system.
Then there's event spaces: Plain hit boxes that trigger an event script on collision. A special case of this is the warp space, which transports the player to a given position on a given map, potentially after playing a door animation. It sounds like a good idea to map warp spaces to each other so that, when you change one map's layout, you don't have to change warp spaces on another. That would be a nice option, but it wouldn't let you make warps that don't let the player go back the way he came.

That's actually the three basic event types from the third generation Pokémon engine I'm so used to working with, so that's proven to be enough for the kind of game I'm planning, but you could get away with less, like the RPG Maker, which combined NPCs and event spaces by providing multiple hooks for scripts (on-collision and on-interact, essentially), and required you to script warp spaces yourself.
And maybe we find that to be the way our script component should work, but at minimum we should provide a parametrized default warp script so you don't have to write six lines of boiler plate script code for every fricking door in the game. I like using the RPG Maker as reference for this sort of stuff, but that's one thing I think it botched badly.

Moving on, there's some more object types I'd like to have to enhance the feel of the game world.
First, there's sound emitters. Ever since I saw a live stream of a guy scripting dozens of events in RPG maker to alter the volume of an ambient waterfall sound effect as your character moved, I've been in love with this idea. It's common practice in 3D games, where it's important for the suspension of disbelieve, less so in 2D games.
I'm not quite so sure about the last one, but particle emitters might come in handy as well. Whether it's malfunctioning electronics spouting sparks, waterfalls spraying water or just chimneys puffing away, that some things might be easier to animate this way. This would be very inaccurate though (neither the SNES nor the GBA used many particle effects), and I'm not sure it would be actually worth the effort, I'm just putting out the idea.

Recomposition

Now we can look at what components we need to make, by finding common vs. distinct properties of these entity types and identifying their purpose.

To start with the obvious, other than NPCs and particle emitters, which have very distinct logic, none of these are actually visually rendered. We'll therefore make a SpriteComponent and possibly a ParticleEmitterComponent. Similarly, the sound emitter is the only type that directly uses the audio engine (which doesn't actually exist yet, so we'll have to implement this later on), so we'll make that a distinct SoundEmitterComponent.
Next, you may have noticed that all our components make use of the entity's position in the game world. We might make that a LocationComponent that all other components rely on, but at that point, wouldn't it be better to just put that data on the actual entity? Or rather, is something that isn't located in the game world really an entity? Maybe it should just go into the map properties if it's level specific, or into the engine code, if it's an even more generic thing.

Identifying the other components is where it gets more difficult, as they all use the scripting system and rely on collision to some extent. We'll probably want to make a HitBoxComponent for collision testing, with a setting to determine wether the player should pass through, where you can manually query for collision, or whether they should be pushed back by the physics system.

Another thing only the NPCs do according to my descriptions is move and animate. There's different stances you might take on how that should be handled: Movement is just what happens when you play the walking animation, when you move an NPC, it should automatically play the walking animation. If you've ever worked with FL Studio or other audio software with automation envelopes, you might argue that being able to animate any value on any component is the way to go.
While it's true that that's an incredibly powerful system, especially in music production, where all the knobs affect each other and changes to their values stack exponentially, rippling into the final mix, I don't think that's necessarily true for games, especially considering we're currently trying to keep values from affecting each other too much. Our animation system ought to be simple enough to be effectively used by event scripts, but not so simple that you need a lot of code to accomplish anything with it.
So, since our animations have to be accessed by script code, we might as well push the movement behaviour of an NPC into the script system, with scripts looping in the background, running animations. To complement that, we just need an AnimationComponent to (1) define available animations; and (2) store the currently running animation and its state.
That leaves us with the three script hooks we've identified so far: OnCollision and OnInteraction are triggered during collision testing, while the ParallelProcess just runs in a loop forever. Not every NPC needs a movement behaviour attached, neither do other objects you might want to script, so let's make a separate ParallelScriptComponent.
As for the other two, they are so strongly tied to the collision code that I'll probably just put them on the HitBoxComponent after all.

Additions and Optimizations

Okay, so I just spent a couple of hours away from my computer, thinking about other stuff, and came back to sanity-check the concept so far. Here are three things I didn't talk or think about the first time round:

Conditionally De/Spawning Entities

If a game allows you to revisit a location multiple times over the course of the narrative, you'd probably expect your progress in the story to have some effect on the game world, especially the characters in it. To do that, RPGs commonly show or hide characters depending on a choice you made or an event you cleared.
We have a general purpose progression system consisting of boolean values and integers that can be set by scripts (in theory, anyway - the scripting thing in general isn't in place yet). Tieing an entity to such a value is an easy way to the goal here, we just need to find the place to check for it. You might get away with checking every frame, since both flags and variables are stored in your every-day array, and it guarantees immediate response, which caching the value doesn't.
Since you may want to use this for any entity, and every system has to perform the check, we don't even have to make this it's own component, and store the relevant data directly on the entity.

Sharing Animation Definitions

As I specified above, the AnimationComponent should hold definitions for all available animations. Thing is, most NPCs will use the exact same sprite sheet layout with the exact same animation timings and all. We don't have to make shared components a thing, we just need to make AnimationDefinitions into a type of content that can be shared by AnimationComponents.
It seems pretty obvious, so I'm not sure me a couple hours ago thought so too and didn't mention it for that reason, or just forgot about it.

Dynamic Object Creation

While pretty much all of this post focused on objects you'd place on a map during level design, the entity system also needs to handle objects that are placed by scripts or even the engine itself, as they come and go.
Prime examples of script placed objects are NPCs that aren't actually on the map and only spawned for one event, and sound effects that should have an audible location in the world (an explosion, a scream, church bells, whatever).
As for objects placed by the engine, there's the player character itself - placing it in engine code is much better than demanding that every map have an object with a PlayerCharacterComponent or something, I think. There's also a swath of visual effects that I'd implement this way; Pokémon style tall grass, submerging characters in water up to their necks, and reflections on ice floors or wall mirrors, just to name the ones that I think would be cool.

With these effects in particular, we need to consider our technical limitations and our layer model, as well as rendering order in general. But that can wait until the next post, where I'll implement the basis for all this and hopefully the rendering bits, so I can show screen shots again.

2015/08/20

Matrix Transformations in Color Space

When I started programming in C# five years ago, I quickly latched onto XNA for game programming. I didn't have constant internet access back then, so I bought books to learn from instead. The particular book I learned XNA from had a chapter on shader programming. Back then it seemed like magic to me - my school didn't teach me linear algebra until a year or two ago, I forget - and even today, I grasp the general concept, but have little experience, especially with vertex shaders.
However, one sample shader caught my attention recently: The black-and-white shader. It used a dot product operation to convert a color vector to its monochrome luminance. Other than that, and maybe sepia filters, I never really heard of people using transformation matrices on their color vectors. A quick google search reveals that it's not an entirely new concept - that page predates my birth by over three years. There is quite a bit of cool stuff you can do with this, so let's get crackin!

Prerequisites

Before we can work on the color transformations though, we need to linearize our colors. There's a nice chapter of GPU Gems explaining the concept and general importance of that, and since we use additive and subtractive blending for lighting effects, we should be doing that anyway.
I haven't produced any textures for this project yet, so we can, as advised, assume that all loaded textures are already linearized. However, we still need to perform a gamma correction in the post processing effect, and that has to happen in a pixel shader. Additionally, we should apply the calculated transformation matrix as we render so we'll actually need two different shaders.

I've dabbled in shader programming in HLSL for XNA, but I know that MonoGame uses its own shader system to compile for different platforms. I was pleasantly surprised to find documentation and example code on that at least.
Note that, unless you are using SpriteSortMode.Immediate, applying the Effect manually won't actually work. You need to pass it to SpriteBatch.Begin, but then you'll face a compatibility issue between the sprite batch and you effect: the camera transformation matrix is no longer applied, and neither is the OrthographicOffCenter matrix used internally by every sprite batch. You'll have to apply these manually with a vertex shader. Here is a SpriteBatch compatible pass-through effect that does that:

Color Transformations

Okay, this section is basically about porting Paul Haeberli's color transformations to C#, it's not really my code per se. The first transformation, Changing Brightness/Color Balance is basically provided by XNA, as it only scales the color by the specified parameter; we can use Matrix.CreateScale for that. We can also Apply Offsets to Color Components by using Matrix.CreateTranslation.

Modifying Saturation

This one is pretty simple: the first variant, Converting to Luminance, is a specific case of the saturation problem, and is solved by using a weighted average of all channels for all channels. To arbitrarily Scale Saturation, you need to interpolate between Matrix.Identity and the luminance matrix. I implemented both in one function, since you can just pass 0 to get the monochrome picture:
This is pretty cool, because you can actually pass values outside of (0; 1) to get inverted colors at -1 and boosted saturation for values greater than 1.

Shifting Hue

I'll admit that I don't really understand the math behind this either. I mean, I know how the matrices do their job, but I know too little about color spaces and how they relate to each other to be sure about why you need to do these exact operations. Sorry.
To be honest, it's not even that useful an operation. If we're tinting the map and objects to simulate different lighting situations, hue doesn't usually shift by a fixed offset. Rather, it shifts towards a certain value, generally yellow for increased light and blue for decreased light.

Just combining the other three operations gets us quite far, though; these are the exact same scenes, only rendered with a different color transformation matrix:

The ugliest island map in the world, day and night.

2015/08/18

ABDUCT: About functional programming, patent laws and a dead project

About a week ago, while I was working on the entity system for my game, I grew tired of my code base; I'm not really happy about all decisions I made, both early on and with my tile map rendering, and it all accumulated into some anxiety over sprite rendering.
So I decided to take a break and come back later instead of burning out and leaving the project unfinished. I didn't want to stop contributing to the game, however, so I considered my other options. I was also kind of obsessed with audio stuff at the time, since I spent so much time researching it just prior, and really wanted to work on authentic BRR compression, with better tool support than the shady tools I found on romhacking.net.
Finally, I wanted to try functional programming, see what all the hype was about. Since audio compression seemed like a great fit, and F# like a good bridge into the paradigm, everything seemed to line up perfectly.

Since tool naming is important, I thought long and hard (heh) about a naming scheme for my own tools, and came up with the Authentic Bit rate reDUCtion Tool, the Barely Usable Rigid Game Level Editor and other silly backronyms for criminal verbs.
Then I got to work, setting up an F# project Abduct.Core, which should do all the heavy lifting, and the C# project Abduct.UI, which should handle all the boring stuff, such as communicating with NAudio, the library I decided to use for handling audio files (and resampling, which, as it turns out, you shouldn't try to do yourself, at least not if you want quality output).

And, as if in the second act of a tragedy, things started to go down hill from here.
The plan was relatively simple: use NAudio to load WAV files and convert them into 16 bit mono. Resample to a user defined sample rate, perform length correction - BRR compressed sounds come in chunks of 16 samples, so you either have to pad in silence for sound effects, or truncate looped samples. Then you apply the actual compression, and decompress again - we're not actually running on a SNES, so the compressed data is of no use to us.
The user should be able to play back a sound at any sample rate (without resampling!) to hear what the sample would sound like after pitch modulation, with or without looping, to examine its behaviour, and finally export the file as an uncompressed wave file.

As mentioned above, I use NAudio to resample, so that's not really an accomplishment. I did get length correction to work, I think. It's currently intertwined with the non-functioning compression code, and even before, it was a bit hard to test.
Convering different wave formats with NAudio involves mucking around with .NET streams. A lot. So much so, that I couldn't find a way to easily do it for any wave file I might load, even with a clearly defined target format. So, for prototyping, I just allowed wave files in the proper format.
Playing back audio with NAudio is also an atrocious process. Well, it is if you have to change the output wave format dynamically because you're trying to modulate pitch.
The real killer of this project, though, was the actual compression, for two reasons:
  1. I didn't bother to really learn the concepts of functional programming. I just jumped in with a new project in Visual Studio and this reference. Inevitably, I encountered a compiler error that I still haven't wrapped my mind around yet.
  2. As the wikipedia article on BRR compression mentions - which I didn't notice during my initial research - there's loads and loads of patents involved. I don't have the patience and legalese to read all of these, and I can't afford (and frankly, don't want to pay) a patent attorney to do it for me. I'm not clear on American patent laws, and the fact that I'm located in Germany doesn't make the whole situation any simpler.
To cut to the chase, even if I did get around all these other problems - bite the bullet and write all that code to properly use NAudio, learn all about the functional paradigm - I still couldn't complete and use the tool with confidence, much less release the code in good conscience.
I will, however, strip away the bits of compression code I wrote so far, get it to a somewhat presentable state, and release that code on github later this week. If anyone from the home brew and rom hacking communities is more confident in their legalese (or Nintendo's unwillingness to sue over this stuff), they are of course free to do with it whatever they want.

As for myself, the inability to get accurate SNES sound is a bit of a disappointment. I'm left with just a few options:
  • Skip the compression and only use length correction and resampling. This would make our samples significantly higher in quality - maybe using 8 bit samples might combat that somewhat.
    • I could, of course, try other compression algorithms, or even create a shitty one myself.
  • Use a chip tunes soundtrack. Most samples qualified for that would barely suffer from the compression anyway. It would also get us closer to the GBA sound, which wouldn't be too bad.
  • Giving up on authenticity and using a CD quality sound track is tempting, but not really my style.
I'll have to think about the specifics of this later. It's 4 am over here, and I need some sleep.
Anyway, that's what I've been up to this week. I'll muck around with functional programming some more, but I'll return to doing engine work soon, promise!

NOTE: I usually wrtite blog posts a week or three in advance, so the posts will keep coming as scheduled, this post is the one in the wrong time line.

UPDATE: The cleaned up code for this project can now be found on github: https://github.com/teryror/ABDUCT-Audio-Processor.

Handling Multiple Resolutions with a Virtual Screen

My current retro 16 bit style guidelines dictate that my game use a virtual screen of 320x180 pixels. At first, that seems easy enough: when the game is running in window mode, we can just dictate the size to be a multiple of that and zoom in accordingly. In full screen 720 or 1080p, we can just scale by a factor of 4 or 6, respectively. Common laptop resolutions, like 1366x768, are not so easy, but we can give the option to either scale unevenly, by a factor of roughly 4.26 in this case, or scale evenly by a factor of 4, and have a black border around the virtual screen. On displays with different aspect ratios, we generally face the same problem, except even the unevenly scaled version would require letterboxing.

To handle all of these different cases, as well as the very concept of a virtual display, I created the class VirtualScreen. It uses a RenderTarget2D that is a multiple of 320x180 in size, and provides a CaptureFrame method that sets the render target on the GPU, and a DrawFrame method that renders the target on the screen.
The hairy bit is the Refresh method, which sets the size of the render target and calculates a destination rectangle for DrawFrame. There's also the Reset and Save methods, which load the properties of the class from the configuration and calls Refresh, and write the current state of the screen back into the configuratin and save it, respectively.

This is the bad boy in question:
It looks pretty hairy at first glance, and it is a bit heavy on the arithmetic side, but the real problem is, that it just explodes in size as we add rendering options, and there just is no design pattern or algorithm to save us here. We could factor out a RefreshFullScreen and a RefreshWindowed function, but that would only take away a couple lines of code and a level of indentation; it wouldn't help with the why does-this-math-work-question.
The real problem is, that this thing just explodes in complexity as you add video options - I added a SuperSampling property that doubles the size of the render target while keeping the destination rectangle the same, slightly blurring the pixel art but also smoothing out sprite movement (once we get to that, anyway).

But setting up the back buffer and render target isn't the only thing I wrote this class for.
The Scale property the Refresh method sets before calculating a destination is not configurable. Instead, it's supposed to be used to create transformation matrices for rendering on the virtual display. I now use it in my Camera class in place of its own scale member, so the camera automatically adjusts to changes to the rendering options.
Since we have a texture with the rendered world and UI by the time DrawFrame gets called, we can also do post-processing there, which I'll at least touch on in the next post.
I will also render a Super Game Boy inspired background image behind the render target, to make potential letterboxing less awful to look at. The original The Binding of Isaac did that to great effect, I find.

2015/08/16

Matrix Transformations in 2D Space - The Camera

While I was factoring out the useful bits of the map rendering proto-code, I tested them a lot more thoroughly and a fixed a couple of bugs, mostly just missing calls into classes I wanted to plug into the engine. One bug in particular, however, was related to the way we currently rotate tiles;
When we set the origin in our draw call to (8 8), we not only rotate tiles around their center, as I'd assumed, but also shift them to the top left by the same amount. To fix that, I removed the origin parameter and used some evil bit magic again:
This is not today's topic though, so explaining how that works is left as an exercise for the reader.

Doing Math on Camera

With the refactoring done, we could go straight to implementing map objects, but I decided to do some polishing first. We can kill quite a few birds with one stone here, and that stone is a 2D camera.
We usually talk about cameras in 3D games, because their control is usually fundamental to the gameplay. But it's also common to have a camera in a 2D game, because it's very convenient for the programmer: If you can just move the camera and the environment magically scrolls on the screen, there are so many fewer headaches involved in just figuring out where to render a sprite.
The easiest way to implement such magic is with a transformation matrix that you pass to the GPU, which will use that to automatically figure out where sprites should be rendered, and at what size.
XNA/MonoGame also provide us with a Matrix structure, so we don't even have to worry about the math behind the magic too much. We just call a couple static functions, multiply the results together, and pass the product to SpriteBatch.Begin.

However, in case you actually are unfamiliar with the math, I highly recommend you read up on it. There's tons of more focused resources on linear algebra than this blog, and you need it constantly when you're doing any sort of graphics or physics work. In any case, here's a brief rundown of the essentials:
A matrix is a NxM field of real numbers. The (x y) vectors we've been working with are essentialy 1x2 matrices. Other than vectors, the most common kind of matrix in graphics programming is the 4x4 matrix, which is also what a Matrix in XNA is.
You can use arithmetic operators on matricess; the two most important operations for us are multiplying 1x4 matrices - Vector4 - with 4x4 matrices to get another Vector4, and multiplying one 4x4 Matrix with another to get a third 4x4 Matrix.
The latter lets us combine multiple transformation matrices into one, so we can perform multiple transformations of a vector with just one matrix multiplication.
The former works by calculating four linear combinations of the elements in the original vector, with the columns of the matrix serving as the coefficients for each elemnt in the new vector. Since we mostly work with 2- or 3-component vectors, the remaining elements are usually padded with 1.0, giving us the power to do translations, i.e. shifting the vectors around by a constant offset. We can also scale a vector by setting the factors on the main diagonal.

Our camera doesn't even need to do any more than that:
First, we want to zoom in by a factor of 16. This lets us render tiles into destination rectangles of size 1, which should remove a lot of potential *16 and /16 from our code, such as in the bug fix above.
Then, as the camera's position goes to the bottom right, we want to move the scene to the top left. This lets us render the map with the origin at (0 0), no matter which part of the level should currently be focused. We could also easily add a screenshake offset, without anything having to change in our rendering code.
Finally, we want to zoom in some more so our virtual 320x180 screen doesn't look so small on a 1080p display. Having this configurable easily will help us once we actually code the virtual screen to deal with multiple resolutions.

I use a full class TopDownCamera for this, with properties and dirty flags, but at its heart is just this one line of code:
Passing that to SpriteBatch.Begin does all the rest. However, there's one other thing we can do, now that we have the camera working:

Polishing Tiles

At this point, I added a Vector4 Viewport property to my camera class that would give me the left, top, right and bottom limits of the visible area of the world. Using that, we can calculate the upper and lower limits of the x and y loops in our DrawLayer method. This has two major advantages:
We only consider tiles that are actually on screen - we can remove map size from the performance equation.
We also get to consider values that are not on the map. This may sound stupid at first, but whenever the camera gets close to the edge of the screen, you see Cornflowe Blue, or whatever other color you clear your screen with. If the map data contained a set of border blocks we could render there instead, we can easily detect when to render border blocks, depending on whether the x and y indices are within the bounds of the blocks array.

Depending on how much attention you want to give this, determining which tile to render can actually be more difficult. If you just define a single border block, you can just go
I, however, wanted the border blocks to be almost full featured maps of themselves. They share the same layer settings and obviously won't get any collision data, but I defined a ushort[,,] blocks, which should be repeatedly rendered around the map.
Finding the proper index into this array is a bit more complicated. You could just modulate the coordinates by the length of the array in the respective dimension, but then you'd still get negative values. Negating these won't work, though, because that would inverse the block pattern upwards and to the left of the map. However, you could add the length of the array and modulate again, which gives the correct pattern, like this:

Finally, here is the code in action:
The ugliest island map in the world!
The water is all border blocks, by the way.

2015/08/15

Tile Maps: I'll Show You The World

After fixing the hot mess from installing the full release of Visual Studio 2015 without first uninstalling the Release Candidate, which left me unable to start up or uninstall either (if you're having the same problem, try running the installer from the command line with the /u /force flags), I can finally do the rendering code for tile maps.

However, to see if it works, we'll need a tileset and a map. We can hardcode the tile map, but for the tileset, I decided to use KenneyNL's Roguelike tile sets, just for proto typing.

And now my watch begins

To get an idea of where to even start, let's think of when a tile map would be visible.
  • First and foremost, during exploration, when you're controlling your character,
  • during scripted events, when you're not in control, and
  • when you're in an overlaid menu, where you control the cursor, rather than your character.
These are three inherently different game states, that share some logic - they are substates to the encompassing TileMapView game state.
Just to get something on screen quickly, I took a quick stab at implementing it. It's chaotic, it's not feature complete, and it's not even documented. It's a refactoring waiting to happen. However, it also renders this:
Not all that pretty. Yet.
I think the code is mostly self-explanatory, but before we go on to add the missing features, let's discuss the evil bit level hacks and other non-obvious choices.

Starting at the top, the MakeTileID method would probably have been a macro, if those existed in C#. It's purely a convenient function for now, but it does document the bit level layout of a tile ID: There's integers X and Y, each ranging from 0 to 63, taking up bits 0 through 5 and 6 through 11, respectively. There's a SpriteEffects in there, which is a field of two flags, and finally a two bit integer called r. That's an index into the ROTATIONS array above, allowing blocks to be rotated individually.

The constructor is pretty simple initialization. All these loops programmatically fill the map. The last line is there to test the rotation and mirroring features. I tested them individually as well, but here the combination of both flips and a rotation by 180° gives us the original orientation of the tile.

Finally, the DrawLayer method does the heavy lifting: we Begin a SpriteBatch for each layer because, as the additive/subtractive lighting layers go in between, we'll have to begin different batches anyway. The layers in between are also the reason this is its own method in the first place, and not just the body of a loop in Draw.
Then, we declare and initialize our local variables; the source and destination Rectangles for the call to SpriteBatch.Draw, as well as rot, an index into ROTATIONS, and a SpriteEffects variable. I do this outside the loop because it's probably going to be the hottest loop in our code base for quite a while and it's a quite intuitive optimization.
Within the loop itself, we split all the info in the tile ID and update the local variables as necessary, before finally drawing the tile. The Vector2 origin we pass here refers to the center of rotation, relative to the top-left corner of the sprite. Since all tiles are 16x16 pixels, passing (8 8) rotates tiles around their mid-point.

The light that brings the dawn

I already mentioned the light maps that go between layers, an those will be our next priority. To make this make sense, let's place a light source first. The fire pit will do, so that's MakeTileID(14,26) for me.
As for the light itself, I used gradients and the Posterize adjustment in Paint.NET to make a pixely, fading blob of orange on black and added that to the tile set. At this point, my proto-tileset looked like this. I have my doubts stuffing the lighting data into the tile set like this is viable in the long run, but for now, this will certainly do.

To blend layers, we need to pass a BlendState when calling SpriteBatch.Begin, so we'll just add that as a parameter to our DrawLayer function. After adjusting our code in Draw and adding a couple tiles to the additive layer, we can now render this:
...still not impressed? Me neither.
While BlendState.Additive is provided in XNA/MonoGame, we'll have to make our own subtractive blend mode. We can't really set the opacity either, but for that, we can use the color parameter to SpriteBatch.Draw.
It is used to tint sprites by multiplying each pixel's color by the provided value. Since we only use the opacity for additive and subtractive blending, we can provide a shade of grey that gets us the same effect as setting opacity.
The rules say we get exactly different 16 opacities, so I decided to make another array of constants to reference with an opacityIndex parameter to the DrawLayer method. Choosing an opacity of fifty percent (that's #808080) makes the lighting in this proto type a lot prettier:
Meh.


The fire that burns against the cold

Our little camp fire just doesn't look right with just the light map, wouldn't you say? The thing it lacks is movement. I've never seen a still fire, so we need to animate it.
I already mentioned using a Dictionary for that in my last post, and here's how that works:
You need the X and Y parts of a tile ID, both as values and as keys. When rendering, get the lower 12 bits by ANDing the tile ID with 0xFFF, put them in a local variable, and if (dictionary.ContainsKey(id)), set id = dictionary[id].
This lets you alias tile IDs, which is a great way to troll level designers, but not all that useful by itself. To animate a tile, you need to dynamically change the contents of the dictionary. There are seemingly endless ways to time your animations, but no matter which you choose, in the end, the result will look somewhat like this:
Converting this into a gif seems to have fucked the colors, but at least it's moving.

With that, we have the basic feature set down. The new main focus, obviously, is to refactor this mess into something useful. You'll hear from me again once that's done and something interesting is in the making again. Until then!

2015/08/14

Tile Maps: How to Represent a World

After the last post, which covered an awful lot of code, this will mostly be theory again, mostly because I'd like to go in with a plan, and I do not yet have one.

With a good tile set, you hardly notice the grid.
I want you to think of classical console RPGs. Final Fantasy VI (pictured on left), Chrono Trigger, Secret of Mana and the like. While their art styles differ somewhat, they have one thing in common: They share an angled top-down view of their game worlds, projected orthogonally, i.e. without perspective. Levels are made up of 16 by 16 pixel square tiles.
Pokémon, Golden Sun and most RPG Maker games use the very same world view. Yet, with all of these games, there are some technical differences in how they do their rendering; they all use different data structures to represent their levels, so they all have slightly different 'max-specs' in terms of level design.

Layers

Seiken Densetsu 3, showing the ocean
and the sky beyond the cliffs.
Some SNES games opted to use one of their precious two background layers for a parallax background.

Especially popular with the RPG Maker crowd, but also used in Chrono's room in Chrono Trigger, you could use additive blending for lighting effects.

You could use subtractive blending for shadow effects in murkier environments, alpha blending for translucent overlays, e.g. for weather effects or transparent objects in the game world.

By sacrificing layers for visual effects like these, however, you limit how many layers you can effectively use for maps. For instance, Pokémon Ruby and Sapphire had bridges you could go both under and over, solely because the GBA had more background layers, making a multi-layered environment possible.
Considering you also need to render UI elements somehow, you run out of layers fast. In my post on technical limitations, I said I'd use around eight layers. Let's see where that gets us, assuming we want all of the above:
  1. Parallax Background
  2. Map (Underneath objects)
  3. Map (Above objects)
  4. Light Map (Additive Blending)
  5. Shadow Map (Subtractive Blending)
  6. Parallax Overlays (Arbitrary Blending)
  7. UI Background (Blending)
  8. UI Text
That's not very far. We could do objects crossing bridges over and under, but there could be no elements in the map that cover these bridges, so that's already pretty limiting. With both the light and shadow map above the map layers, you couldn't have light sources or shadows that only cover the lower level of the map.

We'll have to bend our minds somewhat to justify more layers.
The obvious target would be UI; Assuming the background doesn't blend, you could render it into a single layer. You could also use sprites for text and other non-transparent objects. Either way, that's one layer we get back.
You might argue that few maps would use both light and shadow maps, and that having both go in front of all map layers is almost certainly a waste of either. Reordering the layers so that you get one transparent layer with arbitrary blending in front of both map layers would be more useful.
Finally, you might say that having both an overlay and a parallax background would look weird, since some layers in the middle would be the only scrolling ones. If you allow either, but not both at the same time, you get the same fundamental feature set, and an additional layer back. However, I've never seen both effects used at the same time and thus would like to withhold judgement in this case.

Still, our layer reservations now look like this:
  1. Parallax Background
  2. Map (Underneath objects)
  3. Light Map (Arbitrary Blending)
  4. Map (Above objects)
  5. Light Map (Arbitrary Blending)
  6. Map (Above everything)
  7. Parallax Overlays (Arbitrary Blending)
  8. UI
I might want to reshuffle layers 2 through 6 later on, or even provide facilities to determine their order on a per-map basis, but that's basically what I'll be going for. And I'm quite happy with it. We have three Map Layers, which should be enough. We might even get a fourth, depending on my judgement on the background+overlay issue.
All that said, it should be clear that there can really only be two layers of map objects, which will be a major factor in how you can layout a dungeon, for example.

Tile Sets

These are the deciding factor in what can actually go inside a map. I barely touched on these in my post on technical constraints, so it's time to talk about them now.
A tile set is essentially a texture you use to construct maps. A map is a multidimensional array of indices into an array of tiles. Tiles are regularly sized pieces of the tile set.
As mentioned above, most RPGs from the era in question had a 16x16 pixel grid, but both the SNES and GBA had hardware support for maps with a 8x8 pixel grid. Generally, you would find tile set meta data, that provides the 8x8 tile set to the hardware, as well as block data, which would combine four small tiles into one 16x16 block, possibly with some gameplay data attached. This gives you a block set, from which the actual maps are constructed.

Since there's no hardware or even library support for tile maps, I don't have to add this complication, using a tile set with a 16x16 grid from the outset.
However, that begs the question how large our tile sets should be. Depending on the graphics mode, the SNES supported up to 1024 8x8 tiles in a set. Each can be displayed in one of eight palettes, giving us 8192 possible tiles, four of which are combined into one block, giving us upwards of 4.5 quadrillion possible blocks, more than we could index with a 32 bit integer.
Usually, only a handful of tiles would be used with different palettes, and even then, rarely more than two palettes. Assuming all tiles could be reused that way, this gets us down to about 17.6 trillion. If none are to be reused, that puts us to just below 1.1 trillion, still more than fits into an integer.

Maybe looking at actual technical limitations is in order. XNA runs on DirectX 9. That's so 2010. Anyway, DX9 supported textures up to 4096x4096 pixels, though not all DX9 GPUs actually implemented that. The most commonly recommended maximum texture size I remember from these days is 1024x1024 pixels.
Now, I'm not sure how MonoGame fares compared to that, but a texture of that size gives us 64x64 (=4096) tiles. In comparison, that's nothing, but it's a much more reasonable size for a block set. You can index that with a 16 bit integer and have 4 bits left over, e.g. for flipping blocks along either axis (one bit per axis) or rotating by a multiple of 90 degrees (four rotations fit into two bits).

As for colors, because of the way blocks are constructed from four tiles, each of which might use one of eight palettes of 15 colors, one block might contain up to 60 different colors, but only 15 colors per quarter block. That's not only confusing, but also somewhat limiting.
You can't really enforce it from a software perspective either, so I just decided to go with up to 128 colors per tileset, and try to avoid scenarios with many palettes on a block. If breaking this rule makes for a prettier tileset, though, I won't think twice about it.

Animation

Whether it's a waterfall crashing into a riverbed, the mill wheel rotating downstream or just the grass beside it swaying in the wind, if it's in your game world, you'll want to see it moving. Since the tiles themselves can't really move, and that would make for a horrible 'animation' to boot, we have to figure out another method.
Generally, you animate pixel art with old-school, hand drawn frames that get swapped according to some carefully chosen timing.

Ideally, the tile ID we read from the map should be translated directly into coordinates in the tile set texture, so this leaves us in a bit of a predicament.
On the SNES, you'd have a set of animation definitions, process them, and hot-swap the animated tiles in the tile set for other graphics. We could use SetData on the tile set texture, though I'm not sure how feasible it is to do that mid-frame.
If we want to stick to the ideal tile ID to texture coordinate translation, the only real option is to modify the map data instead of the tile set data. That's naive and counter intuitive at the same time; the map refers to an animated block, why does the animation change the map? It would also take varying amounts of time, depending on map size.
The most viable solution I can see, would be an additional step in translating the tile ID to a texture coordinate. Have a dictionary of tile IDs, which is updated by the animation code, and switch tiles as they're rendered.

Meta Data

With that, we could go in and write a renderer for this specific tile map system, but there are some questions left to make it into a game. These questions primarily concern meta data:
Which tiles can you actually walk on? Do these tiles have any special behaviour attached? How should NPCs and other event data be stored?

Starting with collision data, I want you to think of two games in particular: The Legend of Zelda - A Link to the Past, and Tales of Phantasia. In both games, you move your character with no respect for the grid, unlike Pokémon, where your character is always aligned with it.
In Tales of, a block's collision data was essentially four boolean values, creating a smaller 8x8 grid of colliders. In A Link to the Past, in addition to full-block colliders, there were diagonal walls, that, when walked against, had you slide along them.
The point is, there's many ways to relate collision data to the tile map, and you have to decide on one concept.

Logical layers also tie into this. Consider bridges again. While you're on top, you collide with their edges, but can walk over just fine. When you're below, the inverse is true.
The Pokémon games handled this strangely. They have one byte of movement data attached to each block. Two bits of that are dedicated to the basic collision behaviour (walkable, impassable, walkable*, surfable), with some of the rest representing various layers.
You can cross from walkable blocks to walkable blocks of the same layer, but not of a different layer. The walkable* behaviour allows you to cross layers.
Certain values of the movement byte are reserved for horizontal and vertical bridges, as well as some others with unknown behaviour. If you are on a walkable block above a certain layer, you can enter horizontal bridge tiles from the sides, though you can walk freely between them. At the same time, you could only enter vertical bridge tiles from the top and bottom. If you are below that layer, the inverse is true.

I really like this approach, because you only need one collision map. The Pokémon developers only ever used, like, two of the dozens of available layers, confirming again that that should be enough, while also freeing up some space in the collision data structure.
Let's say we use three bits to determine the basic collider shape. There's passable (0), impassable (1), four variations of diagonal walls (4-7), and two undefined values. When used on the lower layer, these may just be impassible, but on the upper layer, they might represent bridges. Add one bit to indicate the layer, and we have four bits per collider.
We could easily use four of these per block, which would make for a total of 16 bits of collision data.

Block behaviour is a whole other issue. These should be hard-coded event handlers for entering, exiting or clicking on a block with the given behaviour. Seeing that they're hard coded, you could use a simple list of behaviours and index that with a byte, but I think you could parametrize behaviours opcode style.

NPCs and Objects should probably get their own post, especially with the recent buzz around Entity-Component-System designs. For the time being, I think the most important bits are already covered.

Anyway, we won't need most of that meta data for quite a while. We will first focus on getting tile maps to render in the first place, then we can worry about map objects, collision detection, the character controller, and all the other good stuff.
Personally, I'm excited this is finally packing some steam and look forward to continue work on this! Until then!

2015/08/12

Self-Imposed Restrictions: The Sound of a Generation

As I hinted in my last post, I've been researching sound technology so I could finally complete the rule set simulating technical restrictions of the SNES and GBA. The fundamental problem I should have noticed the first time around is that, while the two systems had really similar graphical capabilities, their sound systems are radically different.

Which Generation, though?

The GBA had to be backwards compatible with the Game Boy and Game Boy Color, so it came with all the hardware of the GBC inside it, including its sound system: four analog channels, one for white noise, two for square waves with varying duty cycles, and one "programmable channel", which was essentially a four bit digital channel. It would have been a waste not to use these for GBA games as well, so all that was added for those were two digital 8 bit channels.
Compare that to the S-SMP, the audio processing unit in the SNES, which sported eight digital 16 bit channels, at 32 kHz, which is almost CD quality (which was never utilized, because of limited memory and lossy compression). A cheesy echo effect was famously added as well, which saw quite a bit of overuse.

The two consoles encouraged completely different approaches to sound design. One had all the utensils for high quality chip tunes, the other was designed for sampled music. Lo-fi samples, but samples nonetheless.
This is the reason people barely complained about the music in Donkey Kong Country 3, which had its sound track remade entirely for the GBA port, while almost everyone complained about the "shitty audio" in, say, Final Fantasy VI.

When designing the rules for graphical limitations, I cherry picked the better properties from both systems. My virtual controller has an X and Y button; my virtual screen is modeled after the SNES resolution; and I won't emulate the GBA's display, at least not by default.
The choice seems obvious: I'll design my rules around the S-SMP, leaving out GBA tech. Even if I decided to use a chip tunes sound track instead - you can make some kick-ass tunes with eight sampled channels, even more so than with four analog and two sampled ones.

The Rules

The rules themselves are quite simple, actually:
Since there is only eight channels, you can't play more than eight different samples at a time. Since you need a channel or two for sound effects, that leaves you with two options: reserve a channel or two, or drop one out dynamically whenever an effect plays. I won't bother programming the latter, so we'll stick to the former: No more than six notes may be played at any one time in a song, though I there will be no limit to the number of sound effects playing at a time. It does, however, include the tails of notes: samples were generally chosen such that even when played in rapid succession, they wouldn't overlap with themselves.
The S-SMP's memory was also limited to a mere 64 Kilobytes, and after compression, 9 bytes of data in there correspond to 32 bytes of sample data. Leaving some space for instructions, this means that no more than 110 Kilobytes of sample data are permitted per song.
With so little memory, multisampling, i.e. using multiple samples of the same instrument playing different notes, wasn't really done. So only one sample per instrument. On the other hand, sounds based purely on white noise, such as a synthesized snare drum, do not count towards the data total, as the S-SMP came with a noise synthesizer.
As to the quality of the samples, the SNES could play back any sample rate up to 32 kHz, but because of the strict memory limitations, none of the games I looked at used sample rates above 8 kHz. Also noteworthy is the compression algorithm used: Bit Rate Reduction produces very unique, audible artifacts. Luckily, I found a tool that BRR compresses sample data.
We'll also make good use of the famous echo effect. It could be enabled independently for all channels, but they had to share the same settings. The delay could be set to any multiple of 16ms from 16 to 240ms. Even echo feedback is supported.
Finally, the eight channels could be "mastered": There's registers for pitch modulation and ADSR envelopes, as well as panning and volume.

With all that in mind, I'll use a tracker to compose the music and render it into uncompressed *.wav files, and possibly master them with an equalizer. This would be wholly unauthentic, but it's what Shovel Knight did, and that game had a great sound.
Either way, this is entirely in asset production, so my sound engine code can be quite simple, and I can focus on other things, which is always good.

Problem is that I don't know jack about music theory, so I'll use inauthentic place holder sounds until I (a) find someone willing to work under these conditions; or (b) learn music theory.

Emulating the GBA's Display with Gamma Correction

When I was researching the sound technologies of the GBA and SNES for my current project, I started by just searching for YouTube videos comparing SNES games to their GBA ports. In the comments, there were two common complaints about the remakes: The awful audio quality, which I came to witness myself, and the "washed out" palettes.
One or two people responded to the latter complaints: It looks fine on the actual hardware, the GBA didn't have a backlit screen, the palettes had to be remade with that in mind. That's true and all, but it still bothered me; why do none of the popular emulators accomodate this?

I don't maintain an emulator project myself, and I don't want to make yet another fork of Visual Boy Advance. Even if I did find my way around the code base to make the patch, noone would want to maintain any C code I could write. But at least I could do the leg work and figure out how to even emulate the display properly (Welp, looks like it ain't happening in VBA-M), and that's what I did:

Show Your Work!

I busted out my trusty white Game Boy Advance and the copy of Pokémon Sapphire I got for christmas so many years ago, booted it up and found myself in a Poké Center. "That'll do", I thought, and took this picture with my phone:

I already do programming, game design and pixel art.
You can't expect me to be a photographer as well!
I took quite a few more, but this is the one I ended up using. I fetched this picture from Bulbapedia for comparison and got to work. Now, when you're working with a picture of a screen, there are a few things you need to watch out for.
One is light pollution - I took this picture in my bedroom, at midday, with the blinders down and no artificial light. This makes for a mostly neutral lighting environment.
Next is reflections. Obviously, I couldn't use most of this photo because the reflection of my skin and hair tinted the image. Similarly, the reflection of the white ceiling in my room desaturated or completely obscured some sections. The lower right quarter is fine, though - that's the black casing of my phone, making no real reflection.
Then there is the limits of technology: Pixels are not magic, and you can see obvious artifacts from their boundaries on the photographed screen. As with any photo, there's image noise, and because I can't tell my phone not to compress the image, there may be JPEG artifacts.

With all that in mind, I started taking samples of pixels that should be white: The highlights on the table and chairs, the tiles in the middle of the room. From fifteen samples, I determined the median values of the R, G and B channels independently, hopefully minimizing the effects of noise and human error, arriving at (41 50 47) for "white".
I repeated that for two more colors - I got a garbage value for R the first time - and arrived at (15 31 29) for (144 176 112). If you think that looks wrong, don't worry; so did I, until I saw the results.

I decided to try gamma correction before any other more complicated methods, and just needed to calculate the gamma values for the three channels, which seem to have different curves.
For those of you who don't know what gamma correction is, this video does a fine job explaining the concept in a mere four minutes, though it only refers to the term gamma in a foot note on the screen.
Anyway, to calculate the gamma values, we should first make the measured values a little less unwieldy by linearizing them.
The samples from the JPEG could range from 0 to 255, so we first divide by 255. To linearize the result, we then need to take the value 2.2, the gamma value used by most photo cameras.
The samples from the bulbapedia picture range from 0 to 248, a result of the way the image was converted from the GBA's 15 bit color space. The reason they look washed out is, that they probably already are linearized, so we just divide these by 248.

This is the resulting set of equations:
r(0)=0    r(0.58)=0.002   r(1)=0.018
g(0)=0    g(0.71)=0.010   g(1)=0.028
b(0)=0    b(0.45)=0.008   b(1)=0.024

However, a gamma curve is of the form f(x)=x^gamma. This means that f(1)=1, so we'll scale the right side of all equations accordingly and multiply the resulting color by (48 48 48). Note that I determined that value empirically, because the "white" (41 50 47) didn't seem to work that well for this.

The equations now:
r(x)=x^R    r(0.58)=0.1111
g(x)=x^G    g(0.71)=0.3571
b(x)=x^B    b(0.45)=0.3333

Solving this with logarithms is trivial and gives us roughly R=4.0, G=3.0, B=1.4 for our gamma values.

The Results

With all that said, let's look at some screenshots!
The original image.
It seems to work perfectly for GBA games!
GBC games were also playable on GBA.



It does NOT work for SNES games, which were intended for well lit CTR screens.

These are real fucking dark. Reminds me of shifting in my seat just so I could see what I was even doing. If we want a compromise between usability and authenticity, maybe customization is in order;
Same gamma values, different multipliers. From left to right:
(255 255 255), (096 096 096), (048 048 048)

Recap

Most emulators do not emulate the display that came with the emulated device. To convert a color c from 15 bit RGB to C in 24 bit RGB, most use C=c*8, leading to inaccurate color representation.
To emulate the non-backlit display of the GBA, gamma correction and color multiplication should be used, with different gamma values per channel. To make the effect more practical, customization should be offered.

2015/08/09

Keeping Time with a Game Clock

So, I'm currently in the process of implementing tile maps, which is coming along nicely, but when turned to take care of animations, I realized I hadn't yet come back to talk about timing, and why it's been lacking from my code so far (excluding the Profiler, of course).

Keeping Time: Now

The way MonoGame and XNA provide timing information is with the gameTime parameter of the Game class's Draw and Update methods. Its type is the aptly named class GameTime, which has two TimeSpan properties for the total time since the initial call to Game.Run and the time since the last call to Update, and an IsRunningSlowly flag to help you optimize in certain scenarios.

The way I learned it, you're expected to pass that around all code that requires timing information. Which can be anywhere from half to almost all of your code base.
That alone is a bit annoying, and reason enough to look into alternatives. Just looking at the source code of the Game class tells us that the _gameTime field is never reassigned after initialization, and its initialization is the only construction of a new GameTime instance.
Therefore it stands to reason you could just keep a reference to that very gameTime around wherever you need it, instead of passing it in every frame. Problem is, at least with our major engine subsystems, initialization should be over by the time the first update rolls around.

Keeping Time: From Here On

To fix that I propose a GameClock class. If we manage instantiating that ourselves, we can be certain keeping references to it is save, rather than guessing as with GameTime. We can also initialize it first, so our subsystems can safely complete initialization, without having to wait for the first frame.

The other thing I envision it doing is better medium awareness: There's more to keeping time than frame deltas and the total running time of the game. The latter in particular has literally never been useful to me. I can see passing it to the Steam API at the end of the session, so Steam could track the total amount of playtime with the game, including idle time. Still, it's a little early to be thinking about Steam, wouldn't you say?
What we should be thinking about is relativity. The game may have been displaying the title screen for a while, but when the player finally hits New Game, we should start counting from 0:00. If they load an 8 hour save file, however, we should start from 8:00. We'll call this play time, as opposed to the 15 minute session time.
Finally, the game world may run at an entirely different time scale. If the player picked up a slow down power up, it might run at half speed, or zero speed, when the game is paused (The latter could also be true for play time, when in debug mode).

If you're confident in your engine's determinism, you could also enable frame skipping, and have a mocked out GameClock and InputMapper play back recorded date. This way, you could have hours of playtime run trhough in minutes, which makes for awesome regression tests and might help your QA efforts a lot if you can't afford your own team for that.
I don't have plans to do anything like that at the moment, but it's a nice idea to keep in mind.

Anyway, here's my implementation of such a clock:

2015/08/05

Mapping Input to a Virtual Controller

The input system I built for my engine is fairly simple, but it involves a fair amount of code as well.

As I mentioned last time, it revolves around what's essentially a virtual SNES controller, with just twelve digital buttons. There's two concrete input mappers, one that maps keyboard input to that virtual controller, another one for game pad input.
Then there's two additional configuration classes, to allow rebinding either input method. Then there's all the details, primarily initialization in this case.

The Polling Interface

Let's begin with the abstract class InputMapper. It's essential function will be to provide a button state for a given button. It's pretty much the only thing it'll ever really do, so I decided to make that an index. The return type is a custom [Flags] enum ButtonState, which has one flag for the current state and the previous state, so it also encompasses Released and Pressed events:

The enum Buttons doesn't have to be a flag field, but for convenience, we should be able to cast it to an integer, for use as an index into an array. You'll see why in a moment, but essentially, we should be explicit about that. No confusion how to use any such arrays then.

This leaves us with this bit of code for the InputMapper:

That should clear you up on why Buttons had to be array compatible, but how do we fill the state array?
For one, we should only fill it once, before we actually need any input, so we will call an Update method on the InputManager, before we get to the logical parts of the game loop. Instead of making everyone who implements the InputManager deal with ButtonState, however, we'll fill the array in an abstract way, and only have the implementer tell us whether a button is down. Before we can do that, the implementer will want his input data, though.
That code looks like this. The loop might be a little confusing there, so let me explain: Essentially, for each value in Buttons, we right shift the current state by one bit, and OR in the current state. This works because the ButtonState enum is a field of flags, with the lowest value bit representing the previous button state and the next lowest bit representing the current state. Right shifting by one bit will completely delete the previous state and move the current state into the previous state. The four possible combinations of previous and current state make up the enumeration.

Mapping Game Pad Input

To map game pad input to our virtual controller, we need to implement a class derived from InputMapper. The GamePadMapper only needs to implement PollState, which should retrieve and store a GamePadState instance, and IsButtonDown(), which should defer to GamePadState.IsButtonDown() with an appropriate parameter.
Figuring out that parameter is the real mapping. I used another array of length 12 for this, with the Button, cast to an int, as the index. GamePadState.IsButtonDown() takes Buttons as well, however that's Microsoft.Xna.Framework.Input.Buttons. Hooray for namespaces!

However, this just defers the problem: We still have to fill the array of Xna Buttons to actually map any input. And that's when we go back to our static class Configuration.
First of all, we need to add a GamePadConfigurationSection, which defines lots of properties like this:
Yeah, I copy/pasted twelve of these.

The reason I didn't use an array in the configuration is two-fold: I just didn't want to bother looking into collections in App.config, and it's safer if you explicitly force twelve properties, all of which are named. If one is invalid, at least you get the other input settings you specified.

Mapping keyboard input is really similar. In fact, I copy/pasted most of this code and find/replaced GamePad with Keyboard. That's just XNA being consistent. Isn't it great?

Integrating the InputMapper

Anyway, now that we have two concrete InputMapper implementations, time to figure out how to fit it into the existing code.

I started in the MainGame class, adding a field InputMapper input, and rewriting the Update method like so:

Next was initializing the input field, so I went back to the InputMapper class and added a static method CreateDefault to create a default instance according to the configuration.
To do that effectively, I had to add a boolean property PreferGamePad on the GameConfigurationSection class. If a game pad is preferred, we should try to return a GamePadMapper. If we can't responsibly do that, i.e. when no game pad is connected, we return a KeyboardMapper instead.

Finally, I added constructors to both concrete mappers, both of which use their respective configuration sections to fill the buttonMap[].

And that's pretty much it. Fairly simple code, the bulk of it is mostly boilerplate.
With that out of the way, however, we can turn back to talk of visual design and tile maps, and finally get some graphics on the screen. Hopefully.

2015/08/02

Self-Imposed Restrictions: Examining the GBA and SNES

Since I finished the last blog post, two weeks have passed, mostly because there was no clear direction to go next, and there were other matters taking up my free time. Also procrastination.

Anyway, in my post on save game serialization I said that the game I want to build would be to 16 bit RPGs as Shovel Knight was to 8 bit platformers. The Shovel Knight devs went to great lengths adhering to technical limitations imposed by the NES, breaking the rules only where both justifiable and necessary. Programmer David D'Angelo outlined this in a fascinating Gamasutra post. A style guide for Steel Assault imposes similar restrictions.

So, since all the topics I wanted to tackle, but couldn't decide on, should adhere to similar restrictions, the goal of this post is to find a suitable rule set based on the SNES, home to classical RPGs, and the technically similar GBA, home to most games of my childhood.

Input

One thing I wanted to write about from the very beginning is an input system; reconfigurable controls, game pad/keyboard support, the whole shebang. Funnily, input isn't mentioned in either of the rule sets I linked, but the Shovel Knight one touches on the implications for level design some of the graphical limitations had. I think something similar should apply here:
the SNES had six face buttons (A, B, X, Y, Start, Select), an 8-way D-pad, and two shoulder buttons (L and R). Compare that to an XInput controller, which adds six analogue input axes (two thumb sticks and two shoulder triggers) or a modern keyboard's ~100 digital keys. Obviously not all of these are used, but think of games like League of Legends, Starcraft or even Dragon Age. These require a mouse in one hand and the keyboard with dozens of short cuts in the other. There is basically no way to figure the controls out as you go in these games.

Granted, these are complex games, and you don't need all those short cuts, but it's more intimidating and complex than it needs to be, for this game at least.
We will ban mouse input, and limit the keyboard to 12 freely remappable keys, with sensible defaults.
Controllers will also not be utilized fully: Analogue inputs are forbidden, except for the left thumb stick, which will be truncated to function as a D-pad.

Graphics

This is the main factor in simulating the look-and-feel of a SNES or GBA game. Both systems have comparable graphical capabilities, which is why so many games for the former were remade for the latter. For starters, we will cherry pick the best of both, and see if we need to add more loop holes during development.

For one, the two share the same color space. Both use 5-bit-per-channel RGB colors in palettes of 16 colors. I've read claims that the sixteenth bit in a color determines its transparency, but I've never actually seen it used. The first color in a palette is transparent for backgrounds and sprites alike (except for the back-most layer), giving us 15 colors per sprite or background tile.
The SNES had a total of sixteen palettes available, the GBA had 32. Both assigned one half to backgrounds and the other half to sprites. We will enforce no hard limit on the total number of palettes. Certainly not for sprites. But considering that UI and other overlays also require some palettes, there won't be a limit for backgrounds either. There will be a limit for the actual tile sets, however. Instead of limiting ourselves to exactly one set of fifteen colors per 8x8 cell, we will limit the total number of colors in a tile set to 128, including transparency.

Speaking of background layers, the SNES has two while the GBA has four. We will ramp that up to eight layers. These are larger than the screen and can be scrolled freely, allowing smooth-scrolling maps and parallax effects.
The available blending options differed a lot more wildly: The GBA had alpha blending with arbitrary five bit coefficients, with one bit unused, giving us lots of options, most importantly semi-transparent and additive blending with 16 different opacities. The SNES did not have modifiable coefficients, offering four blend modes instead: adding, adding and halving (i.e. averaging, effectively 50% transparency), subtracting, and subtracting and halving, which went mostly unused because the results are extremely dark.
If we pretend the fifth bit of the coefficient was a sign, we can have subtractive blending with 16 different opacities as well.

Then there's sprites. Both consoles only have one continuous section of Object Attribute Memory, where the order determines how sprites should overlap, and blending generally considers all sprites as one layer. However, on the GBA at least, each object can set between which layers it appears.
This behaviour can lead to some visual glitches and is actually pretty hard to simulate with modern rendering techniques, so we will have eight separate sprite layers, one in front of each background layer. There will be no hard limit on the total number of sprites.
Sprites could be scaled and rotated freely, with nearest neighbour sampling.

As for the screen size itself, we will use a virtual 320x180 display, for various reasons: Shovel Knight carried over the height of the NES's resolution and expanded horizontally to accommodate modern displays. They decided on that because of the way gravity works. In a top down RPG, there is no gravity as such, so the screen height in tiles becomes less important to gameplay.
Instead, the total number of visible tiles becomes more important to replicate the same sense of scale as RPGs on the SNES. I arrived at 320x180 by calculating the 16:9 resolution that comes closest to the total number of pixels of the SNES resolution.
It has some other nice properties, such as scaling perfectly to 360p, 720p, 1080p and 4K displays.

Finally, perhaps the most iconic SNES feature is graphics mode 7. Contemporary RPGs commonly used it for the world map, showing off features in the distance, impressing you with the vastness of the world. Chrono Trigger used it in the racing minigame, to much of the same effect: you get an idea how much distance you cover. It also looked pretty cool. Personally, I'd use it to juice up combat a bit.
The GBA had it as well, although it was used more rarely and perceived as less exciting. The point is to transform the background in 3D space. I'm not sure where to use it, if at all, but if I do, it'll be as follows: The ground will be a single 256 color texture, and sprites can be used freely.

There's some other visual effects that had native support, such as mosaic filtering on background layers, as well as fade to black or white. Since there's central palette memory, you could feasibly manipulate the coloring of the scenery, as some talented ROM hackers did to add a day/night cycle to the Pokémon games on the GBA.
I will definitely discuss color manipulation in a later post, but every other effect will be decided on on a case by case basis.

Sound

I won't say too much at this moment, because I know little about audio programming and less about music theory. I know these consoles have similar audio capabilities, with a certain limit on how many sound effects may be playing at the same time, how many sounds a song may simultaneously use, and a maximum sample rate on all sounds.

We will not enforce any limit on the number of sounds playing at the same time, and probably allow any and all music that can be rendered with a MIDI file and a sound font. Sample rate limitations will be considered at a later time.

Where to go from here

Okay, with these rules in mind, how should we proceed from here? I already mentioned that I wanted to write about the Input system for a while, and with what I wrote here, I think I can wrap that up in a single post, so that's next.
The other topics I wanted to tackle are the tile map rendering system, which obviously depended somewhat on the graphical limitations I put in place here, and prototyping a combat system.
We will do tile maps first, because the combat system should be seen in context of a narrative-driven RPG, so we should at least have a map to explore and enter multiple fights from there.

If you are interested in the technical specification of the Gameboy Advance, you might want to take a look at GBATEK, which also discusses the Nintendo DS. I haven't found an equivalent document describing the SNES, but I admittedly never looked particularly hard.