2015/08/18

ABDUCT: About functional programming, patent laws and a dead project

About a week ago, while I was working on the entity system for my game, I grew tired of my code base; I'm not really happy about all decisions I made, both early on and with my tile map rendering, and it all accumulated into some anxiety over sprite rendering.
So I decided to take a break and come back later instead of burning out and leaving the project unfinished. I didn't want to stop contributing to the game, however, so I considered my other options. I was also kind of obsessed with audio stuff at the time, since I spent so much time researching it just prior, and really wanted to work on authentic BRR compression, with better tool support than the shady tools I found on romhacking.net.
Finally, I wanted to try functional programming, see what all the hype was about. Since audio compression seemed like a great fit, and F# like a good bridge into the paradigm, everything seemed to line up perfectly.

Since tool naming is important, I thought long and hard (heh) about a naming scheme for my own tools, and came up with the Authentic Bit rate reDUCtion Tool, the Barely Usable Rigid Game Level Editor and other silly backronyms for criminal verbs.
Then I got to work, setting up an F# project Abduct.Core, which should do all the heavy lifting, and the C# project Abduct.UI, which should handle all the boring stuff, such as communicating with NAudio, the library I decided to use for handling audio files (and resampling, which, as it turns out, you shouldn't try to do yourself, at least not if you want quality output).

And, as if in the second act of a tragedy, things started to go down hill from here.
The plan was relatively simple: use NAudio to load WAV files and convert them into 16 bit mono. Resample to a user defined sample rate, perform length correction - BRR compressed sounds come in chunks of 16 samples, so you either have to pad in silence for sound effects, or truncate looped samples. Then you apply the actual compression, and decompress again - we're not actually running on a SNES, so the compressed data is of no use to us.
The user should be able to play back a sound at any sample rate (without resampling!) to hear what the sample would sound like after pitch modulation, with or without looping, to examine its behaviour, and finally export the file as an uncompressed wave file.

As mentioned above, I use NAudio to resample, so that's not really an accomplishment. I did get length correction to work, I think. It's currently intertwined with the non-functioning compression code, and even before, it was a bit hard to test.
Convering different wave formats with NAudio involves mucking around with .NET streams. A lot. So much so, that I couldn't find a way to easily do it for any wave file I might load, even with a clearly defined target format. So, for prototyping, I just allowed wave files in the proper format.
Playing back audio with NAudio is also an atrocious process. Well, it is if you have to change the output wave format dynamically because you're trying to modulate pitch.
The real killer of this project, though, was the actual compression, for two reasons:
  1. I didn't bother to really learn the concepts of functional programming. I just jumped in with a new project in Visual Studio and this reference. Inevitably, I encountered a compiler error that I still haven't wrapped my mind around yet.
  2. As the wikipedia article on BRR compression mentions - which I didn't notice during my initial research - there's loads and loads of patents involved. I don't have the patience and legalese to read all of these, and I can't afford (and frankly, don't want to pay) a patent attorney to do it for me. I'm not clear on American patent laws, and the fact that I'm located in Germany doesn't make the whole situation any simpler.
To cut to the chase, even if I did get around all these other problems - bite the bullet and write all that code to properly use NAudio, learn all about the functional paradigm - I still couldn't complete and use the tool with confidence, much less release the code in good conscience.
I will, however, strip away the bits of compression code I wrote so far, get it to a somewhat presentable state, and release that code on github later this week. If anyone from the home brew and rom hacking communities is more confident in their legalese (or Nintendo's unwillingness to sue over this stuff), they are of course free to do with it whatever they want.

As for myself, the inability to get accurate SNES sound is a bit of a disappointment. I'm left with just a few options:
  • Skip the compression and only use length correction and resampling. This would make our samples significantly higher in quality - maybe using 8 bit samples might combat that somewhat.
    • I could, of course, try other compression algorithms, or even create a shitty one myself.
  • Use a chip tunes soundtrack. Most samples qualified for that would barely suffer from the compression anyway. It would also get us closer to the GBA sound, which wouldn't be too bad.
  • Giving up on authenticity and using a CD quality sound track is tempting, but not really my style.
I'll have to think about the specifics of this later. It's 4 am over here, and I need some sleep.
Anyway, that's what I've been up to this week. I'll muck around with functional programming some more, but I'll return to doing engine work soon, promise!

NOTE: I usually wrtite blog posts a week or three in advance, so the posts will keep coming as scheduled, this post is the one in the wrong time line.

UPDATE: The cleaned up code for this project can now be found on github: https://github.com/teryror/ABDUCT-Audio-Processor.

No comments:

Post a Comment