4/28/2013

Arithmetic with Regexps

Let me start by telling that I'm addicted to the programming challenge site SPOJ, and I was very happy when they added a new section dedicated to code golf. However, it didn't take too long for me to realize that I can participate in the contest using Python or Haskell, but I could only compete by using Perl.

I was not fluent in Perl, so I bought the Camel Book and started to learn it. There's a secret for Perl being so good at code golf, and this secret is clever use of regexp. The regexps from Posix are just finite automata, but the Perl engine is much more powerful than that.

Naturally, I became tempted to check what are the computational limits of Perl's regexp. For example, can we do arithmetic with regexps? After thinking I while, I concluded that yes, we can do arithmetic with regexps!



Before showing how I did it, let me do a quick review of Perl. The only command I used is s///g (substitution with regexp). The first argument is a regexp that will be matched against your input string. The second is a string, possibly interpolated, that will substitute the match. The option g allows for multiple substitutions, starting at the point where the last one finished. For example, here's a regexp to translate Brazilian laughs for international laughs:

input: "kkkkkk goat kkkk yelling"
regexp: s/k+/LOL/g
output: "LOL goat LOL yelling"

Before doing any calculation, we also need to define a numeric representation. At school we learn how to do math using decimals, which have ten symbols, from 0 to 9. Later, computer guys learns binary, which has just two symbols, 0 and 1. This time I'll do math using unary, which has just one symbol: I (the symbol I was chosen because it resembles roman numbers).

In the unary system, the number is represented by repetitions of the unary symbol. For instance:

1  = I
4  = IIII
10 = IIIIIIIIII

Also, just to avoid big regexps, I'll restrict myself to operations defined over the positive integers (meaning both the arguments and the results must be positive integers).

Addition

s/\+|=//g

That's the easiest one, just remove the plus and equal signs. The result is the concatenation of the operands. Let's check how it would work with 4+7:

"IIII+IIIIIII="
s/\+|=//g

"IIIIIIIIIII"

As expected, the result is 11.

Subtraction

s/(I+)-\1=//g

In order to implement subtraction, we'll use a backreference. If a sequence is found in the right side, then you are free to remove this sequence and an equal one on the left side. All that is left is the characters that were on the left side but not on the right, which is the same as the subtraction. Let's check 7-4:

"IIIIIII-IIII="
s/(I+)-\1=//g

"III"

Multiplication

s/(I)(?=I*\*(I+))(\*I+=)?/$2/g

Multiplication is more complex, we need the operator (?=), used to perform positive lookahead: it will match, but it won't consume the input. As a result, the operator /g can make multiple passes over the same string, and each "I" found on the left side is replaced by the entire right side. Let's check a step-by-step example of 3*6 (I'm doing step-by-step here, but Perl will do it with a single call to the command):

"III*IIIIIII="
s/(I)(?=I*\*(I+))(\*I+=)?/$2/g

"IIIIII""II*IIIIIII="
s/(I)(?=I*\*(I+))(\*I+=)?/$2/g

"IIIIIIIIIIII""I*IIIIIII="
s/(I)(?=I*\*(I+))(\*I+=)?/$2/g

"IIIIIIIIIIIIIIIIII"

Division

s/(I+)(?=I*\/\1\=)(\/I+=)?/I/g

Division is similar to multiplication, but we use multiple subtractions instead of multiple additions. Let's check 12/4:

"IIIIIIIIIIII/IIII="
s/(I+)(?=I*\/\1\=)(\/I+=)?/I/g

"I""IIIIIIII/IIII="
s/(I+)(?=I*\/\1\=)(\/I+=)?/I/g

"II""IIII/IIII="
s/(I+)(?=I*\/\1\=)(\/I+=)?/I/g

"III"

GCD

s/gcd\((I+)\1*,\1+\)=/$1/g

Why stop at the four operations? We can also implement the gcd (greatest common divisor). In this case we'll ask for repeated backreferences, this will only match when both the left and right sides are a multiple of the backreference. Since the first group is greedy, it will try to make the longest valid match, and as a result the backreference will be the gcd of the two arguments. Let's check 15 and 9:

"gcd(IIIIIIIIIIIIIII,IIIIIIIII)="
s/gcd\((I+)\1*,\1+\)=/$1/g

"III"

Playing with regexps like this is fun, but in the real world never use this without the help of a responsible adult. :)

7/03/2012

The Most Difficult Bug

It's a classic interview question: what was the most difficult bug you ever fixed? In my case, the bug gave me a lot of trouble, and for a curious reason: the bug was not in the code, instead it was between the chair and the keyboard!


The Story


This story takes place in 1998. At the time, I was writing BrMSX, the first Brazilian MSX emulator (if you're not familiar with MSX, it was an 8-bit computer popular in East Asia, Europe and South America). I was taking extra care in the implementation: the emulator should be fast enough to run at full speed in the machines available at the time, and had to be precise enough to be indistinguishable from a real MSX.

The first objective was reached by writing the emulator in Assembly, the second by using a solid testing methodology. I didn't know about unit testing yet, but for each new feature, I always wrote a test program, and compared the program output in the emulator and in a real machine.

Early versions of my emulator were mute, I was only emulating the CPU and the video chip. When those two were stable, I started writing emulation of the audio chip: the AY-3-8912 from General Instrument, also known as PSG. This chip has three sound channels and one noise channel, the noise was used to make sound effects (such as gunshots and explosions), and also to make percussion sounds (such as drums).

To check if my noise implementation was correct, I made the following MSX BASIC code:

10 SOUND 7,183
20 SOUND 8,15
30 FOR I=8 TO 1 STEP -1
40 SOUND 6,I
50 PRINT I
60 FOR J=0 TO 300: NEXT J,I
70 BEEP

The magic numbers in code are scary, but their meanings are simple: register 7 is the mixer, a value of 183 turn on the noise generator in channel A, and turn off all the rest. Register 8 is the volume of channel A, setting it to 15 is the maximum volume. Finally, there's a loop changing the value of register 6, which is the noise frequency.

We expect this program to play some noise with a constant volume, at first with a low pitch, and then gradually becoming a higher pitch. That's in theory. On practice, however, something else happened!

The Bug


I've redone the experiment and recorded it. On the video below, you can listen to the program running on a real MSX, and then on my emulator (I used dosbox to compile the 1998 version, before the bug was fixed).




How curious: even though I have set the volume to be constant, on a real MSX the volume gets lower at the end. Worse, on the emulator that doesn't happen!

That's a very subtle volume change, was I mistaken? To confirm, I've captured both signals and plotted them on Octave (in 1998 Octave didn't exist yet, so I used Matlab instead).



It looks like the turboR volume is really decreasing, but it's hard to compare. A better comparison can be made by noticing that the perception of loudness is physically caused by the power of the signal. And power is just the integral of the waveform squared. Drawing this on Octave we have:


There it is! I wasn't mistaken, in the last two steps the volume on the turboR really gets lower. But why does it happen? Maybe I could find out by analyzing the internals of the hardware. 

The Circuit


In order to understand how the PSG works, it's useless to get the schematics of the MSX, or the datasheet of the AY-3-8912, because none of them explains what happens inside the chip. In theory I could open the chip and try to read the die using an electronic microscope, but since I didn't have access to one, I had to resort to reverse engineering.

It was easy to come up with a model for the PSG sound channels. It has three square wave generators, each one of that working like the diagram below:


The input clock of the counter is the MSX clock divided by 16, so it has 223506Hz. On each pulse, the counter increments by one, and the result is compared with the value of the frequency register. When the values are equal, the counter resets, and a T-type flip-flop invert state, generating the desired square wave.

What about the noise generator? At the time no one knew how it worked, some guessed it was thermal noise on some resistor, others guessed it was shot noise on some semiconductor junction.

The truth was easier than that, the PSG used a pseudo-random generator! In fact, it has a square wave generator just like the one above, and, on each output transition, it made a transition on a Linear Feedback Shift Register.


We know that every LFSR generates a periodic signal, so by capturing an entire period of the generator, we could deduce the exact characteristic polynomial used. Turns out it is x17+x14+1, the maximal polynomial of size 17.

So I knew how the signal was generated, and I even knew the exact LFSR used by the PSG. However, this knowledge didn't help to explain why the volume was getting lower. Perhaps I had an error in my implementation?

The Implementation


One of my requirements was to run the emulator at full-speed, in the machines of the time. By 1998, this means simulating this entire circuit on a Pentium 1, and yet still have time for the emulation of the CPU and the VDP. Coding this in Assembly wasn't enough, I had to guarantee that all inner loops ran entirely in registers.

That's hard to achieve in the x86 architecture, which has only 7 general-purpose registers. My first attempt was a hack: during PSG emulation, I turned off all interrupts, and used the stack pointer as a register. This worked fine at home, but I used DOS at home. Most of my users had Windows 95, and this hack didn't work there.

Let's find another solution: there were lots of variables which changed when a PSG register is touched, but have constants values inside the inner loop. This presented a good opportunity to use self-modifiable code: I would "recompile" the inner loop every time a PSG register changed. This solution worked on Windows, and it was fast enough.

Fast, but also complex. You can see this code in github, it's neither legible, not easy to understand. Perhaps the volume wasn't changing due to bug in the implementation.

However, after thinking about it, I came to the conclusion that my implementation should be correct. If the implementation was wrong, then this error should only appear in my emulator. But all other MSX emulators had the same bug!

If many different implementation had the same bug, then the problem clearly wasn't in the code. It must be something more abstract. Maybe I could get some clue by analyzing mathematically the signal.

The Transform


Let's model the signal: suppose the output bits of the LFSR are a discrete sequence {di}, and for each bit 1 on the input we output a waveform s(t). If the pulse width is T, then the PSG noise is:


From the theory of stochastic processes, we know that the power spectral density of the signal is:


...where Gd(f) is the power spectral density of the sequency {di}, and S(f) is the Fourier transform of s(t). If we suppose the noise is white noise, then Gd(f) is a constant. The waveform s(t) is a rectangular pulse, whose transform is the sinc function. Therefore, the power spectral density of the signal is a squared sinc.

The bug appears when PSG register has the value 1, in this case T=8.98ยตs. The power spectral density of the output is:


Again, the perceptual loudness of the output is the power of the signal, and by Parseval's theorem, the power is the area below the graph.

After staring for some time at this graph, I had the insight I was looking for!


The solution


The problem was between the chair and the keyboard! More specifically, the problem happened because there was a human being between the chair and the keyboard! The human auditory system can only hear frequencies up to 20kHz, and this signal was spreading power over inaudible frequencies. A human being can only hear this part of the power:


The integral of the squared sinc cannot be solved analytically, but we can evaluate it numerically. This is a table with the audible power divided by the total power, for each of the frequency values used by the BASIC program:


Register 6Percentage of audible power
80.99
70.99
60.99
50.99
40.99
30.96
20.83
10.50


Comparing the power of the emulator and the turboR on the graph, it clear that this percentage is exactly what's missing to make the emulator sound the same as the real machine. By adding this attenuation to the emulator, it became true to the original as I wanted.

If you're paying attention, you may be asking yourself: "If the problem is only perceptual, then why the graph of the power measured by Octave shows the volume loss?". This is also an artifact from the human auditory system, but indirectly so.

When I plugged the turboR on my notebook's microphone, the sound board made the audio capture with CD quality, sampling it at 44100Hz. The designers of the sound card are smart, so they added a low-pass filter on the Nyquist's frequency before sampling, to avoid aliasing. Since this frequency was calculated to throw away everything humans can't hear, we're back to the same problem.

If you want to make the experiment yourself, I've placed on github the Octave scripts used to generate the images and tables of this post:

Scripts in Octave for noise analysis

This bug gave me lot of trouble to solve, but, in the end, I doubt anyone but me noticed the difference :)