Contributed by Matt Wilbur
This is a cautionary tale.
I was recently test-driving a UART module. Nothing earth-shattering, I know. For my current project, I have adopted the "DH" pattern for my driver work, where the Driver provides useful services and interfaces to higher level code and relies on a separate Hardware class (well, in C, a compilation unit), to hide niggly hardware details. I had a simple task for my UART: Receive a character. Pretty basic stuff, eh? That's what I thought.
As I test drove my driver code, I started to add more funky stuff, like allowing my imaginatively-named string receiving function UartDriver_ReceiveString to return before an entire string was read in the case that it received a UART_RX_NOT_READY value from the hardware. That might happen if some clever user asked my driver to always return immediately. Here's the code:
s32 UartDriver_ReceiveString(UartDriver driver, s8* stringToReceive, u32 lengthOfString) { int i; s8 theChar; if ( stringToReceive == 0) return 0; if ( driver == (UartDriver) 0 ) return 0; for (i=0; i<lengthOfString; i++) { theChar = getByteFromHardware(driver); stringToReceive[i] = theChar; if (theChar == UART_RX_NOT_READY) break; } return i; }
Pretty straightforward, right? The helper function getByteFromHardware called one other function, which I'll replicate here as it was when it was first written.
s8 getByteFromHardware( UartDriver driver ) { if( driver->timeout == UART_DRIVER_TIMEOUT_INFINITE ) return getByteFromHardwareWhenItsReady( driver->hardware ); else return getByteFromHardwareAndReturnImmediately( driver->hardware ); } s8 getByteFromHardwareWhenItsReady( UartHardware hardware ) { volatile s8 theCharacter; do { theCharacter = getByteFromHardwareAndReturnImmediately( hardware ); } while ( theCharacter == UART_RX_NOT_READY ); } s8 getByteFromHardwareAndReturnImmediately( UartHardware hardware ) { return UartHardware_ReceiveCharacter( hardware ); }
"Hey!", you say. "How dare you write code without a test!"
Don't worry. I didn't. Here was my simple test:
void test_UartDriver_ReceiveString_ReceiveGetsNegOneIfTimeoutIsNoWait(void) { s8 stringToReceive[5]; UartHardware_ReceiveCharacter_ExpectAndReturn(mockHardware, UART_RX_NOT_READY); UartDriver_SetReceiveTimeout(mockDriver, UART_DRIVER_TIMEOUT_NO_WAIT); TEST_ASSERT_EQUAL(0, UartDriver_ReceiveString(mockDriver, stringToReceive, 5)); TEST_ASSERT_EQUAL(-1, stringToReceive[0]); }
"Hey!", you say (stop shouting, please, I can hear you fine). Why so many friggn' calls? We're embedded software designers. We don't like the overhead of such things. True enough. But I will invoke a few items from my defensive bag of tricks.
- I'm striving for clarity in the code. As long as I'm not actually running out of ram or failing some sort of timing requirement, I'm quite comfortable with these calls. In this case, I don't have a timing requirement.
- One can inline things later. A la C/C++ (with inline), not a la Java with Eclipse.
- Don't try to out-optimize your compiler. You won't win.
If you still don't like it, I'm okay with that. TDD has made me confident enough in my code that I'll still sleep at night. I'll also refer you to Test Driven Development For Embedded C, 12.5, page. 267. Don't have the book? Go buy it.
As any sane person would, I use Ceedling to do my TDD. I felt very clever when all tests ran green.
Test 'TestUartDriver.c' ----------------------- Generating dependencies for UartDriver.c... Compiling UartDriver.c... Linking TestUartDriver.out... Running TestUartDriver.out... ------------------------- OVERALL UNIT TEST SUMMARY ------------------------- TESTED:13 PASSED:13 FAILED: 0 IGNORED:0
Awesome. So, I took this code, made a library and linked it with my application. Badness. Bad badness. A simple, simple CLI (linenoise), was not receiving characters. How dare it! I tested this code! It's good! I figured I misunderstood the hardware (mocking only buys you so much). I spent several hours scratching my head during DOH (debugging on hardware). Finally, I realized what happened. My helper getByteFromHardwareWhenItsReady was not helping. It was hindering. It kept the received character for itself and did not share it. Very rude.
So how did this slip through my safety net? Bad luck. In assembly, here is how UartDriver_ReceiveString breaks down around the call to getByteFromHardware.
0x804acd5 <UartDriver_ReceiveString+47> call 0x804ad03 <getByteFromHardware> 0x804acda <UartDriver_ReceiveString+52> mov%al,-0x9(%ebp)
Okay, so we know getByteFromHardware puts the received char in al. So, what does getByteFromHardware look like?
0x804ad1c <getByteFromHardware+25>call 0x804ad32 <getByteFromHardwareWhenItsReady> 0x804ad21 <getByteFromHardware+30>jmp0x804ad30 <getByteFromHardware+45> 0x804ad23 <getByteFromHardware+32>mov0x8(%ebp,%eax 0x804ad26 <getByteFromHardware+35>mov(%eax),%eax 0x804ad28 <getByteFromHardware+37>mov%eax,(%esp) 0x804ad2b <getByteFromHardware+40>call 0x804ad50 <getByteFromHardwareAndReturnImmediately> 0x804ad30 <getByteFromHardware+45>leave
What a clever compiler. It simply calls getByteFromHardwareWhenItsReady or getByteFromHardwareAndReturnImmediately and then leaves eax in a pristine state so that getByteFromHardware can use it. But this is where it ought to break, no? I am not actually returning the character from getByteFromHardwareWhenItsReady like I should. In C, I am not. But that clever compiler was too clever for me.
0x804ad3e <getByteFromHardwareWhenItsReady+12>call 0x804ad50 <getByteFromHardwareAndReturnImmediately> 0x804ad43 <getByteFromHardwareWhenItsReady+17>mov%al,-0x9(%ebp) 0x804ad46 <getByteFromHardwareWhenItsReady+20>movzbl -0x9(%ebp),%eax 0x804ad4a <getByteFromHardwareWhenItsReady+24>cmp$0xff,%al 0x804ad4c <getByteFromHardwareWhenItsReady+26>je 0x804ad38 <getByteFromHardwareWhenItsReady+6> 0x804ad4e <getByteFromHardwareWhenItsReady+28>leave 0x804ad4f <getByteFromHardwareWhenItsReady+29>ret
See what's going on? You don't? Come on! It's staring you in the face. Nothing happens to eax between the call to getByteFromHardwareAndReturnImmediately and the ret. So, the return value from getByteFromHardwareAndReturnImmediately (theCharacter) is being returned up to UartDriver_ReceiveString, as far as my x86 computer is concerned. Suppose I force another call to a function before returning so eax is modified.
s32 foo() { return -1; } s8 getByteFromHardware( UartDriver driver ) { if( driver->timeout == UART_DRIVER_TIMEOUT_INFINITE ) return getByteFromHardwareWhenItsReady( driver->hardware ); else return getByteFromHardwareAndReturnImmediately( driver->hardware ); s32 bar = foo(); }
Sure enough. Failures galore. If we return theCharacter after that call, all is well again.
As an epilogue, it was (rightly) pointed out to me that I should really have some dedicated tests fore these helper functions. That is, in addition to test_UartDriver_ReceiveString_ReceiveGetsNegativeOneIfTimeoutIsNoWait, I should have a test like this:
void test_UartDriver_Helper_getByteFromHardwareAndReturnImmediatelyReturnsCorrectByte(void) { UartHardware_ReceiveCharacter_ExpectAndReturn( mockHardware, UART_RX_NOT_READY ); TEST_ASSERT_EQUAL(-1, getByteFromHardwareAndReturnImmediately(mockDriver)); }
I did add that. Same problem, but I feel better having that test in place.
And that, my friends, is how your compiler can screw you and why sometimes tests can't always save your ass from dumb mistakes.
As postscript (which differs from an epilogue, I hope) I realized a fundamental flaw in my zeal to get a working UART driver. In fact, I'd be surprised if Kent Beck weren't getting out a ruler at some point during this note so he could smack me on the knuckles (the Kent-Beck-in-my-head that is. I'm not delusional enough to think Kent will read this). In case you're as dim as I was, I'll confess my sins in the hope that it will save my knuckles. Typing verbatim from my Test Driven Development by Example, by (the real person) Kent Beck, here is the TDD micro-cycle:
- Quickly add a test.
- Run all tests and see the new one fail.
- Make a little change.
- Run all the tests and see them all succeed.
- Refactor to remove duplication.
Okay, maybe there was more than one flaw (pobody's nefect), but the biggest one that could have avoided this whole ordeal is number 2. I didn't see the test fail. I was so happy with myself and my ace coding skills that the test passed the first time, I didn't test the test. Which, to me, is one of the functions of step 2. Bad me, but lesson learned. Hopefully, you've learned it too.