October 23, 2008

...Learn TDD with Codemanship

Writing Thread-safe Code - What Can Go Wrong With Multithreaded Logic?

I've been getting a good deal of encouragement to talk about multithreaded programming some more, as it seems to be an area of great technical interest to many of you out there in developerland.

It's also quite obviously an area of some considerable pain...

I want to talk about two specific kinds of things that can go wrong when two or more threads of execution start interacting with the same data and objects:

1. The pre-condition paradox

Jill and John have a joint bank account with a balance of $100. (They're not big earners, I should stress. Probably teachers. Or nurses.)

Jill sees a pair of shoes she really likes in a shop window during her lunch hour. She calls her bank's 24-hour telephone banking service to check the balance of the account to see if she has the $75 needed to pay for the shoes.

Meanwhile, John spies a boxed set of Battlestar Galactica Series 4 (though, as it turns out, it's only the first 10 episodes of series 4, the cheating buggers!) He stops at an ATM to check the account balance to see if there's enough in there to cover the $40 he needs to buy the DVDs.

Both Jill and John see that the account balance is $100. But in the time it takes John to get from the ATM back to the DVD store, Jill has already used her debit card to pay for the shoes. So when John hands his card over to the cashier, the account now only has $25 in it.

His card is rejected, in front of a huge queue of shoppers who are all in the kind of hurry that only lunchtime shoppers can be in. He gets egg on his face, and leaves with the shame of knowing that everyone in that store thinks he's a no-good bum. Which, of course, he is - I mean, what kind of person buys Battlestar Galactica box sets when he's only got $100 left in his account?

But anyway, I digress. The point is that because Jill and John both interacted with the account concurrently, it was possible for Jill's transaction to break the pre-condition of John's transaction after John had checked that the pre-condition for his transaction was satisfied.

One way for Jill and John to avoid this kind of scenario would be if one of them could obtain exclusive access to the account until their transaction - the whole transaction, starting with checking the account balance - is complete. This would mean, in effect, obtaining a temporary lock on the account until the transaction ended. Jill gets the lock first, and so John has to wait until her payment for the shoes has cleared before he can access the account, by which time the balance has changed and his pre-condition will fail when he checks it.

In reality, Jill and John probably wouldn't need to access the joint account at the same except on rare occasions. So locking the account and forcing the other account holder to wait would be a viable option from a performance perspective.

But if many people were trying to access the same account throughout the day - perhaps a large organisation with multiple branches all sharing a single bank account - then the time spent waiting for the account to become unlocked might be very noticeable.

2. Deadlocks

Jill, having blown most of their balance on a pair of shoes, calls her Mom and asks if she can wire her some money. A funds transfer between two bank accounts requires a lock on both of them.

Mom's bank locks her account and then asks Jill's bank to lock Jill's account. Safe as houses, yes?

In the meantime, though, John has been speaking to Dad (who shares a joint account with Mom), and cooked up a similar plan (he's got his eye on a remote-controlled Dalek now, you see). Dad's bank also needs a lock on both accounts. But this time his bank asks John's bank to lock John's account first, and then to lock Dad's.

It could happen that while Mom gets a lock on Mom and dad's joint account, Dad gets a lock on John and Jill's joint account. Now each has to wait for the other person to unlock the other account. And they'll be waiting indefinitely, because Mom can't proceed until John and Jill's account is unlocked, and Dad can't proceed until his and Mom's account is unlocked.

A few simple strategies could have prevented this deadlock:

a. Lock shared objects/data in the same order

If Mom and Dad's bank had both asked to lock their joint account first, the deadlock could not have happened because Dad would have had to wait until Mom was done with both accounts before proceeding.

b. Lock shared objects/data in a single atomic step

If you need both acconts to execute a transfer, then lock both accounts at the same time and make all other threads wait before they can try to get a lock on any of them.

c. Timeout

If a thread has been waiting to long to get a lock on an object or piece of data, have it release all of its locked objects/data in case that's what another deadlocked thread has been waiting for.

One thing I'm putting though into now is how one might use automated tests (or other lightweight QA techniques) to describe multithreaded correctness and detect mutli-threading defects.

Posted 9 years, 8 months ago on October 23, 2008