August 3, 2018
Keyhole APIs - Good for Microservices, But Not for Unit TestingI've been thinking a lot lately about what I call keyhole APIs.
A keyhole API is the simplest API possible, that presents the smallest "surface area" to clients for its complete use. This means there's a single function exposed, which has the smallest number of primitive input parameters - ideally one - and a single, simple output.
To illustrate, I had a crack at TDD-ing a solution to the Mars Rover kata, writing tests that only called a single method on a single public class to manipulate the rover and query the results.
You can read the code on my Github account.
This produces test code that's very loosely coupled to the rover implementation. I could have written test code that invokes multiple methods on multiple implementation classes. This would have made it easier to debug, for sure, because tests would pinpoint the source of errors more closely.
If we're writing microservices, keyhole APIs are - I believe - essential. We have to hide as much of the implementation as possible. Clients need to be as loosely coupled to the microservices they use as possible, including microservices that use other microservices.
I encourage developers to create these keyhole APIs for their components and services more and more these days. Even if they're not going to go down the microservice route, its helpful to partition our code into components that could be turned into microservices easily, shoud the need arise.
Having said all that, I don't recommend unit testing entirely through such an API. I draw a distinction there: unit tests are an internal thing, a sort of grey-box testing. Especially important is the ability to isolate units under test from their external dependencies - e.g., by using mocks or stubs - and this requires the test code to know a little about those dependencies. I deliberately avoided that in my Mars Rover tests, and so ended up with a design where dependencies weren't easily swappable in ths way.
So, in summary: keyhole APIs can be a good thing for our architectures, but keyhole developer tests... not so much.
December 19, 2015
The Multi-threaded Santa TDD ChallengeIf you've hunted on the Interwebs for TDD katas, you will no doubt have noticed that they tend to be based on relatively straightforward problems. In particular, there are no - as far as I'm aware - TDD katas that set more challenging high-volume, multi-threaded, multi-process problems to solve.
In the spirit of Christmas - if you celebrate such a thing (and this year, I won't be) - I set some students just such a problem, which I'm calling the Multi-threaded Santa TDD Challenge.
Now, read this very carefully. It goes something like this:
Multi-threaded Santa TDD Challenge
The goal is to make, wrap, load and deliver 1 million presents on Xmas Eve. Play this challenge in pairs, with multiple pairs competing to do it for the lowest cost.
Santa and his elves have to make, wrap, load and deliver 1 million presents. They work in 4 teams:
Each team must be implemented in its own process (e.g, a web service for each activity, or a daemon, etc etc).
Each team either sends to or takes from (or both) one of 4 queues, and each queue has a maximum size, beyond which no more presents can be added:
1. Made (max size = 1000)
2. Wrapped (max size = 2000)
3. Loaded (max size = 5000 - these are all the presents on Santa's sleigh for delivery)
4. Delivered (no size limit)
How does Santa deliver all those presents in just one night, though? Simple. Santa Time works differently to our time. An hour in Santa Time is just 1 second (1,000ms) of human time. So 3600 hours of Santa Time elapse in 1 hour of human time.
You must implement these time delays in a Present class, with 4 distinct lifecycle event methods:
* Present.make() - wait 50ms
* Present.wrap() - wait 10ms
* Present.load() - wait 5ms
* Present.deliver() - well, we'll get to that...
To perform work on a present, a worker thread must invoke the appropriate method, incurring a time cost for that work within that thread. Only after that work has been performed can the present be sent to the queue for the next process to take from. e.g. one thread in the Making process called present.make(), then sends that present to the Made queue, where the Wrapping process can pick it up.
In each process, there are one or more elves (worker threads). Elves are managed by a fifth process: the Elf Pool. To add an elf (worker thread) to a process (e.g., wrapping), you must request an Elf ID from the Elf Pool.
The Elf Pool records every elf assigned to every process for the duration of the work.
Because of the very strict rules of the Elf Union, elves - once assigned to the work - cannot be laid off. So the total number of elves employed can only go up, not down, until all presents have been delivered.
Elves can, however, be swapped between the 4 work processes: so an elf assigned to wrapping can be re-assigned to loading, for example. This, too, must be managed through the Elf Pool.
One worker process (e.g. wrapping) signals an intention to reassign one of its elves to the Elf Pool. The elf remains in that process until another worker process (e.g., Loading) requests another elf, at which point - instead of assigning a new elf from the pool - that elf is re-assigned instead. So the total number of elves employed remains the same. If the Elf Pool has no elves awaiting re-assignment, it assigns a new elf when one is requested.
There is only one Santa, and only one sleigh!
Okay, here's your bottleneck. There can only be one worker thread in the Delivering process, and the delivery round takes a fixed amount of time each time the sleigh goes out, regardless of how many presents or elves are assigned to Delivery. It takes Santa 500ms to make one round of deliveries. i.e. delivering all the presents on Santa's sleigh - no matter how many there are - incurs a time delay of 500ms. At the end of that time, all the presents in the sleigh (i.e., on the Loaded queue) are delivered. This could, for example, be achieved with a Sleigh class that implements the time delay in a deliver() method.
Be advised that no presents can be loaded for delivery while Santa's sleigh is out on a round. This is your bottleneck! For 500ms (at least), the elves doing the loading will be idle. So you may want to reassign them to making or wrapping. But... hmmm... how long will that take?
How to score?
The Elf Pool has another job in this challenge: elves don't work for nothing. They work for cookies. Specifically, one elf assigned for one hour of Santa Time (1s of human time) gets paid 1 cookie. So 100 elves assigned for 1000s of human time would cost 100,000 cookies.
The Elf Pool calculates the cookie payroll, multiplying the number of assigned elves by the time it takes to deliver all the presents in human seconds.
To win, your team needs to deliver all 1 million presents for less cookies than your competitors.
Never Give In. Never Surrender.
Now, here's the kicker: once the start whistle blows, there is NO STOPPING. If any of your worker processes falls over, the work becomes blocked, the clock keeps on ticking, and you keep burning cookies. Restarting any worker process must re-attach all of the elves that were assigned to it via the Elf Pool before it can begin working on presents again.
If your Elf Pool falls over: you are disqualified, and Santa gets to kick you in the backside for all eternity for ruining Xmas!!!
So, there you have it: 1 million presents, 4 worker processes, as many worker threads as you like (within the rules), 4 work queues, and one manager process keeping track of the elves and the cookies and the time.
Easy as cheese. Average time for students to complete was 4 hours. My advice: keep it as simple as you can. I've already piled on plenty-much complexity for you to deal with!
March 27, 2015
Component-based & Microservice Architecture: Swappability Happens On The Client SideLunch time teleconference looms; just enough time to spew out these thoughts about distributed component architectures (you young hop folk may know them as "microservices", which is the trendy cool Dubstep name for them).
The key to distributed components is swappability - actually, that's kind of the whole point of components generally, distributed or in-process, so take this as generally applicable advice. Or don't. see if I care.
What our bearded young hipster friends often forget to mention is that the swappability in component-based design really happens on the client side of component-to-component (or service-to-service) collaborations. Not, as you may have been led to believe, on the server side.
Sure, we could make pretty much any kind of component present the same, say, REST API. But we still run the risk of binding our client to that API, and to the details of how to consume RESTful services. (Which can be, ironically, anything but to work with.)
Nope. To make that service truly swappable, we have to hide all of the details from the client.
UML 2.0, that leviathan of the component-based era, introduced the notion of components and connectors. It's a pretty neat idea: basically, we cleanly separate the logical conversation held between two components from the dirty business of the medium through which that conversation takes place.
To connect the idea with the real world, let's say I ask the prime minister if he's ever eaten three Shredded Wheat and he replied "Yes". Now, I didn't say how I asked him. Maybe I went to Number 10 and asked him to his face. maybe I emailed him. Maybe I had the words branded onto a poor person and paraded said pauper through the streets so that the PM would see the question when they reported the news on his tellybox.
What matters is that I asked, and he replied.
In component architectures, we seek to separate the logic of component interactions from the protocols through which they physically connect.
Let's say we have a Video Library web application that wants to display 3rd-party reviews of movies.
We could look for reviews on IMDB, or on Rotten Tomatoes (or on both). We want to ask the logical question: what reviews have been written for this movie (identified by the movie's title and year)?
We can codify the logic of that interact with interfaces, and package those interfaces in a component the client knows about - in this case, the VideoReviews .NET assembly, which contains the interfaces IReviewService and IReview.
The VideoTitle class that consumes these services doesn't need to know where the reviews are coming form, or how they're being marshalled. It just wants to ask the logical question. So we present it with a logical interface through which to do that.
By injecting the review service into the VideoTitle, it becomes possible to dynamically bind implementations of those interfaces that know how to connect and interact with the remote server (e.g., the Rotten Tomatoes API), and unpacks the data that comes back, translating into a form that VideoTitle can use easily.
All of that is done behind the scenes: VideoTitle knows nothing about the details. And because it knows nothing about the details, and because we're injecting it into the VideoTitle from outside - e.g., in its constructor, or as a method parameter when VideoTitle is told to get reviews - it becomes possible to swap in different connectors to get reviews from other services.
All of this can be wired together from above (e.g., we could instantiate an implementation of IReviewService when the application starts up), or with a dependency injection framework, and so on. The possibilities are many and varied for runtime hi-jinks and dynamic larks.
The component dependencies are crucial: our app logic (in the VideoLibary component) only depends on the abstractions in the VideoReviews library. It depends in no way on the external services. It does not know there are even external services involved.
All component dependencies point towards the abstractions, satisfying the Stable Abstractions package design principle.
It now becomes possible to do clever things with swappability, like pooling connectors that point to different service instances to provide basic load-balancing or fail-over, or giving the end user the choice of which service to use at runtime.
Most importantly, though, it gives us the ability to vary the logic of our applications independently of the details of how it connects to external services independently of each other. If a new movie review site came along, we would simply have to write a new connector for it, and wouldn't have to rely on them implementing the same web API as our existing providers. Because, that my friends, is beyond our control.
So, succeeding with components is about swappability, and swappability is about programming the logic of our applications against clean interfaces that we control.
The REST is details.