Wednesday, January 27, 2010

Using Retlang for multi-threaded windows form code

I've been a fan of Retlang for a long time and have been meaning to write up some of the ways I've used it. This is the first of 3 blog entries and in this one I'll describe a way of using Retlang to avoid the dreaded InvalidOperationException "Control accessed from a thread other than the thread it was created on".


It avoids lots of nasty boiler plate calls around InvokeRequired(), this isn't new and has been described before but what I want to show how it can also make testing easier and work with existing MVC patterns.


We are going to use Retlang to allow controller code and view code to communicate over message channels in a thread safe way.


Lets imagine a very simple implementation of MVC to illustrate the approach.


Here's the test, we are just adding numbers together and then displaying the answer in the view. 



[TestFixture]

public class TestSimpleController

{

private Model model;

private SimpleView view;


private MockRepository repository;


[SetUp]

public void SetUp()

{

repository = new MockRepository();

model = repository.StrictMock<Model>();

view = repository.StrictMock<SimpleView>();

}


[Test]

public void testShouldDoSimpleAdd()

{

model.PushNumber(10);

model.PushNumber(5);

model.SumStack();

LastCall.Return(15);

view.DisplayCurrentTotal(15);


repository.ReplayAll();


var simpleController = new SimpleController(model,view);

simpleController.SendNumber(10);

simpleController.SendNumber(5);


simpleController.Sum();

repository.VerifyAll();

}

}


I'm using RhinoMock for mocking the view and the model. The controller implementation looks like


public class SimpleController

{

private readonly Model model;

private readonly SimpleView view;


public SimpleController(Model model, SimpleView view)

{

this.model = model;

this.view = view;

}


public void SendNumber(int i)

{

model.PushNumber(i);

}


public void Sum()

{

var result = model.SumStack();

view.DisplayCurrentTotal(result);

}

}


Now suppose we have to extend the implementation to deal with a long running task, we don't want to block the view thread as that is bad for user experience so we create a new thread to call the model from. Here is the controller code for that:


public void DoPrediction()

{

var thread = new Thread(InvokeModelWork);

thread.Start();

}


private void InvokeModelWork()

{

var result = model.LongRunningCalculation();

view.DisplayCurrentTotal(result);

}


It's worth noting here that a test for this code written in the same way as for our SimpleAdd will quite likely start failing at this point as asserts will be made on the calling thread before the "worker" thread has called the model and the view. Here is that naive and incorrect version of the test


[Test]

public void testShouldInvokeLongRunningCalc()

{

model.PushNumber(33);

model.PushNumber(44);

model.PushNumber(11);

model.LongRunningCalculation();

LastCall.Return(88);

view.DisplayCurrentTotal(88);


repository.ReplayAll();


var simpleController = new SimpleController(model, view);

simpleController.SendNumber(33);

simpleController.SendNumber(44);

simpleController.SendNumber(11);


simpleController.DoPrediction();

repository.VerifyAll();

}


This sort of test is a common problem in many code bases where the threading is not added until after complaints from the users about performance problems - this can create a lot of problems with the testing and is a very common source of bugs in MVC code. It is easy to tie yourself in knots using the mock framework to signal threads that a method was called or, even worse, using Thread.Sleep() to pause inside of the test. My experience has been that both of these are a source of "mysterious" test/build failures and are very hard to maintain.


Anyway even if we make our test pass we'll see the following when we try to run the code for real in a Windows Forms implementation of our view interface:


{"Cross-thread operation not valid: Control 'textBoxResult' accessed from a thread other than the thread it was created on."}


For completeness here is the very simple user control that implements the view.


public partial class SimpleControl : UserControl, SimpleView

{

private readonly SimpleController controller;


public SimpleControl()

{

controller = new SimpleController(new DoesMath(),this);

InitializeComponent();

}


public void DisplayCurrentTotal(int i)

{

textBoxResult.Text = i.ToString();

}


private void buttonSum_Click(object sender, EventArgs e)

{

controller.Sum();

}


private void buttonPredict_Click(object sender, EventArgs e)

{

controller.DoPrediction();

}


private void buttonSubmit_Click(object sender, EventArgs e)

{

var input = int.Parse(textBoxInput.Text);

controller.SendNumber(input);

}

}


So how can Retlang help?


Retlang lets us create channels which different threads can use to communicate. We can create one of these channels from the dispatch thread of a windows form class.


Here is the new version of the constructor for the controller:


private Channel<int> viewChannel;


public SimpleController(Model model, SimpleView view)

{

this.model = model;

viewChannel = new Channel<int>();

viewChannel.Subscribe(view.Fibre, view.DisplayCurrentTotal);

}


In this simple case we have one channel over which we will send int's and we also have have just one subscriber which is the view.DisplayCurrentTotal() method. We also ask the view to provide use with the Fiber to use, in this case this allows the view to provide a fiber that it knows will be safe to execute the subscriber method(s) on.


In more advanced cases you might want to have many channels or multi-plex different message types over the same channel and use the Dispatcher pattern to route those to the right subscribers on the view. I've done just that in a real example and used reflection to create mappings from each view method to the the correct message type being received from the channel.


On with our simple example; so what do Sum() and DoPrediction() look like now?


public void Sum()

{

var result = model.SumStack();

viewChannel.Publish(result);

}


public void DoPrediction()

{

var thread = new Thread(InvokeModelWork);

thread.Start();

}


private void InvokeModelWork()

{

var result = model.LongRunningCalculation();

viewChannel.Publish(result);

}


So instead of calling methods on the view we publish messages instead, in this case just an int.


What about the test? We'll need a few changes to SetUp as well, Retlang gives us a nice stubbed implementation of fiber we can use for testing.


[SetUp]

public void SetUp()

{

fiber = new StubFiber();


repository = new MockRepository();

model = repository.StrictMock<Model>();

view = repository.StrictMock<SimpleView>();

SetupResult.For(view.Fibre).Return(fiber);


fiber.Start();

}


[Test]

public void testShouldInvokeLongRunningCalc()

{

var finished = new ManualResetEvent(false);


model.PushNumber(33);

model.PushNumber(44);

model.PushNumber(11);

model.LongRunningCalculation();

LastCall.Return(88);

view.DisplayCurrentTotal(88);

// note the type of m below must match the parameter type of the above method

LastCall.Callback((int m) => finished.Set());


repository.ReplayAll();


var simpleController = new SimpleController(model, view);

simpleController.SendNumber(33);

simpleController.SendNumber(44);

simpleController.SendNumber(11);


simpleController.DoPrediction();


Assert.IsTrue(finished.WaitOne(1000),"Timed out");

repository.VerifyAll();

}


We use a manual reset event to allow us to wait for a particular event to occur, in this case the call to view.DisplayCurrentTotal() is followed by the use of a callback to allow this to happen. We then wait for the event (or a timeout) before calling VerifyAll(). So we can't avoid the fact that things happen asynchronously, but by using a message channel we make things explicitly so and importantly we have consistency across all the controller/view interactions. If we treat our test code as just code we can take advantage of this consistency to extract methods etc and this helps keep the code readable.


Here are the required changes for the view:


private readonly FormFiber formFiber;


public SimpleControl()

{

formFiber = new FormFiber(this, new BatchAndSingleExecutor());

controller = new SimpleController(new DoesMath(),this);

InitializeComponent();

formFiber.Start();

}


public IDisposingExecutor Fibre

{

get { return formFiber; }

}


We no longer need to worry about surrounding things with DispatchRequired() calls, our implementation of the view interface stays nice and clean. We also avoid having to create a view Proxy, this is another solution to the DispatchRequired() that requires a lot of repetitive boiler plate code.


I've tried to keep things simple for this example, some things to think about for a real world implementation are

  • That Methods on the view interface will be of the form void Method(Message msg)
  • Any methods on the view that do need to return something must be called from the same thread as created the form (but to be thread safe this must be done anyway).
  • Whether to multiplex multiple message types over the same channel and then use a Dispatcher to call the correct method on the view or to have a channel per message type
  • Whether to create different message channels for different styles of message flow and quality of service

I wonder if this is more Model Channel Controller as opposed to Model View Controller? 


In the next entry I'll describe using a similar idea to allow high performance low latency updates of a view direct from a domain model by sidestepping the controller but without seriously compromising encapsulation. In the final entry of the 3 I'll describe how to use Retlang channels to efficiently share work across multiple worker threads and hence CPU core. The Retlang developers have done a great job and created a simple to use but highly useful library.

Tuesday, November 24, 2009

Ban The Debugger

How much logging do you really need in your application? I was visiting a client recently to help diagnose some problems with one of their applications. I asked for their support people to send me the log file, seemed like a good place to start. I got a file of about 2.5K, that seemed kinda small to me. I opened it up and found just one exception and it's stack trace so I got back to the support people to check. Yep, that was it, logging cranked up to max and the only output was one stack trace. Not even the date and time of the problem. Wow! A technique I find very useful prior to go live is to Ban the Debugger. Developers get very used to just firing up the debugger when fixing issues or diagnosing problems. This is fine but it means that no one looks at the log files from the point of view of someone who only has those available to find out what is going on. For our colleagues in support and operations it is only the log files that they can use to find and fix issues. So during development prior to go live I stop the developers from using the debugger, instead I ask them to spend at least some time trying to fix the issue based solely on the log files and whatever else our friends in support will have available once we have gone live. This usually leads to a big upswing in the amount of logging and it's logging we know helps to fix issues. Of course sometimes you do need the debugger, but hopefully after we've used the logging to narrow down the problem area and to understand what the users were trying to do at the time. So back to the client above - it turned out the technical lead had never worked in support and that the support team had not really been represented during development. We owe our colleagues better than this, perhaps Banning the Debugger for a time during development might help.

Wednesday, November 04, 2009

Technology Lightning Talks in Chicago

Some of my colleagues from the ThoughtsWorks Technology Advisory Board members will be delivering Lightning Talks next week in Chicago, I'm not speaking myself but will be attending. If you are in Chicago and would like to come along follow this link for registration information. I think it's a pretty great list of speakers and am looking forwards to hearing them myself.

Monday, November 02, 2009

Databases and Separation of Concerns

A continuing source of pain on projects is Object Relational Mappings and databases. A contributor to this pain is the mixture of two concerns, this mixture seems to occur on nearly every project that uses a database. I think spending a moment to think about these two different kinds of usage is worthwhile. So what are these two concerns? A. Persist State Needed for Recovery This is just working state saved so that when the application restarts we can continue processing. For example perhaps the current state of a customer order or work in progress on a very long running calculation. B. Data saved for Reporting and Querying This is data saved so it can be queried later on, perhaps to allow end of month reports to be generated or tracking of user trends. It is not needed to recover the working state of an application. Many teams try to overload (A) to achieve (B), the sorts of problems this can cause are i) Data Volume - the volumes of data needed for (A) tend to be smaller, it's current working data as opposed to historical data. This can show itself as performance issues for the application as queries become slow over time. ii) The object design needed for (A) and (B) is not necessarily the same. This often shows itself as fields or object relationships being created with names like "history" or "recordOf", so an object design created for (A) becomes overloaded with things needed for (B). This again causes performance issues as the number of objects and data getting pulled into memory by the ORM can increase. It also means a simple state update can touch a lot of tables as we try to achieve (B) at the same time, chances are the indexes needed for historical queries can start to impact the update speed for these. iii) Confusing Code As with any other area where we fail to achieve separation of concerns the code can become confusing, for example state change operations become implicitly overload to create and persist data needed for historical reasons. iv) Archive of historical data becomes problematic, so you can't cleanly identify what data in the DB can be safely moved out to a historical database or deleted without impacting the functionality of the application itself. It wont always help but separation of these two concerns can provide clarity and address some kinds of performance issues. I certainly think it's worth calling out these concerns as different kinds of requirement even if you end up using the same implementation for both.

Wednesday, September 30, 2009

See you at JAOO?

I'm very excited to be attending JAOO this year, I'll be hanging out at the ThoughtWorks stand so why not head over and say hello.

Great OS X app for those who change location a lot

Finding this extremely useful http://www.symonds.id.au/marcopolo/ Lets you define rules to work out where you currently located based on all sorts of criteria and then define actions for that location: enable screen saver password, default printer, mount a drive, etc.

Wednesday, September 02, 2009

Incremental Internet Product Launch - responding to feedback over predicting needs

I've spent a lot of time working with companies looking to launch offerings on the internet, I've been asked to get involved at various stages of the life cycle including starting over after the previous launch failed. There seem to be two main approaches to product development for the internet and interestingly these approaches often correlate with whether the company is a "traditional" company or not. By traditional I mean a company who previously sold and marketed products via non-internet means, the internet is something they've had to react to. So by implication the other kind of company is one that exists solely because of the internet, an "internet company" if you will, they are built around the web and have none of the more traditional retail sales channels. Traditional Companies The more traditional companies often seem to have a very fixed approach to product development, perhaps as a legacy of selling through offline means such as the high street? They go through a series of (often costly) exercises to try and fully describe a product offering, so to anticipate needs and to fully design a web product. There are usually a series of gates to pass through to get approval for budget, not least because the whole product has been designed and needs to be built so there is a lot of money involved. Product launches are costly infrequent events for this kind of company, usually with a lot riding on them. As the product offering has been designed upfront traditional methodologies are often used in delivery, no need for a more Agile approach when we already know what the users want. As such a ability to react quickly to change is rarely valued in this kind of product development with the real users often not involved at all. There is a high degree of risk around adoption with this sort of approach; will people actually want to use it? One technique often used to try to mitigate this risk is a big expensive marketing campaign and lots of publicity around the launch, although this in turn increases risks around performance problems and scaling at launch time. It also tends to push companies towards big upfront expenditure on fixed infrastructure sized for the peak demand. I'm sure we can all think of examples of products launched in a blaze of publicity only then to very publicly fail due to performance issues. Finally even if things go well and you do manage a good launch you've got a much bigger problem: non-traditional companies seem able to react extremely quickly and update or create competing offerings, often in weeks and not the months or years it took the traditional company. Internet Companies Internet companies seem to do things quite differently, perhaps because they've never had to sell products that had to be 100% fully formed at launch via traditional retails channels. They do something few traditional companies seem willing to do - they create and launch a product with an absolute bare minimum of functionality, just enough to hit the core proposition and nothing more. They then put that out into the wild and see what happens; often with minimum publicity but always with good feedback mechanisms in place. If no one picks up on the offering it can be just as quietly be canned, while money has been lost it was a small amount compared to a fully fledged product launch. In fact as this is a relatively inexpensive approach there is a chance to try out a whole series of ideas and do so quickly and often. If people do start to use it then the feedback mechanisms come into play and you can respond to that feedback to start incrementally improving the product. There is a very direct relationship with the users and their feedback here. You'll probably see a gradual ramp up in users as things pick up and can start to scale up numbers of machines etc. If you've used "cloud" like technologies from day one this is probably pretty easy to do. By necessity your delivery mechanisms have to be fast and quality has to be a fundamental part of the process: no time for a big 1 month long test period at the end, users need rapid response to their feedback for them to stay engaged with the product. Conclusion If you are a traditional company looking to go head to head with an internet company you may need to change how you approach internet product development. You'll be going up against companies that constantly and rapidly respond to user feedback. The key seems to be understanding the bare minimum you can launch with and then not being afraid of having a more direct relationship with end users. If someone is telling you that the bare minimum product definition "has to be" very large you need to be very sure some other company can't see a way to launch something simpler (faster) and just incrementally respond to their end user feedback.

Wednesday, April 22, 2009

Test Code Is Just Code

Test Code is Just Code Some Anti-Patterns to avoid and some techniques for making sure Test Code doesn't slow you down Maintaining Test Code can become Costly One of the key objections to Test Driven Development is that the tests will become a barrier to change; that the maintenance and reshaping of the tests needed as new business requirements come into play will start to become prohibitively expensive. In fact there is evidence that this indeed happens, over time many Agile teams spend more and more time fixing tests that have been broken by new requirements as opposed to working on those new requirements. I've seen too many teams in exactly this situation and it usually due to difficult to understand test code that is hard and costly to maintain. We already know how to keep code in good shape This problem has, however, been solved and many high performing teams are able to introduce new functionality or make rapid changes to existing code without needing to start deleting tests or abandon the XP principle of TDD (Test Driven Development). The key is that they behave towards test code as they do any other code, Test Code is Just Code. All the things we have learnt about keeping production code in shape through high discipline, techniques like refactoring, and the use of patterns all apply to test code. And yes, sometimes this even means writing tests to make sure our test code behaves as expected. Common Traps I want to say more on why people seem to treat test code differently later but first I want to describe some of the common traps that I've seen people fall into and some ways of avoiding them. Here are five I see again and again, they can cause a lot of pain and I've met developers who have been put off TDD by falling into them. Cut & Paste The first one is cut & paste, for some reason when it comes to unit tests people suddenly start cutting and pasting all over the place. Suddenly you find a file with 20 tests each of which repeats exactly the same few lines of code. I don't think I need to describe why this is bad and I expect we've all seen the outcome: at some point later those tests all start breaking at the same time, if we are unlucky a few tweaks have happened to the cut & pasted code in each so we spend a lot of effort figuring out how to make each one pass again. There is no rule that says we can't have methods in test fixtures so ExtractMethod still applies, and using SetUp sensibly often helps. The same rational for avoiding cut & paste and the same solutions we know from production code apply to test code. Poor Encapsulation The classic example of a failure of encapsulation in test code is when you start to see database specific code creeping into every test, perhaps you have "Naked SQL" in your tests? I'm going to focus on the database here but the same ideas apply to other areas as well, for instance security, auditing and messaging. The pain this trap causes often shows itself at the worse time when you are optimising the database or trying sort out a referential integrity issues, suddenly the tests become a millstone around your neck and start breaking on a very frequent basis — because of Cut & Paste this can often be a very large number of tests. I've seen teams avoid addressing things like foreign keys in the database because it has become too difficult to get the test code to create and delete things in an appropriate order. The solution is to introduce proper encapsulation around the test code that looks after the database. The builder and fluent interface patterns are very useful here, if we can hide the implementation and separate concerns we can also often open to door to more advanced testing techniques. Of course you'll want to make sure that builder code is properly tested, that it really puts things into the database in the order you expect and deletes them properly as well, but that should be work you only need to do once. Bloated SetUp The next example is Bloated Setup, hopefully the name is self evident, you end up with a SetUp method in your test fixture that becomes huge. As with any bloated method it becomes code that is hard to understand and error prone. Perhaps changing the order of calls in SetUp fixes one test but causes 5 others to fail and the reasons are just not obvious? If you've seen this then you are probably suffering from bloated SetUp. This pain point has two main causes, the first is poor coding practice such as described above, and the second is the structure of the production code itself. It is often dependencies within the production code that force us to do too much work in SetUp, we have to create all the things each class under test requires and in turn their dependencies as well. The pain grows as SetUp code then gets Cut & Pasted into other test fixtures. As opposed to the first two examples the solution here lies not just into our approach to the test code, we also need to look at the way the production code is structured. The first step is to make sure we use a well understood pattern such as DI (Dependency Injection) to express and understand the dependencies in the production code. The second step is to make use of stubbing or mocking techniques in our test code so that we can isolate the class under test and exercise it without needing to instantiate the whole tree of dependencies. A lot has been written on mocking and DI already, some of the best articles are here & here. Mocking has its own traps not least of which is only having the real wire up of all the components done for the first time in production (more later) but used well it is an invaluable technique for keeping test code easy to maintain and understand. Too hard to write the Test The previous pain point described a situation where the solution lay in looking at both our test code and the production code, this one is about testing pain that comes entirely from the production code. Finding a test hard to write can tell us a lot about the production code and a pain point people often refer to is "it's just too hard to write a test for that". My view on this is that code that is hard to test is poorly written code. To write a new unit test, and hence make a change to make it pass, the code we are testing needs to be easy to read and understand. If we can't work out which class to write the test against that means we have poor separation of concerns or a lack of coherence in our production code. If at first you can't write the new test then refactor the production code until you can, it's the production code that is at fault and not the TDD technique. Code Integration Test Pain The last example I want to talk about is integration testing. As this is such an overloaded term I first need to describe by what I mean by code level integration testing. In essence it is making sure that when we wire up all the components in our solution they behave as expected together. If we've used patterns like DI, a DI framework such as Spring and a test technique such as Mocking then we need to make sure we are fully testing the wired up solution as well, unfortunately this can get forgotten. Often we end up with a special test wire-up class or a Bloated SetUp doing the wire-up job, either way there is a bug waiting to happen if the production wire-up code never gets called until actual deployment. In essence SetUp for our code integration tests ought to be calling the production wire-up code. I often hear that this is impossible because the production wire-up code is hard wired to a specific environment. If that is the case the team is probably heading for trouble, too many teams don't spend enough time thinking about how to handle configuration for multiple environments such as Dev, QA and Production. If you can solve that problem first then exercising the full production wire-up code should not be an issue, just point it at the appropriate environmental configuration. Why do people treat test code differently? Test Code is only throw away code if the production code is throw away. I think people often think of test code as throw away, that once in production it becomes superfluous and so there is little use devoting time to keeping it in good shape. When people say this to me I like to ask if they plan on ever changing the production code or doing another release, the answer is always yes. Test Code is an intrinsic part of the solution, like the launch tower for a rocket it provides an essential stepping stone in the solution and one that it's worth expending time and effort to keep in good shape. There are very few engineering disciplines where regular monitoring and testing do not form part of ongoing maintenance activities, in fact I can't think of a single one. We are lucky with software because if we look after our tests they are a very low cost solution to making sure things can be evolved with minimal risk. We ought to treat testing as an intrinsic part of software engineering instead of a one off 'gate keeper' activity or a problem for the QA team to solve . Test Code is as much a part of a modern software solutions as the production code itself.