Docker + RabbitMQ + Clustering + High Availability + Federation

Wow, what the hell am I doing…

I’ve been watching docker from a distance for some time and have been completely fascinated by its potential.  Saying that, I’d struggled to find an excuse to play with it partly because I’m a Windows user but mostly because I’m lazy.

I’ve come up against a challenge with a network configuration where I needed to communicate between two networks in separate locations.  One of the environments uses RabbitMQ to manage some communication between applications.  It’s in its infancy (a single node on one machine) but is already the backbone of a number of systems.

We have an event that can occur in both environments so ideally I’d like to send emit the event and be able to subscribe to it in one of the environments.  It turns out that Rabbitmq offers functionality to allow this: A Federation.

I’ll call the two environments “upstream” and “downstream” (all the event processing is done in the downstream environment).   I needed a mechanism to get events from the upstream into the downstream ready for processing.  It also needed to be resilient and provide high availability:

  • What happens when a rabbitmq node fails?
  • What happens if the network connection between the environments fails?

I wanted to set this all up quickly and without commissioning a load of kit for it.  I decided I could use docker and run multiple rabbitmq instances locally in order to prove the configuration.  This would allow me to simulate two environments with two clusters and what would happen when one or part of an environment ceased to exist.

Prerequisites

In order to do this I needed to set a few things up.  Boot2docker provides a nice way to use docker while on a Windows environment.  I installed it with all options:

boot2dockersetup

I also had to make sure ssh.exe could be accessed by adding a path variable to its location (“c:\Program Files (x86)\Git\Bin” in my case):

path

With this done I was ready to use boot2docker.

Starting Boot2docker

With the installation complete all interaction with boot2docker and docker can be done via the command line.  In order to initialise docker you run the following commands:

  • boot2docker init

 

boot2dockerinit

  • boot2docker start

The above steps prepare the Linux image and start the virtual machine.  We then need to add several environmental variables, with powershell this can be done as follows:

  • boot2docker shellinit | Invoke-Expression

Running Containers

With boot2docker setup we can now start using docker itself.  In my example setup I want two clusters, each having two nodes:

  • downstream1 + downstream2
  • upstream1 + upstream2

I’ll need 4 containers with the pairs linked.  I’ll also need an image to use for the container, I used: bijukunjummen/rabbitmq-server.  This image has rabbitmq on it plus the federation plugin.  The following command will run the container using that image:

  • docker run -d -p 5671:5672 -p 15671:15672 -h downstream1 –name downstream1 bijukunjummen/rabbitmq-server

I’m creating the containers as detached (-d) and I’m mapping two ports from the host to the container (5671 to 5672, 15671 to 15672),  I’m then setting the host name as “downstream1” and the container name as “downstream1”.  The last segment specifies the image to use.  If you’re using an image for the first time you should see an output as follows:

dockerrun

 

Docker downloads the image for you, run the following command to list running containers:

  • docker ps

You can now access the running rabbit instance by the host IP address and management interface port specified (15671).  In my case this was 192.168.59.103:15671.  If you want to be able to access this from other another machine you can use port forwarding in the virtual box configuration settings.

managementinterface

As we’re running a cluster we’ll need at least one more more node to do this, I used the following command:

  • docker run -d -p 5672:5672 -p 15672:15672 -h downstream2 –name downstream2 –link downstream1:downstream1 bijukunjummen/rabbitmq-server

There’s an extra parameter here, “–link downstream1:downstream1”.  This creates an alias (name or id to alias) on the container allowing this container to see the other container “downstream1”, without this the nodes won’t be able to communicate.

Run the following command see to the containers running:

  • docker ps

dockerps

Clustering

There’s a few things to note from the clustering guide.  Clustered nodes must share the same Erlang cookie, because I’m using the same image for all the containers they already share this.

We’re going to cluster downstream2 to downstream1.  We have to stop the rabbit application in order to join it to a cluster:

  • docker exec downstream2 rabbitmqctl stop_app

This executes a command on the container, stop_app stops the rabbit application.

  • docker exec downstream2 rabbitmqctl join_cluster rabbit@downstream1

This creates a cluster between downstream2 and downstream1 . We then need to start the rabbit application again on the downstream2 container

  • docker exec downstream2 rabbitmqctl start_app

The nodes overview in the rabbit management interface should now look like:

nodes

I’m looking to create a federation as I intend to have two rabbit clusters with messages being transported from the upstream.  I run the following commands to create this:

  • docker run -d -p 5673:5672 -p 15673:15672 -h upstream1 –name upstream1 bijukunjummen/rabbitmq-server
  • docker run -d -p 5674:5672 -p 15674:15672 -h upstream2 –name upstream2 –link upstream1:upstream2 bijukunjummen/rabbitmq-server
  • docker exec upstream2 rabbitmqctl stop_app
  • docker exec upstream2 rabbitmqctl join_cluster rabbit@upstream1
  • docker exec upstream2 rabbitmqctl start_app

The above steps setup a cluster in the same way as we did with downstream1 and downstream2.

upstreamnodes

We should now have 4 images running with 2 pairs of nodes running as clusters 🙂

High Availability Queue

The queue we subscribe to will be on the downstream cluster.  The next step was to create the queue on the downstream.  I’ve done this using the web interface to rabbitmq:

queues

I then created a fanout exchange (DownstreamExchange):

exchanges

I then bind this exchange to my queue:

bindexchange

In order to setup high availability in rabbitmq we use a policy:

policy

This creates a “DownstreamHA” policy, applies it to the queue named “DownstreamQueue” and applies it to queues only. I’ve used the “ha-mode” and use the parameter “all”:

“Queue is mirrored across all nodes in the cluster. When a new node is added to the cluster, the queue will be mirrored to that node.“

Adding the policy, this then applies to the DownstreamQueue queue:

policyqueue

The “+1” next to the node tells us that the queue is mirrored.

Federation

In order to create a Federation there’s a few things which need setting up on the upstream cluster:

  • Create user for the downstream to access with
  • Create Virtual Host (I couldn’t get access to the default virtual host when I first tried this)

In this case I’ve created a user “downstream” with a password “password”:

users

Then I created the virtual host “Downstream”:

virtualhosts

In order for the downstream to access permissions then need granting via user admin:

permissions

Clicking “Set Permission” grants that user permissions for that virtual host:

hostpermissions

We created a federation by defining a Federation Upstream using the admin section on the downstream cluster:

federationupstream

As there are two nodes on the upstream I specified the URI for both: amqp://downstream:password@192.168.59.103:5673/downstream and amqp://downstream:password@192.168.59.103:5674/downstream.

The Federation Upstream won’t run unless we create a policy which applies it to the exchange.  I added a policy as follows:

federationpolicy

I applied this only to exchanges and pass the definition “federation-upstream-set” with the parameter “all”.

Once added the upstream status should be visible under the admin section:

federatonstatus

The exchange was not visible on the upstream cluster until I granted the user I was logged in with the relevant permissions:

fedexchange

And in the queues:

federatedqueue

I wanted the queue to have high availability on the upstream we can create a policy for that:

hafedqueue

upstreampolicies

You can see the policy applied to the queue as follows:

upstreamqueueha

Done!

I now have two clusters, high availability queues and a Federation.  I then wanted to see it in action, the main thing I wanted to test was that the upstream would still function when the downstream cluster would not be accessed.  I tested this by pausing the downstream containers.  I was able to push messages into the upstream exchange, which would then wait in the queue until I unpaused one or all of the downstream containers.  Pausing individual nodes between clusters also gave me the resilience I wanted.

Useful Links

https://registry.hub.docker.com/u/bijukunjummen/rabbitmq-server/

http://boot2docker.io/

https://www.rabbitmq.com/clustering.html

https://www.rabbitmq.com/federation.html

https://www.rabbitmq.com/ha.html

https://www.rabbitmq.com/uri-spec.html

 

 

 

 

 

 

Advertisements
Posted in Uncategorized | Tagged , , , , , , , | Leave a comment

Test driven development is hard

I recently attended the Manchester Codejo group (www.manchester-codejo.com) ran by two awesome developers Iain and Gemma.  I thoroughly enjoyed the session and found it thought provoking and fun. It’s had me thinking a lot about my experiences with TDD and the challenges it often presents.

I’m slowly coming to realise (perhaps obviously to most) that TDD is hard because it’s a completely new way of programming. I often come across the preconceived notions that there’s an extra time overhead with TDD and that it’s a pointless overhead, tests are a waste of time because I could be writing “useful” code.  We’re all trained to think solutions first adding TDD to this then makes us think we wrap our tests around the solution.  I see this time and time again, from a code unit level to an “enterprise” application level.  We love coming up with solutions, often without knowing what the problem is (or at least understanding it fully).

This isn’t just something that can be applied to software development.  I’ve started learning piano recently, last night’s lesson involved sight reading.  Before we started my piano teacher told me a story told to her by an 11 year old boy she teaches.  The 11 year old boy explained that in his art class there was one girl who’s talent was far beyond that of anyone else in the class.  In one lesson his art teacher had set the children the task of drawing some flowers.  All the students set about drawing, except this one girl.  She sat, looking, observing, understanding.  When this girl finally started and finished, she produced by far the best drawing.  My teacher, Kathleen, then told me to spend a period of time reviewing the music in front of me before starting.

TDD forces us to articulate our understanding of the problem step by step, starting outside in.  It challenges all our assumptions about what we understand and what we think the solution is.  It discourages us from simply diving in.  I think this is profound in a number of ways, it forces us to destroy our attachment to what we thought the implementation would look like, I often find myself taking this personally.  I’ve found my main challenge with TDD is my own personal attachment to how I currently write software.

This line of thought makes me wonder how the barrier to entry can be reduced with TDD.  To me it seems to be encouraging ourselves to let go of how we write software, seeing tests as a way of describing what we’re trying to validate.  Then using this implementation to deliver as little as possible to solve the problem.

Posted in TDD | 2 Comments

Applying what I’ve learnt

I was lucky enough to be included in a session (I posted previously about) for a second time.  This time I decided to stop talking about what I’d learnt and try to apply it, the following is the result:

Tests – http://sllabres.apphb.com/GameOfLife/tests.html

Implementation – http://sllabres.apphb.com/GameOfLife/

I have to admit in my eagerness to see something I threw together the presentation using a WebGl API called Three.js.  In the link provided I’ve used the WebGlRenderer, which won’t work in Internet Explorer and may require enabling in other browsers (Chrome seems to work best).

Posted in TDD | Tagged , , , , | Leave a comment

TDD – A New Experience

I had to brain dump this after a really enjoyable day long event ran by Kevin Rutherford (http://www.kevinrutherford.co.uk/).

The general aim of this event was to create Conway’s Game Of Life (http://en.wikipedia.org/wiki/Conway’s_Game_of_Life) using any language (no graphical representation required) and driven by tests. We all chose C# to develop in for the day (our primary language).

The Game Of Life is a rule based simulation whereby players observe the interaction of “Cells” based on the defined rules. The cells exist in a grid and live, die or resurrect based on their neighbouring cells.

The day was broken up into 45 minute sessions. We paired up practising Red Green Re-factor, fastest to green.

All of us started with what we thought was an object orientated approach, thinking of the game from the perspective of the object in question. Some of us started by creating tests for the cells and some by creating tests for the grid which would contain the cells. A typical test being along the lines of:

class CellTests
{
    [Test]
    public void CellIsAlive()
    {
        var cell = new Cell();
        Assert.That(cell.Alive, Is.EqualTo(true));
    }
}

class Cell
{
    public bool Alive = true;
}

We’d spent the first 45 minutes going back and forward creating test for grids and cells. After the first session nobody had got as far as handling any real interactions or applying rules, in fact it looked like it couldn’t be achieved within the day. At the end of every session, all code was deleted and we were forced to start again!

We took a 5 minute break then we switched pairs and started the same exercise explicitly practising ping pong (One person would write the test, another would pass the test, back to the other person who would re-factor). The results were fairly similar, we all tried different approaches around the same implementation, ie start by writing tests for the Grid first and building from there. After another 45 minutes nobody was much further.

Another 5 minute break and we switched pairs again. This time no talking was allowed and the person making the test pass had to be EVIL! This was a lot of fun but the result was more a case of people just bouncing back and forward around the same problems.

Lunch time. I must admit, I’ve not been developing a lot this last year. I felt pretty deflated that I hadn’t gotten as far as I’d like and perhaps my coding abilities were beyond repair.

Over lunch Kevin referred to a story whereby a team of developers were asked by their development lead if they were practising object orientated programming. All the developers in question nodded, “yup, of course we are”. The development lead went on to ask who was using getters and setters, they all were. He then set the team the challenge of removing all getters and setters from their code base over a 12 month period.

The point being by querying the state of objects you break encapsulation, this makes a very brittle code base, a very coupled code base and it makes a very hard to test code base.

Now, we’re all sitting here wondering how you could make something like The Game Of Life without querying the state of each cell. Surely each cell has to ask its neighbour what its state is? How do we know where each cell is on the grid without asking it? I couldn’t see how this was possible.

Kevin throughout pushes us to answer these questions, never answering questions for us or solving our problems.

He sets this afternoons challenge of creating The Game Of Life without using getters or setters. We’re not allowed to query the state of an object at all! We discuss this problem throughout lunch, if we can’t query the state of an object, not only does this make understanding what’s happening difficult but also validating our test results. The whole discussion genuinely turns into “There is no spoon”. Perhaps there’s not actually a grid at all, perhaps there aren’t even cells 🙂

As it happens we’re not actually interested in the state of each cell are we because it’s up to the cell to TELL the world what it’s doing and for us to observe behaviour and interactions. We’re taken down the path that cells don’t decide what to do but are told what to do. So again, how do we know if a cell is dead or alive?

We decide that there are not cells but two entities dead cells and live cells. We also conclude that perhaps something should be telling the cell what to do based on the rules, ie the cell asks the rule what I should do, live or die? We can also use a factory to create cells. The cell uses information given to it by the rules to tell a factory to create new cells of the relevant type. This then allows us to mock out the factory and assert when the relevant method is called on the factory. Kevin recommended a pattern called “self shunt” to inject the test class instance into the class it’s testing, that way the class being tested in fact tells the test object what it’s doing, e.g.

[TestFixture]
class TestClass : ICellFactory
{
    private int _liveCellCreated;

    [Setup]
    public void Setup()
    {
        _liveCellCreated = false;
    }
    [Test]
    public void ExpectCreationOfLiveCell()
    {
        new Cell(this);
        Assert.That(_liveCellCreated);
    }
    public void CreateLiveCell()
    {
        _liveCellCreated = true;
    }
}
public class Cell
{
    Cell(ICellFactory cellFactory)
    {
        cellFactory.CreateLiveCell();
    }
}
public interface ICellFactory
{
    void CreateLiveCell();
}

The day’s drawing to a close but throughout each iteration we’re realising we can abstract further away from the details. It’s becoming clearer that we don’t need to start as low down as defining the rules because we can assert how we expect cells to behave based on the rules as we know them (perhaps with more time we’d have gone further out). For myself it’s become an exercise in letting go and really starting to trust the tests I’m writing.

Ok, so we didn’t end up with a complete implementation of The Game Of Life but we’re now on the road towards true object orientation and top down development. It’s become natural to see TDD as a design practise rather than retrospective testing. Applying these same practises in professional and personal development allows the creation of clean and decoupled code. Best of all through we’re encouraged to constantly write code, working software as a measure of progress!

UPDATE: 

The four rules of simple code were discussed, aka “Extreme Normal Form”.

1. Passes tests
2. Communicates intent
3. No duplication
4. Nothing unncessary

Also the team of developers referred to included Steve Freeman and Nat Pryce, current reading: http://www.growing-object-oriented-software.com/.

Posted in TDD | Tagged | 1 Comment