I’ve been looking at several MQTT brokers recently, and whilst I shan’t go into all the details of that, I shall post my hard-earned learnings on how to get VerneMQ built and running on macOS 10.12. This is all cobbled together from Googling various issues I ran into and bludgeoning my way to success, but hopefully this end to end description will be useful to somebody. To be honest, I have been massively put off VerneMQ because of this poor out-of-box developer experience and the extremely limited documentation.

Building

  • Ensure XCode and its command line tools are installed, because we’ll need the command line compiler tools.
  • VerneMQ is built on Erlang, so we need to install that. Unfortunately the official “Erlang Installer” for mac didn’t work, giving an error: “erl could not be removed”. Thankfully brew works nicely. Don’t know why I didn’t try that straight off.
    • If you haven’t already got brew, install it from https://brew.sh. It’s really very quick and simple and a must-have tool anyway. Then install Erlang:
      > brew install erlang
  • Get the VerneMQ code:
    > git clone git://github.com/erlio/vernemq.git vernemq_git
    > cd vernemq_git
  • The build would fail with “vmq_passwd.c:32:10: fatal error: ‘openssl/evp.h’ file not found”, on macOS 10.11+, so we need to specify openssl location in CFLAGS. Ensure openssl is installed first if necessary, with brew:
    > brew install openssl
  • Get past an erlang rebar bug, caused by files being readonly, by making them readable (bit of a hacky workaround):
    > chmod -R u+w /usr/local/Cellar/erlang
  • Actually build it, now we have everything we need, adapting the path here as necessary to match the openssl version you actually have.
    > CFLAGS="-I /usr/local/Cellar/openssl/1.0.2k/include -L/usr/local/Cellar/openssl/1.0.2k/lib" make rel
  • If that succeeded (after a fair while) you should have the binaries in _build/default/rel/vernemq/bin/

Running

It’s trivial to start the binary with default config:

_build/default/rel/vernemq/bin/vernemq start

Note that this starts it up then quits, but leaves the server running. You can do the same but with ‘stop’ to shut it down. See the docs on further things you can do with the vernemq binary.

Configuration

If you’ve just built it and are running from the _build location, the config file it is using is at _build/default/rel/vernemq/etc/vernemq.conf

I turned on anonymous users and added a websockets listener by following the instructions in that file. I restarted the server (vernemq restart) and then was able to successfully publish and subscribe with the handy HiveMQ online WebSocket MQTT client.

I was recently frustrated by very slow tests and timeouts in my Java code, that would often show a similar stack trace (if they actually timed out):

    io.vertx.core.VertxException: Thread blocked
     at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
     at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
     at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
     at java.net.InetAddress.getLocalHost(InetAddress.java:1500)

I’ve highlighted the critical part. The Vertx bit is just what implements the timeout in my case. If you’re using a Mac and you see this, you’re probably having the same problem. You might also see Inet4AddressImpl in the stack trace instead of Inet6AddressImpl.

After a lot of web trawling and some help from a colleague (thanks Tim) I got to the bottom of it. I’m writing it up here, for my own benefit when I run into this again in the future, and because a lot of the existing resources weren’t clear and direct enough to solve my problem easily.

The fix

The slowness is caused by a domain name lookup that’s taking a few seconds each time, because for some reason your computer is asking the network about its own address, and timing out. I don’t fully understand the mechanics frankly, but the fix was simple in my case.

First, figure out what your computer thinks its hostname is, by running hostname in the terminal. Then use the value returned from that to add lines like this to your /etc/hosts file:

    # This works around slow lookup that we sometimes see in
    # java.net.Inet6AddressImpl.lookupAllHostAddr
    127.0.0.1 Sams-MacBook-Pro.local
    ::1 Sams-MacBook-Pro.local

This provides a direct answer for both IPv4 and IPv6, avoiding the slowness. This had the nice effect of bringing my Gradle build time down from 2 minutes to just 44 seconds, including all the tests.

Test error

I was writing tests in Swift for a time parsing function, and on the way to getting it right, I saw some very confusing error output, as per the screenshot above. In that screenshot I’ve deliberately broken the test (the times don’t match) to invoke the red error text, but what’s interesting is that the error reports 04:02:15 and 04:03:15 instead of the times I actually used – 04:01 and 04:02 respectively.

When you’re testing time-parsing code, the last thing you want is the test failures giving confusing/misleading figures so I had to get to the bottom of it.

It turns out that GMT and “Europe/London” timezones diverge through history (ignoring daylight savings). In fact, in the year 0 AD they were one minute and fifteen seconds different, and it’s this discrepancy that was showing up in my tests. Note that it doesn’t affect the test results themselves – only the display of NSDate values when there’s a test failure.

I was constructing test times from NSDateComponents and only specifying day, hour and minute, as that was all that was relevant to the tests. However that left the year defaulting to zero. I was also specifying timezone as NSTimeZone(name: “Europe/London”). The error messages from XCode only have an NSDate to work with however, which doesn’t have any notion of timezone, so XCode used UTC/GMT to format for display. And being year zero dates, the time comes out differently. The simple fix for me was to set the year in the NSDateComponents to 2015 to get everything into line.

What would happen if I ran the tests in the summer, when daylight savings is in the effect here in the UK? I’m not sure, but I might end up with times an hour out, for the same reason.

Here’s some code you can dump into a playground to demonstrate. The results are even weirder if you use “Europe/Paris” as the timezone, giving “”0001-01-01 10:35:39 +0000” as the output, just nine minutes and twenty-one seconds earlier, rather than the whole hour earlier that one might expect (and that you get if you use 2015).

// Demonstrate GMT/UTC != "Europe/London" in year zero.
let ukTimeZone = NSTimeZone(name: "Europe/London")!
let ukCalendar = NSCalendar(calendarIdentifier: NSCalendarIdentifierGregorian)
ukCalendar?.timeZone = ukTimeZone
let dateComponents = NSDateComponents()
// Reinstate this to fix things.
//dateComponents.year = 2015
dateComponents.hour = 10
dateComponents.minute = 45
dateComponents.timeZone = ukTimeZone
let date = ukCalendar?.dateFromComponents(dateComponents)
date?.debugDescription // "0001-01-01 10:46:15 +0000"

TL;DR

It’s actually pretty simple, if you know what you’re doing:

  • convert project to a workspace if it isn’t already
  • add Carthage/Checkouts/SWXMLHash/SWXMLHash.xcodeproj to the workspace
  • move playgrounds to the workspace (not a project)
  • build the SWXMLHash project.
  • import SWXMLHash successfully in your playground, as long as the target you have selected includes the SWXMLHash framework project.

But if you don’t know what you’re doing…

I struggled to “import SWXMLHash” in my XCode Playground. I kept getting an angry red error for the module import. SWXMLHash is a third party framework I wanted to experiment with, that makes XML parsing nice and simple (ish).

Having figured out the trick to getting third party frameworks working in playgrounds, I thought I’d document it and the pitfalls. Really it’s pretty much just a case of following Apple’s own documentation, but I still took a while to get it right, mostly due to still being on the learning curve for the ecosystem. So my instructions are more tailored to the innocent newbie.

In my case I’m using Carthage to fetch and build SWXMLHash, so I have a copy of the framework’s project source in my project’s Carthage/Checkouts/SWXMLHash directory. Which we can use later, but first…

You need to be using a Workspace, not a Project in XCode. I had a project because that’s what XCode gave me when I started my new world-beating app, and I didn’t know any better. Use File > Save as Workspace… to save a workspace file containing just your current project. It seems to be traditional to use the same name as the main project, so we end up with Foo.xcodeproj and Foo.xcworkspace files. From now on, always open the workspace not the project.

Now move your playgrounds out from your project and into the workspace – i.e. up a level. I did this by deleting the playground reference from within the original project, then adding it to the workspace with the + button in the very bottom-left of the XCode UI. That button adds to whatever is selected, so ensure nothing is selected (cmd-click on the currently selected item to deselect it). There are probably other ways. Hell, maybe you can even drag them, but I didn’t try that.

Playgrounds can only deal with Frameworks whose project is within the same workspace. If you’ve only got a .framework file I believe you can put it in with the system frameworks in the right place on disk and it will be found, as a workaround, but I’ve got the project courtesy of Carthage so we’re OK here. Add that project (Carthage/Checkouts/SWXMLHash/SWXMLHash.xcodeproj in my case) to the workspace via that bottom-left + button. Build that freshly added framework project for Mac by selecting the relevant scheme from the dropdown in the toolbar and selecting Product > Build.

Now, in your playground, you should be able to import and use that framework, but there’s one final wrinkle: the currently selected target must include the SWXMLHash framework project. Targets that include the Carthage-built framework don’t count – it has to be the framework project that you added above. So for example, selecting the SWXMLHash target itself works. You want the OSX build, because that’s the variant of the framework that is used in the playground.

It’s also probably for the best that your playgrounds now exist in a workspace rather than in a project, cleanliness-wise. It should look something like the image below, with playground, framework project and my own project in the workspace, and the module import working correctly in the playground

Workspace setup

I have blogged on my employer’s blog about the simple, but relatively featureful, Knockout.js page router that I whipped up recently. The core router.js file is only 61 lines but is packaged up into a nice demo app, src on GitHub, that you can just clone, then double click index.html to see it working (no server required).

Inspired by a random tweet about their added Scala support, I tried out the Codility sample test. I rather liked my solution and I think it’s a perfect example of some of the niceties of Scala, in a small way, so here it is:

def solution(a: Array[Int]): Int = {
  // Partial sums are Longs to avoid Int overflow.
  val sums = a.scanLeft(0L)(_ + _).tail
  def equilibrium(index: Int) = leftSum(index) == rightSum(index)
  def leftSum(index: Int) = if (index == 0) 0 else sums(index - 1)
  def rightSum(index: Int) = sums.last - sums(index)
  (0 until a.length) find (equilibrium(_)) getOrElse(-1)
}

The use of scanLeft to build up all the partial sums is particularly handy. Having done that it’s very easy to run through all the indexes until we find one that satisfies us. Note that the find method returns an Option[A] so we use getOrElse to return -1 if no solution was found (as per the requirements).

CQRSLaptopTablet
Allow me to wax philosphical for a moment with an observation about where computers and their operating systems are heading.

In the world of software development CQRS = Command Query Responsibility Segregation, which in its simplest sense recognises that it's sometimes better to use a different mechanism for reading data than it is for writing it. See Martin Fowler's exposition of the concept if you want to know more, but this post isn't actually about software development at all!

I reckon that we're at a critical juncture in the evolution of personal computing devices and that the CQRS principle is necessarily coming to the fore to save the human race.

Tablet computers are taking the world by storm, in case you hadn't noticed. Apple could barely make enough iPad Minis for me to be able to get my wife one for Xmas, though I did manage it at the very last minute, and shortly thereafter bagged one for myself too. Frankly it's bloody brilliant, but I use it predominantly for consuming rather than creating and I'm far from alone. This is partly because the human populace is inexorably dumbing down towards being fat blobs with brains wired directly into the 'net, consuming inane banter, amusing picture of cats and the latest celebrity news, 140 characters at a time. But that aside, it's just not very pleasant to write large quantities of text, manipulate images or perform other expansive creative works by prodding a tiny screen. Or even a big screen.

To write software, construct lengthy blog posts (ahem), edit movies, sequence the human genome or design great buildings requires a proper computer! On that basis I posit that there will always be a place for desktops and laptops, or indeed whatever replaces them but which necessarily has a non-trivial input mechanism. I genuinely worry that the market for serious computers will be increasingly neglected by the manufacturers, refocussing as they are on the mass consumer market, inevitably leading to the downfall of humankind. Perhaps I exaggerate – at least I hope so.

Now I've never used Windows 8, indeed I shudder at having to use Windows 7 on a daily basis at work, but I understand it represents something of a chimera. It is best known for its shiny, touchy, slidey 'Metro' UI, beckoning your greasy fingers to caress its tiles. However it also allows you to fall back into the more staid world of traditional Windows where presumably you can get some proper work done, as long as you have a keyboard and a pointing device other than your finger. I understand critics are conflicted about this hybrid approach, but it's CQRS writ large and may therefore be the way forwards. One way or another, at least some people will need to create great works. I do hope to be one of them, and to have the equipment to be able to do it.

I've been doing some trivial benchmarking of Play 2 with ab (Apache Bench) just to get an idea of its raw capabilities for serving simple requests – and because it's what I always do when picking up a new framework so I know what I'm dealing with. In doing so I ran into a bit of a puzzler that had me thinking Play 2 was bugged – but my spidey sense soon kicked in and told me it was more likely to be an OS or ab issue. I had done approximately the following, using Play 2.0.1 on OS X 10.7.3, and I'm pretty certain you'll see the same results if you do this on a Mac:

> play new hello  [select option 1 - basic Scala app]
> cd hello
> play start
> ab -c 50 -n 16000 http://localhost:9000/ [Runs fine - about 3700rps]
> ab -c 50 -n 16000 http://localhost:9000/ [Gives up with timeout]
> ab -c 50 -n 16000 http://localhost:9000/ [Runs fine - about 3700rps]
> ab -c 50 -n 16000 http://localhost:9000/ [Gives up with timeout]

It took me a bit of experimentation to establish that it's about 16000 requests that work fine, followed by timeouts, in a reliable pattern. That's a suspicious number, being near enough a power of 2, which is what clued me into it being an OS limit that I was running into. I ran the same ab test (with the same result) against the built in Apache https serving a static file, confirming that Play 2 probably wasn't to blame.

Sure enough, a quick Google turns up the goods. My OS was running out of the approximately 16000 ephemeral ports available and having to wait for them to be released before it could reuse them. So not Play 2 or ab's fault at all. Actually in some senses it is Play 2's fault for being so fast that I've run into this limit.

I'm not going to go into the details of what ephemeral ports really are, as others have done that perfectly well, and there is a good StackOverflow answer with some key ways to work around the problem by modifying parameters of the OS' network stack – but be careful and make sure you understand what you're doing.

However, one very simple way to workaround the issue is to simply pass the -k option to ab, to use HTTP keepalive (assuming the server you're testing supports it). Note that this changes the nature of your test though, as you're no longer really simulating large numbers of separate connections – but for basic sanity check testing it may help. For the record `ab -c 50 -n 100000 -k http://localhost:9000/` benchmarked Play 2 at about 7000 requests per second on my 2.4GHz Core Duo MacBook.