Messagepack-RPC - a fast, simple protocol for microservices

August 3, 2016 Ruby MessagePack MessagePack-RPC

If you are using HTTP for your intra-service communication, you’re wasting precious milliseconds: HTTP clients make a new connection for every request. DNS is resolved. Load balancers must actuate. The SSL handshake is also done for every request.

All of these may produce up to 10 ms overhead for each remote call. That’s already significant on the web scale, but if you are doing many calls, the overhead becomes unsustainable.

In our case, some legacy code made thousands of RPC calls per one web request. I chose optimising the RPC over a thorough refactoring of the legacy code. That’s when I’ve noticed that the performance of the HTTP API is much worse than the performance of the service procedures by themselves.

Which got me thinking. You always get fast calls to your database and your cache server, so why wouldn’t you also expect fast requests to your own services?

Moving beyond JSON-over-HTTP

To go faster, I started looking at remote procedure call protocols. There are many of them, starting with the venerable SOAP.

Unfortunately, Ruby lacks an abundance of RPC libraries (now, why would that be the case?) It’s a pity because Ruby on Rails is often used as the web frontend for other services.

Another issue is RPC systems are traditionally designed with code generation in mind - and how many successful code-generation Ruby projects you know?

Anyways, I tried Thrift and GRPC. Both had issues with un-Rubyish APIs, weird domain-specific classes, having to use code generators or generate classes on-the-fly from specifications, and connection instability.

NB: HTTP/2 is also an option - it’s binary and uses a persistent connection. As is HTTP 1.1 with a persistent connection.

MessagePack

MessagePack, or MsgPack is a data serialization format. It is designed to be as flexible as JSON. It supports streamed reads and writes. The format is binary and as compact as it gets without compression.

Most importantly, its native implementation for Ruby is the one of the fastest among other serializers.

Even more important is that msgpack, as JSON, works without a predefined type schema, and the Ruby representation is plain old hashes and arrays, without any wonky metaprogramming or codegenerated classes. (I have a hunch that using hashes with symbol keys in Ruby is more efficient than building custom request and response objects for each RPC call.)

The MessagePack-RPC protocol

MessagePack-RPC is a simple remote procedure call protocol. You just send a msgpack-encoded array of parameters to the server and get a msgpack-encoded array of results back. All you need is a socket connection. And you can keep it alive for as long as possible, until network issues or reboots disconnect the client from the server.

You can use SSL as the transport layer. (You can use HTTP as well, but why?)

MessagePack-RPC also supports pipelining and parallel requests, but I haven’t used them yet. I don’t think they are important for Ruby clients.

Language support

Besides Ruby, MessagePack is supported by a wide array of languages. We use Go. Rust, Java, Scala, Clojure work too. Elixir - no problem. Even Javascript in the browser.

The official msgpack-rpc library for Ruby is overcomplicated and based around async programming that almost no Ruby project ever uses. We had reliability issues when deploying it to production at BrightBytes.

Therefore, we’ve released a leaner, simpler implementation - msgpack_rpc_client - that only uses the official msgpack gem and stdlib sockets.

In Go, the best way to do MessagePack is the go-codec library; it’s efficient, it provides code generation (which is welcome in a compiled language), and the generated code can be reused to serialize the same structures to JSON.

And, go-codec supplies handlers for the builtin net/rpc module to support MessagePack-RPC - see the server example for a complete implementation.

RPC Benchmarks

Using server.go and benchmark.rb from the examples folder, I test a simple RPC call using msgpack-rpc and an equivalent HTTP API:

Between two Convox instances in the same Amazon region, using unencrypted communication:

#  msgpack-rpc:      579.9 i/s
#    http/json:      175.4 i/s - 3.31x slower

Between the same two instances, but with SSL encryption:

#  msgpack-rpc:      415.3 i/s
#    http/json:       99.9 i/s - 4.16x slower

As you can see, we are down from 10 ms per call to 2.5ms - a 7.5ms economy on each RPC call.

And on a local machine without SSL, just for fun:

#  msgpack-rpc:     5161.2 i/s
#    http/json:      294.8 i/s - 17.50x slower

This demonstrates that you should not benchmark network performance without a real network; the loopback TCP stack is fast, and connection delay is much more significant than in a realistic scenario. And I almost went with these results instead of bothering to set up a server pair!

And now, lets move construction of the MsgpackRpcClient into the loop, creating a new connection for each call (on Convox with SSL):

#  msgpack-rpc:       92.9 i/s
#    http/json:       84.9 i/s - same-ish: difference falls within error

So the advantage of MessagePack-RPC is in reusing a persistent connection.

Let’s try out using net-http-persistent to use a persistent connection to our HTTP server (Convox instances, SSL)

#  msgpack-rpc:      415.7 i/s
#    http/json:      303.7 i/s - 1.37x slower

Benchmark results

Method Average time per call
MessagePack-RPC, SSL, non-persistent connection 10.8 ms
HTTP/JSON, SSL 10 ms
HTTP/JSON, unencrypted 5.7 ms
HTTP/JSON, SSL, persistent connection 3.3 ms
MessagePack-RPC, SSL 2.4 ms
MessagePack-RPC, unencrypted 1.7 ms

Benchmarks of the coder and decoder

user = {
  first_name: "Bill",
  last_name: "Gates", 
  email: "billg@microsoft.com", 
  last_login_at: 1469736356
}

# Encoding:

#      oj:   929704.6 i/s
# msgpack:   636616.8 i/s - 1.46x slower
#    json:   144591.7 i/s - 6.43x slower

# Encoded size: 

msgpack_user = user.to_msgpack # 78 bytes
json_user = user.to_json # 98 bytes

# Decoding:

#      oj:   385677.6 i/s
# msgpack:   307558.0 i/s - 1.25x slower
#    json:   162164.0 i/s - 2.38x slower

MessagePack is about as fast as Ruby JSON implementations - but it doesn’t matter because any of them encodes a simple hash in a negligible 10 microseconds. You may see a difference if you are sending larger objects.

Conclusion

If you care about your RPC performance at all, you should be using a protocol with a persistent connection - and MessagePack-RPC is a great choice for Ruby projects. (But the net-http-persistent library is interesting, too.)

Buy Me a Coffee at ko-fi.com