use `base::serializer()` in the plumber API

Xianying Tan

2018/11/14

Recently, I need to share an R model on the server to my colleagues who use R. Plumber comes to my mind immediately. Build a web API using plumber is really easy. I love the roxygen-way to define the API. It’s elegant and easy to maintain.

Usually, web APIs use JSON to represent data. Unfortunately, JSON encodes objects in a string, which may result in information losses. For example, the attributes (other than names) cannot be preserved. And it causes troubles:

Luckily, all my “clients” (my colleagues) are R users, so I don’t really need a general web API. JSON is only one of the many methods to serialize objects and I’m not bound to it. Due to the existence of base::saveRDS(), I know there must be a serializing method provided by R itself - whether the method is exported or not is the only thing in doubt. Fortunately, with little effort, base::serialize() and base::unserialize() are the cures I’m looking for.

My solution is provided in the code below. Since the rds file is almost the seamless representation of the R objects (external pointers are the exception), using base::serialize() as the customized serializer of the plumber API minimizes the efforts required to establish a stable plumber API for the R users.

Enjoy!


The sample code

Add the customized erializer first

plumber::addSerializer("r_obj_serializer", function() {
  function(val, req, res, errorHandler) {
    tryCatch({
      res$setHeader("Content-Type", "application/octet-stream")
      res$body <- base::serialize(val, NULL, ascii = FALSE)
      return(res$toResponse())
    }, error = function(e) {
      errorHandler(req, res, e)
    })
  }
})

Use the customized serializer in the plumber file

#* @post /api
#* @serializer r_obj_serializer
function() {
  ...
}

Get the API results

out <- httr::POST(
  url,
  encode = "raw",
  body = body,
  httr::content_type("application/octet-stream"),
  ...
)
# you may need to check httr::status_code() == 200L 
# or if is.raw(httr::content(out)) is TRUE, first
base::unserialize(httr::content(out))

  1. Let’s make a large double vector by v <- rnorm(1e8), system.time(invisible(jsonlite::toJSON(v))) costs 27 seconds while system.time(invisible(serialize(v, NULL))) costs less than 4 seconds on my computer