Handling Binary Data

The spec appears to be silent on how binary data should be handled.

I have a few situations here where I am handling image files, and converting them to/from base64 to contain them in the data property feels heavy-handed. I am also concerned about the weight of larger images on both the client and the server which dealing with these especially large JSON payloads.

Other issues are that it…

  1. Wouldn’t scale to even larger binary types
  2. Is hard to deal with from a browser, where a multipart file upload makes more sense
  3. It isn’t possible to stream down the binary content in regular HTTP style

Is this a place where you simply roll your own implementation?

Generally speaking, this is a hard question to answer and (as usual) the answer is “it depends.” I imagine you want to link this data to the database in some way? If so, will you be storing it in the DB itself as a blob or will you be pointing to disk? If you don’t want to relate the binary to the database in any way, I assert that JSON API would likely be the wrong solution entirely and you should certainly roll your own as a separation of concern.

For small data, I would not expect base64 encoding to be prohibitively expensive in any way for use with standard JSON API (and you could even store this in the DB as is and have the client-side fix it up when it retrieves it). That said, generally, binary and text-formats don’t play nicely together if you’re not encoding the binary data. Consequently, I would not suggest trying to dump raw binary data into your json payload. If base64 encoding becomes infeasible for your use-case, my recommendation would be to do this out-of-band as you suggested (i.e. roll your own).

There would be many ways to do this out-of-band, but all would likely be specific to your particular use case.

Good luck!

Correct, right now I am working with data stored in the database as a blob. It’s very possible that in the future this will move elsewhere, such as S3 or Azure blob storage.

Here’s an example scenario: Person has one Photo. Photo belongs to one Person.

/person/1 - Would return person details as JSON, in JSON API format. Just as you would expect.

/person/1/photo - This is where it gets tricky. Do we return the photo as a base64 payload in JSON API format? It feels a little silly to do that. It would seem to make more sense to send the image data with Content-Type = ‘image/jpeg’ or similar. That way I would be able to display the photo to a browser’s <img> tag.

I suppose that “it depends” is the most appropriate answer and the spec leaves it out on purpose. I’ll start doing what makes the most sense for our application. Thanks!

I suppose it may come down to the tools you’re using. If you can base64 encode without it making things noticeably less performant, this would likely be the easiest option as your tools could treat this data just like everything else and the client-side can then decode it (knowing that the field is base64-encoded jpeg).

Otherwise if you’re doing this OOB, maybe it’s best not to combine it with the JSON API interface (i.e. /person/1/photo) and provide it a new namespace as to avoid confusion in the future between the spec and special use-cases.

That said, what seems more enticing to me is a JSON API extension. If there was a binary extension, you could return the result and notify the client in the header that some binary format will be coming back. This would have to be well-formed and specified, of course, but may be the most seamless in the end.

Anyway, these are all the interesting engineering questions in the first place :slight_smile: I am a bit curious to see how you handle this in the end anyway (to date, I don’t think I’ve encountered this use-case yet), so be sure to report back!

Similarly, this is the latest discussion I could find on the matter. In short, base64 is the recommendation to remain spec compliant.