Filtering/querying (deep) relationships - a proposal for a syntax

JSON:API currently lacks a standard for filtering/querying data included through relationships, as opposed to GraphQL which has this firmly specified.

Consider the following API:

  • GET /authors to get all resources of type author
  • Resource author has relationship articles which is to-many article
  • Resource article has attribute isDraft: boolean

Now consider the following request:

  • “Give me all authors who have draft articles, and include their draft articles”.

There are two filters here:

  • The filter on the author list (filtering on a related resource’s attribute)
  • The filter on the included articles

Here is a syntax that might be able to confer the necessary distinction:

GET /authors
  ?filter[articles.isDraft]=true
  &filter[articles][isDraft]=true
  &include=articles

The meaning of the two filters would be:

  • filter[articles.isDraft]=true means "only return authors who have articles with isDraft=true"
  • filter[articles][isDraft]=true means "for any included articles from the articles relationship, only return those with isDraft=true"

In other words:

  • The first (of two) pair of brackets indicate what to filter (if not present, filter top-level data)
  • The second (or only) pair of brackets indicate how to filter, using dot notation for filtering on related resources

As far as I can see, this syntax is general enough to be nested artibrarily on any level. For example, filter[articles.comments][author.name]=John&include=articles.comments means "for each article, include only those comments whose author has name=John"

Does this make sense? Could this syntax be a clear, general, and consistent way to support filters on nested relationships?

Also, how will you get all draft articles that were created two months ago?

This proposal is not needed for that. I do it like this (assuming you meant at least two months ago):

GET /articles?filter[isDraft]=true&filter[createdAt][le]=2020-07-01

Where le indicates <=.

Granted, JSON:API does not have a standard syntax for operators like [le] above, but it’s a fairly trivial addition that does not break any part of the spec and which I’ve seen several others use.

Some operators I’ve used:

  • [lt] (<)
  • [gt] (>)
  • [le] (<=)
  • [ge] (>=)
  • [matches] (fuzzy string matching)
1 Like

Yeah! You’re correct, I meant at least two months ago.

I like that strategy, it simple and easy to understand. I think I might use some other operators such as:

  • [since]
  • [until]
  • [between]

E.g.:

GET /articles?filter[isDraft]=true&filter[createdAt][between]=2020-07-01,2020-07-08
GET /articles?filter[isDraft]=true&filter[createdAt][since]=2020-07-01
GET /articles?filter[isDraft]=true&filter[createdAt][since]=2020-07-01&filter[createdAt][until]=2020-07-08

Thanks for you quick reply.

That works, too. Note that if you have since/until (or le/ge), then between is superfluous. (You may still want to include it for the sake of API ergonomics, but I would contest that APIs are generally made for machines, not humans, and that in this case the ergonomic difference is negligible and comes at the cost of clearly separated/orthogonal filters. YMMV.)

Thanks for the idea! While working on filtering I have been thinking about a slight adjustment on this.

For the second filtering of included resources I prefer to index them by the name of their type, not the path of their relationship. Thus for example:

GET /authors
  ?filter[articles.isDraft]=true
  &filter[Article][isDraft]=true
  &include=articles

Or:

GET /authors
  ?filter[Comment][author.name]=John
  &include=articles.comments

For me this has the benefit that it matches the way the sparse fieldset works, which helps is understanding and the shorter way of writing and not having to repeat if there’s multiple ways to get comments.

What do you think?

Doesn’t work, or even make sense. It has to be the relationship path, because it is specific relationships you are filtering. You can have different relationships to the same type, and filter the relationships differently.

Consider for example if each author has two relationships: ownArticles for articles they have written themselves, and guestArticles for articles they have helped co-write, or something like that. Consider now the following requests

  • “Give me all authors and include their published (not draft) co-authored articles”

The request becomes:

GET /authors
  ?include=guestArticles
  &filter[guestArticles][isDraft]=false

The reason it works for sparse fieldsets is the assumption that “whatever relationships the resources of type X come from in this response document, you only want these fields”. That’s an entirely different thing than filtering relationships.

I guess it depends on your context. For me it does work and makes sense.

Just like sparse fieldsets where sometimes it would be useful to differ the fieldset depending on the relationship path you might want that for filtering. But more often I think it isn’t needed. Also in the example you gave; for me that sounds like a page focusing on published articles, whether they are my own or co-authored I don’t care.

Again, this depends on your context. But I don’t agree it is as harsh as you put it and quite comparable with sparse fieldset. And that re-usage of a pattern is also worth something.

Hm, yes, you’re right - it’s just not what this proposal tries to do. There is a parallel, and it depends on the context, but this is a general (“context-less”) proposal. I may be biased, of course, but I would argue that the flexibility/usefulness increase in per-relationship filtering compared to per-type filtering, is significantly greater than the increase from per-type sparse fieldsets to per-relationship sparse fieldsets.

Thanks for your input, though, it made me see this in another way! :slight_smile:

1 Like