Collections of thin representations of fat resources

I have a collection of resources, each individual resource contains a lot of data (or data that is expensive to obtain), so by default I would like the collection resource to return only references to individual resources. However, I would also like to give clients a method of obtaining the full details of the individual resources without having to make multiple API calls. My initial approach was for the collection resource to look like this:
"data":[ {"type": "aResource", "id": "dfd", "links":{"self": "http://example.com/resources/dfd"}}, {"type": "aResource", "id": "ghh", "links":{"self": "http://example.com/resources/ghh"}}, {"type": "aResource", "id": "tyt", "links":{"self": "http://example.com/resources/tyt"}}, {"type": "aResource", "id": "klh", "links":{"self": "http://example.com/resources/klh"}}, {"type": "aResource", "id": "uty", "links":{"self": "http://example.com/resources/uty"}} ]
but the “include” parameter operates on related resources, not on links.

The individual resources all have the same relationship to the collection - they’re merely members of it - so I can’t make the collection look like this:
"data":[ {"type": "aResourceCollection", "id": "someid", "relationships": { "member": {"links":{"self": "http://example.com/resources/member"}}, "member": {"links":{"self": "http://example.com/resources/member"}}, "member": {"links":{"self": "http://example.com/resources/member"}}, "member": {"links":{"self": "http://example.com/resources/member"}}, "member": {"links":{"self": "http://example.com/resources/member"}}, } } ]
because there is no way to distinguish the individual resources.

I could make the collection look like this:
"data":[ {"type": "aResource", "id": "dfd", "attributes": {<big_or_expensive_to_obtain}, "links":{"self": "http://example.com/resources/dfd"}}, {"type": "aResource", "id": "ghh", "attributes": {<big_or_expensive_to_obtain}, "links":{"self": "http://example.com/resources/ghh"}}, {"type": "aResource", "id": "tyt", "attributes": {<big_or_expensive_to_obtain}, "links":{"self": "http://example.com/resources/tyt"}}, {"type": "aResource", "id": "klh", "attributes": {<big_or_expensive_to_obtain}, "links":{"self": "http://example.com/resources/klh"}}, {"type": "aResource", "id": "uty", "attributes": {<big_or_expensive_to_obtain}, "links":{"self": "http://example.com/resources/uty"}} ]
and expose it via two URLs, one for the full-fat version, and one that included fields[] to force the API to drop the rest of the data but that feels like an abuse of the fields parameter.
Or I could ask clients to use the HTTP Prefer header to request thin or fat representations but that feels like working round JSON API rather than with it.

What’s the best approach to this problem?

Hmm, reading this it seems that maybe I should be making the collection look like this:
"data":[ {"type": "aResourceCollection", "id": "someid", "relationships": { "members": { "links":{ "self": "http://example.com/resources/relationships/members" "related": "http://example.com/resources/members" } } } } ]
Then clients that want a thin representation can follow the self link and those that want fat representations can follow the related link. But I think the thin representation returned by the self link has no way to include links that could be used to fetch the actual resources if the client then wanted to, so the client would have to make subsequent calls to the original related link in order to get further details. Is that correct?

Hi @jlangley

The jsonapi spec allows for links as a top level key in your response so why not have:

/thin that returns:

{
  "links": {
    "self": "http://example.com/thin",
    "fat": "http://example.com/fat"
  },
  "data" : [
  {
    "type": "aResource",
    "id": "someId"
  }
 ]
}

and /fat return:

{
  "links": {
    "self": "http://example.com/fat",
    "thin": "http://example.com/thin"
  },
  "data" : [
  {
    "type": "aResource",
    "id": "someId",
    "attributes": {
      "attribute1": "some value",
      ...
    }
  }
 ]
}

Thanks for the suggestion, but I have a couple of concerns about this approach:

  1. I’m pretty sure the only values allowed in a links object are self and related, I don’t think an implementation is free to add any others
  2. Fundamentally I’m returning different representations of a single resource, I’m not convinced that representations should be distinguished by having different URLs - this feels like something that ideally should be resolved via content negotiation / HTTP headers (I agree that using include would also put a reliance on the URL to pass information, but at least that is a JSON API standard approach, rather than something I’ve come up with specifically for my API).

The section explaining the top level links structure does not specify any specific link names, it does however say that "link object"s may in the future have keys other that meta and href.

I agree that it’s two representations of the same resource. You could use the content type header with parameters as explained here

You could use the content type header with parameters as explained here

The spec seems to explicity prohibit this.

The section explaining the top level links structure does not specify any specific link names

No, but the only examples here are self and related; the section on links in relationships explicitly limits to self and related; and in this discussion one of the primary authors (@ethanresnick) seems to confirm that in v1.0 only self and related are permitted.
So my understanding is that currently the only permitted link values are self and related. :disappointed:

The section explaining the top level links structure does not specify any specific link names

On re-reading, I’ve discovered that it does (sort of). It explicitly allows: self, related, first, last, next and prev:

The top-level links object MAY contain the following members …

To me this strongly implies that all other values are not allowed.

Hi @jlangley

The keyword “May” specifies that it is completely optional, but you are right, it is unclear wether the spec intends only those links ( self, related, first, last, next and prev) to be optional, or any links.

According to RFC 2119 for MAY,

One vendor may choose to include the item because a particular marketplace requires it…

and

… an implementation which does include a particular option MUST be prepared to interoperate with another implementation which does not …

I think your use of custom links will be perfectly aligned with that (barring clarification on the jsonapi spec’s intent)

The way I interpret that is that implementations must support or interoperate with systems that use any or all of those 6 links. But I don’t see anything that suggests that other links are allowed.

I think it would be very silly for only those to be allowed link keys.

Why allow for any type of url, but have no way of linking to it. Put differently, if you could only have only those links, why even specify them at all, it can be assumed that those links follow the standard format for links of that type? (Maybe I’m wrong here)

If we look at for example HTML (which is the a good implementation of HATEOS), a tags allow for a name between the opening and closing tag, because the name of the tag is up to the web page designer, and it would and should differ from one system to another, the name is what gives the link meaning. This allows for a lot of strength in designing an API to define “state transitions” between resources.

And I think that without it HATEOS is impossible.

This allows for a lot of strength in designing an API to define “state transitions” between resources.

The way I read the spec, JSON API wants the state transition labels to be the name / key of a relationship, not the name / key of a link.

I must admit it feels a bit strange using relationships to manage related resources and state transitions / actions for the current resource, but it seems to work okay with the ideas of REST without PUT

That’s an interesting idea!

Any ideas on the following:

Say you have an article and you want to mark it as “article of the month”, would you have a relationship to article-of-the-month? I don’t think I understand how that would work? Because only one article at a time would have that “relationship”, the rest would 404 I presume?

It seems a lot more intuitive to have this defined as a link, rather than a relationship?

Maybe the idea here is that actions like this are covered outside of the JSONAPI payload. You can always have documentation that describes these things, but JSONAPI describes resources rather than actions…

Alternatively, you can often rework things to be resource based which will then work. Maybe you could have a resource which is article-of-the-month, and the April 2016 entry has a relationship to an article…

Yes, possibly from /articles.
As @Sazzer suggests this would be a resource in its own right. It would resolve to a specific article, but the linkage could change over time to point to different articles at different times.

Yeah, you’ve got it right here @jlangley: the intention is that other links aren’t allowed—for now. @A-Helberg is right that, without other links, HATEOS is hard, and that’s why the plan is to allow other links in the future, but not yet.

@jlangley To your original question: the JSON:API answer is to use ?fields for clients to request a thin representation. So, you could have a URI like /collection for the fat response, and /collection?fields... for the thin response. If the issue is that /collection?fields[...]=... is ugly to type, you could have a simpler URI for clients to use that redirects to the ?fields one.

I also think you’re right that, from a semantic POV, sparse fieldset information would ideally have gone in a Prefer header rather than the URI. But doing that would have had usability tradeoffs (a user can’t just copy/paste the URI into a browser or click a link), and I don’t think having two URIs/resources is likely to cause many problems in practice. I guess it would mean that some PATCHs don’t invalidate cached GETs (on the other URI) but, in the scheme of HTTP violations, that seems like a pretty tolerable one.

The issue is that a simplistic implementation of fields would be to have the API generate all the data for they resource, then have a small “shim” module intercept every response and strip out any fields the client doesn’t want.
This simplistic implementation works okay when clients are excluding fields due to network bandwidth limitations, but doesn’t help when fields are excluded because they are expensive to create. Of course a more sophisticated implementation could handle both scenarios, but is itself likely to be more effort to create…

Indeed. There might be a solution though that’s somewhere in between in terms of complexity where, in most cases, you use the simple implementation but you put a layer in front of it that says “If exactly this set of fields is requested, forward to a different method to supersede the standard data creation logic”. That way, you don’t have to change the data fetching in the general case. On the other hand, implementing the more sophisticated solution might be worth doing anyway…and hopefully it’s something off-the-shelf implementations will start providing soon, if it’s possible to handle in a generic way.