Hey folks, must I return all attributes in a resource object when sparse fields are not specified? I’m curious because my entities contain quite a bit of information that’s not absolutely required with each request and I’m stuck in a spot where I’d be specifying many fields for each request.
A couple of thoughts I’ve had when returning a “minimal” set of attributes:
When more attributes are requested, prefix fields with a “+” (e.g. fields[article]=+title,+date would add title and date if they aren’t part of the minimal set of attributes)
When less attributes are requested, prefix fields with a “-” (e.g. fields[article]=-title would remove title from the response when title is part of the minimal set of attributes)
In order to provide the client with a list of available attributes, use meta to describe the entire set of available attributes
Along the same lines, would it be acceptable to overload the meaning of fields to represent views for example? fields[article]=view:simple, fields[article]=view:detailed, fields[article]=view:full or is it better to introduce a custom query param, e.g. view-id instead?
I don’t really see why this is a problem. But one alternative might be to represent different views via multiple resources. For example /articles/{id} for the full data, and /articleSummaries/{id} for the partial data.
@jlangley using different views like that might be an issue with relationships and clients that follow links.
I’ve used a reflection mechanism that uses the meta area to send a list of the request parameters back to the client. This allows sensible defaults to be used and reflected back. *"You didn’t specify what you wanted for parameter foo, so I used parameter foo=bar"
This works very well for paging - you don’t want to send 10,317 records just because the client didn’t bother to set a limit. So a client can specify a limit and get that many results, or not specify a limit and get the default 100 records. (and can see what that limit is in the request param area of the response.)
So for @jgornick’s question, I’d say you could just set a minimal default that’s used unless the client specifies otherwise.
If /articles/{id} were to return the full article and /articleSummaries/{id} return only a subset of attributes…
Then which does eg /review/{id} .relationships.article link to?
Does the client need to be explicitly programmed with the two routes?
Does it need to manually rewrite a link from one to the other?
In another point of view, all the routes and responses in an API are like a connected graph. (graph nodes are API routes, graph connections are relationships and other links)
I think that having two routes as described will result in a graph that’s not fully connected, and result in confusion.
Going back to the internet being full of resources that are interlinked, I think a summary is a different resource and, as such, it must be treated as such.
Each domain needs a different level of linking. Very few domains require a fully connected graph and that might seem confusing but if you’re designing the domain to be used in a certain way then you will have sparse graphs.
I would have /articleSummaries/{id}. Each summary links back to its parent. The fact that the {id} is the same is coincidental - the client is not expected to build a URI.
For reviews, you must ask yourself “how do I want the client to use the resource?”. Do you want to go to /reviews/{id} directly from a /articleSummaries/{id}? If so, then include a relationship. If you require the client to load the full article first then do not include the relationship link. Or include both relationships! For this particular domain example, I probably would include both.
That means that the client could do this:
GET /reviews/0?include=articles,articleSummaries
Which is silly but I would rather have the power there and hope the client author does something sensible.
In the domain I use at work, sparse linking is used to steer the user through “the right way to use the system”, which is part of the power of hypermedia as they can do that without documentation.
I’d imagined it would link to the article, to link to an article summary you would have an articleSummary relationship. A review could of course have links to (i.e. relationships for) both the article and its summary (I see @brainwipe has had the same thought).
Yes, but that doesn’t worry me. The alternative is that client needs to be explicitly programmed to include relationships or filter fields. It might be that no client wants both representations, but rather different clients want different representations.
Because I’m a fan of hypermedia I wouldn’t do this, but you could.
I have seen other solutions based around using the Prefer header to negotiate the level of detail, but that has other drawbacks.
I don’t see that this follows. But probably (again similar to the suggestion from @brainwipe) I would have two-way links between an article and its summary.
I’m not convinced separate resources are the solution in all cases.
In our case, there are have been two wrinkles related to this, both costly lookups that we’d like to avoid unless the user is really asking for the information:
reverse relationships: eg. an ‘Invoice’ resource with a ‘lineitem_set’ relationship. In some cases you want to see the lineitems, in some cases you don’t. This is a “bound reverse relationship” example. But you could also have unbound cases, like a ‘Customer’ resource with an ‘invoice_set’ relationship. This relationship could easily continue growing with time. There’s an argument to be made to not offer reverse relationships like these at all, but in some cases they prove to be super convenient, especially when coupled with nested includes.
computed attributes: ie. fields that are not actually db columns, but computed values based on other data, potentially very expensive to look up. examples could include a ‘costing_summary’ json blob on a ‘Job’ resource, or a dynamically-generated label that we’ve opted not to cache in the db.
We considered splitting out to multiple resource types, however as soon as you have more than one expensive field that the user may or may not want, you either have to start making concession that you’ll have a “fast” and “slow” resource type, or your number of resource types for the given db table starts to grow with each combination you want to support. This may sound naive; I’m sure many such cases can be solved with smarter data architecture (treating endpoints as purpose-driven resources rather than mapping them one-to-one with db tables), however in the real world it’s all too easy for this approach to end up in a big mess, with all of the aforementioned issues of poorly connected graphs.
The solution we rolled with was to introduce a concept of ‘ondemand_only’ fields. We specify these on the resource serialiser, such that when the endpoint is hit, we exclude these fields from the response, unless the request specifically names them in the sparsefields declaration. It works nicely, but since it’s not in the spec, we had to build this atop of the jsonapi library we were using, and addressing all the recursive cases was a little involved. Would be cool if it were somehow official.
If there’s nothing in the spec that relates to what should be returned when ?fields[blah]= is omitted from the query, what’s the best way to try to push for a feature request? Is it overly optimistic to expect the spec to be sufficiently pliable to adopt such an addition?
The specification has been clarified to indicate that the set of fields returned without a corresponding fields[type] in the request is not required to be the complete set of fields available for the type.
The ‘ondemand_only’ approach as part of your serializer sounds like a perfectly reasonable approach to this.
In the absence of a fields[type] in the request, the set of fields can even be dynamic (though that may irritate the developers of some clients), doing things like dropping fields if the value is null.