Fetching related resources, when to respond 404 not found


#1

Hi there!
When we fetch relationships, the specification says that the server MUST response 404 Not Found when processing a request to fetch a relationship link URL that does not exist.

Note: This can happen when the parent resource of the relationship does not exist. For example, when /articles/1 does not exist, request to /articles/1/relationships/tags returns 404 Not Found.

I am wondering if the same applies also when fetching related resources. I assume so, but I would like to have a validation on this. For instance, if we fetch an article’s author through GET /articles/1/author and the article with id=1 does not exist, do we need to return 404 Not Found?


#2

The spec calls out that scenario explicitly directly in the first paragraph referring to the previous section, which should be a 200 OK with null data.


#3

Thanks for your answer.

What I understood was that I need to response with 200 OK and null data when the direct resource is not available.
Just to be sure: assume article with id=1 does not exist:

GET /articles/1  -->  returns 200 OK and null data
GET /articles/1/author  -->  returns 200 OK with null data?
GET /articles/1/relationships/author  -->  returns 404 Not Found

Is this right?


#4

Look back at your post, you asked if the author doesn’t exist on an article. I read that as /article/1 exists.

Under those conditions your chart would be correct, under the new assumptions I believe it would be.

GET /articles/1  -->  returns 200 OK and null data
GET /articles/1/author  -->  returns 404 Not Found
GET /articles/1/relationships/author  -->  returns 404 Not Found

#5

Thank you very much!


#6

I think there is still some confusion here. From my perspective it should be the following:

Case 1: Article with id=1 exists but related author does not.

GET /articles/1  --> returns 200 OK.
GET /articles/1/author  --> returns 200 OK with null primary data for the related author.
GET /articles/1/relationships/author  --> returns 200 OK with null primary data for the related relationship.

Case 2: Article with id=1 does not exist.

GET /articles/1  --> returns 404 NOT FOUND returns an "errors document" where the client can not get any "author" relationship because there is no articles resource to navigate.

I think the confusion the specification is trying to address is if for “Case 1” the client application saved, bookmarked, cached the 200 OK response where the article existed at one time but afterwards that resource was deleted unbeknownst to the client application and you do something like a GET /articles/1/author or GET /articles/1/relationships/author then it should return a 404 because the parent resource “/articles/1” actually no longer exists in the system at the moment in time you make these GET requests…


#7

I totally agree on case 1, but I am not sure about case 2, which is the one that actually started my question.

Do you think the documentation is clear on this? English is not my first language and maybe I am missing something with all the should, may, must etc…


#8

@e21524e3 I do think the documentation could be better, to me it is the line previous to the note you mentioned that is confusing:

A server MUST return 404 Not Found when processing a request to fetch a relationship link URL that does not exist.

It is in the “Note” that follows that brings clarity to that line:

Note: This can happen when the parent resource of the relationship does not exist. For example, when /articles/1 does not exist, request to /articles/1/relationships/tags returns 404 Not Found.

So the note is saying it is 404 because /articles/1 does not exist NOT because /articles/1/relationships/tags does not exist . By definition if a related resource or relationship does not exist the response should be 200 with the primary data being null or an empty array depending if the relationship is a to-one or to-many respectively. At least that is how I understand the specification to be…


#9

Thank you very much. Would be interesting to hear back from the upper people!

In the meanwhile, you have convinced me it is like you say.


#10

@scottmcdonald I don’t think your chart is correct. GET /articles/1 is a valid link which could hold an article, as such:

A server MUST respond to a successful request to fetch an individual resource with a resource object or null provided as the response document’s primary data.

null is only an appropriate response when the requested URL is one that might correspond to a single resource, but doesn’t currently.

In this case its clear the spec requires this to be a 200 OK null data. I believe the only confusion here stems from the sub-resource which by analogy would be a null pointer, similar to the point you made. GET /articles/1/author is a resource which COULD exist, but returning a 200 for a sub-resource on a resource which doesn’t exist really doesn’t make sense.

The ambiguity doesn’t help, and this is certainly one of those situations the goal of mostly non-normative language in the spec is working against us.


#11

@michaelhibay

I think we almost agree. We agree when related resource or resource collection does not exist a 200 with either a null or empty array is returned.

Let’s examine /articles/1 again. To me, the moment a client does a GET for this identified resource it either exists or does not exist, so the return should be either a 200 OK or 404 NOT FOUND respectively. The reason I say this is because of the use of an explicit identifier and the language in the specification:

404 Not Found

A server MUST respond with 404 Not Found when processing a request to fetch a single resource that does not exist, except when the request warrants a 200 OK response with null as the primary data (as described above).

Now I hear what you are saying about the language in the specification:

A server MUST respond to a successful request to fetch an individual resource with a resource object or null provided as the response document’s primary data.

null is only an appropriate response when the requested URL is one that might correspond to a single resource, but doesn’t currently.

But in my opinion this is referring to a single resource but NOT through an identifier, like in the case to GET the related author of an article where the related resource might not be assigned. I suppose this could also be for a singleton where the URL does not contain an identifier, something like /temperatures/current. In fact if you read the note in the specification that follows i think the note brings clarity to the language in the specification because it talks about the “related” resource in a to-one relationship in explaining the “could exist” language in your response:

Note: Consider, for example, a request to fetch a to-one related resource link. This request would respond with null when the relationship is empty (such that the link is corresponding to no resources) but with the single related resource’s resource object otherwise.


#12

You’re right, we are very close to agreeing but I think you are reading into it and “legislating from the bench” a little. If we were having this conversation on a v1.1 GH issue, I may be saying something else, but this is a thread about what the spec says, not what it should say, and what it does say is quite clear.

One of our earliest discussions here was centered around me making a similar misunderstanding because of the less than clear examples not having a strong delineation between ember convention and {json:api} requirements. In both of those cases you are clearly referencing a URI which could be a resource, so if GET /articles/1 is 200 OK null, GET /articles/1/author has to be 404 because you are referencing a sub resource of null.

The language is ambiguous, but with precedent absolutely clear, and since I’ve been delving into the world of spec writing I’ve developed a keen understanding of exactly why normative language exists. When you’re writing software specifications, yes normative language is less approachable, but it gets rid of all this bike shedding and allows the stuff to just work.

The way I read the whole spec is that you basically should be returning a {json:api} message almost all of the time, unless it would be logically invalid for you to do so, i.e. a sub resource under a null json object. What that means is, even though there is this ambiguity, despite us discovering a less than ideal fringe case, we do what the spec says because its backwards compatible :expressionless:.


#13

@michaelhibay

I am going to have to disagree with you on the /articles/1 returning a 200 OK with the primary data being null. For example, if someone does a DELETE on /articles/1 and subsequently does a GET on /articles/1 it should return a 200 OK with a null as the primary data? To me the specification is clear that trying to RETRIEVE a resource by it’s primary key and if it doesn’t exist in the system it should return a 404 NOT FOUND - so let’s agree to disagree I guess…


#14

@scottmcdonald The point about ‘legislating from the bench’ is pretty operative here.

You’re setting up scenarios where it doesn’t make sense. I’m not disagreeing at all, they do not make sense, but the spec says verbatim that GET /articles/1 returns 200 OK null. I would not design the spec that way, but you can’t have conforming implementations if you don’t stick to the script. No user interaction scenario can change the requirements of the spec.

I’d be more than happy to support your very good assertions that this is a mistake in a GitHub issue, however lets not send this person on their way with a flawed understanding of the requirements.


#15

@michaelhibay

I agree we should not send this person on their way with a flawed understanding. So to summarize for an article with a primary key of “1” that does not exist in the system:

Me (Scott): /articles/1 returns a 404 NOT FOUND
You (Michael): /articles/1 returns a 200 OK with null as the primary data.

Can any of the specification authors (@dgeb @steveklabnik @ethanresnick) please comment on this thread for edification purposes…


#16

I am happy this discussion started a debate on this. I actually wasn’t expecting that LOL.

While I wait and hope that some of the specification authors comment, I’d like to point out that the conversation drifted a little bit from my (probably confusing) question.

So far, to summarize, I have this understanding:

If article with id=1 exists, then

  1. GET /articles/1 -> 200 OK
  2. GET /articles/1/relationships/author -> 200 OK with data: null if author does not exist (or data:[] if it’s a to-many relationship)
  3. GET /articles/1/author -> same as 2)

If article with id=1 does not exist, then

  1. GET /articles/1 -> 200 OK or 404 Not Found (you guys (@michaelhibay @scottmcdonald) have different interpretation on this)
  2. GET /articles/1/relationships/author -> 404 Not Found
  3. GET /articles/1/author -> same as 2)

I would love to hear back from the spec authors and maybe update the documentation with more examples?


#17

@e21524e3 Nice summary. In researching (i.e. using my “Google Fu” powers) on what to return for /articles/1 when article with id=1 does not exist; there is no single correct answer as it seems half the comments argue for 200 OK with null and the other half argue for 404 NOT FOUND. Therefore it comes down to what the specification says and the specification authors SHOULD (notice I didn’t say MUST haha normative language) can add clarity on this topic…


#18

@dgeb @steveklabnik @ethanresnick sorry to bug you, it would be nice if you could have a definitive clear answer here, because people seem to be a little confused on the specs. Thank you very much


#19

So, there are two questions here: 1) what is the intended behavior? 2) what does the spec actually say at the moment?

Intended Behavior
The intended behavior is that GET /articles/1 is a 404 if an article with that id does not exist.

Here’s the reasoning/backstory:

A relationship is, conceptually: 1) a collection of resource identifier objects with 2) an associated type (to-one or to-many). In a to-many relationship, the contents are always serialized as an array; in a to-one relationship, they’re serialized as null or a single identifier object.

The related url represents a projection of the relationship, where each resource identifier object is mapped to the resource object it identifies. This result retains the same rules for communicating the relationship type — a to-one relationship gets projected to null or a single resource, and a to-many gets projected to a (possibly empty) array of resource objects.

So, if a resource object includes a relationship with a related url, then that related url represents a set of related resources that does conceptually exist, but might be empty. If it is empty, the server should return data: null in this case, rather than 404, precisely to tell the client that the related collection does at least exist. (As you guys have pointed out, this is a useful distinction.)

Communicating “this exists but is currently empty” is the only reason the spec introduces the data: null response. When something simply doesn’t exist at the time of the request, 404 is appropriate.

Spec text
When the spec talks about a URL that “might correspond to a single resource, but doesn’t currently”, it’s trying to get at this idea of “the thing exists but doesn’t currently have any contents”.

I agree with Michael that the spec doesn’t actually say that, but my strong suspicion is that many implementations have interpreted 404 as the correct response anyway. I know that’s how my implementation has always behaved, I assume it’s how Scott’s behaves, and I imagine it’s how @dgeb’s behaves as well.

Unless a strong majority of implementations use the 200 OK interpretation (which I doubt), I have no problems clarifying the spec text to reflect the 404 behavior. At the end of the day, our backwards compatibility promises are about maintaining interoperability, so, if implementations in practice are already divided on this, clarifying the text doesn’t make things less interoperable.

I’d also point out that, if we were instead to double down on Michael’s reading of the spec, it would lead to some pretty bizarre behavior. In particular, JSON:API gives a server the flexibility to assign urls to resources however it wants, so, in theory, almost any url “might correspond to a single resource”. Therefore, reading the text as Michael suggests would make it impossible for a server to return a 404 at a clearly non-existent resource. For example, every JSON:API server would have to have GET /xcvcxtfien return a 200 Ok with data null. I don’t think that’s plausible/something we want.