Recommendation for user chosen unique ids for resources: precheck resource existence

Hi,
I couldn’t find any recommendation for what I see as a very common use case, which has unfortunately led to a lot of bikeshed in my team :frowning:
The use case is allowing users to create resources, enforcing unique ids.

First of all, a simple system could be to just send an http creation request and have the server respond with an http error code, but this has the downside that the id conflict check is delayed to when all data has been entered by the user. This leads to a worse UI for 99.999% of the cases where the unique id can be checked and feedback provided immediately (of course the unique id must be checked again when actually creating the resource, and this check succeeds almost all the time if the precheck has suceeded. An error is thrown otherwise and the user must chose another id)

So an existence precheck is justified in many cases I think. For this, I think the classic implementations are:

  • performing a HEAD request on the resource canonical URL. Since it’s HEAD, only the http response headers can be used to answer. if you use 200/404, then you get scary colors in browser console logs when you call this and the check says the resource doesn’t exist. Also it can be argued that it’s easier to make mistakes when programming the client because most libraries automatically translate http error code to exceptions, which can cause control flow to jump non locally and that’s more complicated and a return value and local control flow. Using 200/204 fixes this but feels like warping the http codes. Are there other more adapted http codes ?
  • performing a request on a separate dedicated url (e.g. /resources/:name/exists, or /check/:name ) responding with 200 and with “true” or “false” in the body or some variation. This is more like an RPC than REST.
  • using a more general search system for resources (/resources?id=… or /search?id=… or something). This has the drawback that it invites to create a full blown search system when you only need an existence check.

Are there other solutions ? Does {json:api} have a stance on this ?
Personally I would prefer a system that doesn’t use 4XX codes to signal that the user is choosing an existing ID because the goal of the form is precisely to allow the user to create a unique id, so it’s normal and expected that the user tries various already existing ids.

Thanks in advance,
Jon

The specification recommends to use UUID for client-generated IDs to prevent collisions:

The client SHOULD use a properly generated and formatted UUID as described in RFC 4122 [RFC4122].

NOTE: In some use-cases, such as importing data from another source, it may be possible to use something other than a UUID that is still guaranteed to be globally unique. Do not use anything other than a UUID unless you are 100% confident that the strategy you are using indeed generates globally unique identifiers.

Thank you for you answer. Unfortunately, UUIDs are not very memorable nor readable nor short, so can not be used in my case :frowning: (I do use them everywhere I can …)

I think I’ve found inner peace though, using HEAD and 200/204
(and for the same url,
HEAD 200 => GET 200
HEAD 204 => GET 404 // inconsistent, but i’ve accepted this imperfection as the best solution
)
(and for resources where empty objects are possible, which is rare,
HEAD 200 => GET 200 or GET 204 // even more inconsistent, but rare…
HEAD 204 => GET 404
)

This also reminds me of the choices of the couchdb project, where if you want to check for the existence of an object in the database (which is not an error, perfectly valid), they recommend to do a HEAD request to its ID, which returns 404 (and is displayed by browsers as errors, and is interpreted by default my many libraries as exceptions with non local control flow). I don’t think they made the best choice, but I also wish there was something better than the 200/204 solution I chose :frowning:

Your solution only works under the assumption that no resource with a conflicting ID will be created between the two requests. Depending on expected interactions with your API that may be a risky assumption or not.

If you need to connect with the server to generate (or validate) an ID anyways, why to bother with client-generated IDs in the first place?

Hi Jelhan, thanks for taking the time to discuss !

Your solution only works under the assumption that no resource with a conflicting ID will be created between the two requests. Depending on expected interactions with your API that may be a risky assumption or not.

yes I addressed this concern in my original question, the server must revalidate the ID upon the final submit, but I expect in my case that this final validation will almost always succeed. The precheck is just for improved user interaction (earlier feedback)

If you need to connect with the server to generate (or validate) an ID anyways, why to bother with client-generated IDs in the first place?

Because I want the IDs to be pretty, memorable, short, meaningful. . Only the (human) clients know how to generate this kind of ids.

So this question is about improving the user interaction by prechecking the id, but you are right that it is more general: it’s about prechecking any part of a form when submitting resource when that part has the following properties:

  • most of the time, the precheck and the check will have the same values
  • the precheck doesn’t depend on data which is not available yet (for example filled only at the end of the form)
  • the check depends on some kind of invisible state (otherwise the check can be fully performed in the client)

So maybe the right solution to this is to have a “dryRun” parameter on the creation/modification endpoint which can be used to do prechecks without actually creating/modifying anything, just returning the list of erroring fields. I don’t think I’ve seen dryRun many times in REST APIs, but i’ve seen them in other context ( mostly CLIs I guess, git or kubectl ). What do you think ?

I think that’s a valid use case. Even though I have often seen a slug being used in additional to an ID for that use case. It decouples the identifier expected to be used by humans (slug) from the identifier used as an implementation detail in the application (ID). It has the benefit of supporting changes to the human identifier (slug).

Server-side validation independently of creating or updating a resource is not supported by JSON:API base specification. But it is a common pattern. I haven seen it often for registration forms, which check if an user account with a given email address already exists. Giving the user early feedback that they should login instead of filling out the entire registration form. I think it could be implemented as an extension.

Writing such an extension may be more challenging than it first seems:

  • Often a client can validate all fields on its own without interacting with the server. It is only a few cases in which a client can not have all required information for validating a field by design. Uniqueness is one of these cases.
  • A server may want to limit, which fields could be validated in such a precheck as performance optimization.
  • A client may only want to validate the field, which the user recently changed for performance optimization. But validation logic may depend on values of other fields as well.

So I guess it would need to support validation per-field but a client may need to provide the values of other fields as additional context.

It’s unclear to me if batching validations of different fields in a form should be supported. It may reduce the number of parallel network requests. But it may delay results for some fields. And in general parallel network requests shouldn’t be an issue anymore with HTTP/2. I would tend towards not supporting it. Similar as the base specification not supporting creating, editing or deleting multiple resource in a single request.