Delayed response without polling


Hello, guys!

I’m thinking about one more interesting questions related to {json:api}. I have an action and its result should be returned to user ASAP but should be done on background due to its complexity. It performs a lot of computations and comparisons and not all the time it could be done right after the request. Sometimes it could be 5-10 seconds or even minutes in some cases and wouldn’t be great to keep client connection alive.

In specification there is possibility described for 202 Accepted status code with returning URI where result could be found in the future.

If a request to create a resource has been accepted for processing, but the processing has not been completed by the time the server responds, the server MUST return a 202 Accepted status code.

To get result I will require to make many requests to URI to be sure if the data ready. But because I want to return it as quick as possible I need to make them frequently. And this is the problem. I’m making a lot of requests instead of just receiving data from server when it is completed.

WebSockets to the rescue! One of the solutions I’ve thought: make HTTP request that will launch some background process with code 202 Accepted, and when process is ended - return response over WebSockets. But in this case I’m breaking {json:api} specification, because 202 should have URI of future response location, which wouldn’t exist. What do you think, is it right to do so? Or additionally to WebSockets create this URI and if it will be accessed return data when it will be available, but doesn’t use it until there is no need in it?
The other one bad part that this behavior isn’t really clean and common for developers. And since I’ve planned to make this API public - it could become very tricky.

WebSokets would have data format a bit different from {json:api}.



Why not use the Retry-After header? If you have a good idea how long the processing should work, you simply tell it how long to wait and any ‘miss’ could then also send this header with a subsequent delay. You could also use cache-control headers to sync with the Retry-After header to prevent any overzealous client from overloading your application servers with polls.

If you try to get a semi-accurate dynamic value, you could even have the FE present the estimate to the user.


Thanks for your help, @michaelhibay! Your proposals are really helpful all the time!

I can not be 100% sure about time when data will be available. It’s something like trigger which says that client ready to receive search results. And server start to search data relevant for user profile & his request. Some of the data could be returned pretty fast, others need time to appear (not all resources could be available at this moment of time).


By your description it seemed like you would have some kind of idea how long the request might take. Simple heuristics, cache-control, expires and the retry-after header should give you enough granular control to prevent most of the load from your server’s perspective. It won’t be exactly like a WebHook or HTTP/2 server push, but it will get you most of the way there.

For example if you have a particular scenario which usually takes 30-40 seconds to execute, you could return Location:;Retry-After: 30 which suggests the client not call for 30 seconds. If they poll immediately then you return Location:;Retry-After: (now timestamp+30s);Expires:(now timestamp+30s);Cache-Control:max-age=30 which should prevent the browser from making the request, or an intermediary cache to serve the previous response.

If the client calls at 30 seconds and the resource isn’t ready you can alter the time delay however you want, lets just say every 5 seconds Location:;Retry-After: (now timestamp+5s);Expires:(now timestamp+5s);Cache-Control:max-age=5. Setting these intervals would be entirely up to the 80/20 or so heuristics you could come to, and if you really couldn’t figure it out, a simple 5 second interval would be pretty easy on your application server.

Considering this is an asynchronous job, a map could hold a value to let you know if the job is finished or not on a single instance. If you have a distributed architecture for this, you could set up a cache to hold the job response status, which would make the polling very efficient, despite the volume of calls the client may attempt. Most of the time if you leverage the capabilities of HTTP correctly, the few seconds of granularity you miss in this polling vs push design is a minor issue and the load on your servers is almost entirely cached and cacheable which makes it fast AND cheap.


Thank you so much for such detailed explanation. I need to think about it more and return with feedback later. It’s looks like working possible solution. And first of all it will be much more understandable and common for developers than my idea to combine HTTP + WS.