Proposal: Efficiently loading mutliple related objects for multiple parent objects


#1

Consider this scenario:

A user browses a large data set of persons. The user is the smart kind of user and is aware that he should not include the many-related child objects for each person while operating on the large data set for performance / response time reasons (or the user isn’t smart but the UI won’t let him include the children).
So the user repeatedly applies filters until the number of remaining objects is ‘small enough’, say 50 persons.
Then the user would like to load the children of all persons in the remaining set at once, because he now has decided he wants to see the children.

Afaik, the only way to retrieve the children of the 50 persons would be to follow the links in the relationships section of the person objects and fire 50 requests against the api.

That’s neither efficient for the client, nor the server, although the server probably gets less existed about it.

My proposal for the jsonapi spec is, that the server side should allow retrieval of related objects for multiple objects like so:

GET /api/persons/1,4,6,8,33,234,.../children

Somehow, the children would need to back-reference their parent through which they were loaded.
That could go into the relationships section of each child, probably like that:

relationships: {
  person: {
    links: {
      parent: "/api/person/33"
    }
  }
}

Both client and server could fetch all children of all persons in one request. Maybe parent would need to be parents and be an array of resource URLs that refer to the child object. Or multiple lines with parent relation.
Curious what you think about this proposal.


#2

I think now this can be achieved with the filter and include query params.

If the data set is too large to filter with the include, the server throws an error, and once the result set is small enough it allows the request to complete.

This could only be communicated through the documentation however.


#3

Right, but not in this sense:
‘I have loaded the parents already, now only load their children’.

Also a scenario where a user ticks elements for which the relations should be loaded comes to mind.


#4

How big is the payload representing only the parents? I have a hard time imagining it’s sufficiently large that repeating the request while adding an include for the children would be an issue for server or client.


#5

The whole idea is less about saving bytes in the payload. It’s about saving requests.
We are using ember (data) for our client applications. If I just tell ember to show the children for the selected 50 parents, we most probably would see 50 requests against the api (if I don’t force ember to do anything special). My proposal offers a way to get related data in just one request, such that client frameworks would also know how they can efficiently get all the related data at once.
I also feel that a URL /api/persons/1,4,6,8,33,234,.../children is a very logical extension to the schema that the jsonapi spec defines already.


#6

After some more thought I think @casey is right - can’t this all be done now with filter, fields, and include parameters?

For example, why wouldn’t this work?:

  1. Client issues GET to /api/persons, gets 10,000 people in response
  2. Client issues GET to /api/persons?filter[gender]=female, gets 8,000 people in response
  3. Client issues GET to /api/persons?filter[gender]=female&filter[professor]=true, gets 50 people in response
  4. Client issues GET to /api/persons?filter[gender]=female&filter[professor]=true&fields[persons]=&include=children, gets children of 50 people in a single response but no data about the parents

or if bytes in the payload isn’t an issue (only the number of requests matters), the fourth request could be a GET to /api/persons?filter[gender]=female&filter[professor]=true&include=children, which would get the data of 50 parents and of their children in a single response


#7

will equal

GET /api/persons?filter[id]=1,4,6,8,33,234&include=children

#8

True that children could be loaded like that. But I don’t think existing js client frameworks would / should learn to load children of already loaded parent objects that way. Or should they? If I imagine huge parents object now …


#9

Assuming I’m understanding you correctly, the server can always force pagination for the parents to prevent this.


#10

Pagination partly solves the problem, if one parent object is small.
But assume that the payload for one parent object amounts to, say, 100KB. You would not be able to avoid that extra, yet unnecessary (because the data was retrieved before) data overhead.

A request like

GET /api/persons?filter[id]=1,4,6,8,33,234&include=children

may be viable in many situations, but may be sub optimal for others.


#11

Maybe, but I wouldn’t worry too much about hypothetical problems until you really encounter them.
It’s difficult to suggest a good solution without knowing all the details of your problem, but I can think of a couple of potential options:

  1. GET /api/persons?filter[parent.id]=1,4,6,8,33,234 to get just the children
  2. GET /api/personSummaries?filter[id]=1,4,6,8,33,234&include=children to get an alternative (small) representation of the parent objects but full representations for their children

#12

Your suggestions are all good. But I was more aiming for a general approach to loading related objects on demand, that client frameworks like ember data could implement against. To my knowledge, every client framework that loads related objects automatically when their data is needed, would follow the relationship links, causing one request per parent. Which they should not. The only escape would currently be to try to avoid automatic retrieval and trigger one efficient request yourself.


#13

I think trying to do that automatically at a framework level wouldn’t be very helpful - not every client is going to want that behaviour, and even a single client might want different behaviour at different API endpoints.

To me, that seems a much safer approach.


#14

I can only speak for myself but we need to load related objects for multiple objects all the time.
But this is when ember data falls short, so I thought about a strategy that would allow for a more efficient retrieval.

I still feel that a URL like

GET /api/persons/1,4,6,8,33,234,.../children

is pretty much self explaining and does make sense. Anyway, thanks for discussing with me, I appreciate it!
And you brought up very good suggestions to achieve the same thing, where my favorite would be this one:

GET /api/persons?filter[parent.id]=1,4,6,8,33,234 to get just the children

Thank you!