Skip to content

Optimize performance for queries on recursive relations #6513

Closed
@noahsilas

Description

@noahsilas

While investigating parse-server performance, we're noticing a lot of time spent doing queries against the _Role and _Join:roles:_Role collections. One interesting feature of our application is that many of our users have roles that have several layers of children, which makes this a good candidate for optimization, and I think I've found a target.

In Auth._getAllRolesNamesForRoleIds, we follow a strategy that at a high level looks sort of like this:

class Auth {
  ...
  
  async function _getAllRolesNamesForRoleIds(ids) {
    const seen = new Set();
    const names = new Set();
    let horizon = [ids];
    
    while (horizon.length) {
      horizon.forEach((id) => seen.add(id));

      const relatedRoles = horizon.map(id => Parse.Role.createWithoutData(id));
      const owningRoles = await new Parse.Query(Role)
        .containedIn('roles', relatedRoles)
        .find();

      owningRoles.forEach((role) => {
        names.add(role.get('name'));
      });

      horizon = owningRoles
        .map(role => role.id)
        .filter(id => !seen.has(id));
    }
    
    return [...names];
  }
}

One thing that is somewhat hidden here is that querying the Role collection through the relation actually translates to two queries: one against the _Role collection, and one against the _Join:roles:_Role collection. This means that the number of queries needed for a user with a role that has depth of N is going to be roughly 2N.

We could improve on this by recognizing that we don't actually need to fetch the intermediate roles separately - we can follow the pointers in the join table directly, which would look something like this (hypothetical mongo version inlined here):

class Auth {
  ...
  
  async function _getAllRolesNamesForRoleIds(ids) {
    const seen = new Set();
    let horizon = [ids];
    
    while (horizon.length) {
      horizon.forEach((id) => seen.add(id));
      
      horizon = await mongo['_Join:roles:_Role']
        .find({ relatedId: { $in: horizon } })
        .map(join => join.owningId)
        .filter(id => !seen.has(id));
    }

    const roles = await new Parse.Query(Parse.Role)
      .select('name')
      .containedIn('objectId', [...seen]);

    return [roles.map(r => r.get('name'))];
  }
}

By doing our recursive step directly through the Join table, we can cut the number of queries down to N+1 (N queries against the join table, and one final query against the Roles table).

I'm starting to take a look at how I could implement something like this, and finding that it's a bit tricky to work through the relations query paths. I was wondering if anyone has suggestions about how we could implement this?

My initial thought is that I could add some methods to the database adapters to directly access join tables, and then pass a reference to the database adapter into the auth component to take advantage of those new methods.

Another option would be to try to add support for this kind of query directly to RestQuery and the JS SDK (which the auth component uses to do these queries). This potentially supports client applications that want a similar optimization on tree-like data stores (category trees, for instance). That would be less invasive in the Auth component, but entail larger changes in the public API. I'm sort of imagining something like role.relation("roles").query().recursive().find(), but I haven't thought this through at all yet.

I'd love to hear thoughts about approaches to this from the team, or if you think that this is just a bad idea for some reason I haven't thought of yet. Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions