fleshyorgans

tryin’ not to be a linux fanboy

fleshyorgans Two Cats

Coding: Rails Issues with :include

February 23rd, 2008 · No Comments

This is a sort of code analysis and something I discovered with :include usage in Rails. (This is in rails 1.1.6, so YMMV. I believe the latest Rails opts to use inner SELECTs rather than JOIN statements.

Also, this is kind of long.)

I have the following association set that represents a document repository.


User 1-n Folders
User 1-n Roles
Folder 1-n Documents
Documents n-n Brands


Folders n-n Viewer Roles
Folders n-n Manager Roles


Lastly, Users n -(polymorphic through)- n Brands

In order to display a tree of Folders a given user can view, that User must
be a manager or viewer of a Folder, and his Brands contain Brand of at
least one Document under that Folder (a Document associated with a Brand under a folder lets the user view that Folder iff that user has the Brand.)

So the way this was originally implemented was a recursive loop
something like this:

def show_subfolders(folder)
User.folders.each do |folder|
show_subfolders(folder)
end
end

Class User
def folders
Folder.find(:all,:conditions => ‘parent = folder’,
:include => [:manager_roles, :viewer_roles, {:documents => :brands]]
).each do
|folder|
if method_to_check_brands(folder) && method_to_check_view_perms(folder)
yield(folder)
end
end
end

The eventual problems that developed with this logic were the following:

  1. second-level includes are problematic in Rails, this generated
    outer joins against seven tables
  2. massive join, recursively called. I believe the Big O* is (c!)^n,
    glancing quickly
  3. Indexes don’t help, because the outer joins sort of blow the
    performance gains away
  4. All data is being selected, so just sheer amount of data transfer
    slowed things down
  5. The yield adds a lot of crap to the callstack
  6. Brand and View/Manage permissioning was already happening in the code

With 245 documents under 30 folders, mysql logged slow queries that
had to examine 320,000 rows — and that was multiple times.

The solution? Remove the :includes. Since permissioning and
restriction was already happening in code, I let Rails do its own
thing and select when necessary. This lowered the response time from
10+ seconds for page view to 0.5 seconds. (And for what it’s worth,
doing the join statements manually, or removing the second-level :includes only brings it down to 2 or 3 seconds.)

By reducing the stuff that needs to join, indices are are used more
efficiently, and letting Rails do a find for Folder.documents on each
iteration ends up being a lot faster than trying to join with the
view/manage/brands permissions.

The More You Know!

* It’s been over a decade since I actually figured out the order of an algorithm. I’m sort of guessing, here.

Tags: , , , ,

0 responses so far ↓

  • There are no comments yet...Kick things off by filling out the form below.

Leave a Comment