This is a sort of code analysis and something I discovered with :include usage in Rails. (This is in rails 1.1.6, so YMMV. I believe the latest Rails opts to use inner SELECTs rather than JOIN statements.
Also, this is kind of long.)
I have the following association set that represents a document repository.
User 1-n Folders
User 1-n Roles
Folder 1-n Documents
Documents n-n Brands
Folders n-n Viewer Roles
Folders n-n Manager Roles
Lastly, Users n -(polymorphic through)- n Brands
In order to display a tree of Folders a given user can view, that User must
be a manager or viewer of a Folder, and his Brands contain Brand of at
least one Document under that Folder (a Document associated with a Brand under a folder lets the user view that Folder iff that user has the Brand.)
So the way this was originally implemented was a recursive loop
something like this:
def show_subfolders(folder)
User.folders.each do |folder|
show_subfolders(folder)
end
endClass User
def folders
Folder.find(:all,:conditions => ‘parent = folder’,
:include => [:manager_roles, :viewer_roles, {:documents => :brands]]
).each do
|folder|
if method_to_check_brands(folder) && method_to_check_view_perms(folder)
yield(folder)
end
end
end
The eventual problems that developed with this logic were the following:
- second-level includes are problematic in Rails, this generated
outer joins against seven tables - massive join, recursively called. I believe the Big O* is (c!)^n,
glancing quickly - Indexes don’t help, because the outer joins sort of blow the
performance gains away - All data is being selected, so just sheer amount of data transfer
slowed things down - The yield adds a lot of crap to the callstack
- Brand and View/Manage permissioning was already happening in the code
With 245 documents under 30 folders, mysql logged slow queries that
had to examine 320,000 rows — and that was multiple times.
The solution? Remove the :includes. Since permissioning and
restriction was already happening in code, I let Rails do its own
thing and select when necessary. This lowered the response time from
10+ seconds for page view to 0.5 seconds. (And for what it’s worth,
doing the join statements manually, or removing the second-level :includes only brings it down to 2 or 3 seconds.)
By reducing the stuff that needs to join, indices are are used more
efficiently, and letting Rails do a find for Folder.documents on each
iteration ends up being a lot faster than trying to join with the
view/manage/brands permissions.
The More You Know!
* It’s been over a decade since I actually figured out the order of an algorithm. I’m sort of guessing, here.

0 responses so far ↓
There are no comments yet...Kick things off by filling out the form below.
Leave a Comment