I know, I know. No one uses Solr anymore. Except me. And the guy that maintains acts_as_solr. And all those people who talk about it on twitter. Okay, so there are still a few people using it. This is for them.

Just a quick note on how you can improve your user’s browsing experience in an app that uses acts_as_solr. Our app has been using acts_as_solr for a year or more now, and we’ve generally been happy with it. Re-indexing a big table can be painful, but we don’t really end up doing that very often.

The one area that was still causing us some pain was the inline indexing of objects. I think the standard solution for this issue has been to defer indexing when creating/updating/deleting and have a cron job scheduled to take care of it at regular intervals. And that both works and is relatively painless. But that solution negates one of the things I like about acts_as_solr, as compared to other search solutions, which is the fact that indexing can be real time.

Anyway, I’m getting a little long winded, all I wanted to do here was point out something that, in retrospect, might seem obvious to others.

Recently, for reasons unrelated to Solr, we implemented workling and starling so that we could background some tasks that were delaying (and in some cases, timing out) our users’ sessions. While I was working on it, I realized that we could use workling to background the inline solr indexing, allowing us to speed up save operations while still basically having real time indexing. Here’s how we did it (all of this assumes that you have workling installed and functioning):

First, we created app/workers/solr_worker.rb:

class SolrWorker < Workling::Base
 
  def index_object(options={})
    object = options[:object_type].constantize.find_by_id(options[:object_id])
    unless object.blank?
      object.solr_save
    end
  end
 
  def destroy_object_index(options={})
    object = options[:object_type].constantize.find_by_id(options[:object_id])
    unless object.blank?
      object.solr_destroy
    end
  end
 
end

Next, in vendor/plugins/acts_as_solr/lib/instance_methods.rb, we added two methods:

def async_solr_save
  SolrWorker.async_index_object(:object_type => self.class.name, :object_id => self.id)
end
 
def async_solr_destroy
  SolrWorker.async_destroy_object_index(:object_type => self.class.name, :object_id => self.id)
end

And, finally, in vendor/plugins/acts_as_solr/lib/acts_methods.rb, we changed two lines from this:

after_save    :solr_save
after_destroy :solr_destroy

To this:

after_save    :async_solr_save
after_destroy :async_solr_destroy

And that’s it. This solution has been in place for a couple of weeks now, and we’ve seen a real improvement.

none

Sooner or later in the life cycle of any application, there comes a time when it makes sense to explore areas where backgrounding processes might improve your Users’ experience. Right now, if you’re looking to do that, the place to turn is Workling.

The obvious candidates for this kind of backgrounding include sending emails and search engine indexing. In our app, though, the first performance killers to present themselves were several calls to third party APIs. Due to the limitations of one of these APIs, and some funky accounting requirements, one process can take as many as 24 separate calls (some lookups, some method calls). Now, the API response times are pretty good, but no matter how quick they may be, it’s obvious that this is going to add some inconvenient overhead to what should be a fast, responsive User session.

In working to background this particular process, I came up against some Workling limitations that may not be typical, or well documented. (Actually, they may be well documented, but since Google just kind of assumes that when you type “workling” you really mean “working”, it’s not always easy to find the pertinent posts.) Anyway, without further ado, some questions/problems I came across, and the answers that worked for me:

1. The standard how-to, startup guide has you calling the worker from the controller, but is it possible to call it from within a model instance?

This one was pretty easy to test, and the answer seems to be yes, you can. I understand it may not be the best practice, but should you find it preferable, calling the worker from within a model does not appear to present a problem. (For my purposes, it ended up making sense to call it from the controller, but I did have it working from a model at first.)

2. Can you pass a model as one of the options’ hash values?

No. Trying this approach, you will very quickly run up against errors like “A copy of ModelName has been removed from the module tree but is still active!”

3. Do ActiveRecord associations work in the worker process?

That would be a no. It’s not a question I thought to ask initially, but I came up against it pretty early in the process. I could be wrong here, but in my experience, it seems like if you want to deal with a given object, you have to load it from within the worker. For me, any model methods that directly interacted with the DB failed (including simple lookups). Which made for some creative, maybe-a-little-hacky coding on my part.

Update: It looks like the reason that ActiveRecord associations were breaking down for me wasn’t a limitation of Workling, but rather an issue with a default setting in the workling plugin. In workling/lib/workling/remote/runners/spawn_runner.rb you’ll find the following line of code:

@@options = { :method => (RAILS_ENV == "test" || RAILS_ENV == "development" ? :thread : :fork) }

Which is probably all well and good, so long as you’re on Rails 2.2, in which Rails is made thread safe. If, however, you’re running Rails 2.1 still, you’re going to want to change that line so it always uses :fork. Changing that solved the association related issues I had been having.

4. What about calling it from within a transaction?

This is potentially problematic, but not necessarily. For instance, if the worker needs to access a record that was created inside the same transaction. When it goes looking for it in the database, it may be out of luck. In retrospect, it made sense, but it didn’t occur to me until I hit the wall.

I think that covers all of the little gotchas I’ve come across so far. We’ve yet to deploy this latest code, or even to add Starling into the mix, so there’s always the possibility of more. If so, I’ll be sure to update this.

none

Categories

Links

Most commented