Updated! July 26, 2011

Now covers both the Bamboo and Cedar stacks and incorporates feedback from Scott Watermasysk and our own experiences using Resque on Heroku.

When it comes to background processing, I use resque — I do not even consider the other popular alternative delayed_job. I seem to be in good company, this is a tweet from @tobi, the author of delayed_job:

I feel like I have to write a imatrix style email about delayed_job and resque…

Unfortunately, on Heroku, the sanctioned way to do background processing is to use a worker with delayed_job. Definitely not an option. A little googling turned up two blog posts1 that give us almost all the pieces we need to do this very inexpensively2. Getting this to work with resque-scheduler was a little tricky, so I have documented the setup here.

There are three parts to this blog post, using resque, using resque-scheduler on Bamboo, and using resque-scheduler on Cedar.

Resque

To start, we need a Redis database to use, RedisToGo offers a nano version for free. Add it to your Heroko app.

% heroku addons:add redistogo:nano

To use resque, add the gem.

1gem "resque"

Make your class(es) work with resque, and then extend the class(es) with HerokuAutoScaler::AutoScaling.

 1class MyStuff < ActiveRecord::Base
 2  extend HerokuAutoScaler::AutoScaling
 3
 4  def self.queue
 5    :my_queue
 6  end
 7
 8  def self.perform(*args)
 9    # work done here
10  end
11end

HerokuResqueAutoscale should be somewhere in your load path (app/models works.)

  1# Based on the ideas from: http://blog.darkhax.com/2010/07/30/auto-scale-your-resque-workers-on-heroku
  2require 'heroku'
  3
  4# Scale workers on Heroku automatically as your Resque queue grows.
  5# Mixin the +AutoScaling+ module into your models to get the behavior.
  6#
  7#   class MyModel < ActiveRecord::Base
  8#     extend HerokuAutoScaler::AutoScaling
  9#   end
 10#
 11# And configure in an initializer +config/initializers/heroku_workers.rb+:
 12#
 13#   HerokuAutoScaler.configure do
 14#     scale_by {|pending| }
 15#   end
 16#
 17# The default scaling is non-linear:
 18# * 1 job => 1 worker
 19# * 15 jobs => 2 workers
 20# * 25 jobs => 3 workers
 21# * 40 jobs => 4 workers
 22# * 60+ jobs => 5 workers
 23module HerokuAutoScaler
 24  module AutoScaling
 25    def after_perform_scale_down(*args)
 26      HerokuAutoScaler.scale_down!
 27    end
 28
 29    def after_enqueue_scale_up(*args)
 30      HerokuAutoScaler.scale_up!
 31    end
 32
 33    def on_failure(e, *args)
 34      Rails.logger.info("Resque Exception for [#{self.to_s}, #{args.join(', ')}] : #{e.to_s}")
 35      HerokuAutoScaler.scale_down!
 36    end
 37  end
 38
 39  extend self
 40
 41  attr_accessor :ignore_scaling
 42
 43  def clear_resque
 44    Resque::Worker.all.each {|w| w.unregister_worker}
 45  end
 46
 47  def configure(&block)
 48    instance_eval(&block) if block_given?
 49  end
 50
 51  def scale_by(&block)
 52    self.scaling_block = block
 53  end
 54
 55  def scale_down!
 56    Rails.logger.info "Scale down j:#{job_count} w:#{resque_workers}"
 57    self.heroku_workers = 0 if job_count == 0 && resque_workers == 1
 58  end
 59
 60  def scale_up!
 61    return if ignore_scaling
 62    pending = job_count
 63    self.heroku_workers = workers_for(pending) if pending > 0
 64  end
 65
 66  private
 67
 68  attr_accessor :scaling_block
 69
 70  def heroku
 71    if ENV['HEROKU_USER'] && ENV['HEROKU_PASSWORD'] && ENV['HEROKU_APP']
 72      @heroku ||= Heroku::Client.new(ENV['HEROKU_USER'], ENV['HEROKU_PASSWORD'])
 73    else
 74      false
 75    end
 76  end
 77
 78  def heroku_workers=(qty)
 79    heroku.set_workers(ENV['HEROKU_APP'], qty) if heroku
 80  end
 81
 82  def job_count
 83    Resque.info[:pending]
 84  end
 85
 86  def resque_workers
 87    Resque.info[:working]
 88  end
 89
 90  def workers_for(pending_jobs)
 91    if scaling_block
 92      scaling_block.call(pending_jobs)
 93    else
 94      [
 95        { :workers => 1, # This many workers
 96          :job_count => 1 # For this many jobs or more, until the next level
 97      },
 98        { :workers => 2,
 99          :job_count => 15
100      },
101        { :workers => 3,
102          :job_count => 25
103      },
104        { :workers => 4,
105          :job_count => 40
106      },
107        { :workers => 5,
108          :job_count => 60
109      }
110      ].reverse_each do |scale_info|
111        # Run backwards so it gets set to the highest value first
112        # Otherwise if there were 70 jobs, it would get set to 1, then 2, then 3, etc
113
114        # If we have a job count greater than or equal to the job limit for this scale info
115        if pending_jobs >= scale_info[:job_count]
116          return scale_info[:workers]
117        end
118      end
119    end
120  end
121end

Scaling works by calling into the heroku gem and issuing commands to your Heroku application; you need to have the heroku gem and your Heroku credentials available. Add the heroku gem.

1# needs to be in your deployment environment, not just dev!
2gem "heroku"

You need to add three config variables to Heroku to allow your workers to auto-scale. Check out my previous Heroku blog post for a neat way to manage your config variables. Set the config variables.

HEROKU_APP = your_app
HEROKU_USER = your_user
HEROKU_PASSWORD = your_password

Add this task file to lib/tasks/resque.task to run as many normal resque workers as needed.

1require 'resque/tasks'
2
3task "resque:setup" => :environment do
4  ENV['QUEUE'] = '*'
5end
6
7desc "Alias for resque:work (To run workers on Heroku)"
8task "jobs:work" => "resque:work"

Finally, we also need to make sure resque does not get a stale db connection3, add this to config/initializers/resque.rb.

1Resque.after_fork = Proc.new { ActiveRecord::Base.establish_connection }

Deploy, and watch your workers scale as needed!4

Resque Scheduler

Out of the box, resque-scheduler will not work with our auto-scaling code because it is broken. Specifically, it does not invoke hooks (like after_enqueue!) when adding jobs to the resque queues. @l4rk and I have submitted a patch that has been pulled in but does not look like it has been released as a gem yet. Thankfully, bundler lets us specify the github repo directly.

1gem 'resque-scheduler', require: 'resque_scheduler', git: 'git://github.com/bvandenbos/resque-scheduler'

If you are using a schedule file, load it in an initializer config/initializers/scheduler.rb

1Resque.schedule = YAML.load_file(File.join(File.dirname(__FILE__), '../resque_schedule.yml'))

Using resque-scheduler on the Bamboo stack

resque-scheduler works by having a long-running worker continually pushing jobs to the resque queues as scheduled. The default Bamboo stack on Heroku does not make any allowance for different worker types. This is a problem. When scaling down, Heroku has no way of knowing which worker is ‘working’ or ‘scheduling’ or ‘idle’.

And the solution is not very clean. Use the Cedar stack if you can (see below).

The only way to isolate a long-running worker (the scheduler) from scaling workers is to use a second Heroku application instance to run the long-running worker. In our case, the easiest way to do this is to redeploy the same codebase to another Heroku instance and set a filter to redirect any http requests to the original app.

There are four things to configure on your second Heroku instance:

1require 'resque/tasks'
2require 'resque_scheduler/tasks'
3
4task "resque:setup" => :environment
5task "resque:scheduler_setup" => :environment
6
7task "jobs:work" => "resque:scheduler"

Using the Cedar stack on Heroku

The Cedar stack on Heroku solves this problem by allowing you to define as many types of worker as you want using the Procfile which is new to Cedar.

Add this to your Procfile

1worker: QUEUE=* bundle exec rake resque:work
2scheduler: bundle exec rake resque:scheduler

Change your task file.

1require 'resque/tasks'
2require 'resque_scheduler/tasks'
3
4task "resque:setup" => :environment
5task "resque:scheduler_setup" => :environment

And run one scheduler worker.

1heroku scale scheduler=1

Deploy!

When we put this all together, we have a scheduler worker monitoring our scheduled/delayed jobs, and any number of workers working our resque queues. Jobs are placed in our resque queues either directly by our application, or by the scheduler at the scheduled time.

1 First, James Bracy of RedisToGo wrote a nice blog post showing how to use Resque instead of delayed_job with Heroku. And Daniel Huckstep then wrote a great blog post on a nifty way to auto-scale workers on Heroku.

2 Almost, resque-scheduler requires a dedicated worker running all the time, and that will cost some money on Heroku.

3 Adjust as needed if you are not using ActiveRecord.

4 Occasionally, you will have two workers clearing the queues simultaneously. In this case, the scaling code will not scale down because it is not safe to do so and the scale down will have to wait until the next opportunity.

blog comments powered by Disqus