Updated! July 26, 2011
Now covers both the Bamboo and Cedar stacks and incorporates feedback from Scott Watermasysk and our own experiences using Resque on Heroku.
When it comes to background processing, I use resque — I do not even consider the other popular alternative delayed_job. I seem to be in good company, this is a tweet from @tobi, the author of delayed_job:
I feel like I have to write a imatrix style email about delayed_job and resque…
Unfortunately, on Heroku, the sanctioned way to do background processing is to use a worker with delayed_job. Definitely not an option. A little googling turned up two blog posts1 that give us almost all the pieces we need to do this very inexpensively2. Getting this to work with resque-scheduler was a little tricky, so I have documented the setup here.
There are three parts to this blog post, using resque, using resque-scheduler on Bamboo, and using resque-scheduler on Cedar.
Resque
To start, we need a Redis database to use, RedisToGo offers a nano version for free. Add it to your Heroko app.
% heroku addons:add redistogo:nano
To use resque, add the gem.
1 gem "resque"
Make your class(es) work with resque, and then extend the class(es) with HerokuAutoScaler::AutoScaling.
1 class MyStuff < ActiveRecord::Base 2 extend HerokuAutoScaler::AutoScaling 3 4 def self.queue 5 :my_queue 6 end 7 8 def self.perform(*args) 9 # work done here 10 end 11 end
HerokuResqueAutoscale should be somewhere in your load path (app/models works.)
1 # Based on the ideas from: http://blog.darkhax.com/2010/07/30/auto-scale-your-resque-workers-on-heroku 2 require 'heroku' 3 4 # Scale workers on Heroku automatically as your Resque queue grows. 5 # Mixin the +AutoScaling+ module into your models to get the behavior. 6 # 7 # class MyModel < ActiveRecord::Base 8 # extend HerokuAutoScaler::AutoScaling 9 # end 10 # 11 # And configure in an initializer +config/initializers/heroku_workers.rb+: 12 # 13 # HerokuAutoScaler.configure do 14 # scale_by {|pending| } 15 # end 16 # 17 # The default scaling is non-linear: 18 # * 1 job => 1 worker 19 # * 15 jobs => 2 workers 20 # * 25 jobs => 3 workers 21 # * 40 jobs => 4 workers 22 # * 60+ jobs => 5 workers 23 module HerokuAutoScaler 24 module AutoScaling 25 def after_perform_scale_down(*args) 26 HerokuAutoScaler.scale_down! 27 end 28 29 def after_enqueue_scale_up(*args) 30 HerokuAutoScaler.scale_up! 31 end 32 33 def on_failure(e, *args) 34 Rails.logger.info("Resque Exception for [#{self.to_s}, #{args.join(', ')}] : #{e.to_s}") 35 HerokuAutoScaler.scale_down! 36 end 37 end 38 39 extend self 40 41 attr_accessor :ignore_scaling 42 43 def clear_resque 44 Resque::Worker.all.each {|w| w.unregister_worker} 45 end 46 47 def configure(&block) 48 instance_eval(&block) if block_given? 49 end 50 51 def scale_by(&block) 52 self.scaling_block = block 53 end 54 55 def scale_down! 56 Rails.logger.info "Scale down j:#{job_count} w:#{resque_workers}" 57 self.heroku_workers = 0 if job_count == 0 && resque_workers == 1 58 end 59 60 def scale_up! 61 return if ignore_scaling 62 pending = job_count 63 self.heroku_workers = workers_for(pending) if pending > 0 64 end 65 66 private 67 68 attr_accessor :scaling_block 69 70 def heroku 71 if ENV['HEROKU_USER'] && ENV['HEROKU_PASSWORD'] && ENV['HEROKU_APP'] 72 @heroku ||= Heroku::Client.new(ENV['HEROKU_USER'], ENV['HEROKU_PASSWORD']) 73 else 74 false 75 end 76 end 77 78 def heroku_workers=(qty) 79 heroku.set_workers(ENV['HEROKU_APP'], qty) if heroku 80 end 81 82 def job_count 83 Resque.info[:pending] 84 end 85 86 def resque_workers 87 Resque.info[:working] 88 end 89 90 def workers_for(pending_jobs) 91 if scaling_block 92 scaling_block.call(pending_jobs) 93 else 94 [ 95 { :workers => 1, # This many workers 96 :job_count => 1 # For this many jobs or more, until the next level 97 }, 98 { :workers => 2, 99 :job_count => 15 100 }, 101 { :workers => 3, 102 :job_count => 25 103 }, 104 { :workers => 4, 105 :job_count => 40 106 }, 107 { :workers => 5, 108 :job_count => 60 109 } 110 ].reverse_each do |scale_info| 111 # Run backwards so it gets set to the highest value first 112 # Otherwise if there were 70 jobs, it would get set to 1, then 2, then 3, etc 113 114 # If we have a job count greater than or equal to the job limit for this scale info 115 if pending_jobs >= scale_info[:job_count] 116 return scale_info[:workers] 117 end 118 end 119 end 120 end 121 end
Scaling works by calling into the heroku gem and issuing commands to your Heroku application; you need to have the heroku gem and your Heroku credentials available. Add the heroku gem.
1 # needs to be in your deployment environment, not just dev! 2 gem "heroku"
You need to add three config variables to Heroku to allow your workers to auto-scale. Check out my previous Heroku blog post for a neat way to manage your config variables. Set the config variables.
HEROKU_APP = your_app
HEROKU_USER = your_user
HEROKU_PASSWORD = your_password
Add this task file to lib/tasks/resque.task to run as many normal resque workers as needed.
1 require 'resque/tasks' 2 3 task "resque:setup" => :environment do 4 ENV['QUEUE'] = '*' 5 end 6 7 desc "Alias for resque:work (To run workers on Heroku)" 8 task "jobs:work" => "resque:work"
Finally, we also need to make sure resque does not get a stale db connection3, add this to config/initializers/resque.rb.
1 Resque.after_fork = Proc.new { ActiveRecord::Base.establish_connection }
Deploy, and watch your workers scale as needed!4
Resque Scheduler
Out of the box, resque-scheduler will not work with our auto-scaling code because it is broken. Specifically, it does not invoke hooks (like after_enqueue!) when adding jobs to the resque queues. @l4rk and I have submitted a patch that has been pulled in but does not look like it has been released as a gem yet. Thankfully, bundler lets us specify the github repo directly.
1 gem 'resque-scheduler', require: 'resque_scheduler', git: 'git://github.com/bvandenbos/resque-scheduler'
If you are using a schedule file, load it in an initializer config/initializers/scheduler.rb
1 Resque.schedule = YAML.load_file(File.join(File.dirname(__FILE__), '../resque_schedule.yml'))
Using resque-scheduler on the Bamboo stack
resque-scheduler works by having a long-running worker continually pushing jobs to the resque queues as scheduled. The default Bamboo stack on Heroku does not make any allowance for different worker types. This is a problem. When scaling down, Heroku has no way of knowing which worker is ‘working’ or ‘scheduling’ or ‘idle’.
And the solution is not very clean. Use the Cedar stack if you can (see below).
The only way to isolate a long-running worker (the scheduler) from scaling workers is to use a second Heroku application instance to run the long-running worker. In our case, the easiest way to do this is to redeploy the same codebase to another Heroku instance and set a filter to redirect any http requests to the original app.
There are four things to configure on your second Heroku instance:
- point to the same redis instance by setting the RedisToGo variable manually
- make sure that the
HEROKU_APP,HEROKU_USER, andHEROKU_PASSWORDvariables (explained earlier) are set identically (you want to affect the original Heroku instance) - use the Heroku app or console to run one worker
- use the following task file
1 require 'resque/tasks' 2 require 'resque_scheduler/tasks' 3 4 task "resque:setup" => :environment 5 task "resque:scheduler_setup" => :environment 6 7 task "jobs:work" => "resque:scheduler"
Using the Cedar stack on Heroku
The Cedar stack on Heroku solves this problem by allowing you to define as many types of worker as you want using the Procfile which is new to Cedar.
Add this to your Procfile
1 worker: QUEUE=* bundle exec rake resque:work 2 scheduler: bundle exec rake resque:scheduler
Change your task file.
1 require 'resque/tasks' 2 require 'resque_scheduler/tasks' 3 4 task "resque:setup" => :environment 5 task "resque:scheduler_setup" => :environment
And run one scheduler worker.
1 heroku scale scheduler=1
Deploy!
When we put this all together, we have a scheduler worker monitoring our scheduled/delayed jobs, and any number of workers working our resque queues. Jobs are placed in our resque queues either directly by our application, or by the scheduler at the scheduled time.
1 First, James Bracy of RedisToGo wrote a nice blog post showing how to use Resque instead of delayed_job with Heroku. And Daniel Huckstep then wrote a great blog post on a nifty way to auto-scale workers on Heroku.
2 Almost, resque-scheduler requires a dedicated worker running all the time, and that will cost some money on Heroku.
3 Adjust as needed if you are not using ActiveRecord.
4 Occasionally, you will have two workers clearing the queues simultaneously. In this case, the scaling code will not scale down because it is not safe to do so and the scale down will have to wait until the next opportunity.