devroom.io/content/posts/2011-01-01-rake-task-to-sync-your-assets-to-amazon-s3cloudfront.md
2019-06-05 14:32:16 +02:00

134 lines
4.5 KiB
Markdown

+++
date = "2011-01-01"
title = "Rake task to sync your assets to Amazon S3/Cloudfront"
tags = ["amazon", "s3", "cloudfront", "hosting", "cloud"]
slug = "rake-task-to-sync-your-assets-to-amazon-s3cloudfront"
+++
With my move to Heroku I felt bad about having Heroku's app servers serve static content for me. It's not really a problem, but I just like to use the best tool available for the job.
Because _Ariejan.net_ is a rack app, it has a `public` directory with all static assets in once place. There are, however, a few problems that need adressing.
~
These are the problems I want to resolve:
#### Keep my S3 Bucket in sync with my public directory ####
The first and foremost is to keep my S3 bucket in sync with the content of `public`. I don't care about file deletions, but I do care about new and updated files. Those should be synced with every deployment to S3.
#### Don't re-upload the entire public directory with every deployment ####
Over time the size of `public` has grown. New images are added all the time. I don't want to re-upload them with every deployment. So, my sync script must be smart enough to not upload unchanged files.
#### Hook the S3 sync into my current deployment rake task ####
My current rake deploy task should be able to call `assets:deploy` or something to trigger an asset sync.
#### Minimal configuration ####
I don't want to configure anything, if possible.
### The script ###
Well, this is the rake task I currently use:
``` ruby
require 's3'
require 'digest/md5'
require 'mime/types'
## These are some constants to keep track of my S3 credentials and
## bucket name. Nothing fancy here.
AWS_ACCESS_KEY_ID = "xxxxx"
AWS_SECRET_ACCESS_KEY = "yyyyy"
AWS_BUCKET = "my_bucket"
## This defines the rake task `assets:deploy`.
namespace :assets do
desc "Deploy all assets in public/**/* to S3/Cloudfront"
task :deploy, :env, :branch do |t, args|
## Minify all CSS files
Rake::Task[:minify].execute
## Use the `s3` gem to connect my bucket
puts "== Uploading assets to S3/Cloudfront"
service = S3::Service.new(
:access_key_id => AWS_ACCESS_KEY_ID,
:secret_access_key => AWS_SECRET_ACCESS_KEY)
bucket = service.buckets.find(AWS_BUCKET)
## Needed to show progress
STDOUT.sync = true
## Find all files (recursively) in ./public and process them.
Dir.glob("public/**/*").each do |file|
## Only upload files, we're not interested in directories
if File.file?(file)
## Slash 'public/' from the filename for use on S3
remote_file = file.gsub("public/", "")
## Try to find the remote_file, an error is thrown when no
## such file can be found, that's okay.
begin
obj = bucket.objects.find_first(remote_file)
rescue
obj = nil
end
## If the object does not exist, or if the MD5 Hash / etag of the
## file has changed, upload it.
if !obj || (obj.etag != Digest::MD5.hexdigest(File.read(file)))
print "U"
## Simply create a new object, write the content and set the proper
## mime-type. `obj.save` will upload and store the file to S3.
obj = bucket.objects.build(remote_file)
obj.content = open(file)
obj.content_type = MIME::Types.type_for(file).to_s
obj.save
else
print "."
end
end
end
STDOUT.sync = false # Done with progress output.
puts
puts "== Done syncing assets"
end
end
```
This rake task is hooked into my `rake deploy:production` script and generates the following output (I added a new file just to show you what happens.)
``` shell
$ rake deploy:production
(in /Users/ariejan/Code/Sites/ariejannet)
Deploying master to production
== Minifying CSS
== Done
== Uploading assets to S3/Cloudfront
......................................U.........
== Done syncing assets
Updating ariejannet-production with branch master
Counting objects: 40, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (27/27), done.
Writing objects: 100% (30/30), 4.24 KiB, done.
Total 30 (delta 17), reused 0 (delta 0)
-----> Heroku receiving push
```
### Conclusion ###
It's very easy to write your own S3 sync script. My version has still has some issues/missing features that I may or may not add at some later time. There's no support for file deletions and error handling is very poor at this time. Also, `public` is still under version control (where I want it), and is pushed to Heroku. This is non-sense, because most of the assets in `public` are not used (except `robots.txt` and `favicon.ico`)