Tag Archives: redis

Just a quick note that Karl Seguin has created a nice mini-Redis book, available for free download. Check his announcement here and grab the book. He’s also posted the source of the book to GitHub.


I have finally had the chance to update my Redis cheat-sheet so that it has the latest commands, including the Hash, Multi/Exec, and Pub/Sub stuff. I’m calling this v2.0 of the cheat-sheet since there are lots of changes and it in theory puts it in-sync with Redis 2.0, though I’m sure there will be small changes to come.

I’ve also done the right thing and created a new GitHub repository for it, with the OmniGraffle file there (though not the fonts, yet, pending looking at them to see if that’s okay). So, enjoy, and let me know if I’ve missed/goofed anything, of course. The repo is here and the direct link to the PDF if that’s all you want is here.

I just pushed a new project to github, redis_logger. I decided to give this a go and see if it ended up as potentially useful as I thought it might, and I’m pretty pleased with its initial version, limited though it is.

The idea is that by installing the tiny gem you can add logging into Redis to your application, including the ability to group log entries together, and then browse the groups, including intersections between groups.

As an example: let’s say you do like I did, which is to add request logging to your Rails application. I added this to my application_controller.rb:

  before_filter :log_request

  def log_request
    RedisLogger.debug({ "request" => request.url,
                        "remote_ip" => request.remote_ip,
                        "user_id" =>,
                        "username" =>
    }, "requests")

This will create a log entry as a Hash in Redis, containing the key/value pairs that were passed in. A “timestamp” value is also automatically added for you. This entry will be added into two groups, “debug” and “requests” — the “requests” group is passed into the call as the optional second parameter.

I could also add a call in my controller to log a warning if a user tries to access a page and is redirected to the signin page, or an error if an exception is passed up. I could then view all of the log entries in the intersection of “error” and “requests” to see only errors logged in the “requests” group, and not the debug or warn messages.

One of the great things about this is that it was so easy to do using Redis, once I worked out the approach. Using Sets, the code stores the entries and then adds them to the log groups as sets. Using Hashes, each entry is stored with its key/value pairs intact. I’m thinking about things l ike adding some more standard keys, like the current “timestamp” key, and enabling some additional functionality like adding tags for additional searchability within the groups.

Having the log entries in Redis is great, but browsing them is the fun part. So, there’s also redis_logger-web, a simple Sinatra app that lists the groups and lets you view the log entries. You can click on a group name to see its entries, most recent to oldest, or you can select multiple groups and view the intersection of their entries. Right now that’s limited to just the most recent 100 entries, until I work out the best way to save temporary sets and clean them up in a cron. In reality, though, the most recent 100 entries is what’s generally useful. Adding export functionality is first on my list, because that will be extremely handy for analysis.

Scaling, testing, and adding the essential multi-threading are next, of course. Sound interesting? Please, grab it, fork it, enhance it, let me know what you think.

I put together a cheat-sheet for Redis, so that I didn’t have to keep a browser tab open on the commands page all the time. This is v1.0 — please let me know any suggestions, ideas, critiques, etc. There are a number of exciting new commands coming (hashes! multi!) that I will add as they become available. I hope this proves useful for people.

Download the PDF.

After spending time to get some data into Redis (as documented in some of my previous posts here), I not surprisingly wanted to make the data searchable. After looking around at some of the full-text search solutions available for Ruby, I really liked the look of Sunspot. Well-presented, well-designed, and it even has decent documentation. It uses Solr underneath, which is a very respectable search engine, so that’s all good. Of course, it didn’t take me long to discover that the sunspot_rails plugin makes things drop-and-go when using ActiveRecord, but those of us branching off into alternatives have to put in more effort. Hence, I’ll document my findings here to hopefully make it easier for others.

I won’t bother going into the details of getting things set up, as the Sunspot wiki does a fine job of that. Suffice it to say that we install the gem (and the sunspot_rails gem if you’re going to have some ActiveRecord models as well), start the Solr server, and that’s about it. We’ve got Redis already going, right? So now it’s time to get our model indexed and searchable!

There are a few steps that we need to follow to make this happen. First, we put code in the model to tell Sunspot what fields should be indexed, which ones are just for ordering/filtering, and which ones should be stored if desired for quicker display:

class Book
  require 'sunspot'
  require 'sunspot_helper'

  # Pretend some attributes like number, title, etc are defined here

  Sunspot.setup(Book) do
    text :number, :boost => 2.0
    text :title, :boost => 2.0
    text :excerpt
    text :authors
    string :title, :stored => true
    string :number, :stored => true
    date :publication_date

  def save
    book_key = "book:#{number}:data"
    @redis[book_key] = json_data
    @redis.set_add 'books', number
    # Make searchable
    Sunspot.index( self )

  def self.find_by_number(redis, number)

First, note that we need to require 'sunspot' to get access to the Sunspot class. This isn’t required for ActiveRecord models, but since we’re on our own, we have to specify that. Then, we call setup, passing the name of our model. In the code block, we specify a few text fields: the number, title, excerpt, and authors. Those fields will be indexed and searchable. Then we specify title and number again as strings, asking that they be stored for quicker retrieval. This is so we can display just that data without fetching the whole object, if we want — I won’t get into the details of doing that here because, well, fetching the objects in Redis is so fast that I found it didn’t matter. Last, the publication date is also listed, so we can filter and order by it if we want.

In our save() method, after we store a book in Redis, we tell Sunspot to index it, and commit the updated index. So far, so good. In theory, we should be able to create a Book, save it, and then search for it. Alas, if this were an ActiveRecord model we’d be pretty much done (and wouldn’t even have to do the index/commit part because those are automagically triggered on create and update). Unfortunately, we have some harder work ahead of us.

Sunspot uses what it calls “adapters” to tell it what to do when it wants to identify an object, and when it wants to fetch an object given an id. We have to provide the adapters for our model. To give credit where it’s due, this Linux Magazine article helped me figure out what to do, and then reading through the Sunspot adapter source code filled in the blanks. If you look back at our model, you’ll see that it requires ‘sunspot_helper’. That’s where we’ll put our adapters:


require 'rubygems'
require 'sunspot'

module SunspotHelper

  class InstanceAdapter < Sunspot::Adapters::InstanceAdapter
    def id
      @instance.number  # return the book number as the id

  class DataAccessor < Sunspot::Adapters::DataAccessor
    def load( id ), id)))

    def load_all( ids )
      redis = { |id|, id))) }


So, what’s going on here? We provide two adapters for Sunspot: the InstanceAdapter, and the DataAccessor. The InstanceAdapter just provides a method that returns the ID of the object. Easy enough, we just return the book’s number, which is the unique identifier. The DataAccessor has to provide two methods, load() and load_all(), that take an id and a list of ids, respectively, and expect objects back. In my case, the objects are serialized JSON, so we just call our find_by_number() method to get each object, call JSON.parse() to get the Hash of data, and construct a new Book object. (Note: obviously this requires having an initializer that can take a Hash and create the object, which I’ll leave as an exercise) Now we just register our adapters, by adding a couple of lines of code right before the call to Sunspot.setup():

  Sunspot::Adapters::InstanceAdapter.register(SunspotHelper::InstanceAdapter, Book)

  Sunspot::Adapters::DataAccessor.register(SunspotHelper::DataAccessor, Book)

Now we should be good to go, right? Okay, we construct a Book object, and call save…then search for it:

b ={ "number" => 8888888, "title" => "My test title"})
=> #<Book:blahblah...
=> nil
search = { keywords 'test' }
=> <Sunspot::Search:{:rows=>1, blahblah…
r = search.results
=> [#<Book:blahblah...
=> "My test title"

And we’re good! Congratulations. So now we want to add the search capability to our controller, right?

# In a view, put in a search form. I have a little search image, so excuse the image_submit_tag:
<% form_for(:book, :url => { :action => "search" }) do |f| %>
      <%= f.label "Search for:" %>
      <input type="text" name="searchterm" id="searchterm" size="20">
      <%= image_submit_tag('search.png', :width => '30', :alt => 'Search', :style => 'vertical-align:middle') %>
<% end %>

# Now in the controller. Note the pagination, which is why we store the search in the session,
# so we can grab it out again if they click forward/back through the pages.
  def search
    @search_term = params[:searchterm] || session[:searchterm]
    if (@search_term)
      session[:searchterm] = @search_term
    page_number = params[:page] || 1
    search = do |query|
      query.keywords @search_term
      query.paginate :page => page_number, :per_page => 30
      query.order_by :number, :asc

    @books = search.results

# And then in our search view, display the results:
<% @books.each do |book| %>
    <li><%= book.number %>: <%= book.title %></li>
<br />
Found: <%= @books.total_entries %> - <%= will_paginate @books %>

Yes, Sunspot is so cool that it integrates automatically with will_paginate. So, looking through the above, we have a form that posts to our action (assuming you set the routes up, which you did, yes?). The action then takes the searchterm parameter if it’s there, or extracts it from the session if it’s not there. Note that this is not robust code — if it’s called with no parm and nothing in the session, it will end up searching for an empty string, which will return every book. In any case, we store the search term in the session, so that when someone clicks through to page 2, we can re-run the search to get the second page. The more important code here, though, is the call to search.

I will give a thousand thanks to this blog post, specifically the fourth item! I was doing this:

    search = do
      keywords @search_term

And it didn’t work — it was fetching every object, even though I knew that @search_term was getting set properly. As that blog post notes, though, the search is done in a new scope, so this didn’t work. The code I showed above, using the query argument, fixes that problem. It certainly took me a while to figure that out, though, because nothing is said about it anywhere in the examples in the Sunspot wiki.

So now you should be all set. Put “test” into the form, submit it, the controller will do the search, return the book, and your view will list it. You are searching! Not so bad, and the fetches from Redis are so fast that the whole thing really speeds along. Pretty simple free-text search against any objects that you put into Redis.

    A Warning

I had one other hitch when I was working on this, which mysteriously went away. I hate that. So, in case someone else encounters here, I wanted to document the issue. When I got the adapters in place for the Book model, and tried to work with it, I got an error saying that there was no adapter registered for String. I was very puzzled, wondering if something about the fact that Redis was returning a JSON String was confusing Sunspot. So I made a quick change to the InstanceAdapter:

  class InstanceAdapter < Sunspot::Adapters::InstanceAdapter
    def id
      if (@instance.class.to_s == "String")
        @instance.number  # return the book number as the id

And changed the register lines in my model:

  Sunspot::Adapters::InstanceAdapter.register(SunspotHelper::InstanceAdapter, Book, String)

  Sunspot::Adapters::DataAccessor.register(SunspotHelper::DataAccessor, Book, String)

And that did the trick. I didn’t like it, and intended to try to figure out what was going on. But after getting all the rest of it working, when I put the code back to its pre-String-adapter state, the error didn’t return. Like I said, I hate that. Hopefully it was just due to something that I was unknowingly doing wrong which I fixed along the way, but…just in case, now the quick-fix is documented here for anyone else who runs into the problem.

This afternoon I wanted to add another quick report to the system I’m building, and it was so easy that I thought I’d share some of the details. As I’ve written here before, I’m using the RaphaelJS library for simple charts, and it makes it very simple to create bar charts and pie charts. So I’ve already got a couple of those, using common code as I described in that earlier posting.

When I wanted to create a new chart, then, I knew I could leverage that. First, though, I needed to get at my data. What I was getting was essentially a list of categories, and the number of items in each category. Since this is stored in Redis, the items are key-value entries, and each category is a Set to which items belong. In this particular case, each item belongs to only one set.

So for the sake of an example, let’s say that we have books, divided into categories. We’ll store the books with keys like book:#:title and each category set will be called cat:name:

  • book:1234:title => “Technical Book”
  • book:5678:title => “Another Tech Book”
  • book:9012:title => “Gardening Book”
  • cat:technical => 1234, 5678
  • cat:gardening => 9012

So, for charting purposes, we want to get a list of the categories, and then for each one we want to fetch the number of books in it. In my reports_controller I have this:

  def books_by_category
    @chart = params[:chart] || "bar"
    @chart_title = "Books by Category"

    redis =
    # Get the list of the categories, which are the labels for the graph
    keys = redis.keys("cat:*")

    # Now let's iterate through the categories and get the counts
    @labels = []
    @values = []
    keys.sort.each do |cat|
      count = redis.set_count(cat)
      @values << count
      cat_name = cat.scan(/cat:(.*)/)[0][0]
      if (@chart == 'pie')  # Need to add counts to the labels for pie charts
        cat_name << " (#{count})"
      @labels << cat_name

First we get a Redis connection, and ask it for all of the category keys, using the pattern chosen: redis.keys("cat:*"). I want to stress something here: if you read the Redis docs (which of course you should, in depth) you’ll see that they say to never use this command in a production app! Obviously, if you have a lot of keys in the database, this is not a good command. In this particular case, I know that the database being used will not have too many keys, so I’m comfortable doing this — but be careful and make sure it’s okay for your case! If not, the solution is to create a new set that contains the names of all of the categories. Grab that set using SORT and work from there, which is simple. I also want to stress that, as with any reporting, if your data set grows (i.e. you start to have lots and lots of categories), you don’t want to run this frequently! Do the count occasionally and cache the results, create roll-up data from which to do your reporting, etc. This is a very simple case, but is a nice example of some tools.

Okay, so then we have the categories, and we iterate through them. For each, we get the count of entries using redis.set_count(cat). The redis-rb library aliases “set_count” to be the Redis command SCARD, which returns the cardinality of the set, i.e. the number of entries. We add that onto the @values array, and then create the category name by taking everything after “cat:” from the key. If we’re making a pie chart, we add the count to the labels, simply because I found that it’s very friendly that way. We add the category name to the labels array, and continue.

That’s pretty much it then! Using the previous reporting code, I just had to create a new view, which includes the partials I created before — the partials expect the @labels and @values arrays, so they’ll just graph whatever they get. Here’s the actual view:

<%= render :partial => "report_chart" %>


<h1>Reports : Books by Category</h1>
<%= render :partial => "report_links" %>


<div id="holder"></div>

If you refer back to my earlier post about RaphaelJS graphing, the report_chart partial contains the Javascript to generate the chart. The report_links partial simply has code to create links to the various chart types for this data: pie, bar, and csv. The holder div is where the RaphaelJS Javascript will render the chart.

And that’s all there is to it. Thanks to the ease of Redis sets, getting the data sliced and diced as needed was extremely simple, and thanks to easy Javascript reporting from RaphaelJS, the plain old label/value charting couldn’t be much quicker.

As discussed in Part 1, I reached a certain point with MongoDB, and decided that rather than fussing with things I’d move over to Redis and see how things went. The first thing was to change the model — rather than using the mongoid “:field” definitions, for Redis the model becomes a simple PORO (Plain Old Ruby Object). I chose to borrow a nice initialization technique from StackOverflow here so that I didn’t have to hand-code all of the attributes, but basically my initialize() method just sets the attributes and then creates a Redis connection via @redis = So the changes to the model were easy. The harder part was working out how the relationships between objects would work.

Rather than the document-style storage of MongoDB, Redis is purely based on key-value, but with more advanced data structures placed on the top. For my purposes, after some fantastic answers from Salvatore (author of Redis) and folks on the redis mailing list, I worked out how to use Sets to access the data in the ways I needed. So let’s say we have three books, ISBN numbers 123, 456, and 789. Book 123 references book 456, and book 789 references both 123 and 456. We have two authors, “Matsumoto,Yukihiro” who wrote 123 and 456, and “Flanagan,David” who wrote 456 and 789. How do we handle this in a key-value store? By using Sets:

  • Create entries for each book, with key pattern “book::data”. The value is a JSON string of data like title, price, etc (see below for note on this).
  • Create set called “books” which contains the number of every book.
  • Create sets called “backrefs:” that contain the numbers of books referenced by book #.
  • Create a set called “authors” which contains all of the authors.
  • Create sets called “author:” that contains the numbers of books written by the author.

Using the set operations in Redis, then, I can display all of the books by using the “books” set; I can display all of the books by a given author by using the “author:” set; I can display all of the books referenced by a given book by using the “backrefs:” set. In the latter case, you might be thinking that I could just keep an array in the JSON string — and yes, that could work, but I wouldn’t be able to use some of the other interesting set operations, such as intersections to determine references for a given author, for example. Note that right now, since an author is just a name, there actually is no longer any Author model! If I add more meta-data about authors in the future, I can add that easily.

About that JSON string: this has advantages and disadvantages that I’m still considering. Some would say that every individual attribute (or “column” in RDBMS-speak) should be a separate key-value pair. In that approach, for example, if I have a book title and price, I’d have book:123:title => “The Ruby Programming Language” and book:123:price => “39.99”. Obviously I can then do things like add a book to sets like “Under $50” by adding the price item to the set. The big advantage noted by some is that attributes can be added instantly by just saving a new key. Using a JSON string, adding an attribute requires reading/updating all of the existing keys. On the other hand, it is tidy to have a single key, and working with JSON is easy. For the time being, I’m giving it a try by using “book:123:data” to store the “data” about the book, and separating out certain attributes if it makes sense to use them in other data structures like sets and lists. Is this the best of both worlds or the worst of both? I’m not sure yet.

A quick note here before getting into the code: I did this using the redis-rb plugin, which has a lot of functionality but is definitely lacking in documentation. However, the code is extremely clear and easy to read through, so I strongly recommend reading through it, particularly the main lib/redis.rb file. Using it’s pretty much just a matter of installing the plugin and then calling

So, my save() method looks like this:

def save
    book_key = "book:#{number}:data"
    @redis[book_key] = json_data		# creates JSON string
    @redis.set_add "books", number	# add to global books set
    if (back_references)
      back_references.each do |ref|
        @redis.set_add "backrefs:#{ref}", number
    if (authors) then
      authors.each do |a|
        a = CGI::escape(a)
        @redis.set_add "authors", a		# add to global authors set
        @redis.set_add "author:#{a}", number

Improvements to be made here include handling the author names in a better way; doing a CGI::escape works, but a proper hash would be better. During prototyping, the escaping is nice because I can go in with the redis client and see human-readable names, but it makes the keys too long in my opinion.

So now the index() action in the Books controller looks like this:

  def index
    redis =
    @entries = redis.set_count 'books'
    @pager =, 20) do |offset, per_page|
      redis.sort('books', { :limit => [ offset, per_page ], : order => "alpha asc" })
    @keys =[:page])

    @books =
    @keys.each do |k|
      @books[k] = redis["book:#{k}:data"]

Here we get a redis connection, and use Paginator to do its thing — we have to get a count of the set, and then we use sort. This is a big part of the magic, and something that took me some time to work out. The sort command in redis (doc here) is the entry point to doing a lot of the interesting operations once you have things in a set. You’ll notice that in the save() method, all I do is add the book number to the set, not the actual key. That’s much more efficient (Redis is especially good with integers), and is enough. In the case above, all it does is call sort on the “books” set, with the “limit” and “order” options — “limit” as shown takes an offset and number of entries to return, which makes pagination a cinch. For “order” you’ll see that I use “alpha asc” which might seem confusing here since we’re dealing with numbers. In my actual use case the “numbers” can have alphanumerics, and I decided to leave this here because it’s a useful variant to see. In reality, the default for the sort command is ascending numeric so you wouldn’t need to even specify the option here.

Once the keys are retrieved, then I iterate on each one and get the actual data. This is very quick with Redis, but still not ideal. Redis supports an MGET command to retrieve multiple items in a single command, but it doesn’t return the keys, which would mean I’d have to data but not know which book number each one goes with. The redis-rb library provides a great mapped_mget() method, but at the moment it doesn’t support passing in an array. I would have to iterate each key and build a string of them. Presumably a fix can be made to accept an array, in which case this can all be collapsed down to a one-liner: @books = redis.mapped_mget(@keys). (By the way, in case you’re wondering why @keys is an instance variable, it’s because it contains Paginator metadata like page number, to display in my view).

Hopefully it’s pretty obvious that showing a book is pretty straightforward:

    book_data = redis["book:#{@book_number}:data"]
    if (book_data)
      @book = JSON.parse(book_data)

Also simple, here’s the code to get the list of books which reference the current book — that is, the books that have the current book as one of their backward references:

      references = redis.sort("backrefs:#{number}")

That’s pretty easy, isn’t it? Obviously you can add in an “order” option and even a “limit” if necessary. More interesting, here we get the list of authors, with the list of books written by each:

    alist = redis.sort("authors", { : order => "alpha asc" })
    @authors =
    alist.each do |a|
        @authors[CGI::unescape(i)] = redis.sort("author:#{a}")

First we do an initial call to sort to get the authors, sorted in ascending alphabetical order (note that this will be a little undependable given my current implementation since the names are CGI::escaped). Then we iterate each one and do a further sort to get each one’s books. This is fine, but it just returns the number of each book by the author — they key, not the value. Do we have to iterate yet again and do a third call to get the data for each book? Not at all, and this is one of the magic bits of the Redis sort command. If instead of the above sort call we can ask sort to return the values to us, instead of the keys. Using the redis client, the difference is like so:

$ ./redis-cli sort authors:Smith%3B+Bob limit 0 5
1. 123456789
2. 465768794
3. 344756635
4. 436485606
5. 347634767

$ ./redis-cli sort authors:Smith%3B+Bob limit 0 5 get book:*:data
1. {"title":"My Book","price":"19.99"}

The second command, as you can see, adds a “get” option. This is a somewhat magic option that instructs Redis to get the values of the keys matching the pattern provided. So what happens, in a sense, is that Redis does the sort, and gets the keys. It then takes the keys and plugs them into the pattern, and does a get. So the first sort command is augmented with a “get 123456789” and so on for the others, and the results are returned. This is all done on the Redis side, very quickly indeed. It is, clearly, extremely powerful. So if we change our code to get the data for the list of books, rather than just the keys:

    alist = redis.sort("authors", { : order => "alpha asc" })
    @authors =
    alist.each do |a|
      books =
      a_data = redis.sort("author:#{a}", { :get => "book:*:data" })
      if (a_data)
        a_data.each do |data|
          books << (JSON.parse(data))
      @authors[CGI::unescape(i)] = books

With this, my controller is passing @authors to the view, which is a Hash keyed off the unescaped author names. The value of each entry in the Hash is an Array of data (which is actually another Hash, created by the JSON.parse call). In the view, I can do something like this rather silly example:

<% @authors.keys.sort.each do |author| %>
  <% books = @authors[author] %>
  <tr class="<%= cycle("even", "odd") -%>">
    <td><%= author %></td>
      <% if (books.length > 0) -%>
        <%= books.length %> :
        <% books.each do |b| -%>
        (<%= truncate(b["title"], :length => 25) %>) |
      <% end -%>
      <% else -%>
      <% end -%>

This page simply iterates through the authors, and for each one it displays the number of books they’ve written, and the first 25 characters of each title. If they didn’t write any books, it shows a zero.

There is one problem here, and it’s one that I’m working on a solution for: the “sort” with “get” is very cool, but it returns the value of each entry instead of the key. That means that in the above view, I have access to the book’s title, price, etc — but NOT the number! That’s because the number is embodied in the key. This is obviously a problem, since I need to display the book number. Right now, I’m working around this by storing the number in the JSONified data, but that’s not the right thing to do. Ideally, there would be a way to have the “sort get” return the key along with the data, though I’m not certain what that would look like. Alternately, the app can get the keys, and use them to do an MGET for the data. We’ll see.

In any case, we’re now able to display the books and the authors, approaching the objects from either direction to access the others. I’ll post more and/or update this post as I experiment further, but I hope this and the first part serve as a useful introduction to people interested in exploring MongoDB and Redis. For my purposes, I plan to continue forward with Redis rather than MongoDB, but as I’ve shown, they’re not at all the same thing — I can easily see cases where MongoDB might be a better fit. It’s clearly worthwhile to do quick prototyping to make sure you understand your problem set, and then see what the best tool is. One of the most exciting things about the so-called “NoSQL” data stores is that developers now have more tools to work with. If I get the time, I hope to play with Cassandra and Tokyo Cabinet to see how they might fit in. It’s always great to have more options in the tool box.