Archive

Tag Archives: ruby

I’ve recently started using Mailgun, which I like quite a bit, but I stumbled on an issue dealing with attachments, because the files I needed to attach are stored in S3. Using RestClient to send the emails, the expectation is that attachments are files. As it turns out, using the aws-sdk gem, S3 objects don’t quite behave like files, so it doesn’t work to simply toss the S3Object instance into the call.

The standard setup for sending an email is like the following:

 data = Multimap.new
 data[:from] = "My Self <no-reply@example.com>"
 data[:subject] = "Subject"
 data[:to] = "#{recipients.join(', ')}"
 data[:text] = text_version_of_your_email
 data[:html] = html_version_of_your_email
 data[:attachment] = File.new(File.join("files", "attachment.txt"))
 RestClient.post "https://api:#{API_KEY}"\
   "@api.mailgun.net/v2/#{MAILGUN_DOMAIN}/messages",
   data

As this shows, the attachment is expected to be a File that can be read in. So the challenges was to make an S3 object readable. One option, of course, is to do this in two steps: read in the S3 object and use Tempfile to write a file which can then be used as the attachment. This seemed pretty unfortunate. For one thing, I’m running stuff on Heroku, and try to avoid using the file system even for temp files. But primarily, it’s really wasteful to have to write, and then read, a transitory file. The better option, of course, was to see if there was a way to trick the client into reading from S3.

Thanks to some very nice help from Mailgun support (thanks Sasha!), the idea of writing a wrapper seemed feasible, and in fact it wasn’t too bad aside from a couple of tricky issues. A side-effect advantage was that it solved another problem: the naming of the attachment. By default, the name of the attachment is the name of the file, which is pretty ugly if you use a temp file. Not user-friendly.

I’ve put the wrapper file in a Github Gist at https://gist.github.com/masonoise/5624266. It’s pretty short, and there were only a couple of gotchas, which I describe below. The key for this wrapper is to provide the methods that RestClient needs: #read, #path, #original_filename, and #content_type. It’s pretty obvious what #read and #path are for. The attachment naming problem is solved by #original_filename: whatever it returns will be the name of the attachment in the email. It should be clear what #content_type does, but see below for why it’s important.

Using the wrapper is described in the header comment, but it’s mainly a change to give RestClient the wrapper instead of a File object:

data[:attachment] = MailgunS3Attachment.new(file_name, file_key)

The first gotcha was that RestClient calls #read repeatedly for 8124 bytes, and doesn’t pass a block. This forced me to write a crude sort of buffering — the wrapper reads the whole S3 object in, then hands out chunks of it when asked. This isn’t a problem for me because the files I’m dealing with aren’t very large, but it’s something I warn about in the comments. If you have large files, this may be a problem for you, so beware.

The second gotcha that threw me off for a little while is that the value returned by #content_type is important. I haven’t researched exactly why this is, but I found that if I tried to send a Word document but #content_type returns ‘text/plain’, the attachment comes through corrupted. It was easy enough for me to check the filename suffix and set the content type accordingly, but I can imagine cases where this might not work, so this is something else to beware of.

Anyway, this solved the issue for me, and hopefully it’ll be useful for others. There are ways to make this a bit more elegant, but it’s a short piece of code that works. Enjoy.

Recently we wanted to add a nice-looking timeline view to some data in a web app, and looked around for good Javascript libraries. Perhaps the coolest one is Timeline JS from Verite, but while gorgeous it’s also super heavyweight, and pretty much demands a page all to itself. We wanted something more economical that could share the page with some other views, and decided on Timeline-Setter, which creates pretty little timelines with just enough view area to provide important information about each event.

However, as-provided, Timeline-Setter wants to exist as a command-line utility. You get a data file ready, run the utility, and it generates a set of HTML (and associated) files that you can drop into an app. That’s dandy if you have a set of data that doesn’t change often, or you want to perhaps run a cron to pre-generate a bunch of timelines. We needed something that could generate a timeline for a dynamic set of data, so we weren’t sure Timeline-Setter would work for us. However, looking it over I thought it seemed potentially usable in a dynamic way. I generated a static example using our data, read through what it had created, and deconstructed what it wanted in order to display the timeline, then wrote some code to dynamically generate the JSON data necessary. It wasn’t too difficult, and fairly shortly we had dynamic timelines going. I wanted to share the info here since it’s a pretty nice library that others should get a lot out of.

We’re using Jammit to handle our static assets, so we simply put “public/javascripts/timeline-setter.js” and “public/stylesheets/timeline-setter.css” into our assets.yml file, but you can use whatever standard approach you take to including JS and CSS into your pages. Once that’s done, you’re ready to go.

Timeline-Setter takes a pretty standard approach to placing itself in the page: it takes the ID of a DIV, and that’s the container which will hold the timeline. One note: we needed to include multiple timelines in a single page, so we had to do a little creative naming of the DIVs that hold the timelines, as you’ll see below.

<div id="timeline-<%= author.id %>" class="timeline_holder"></div>
<pre>
<script type="text/javascript">// <![CDATA[
    $(function () {
      var currentTimeline = TimelineSetter.Timeline.boot([<%= author.make_timeline_json %>], {"container":"#timeline-<%= author.id %>", "interval":"", "colorReset":true});
    });

// ]]></script>

The code here creates the DIV, gives it an ID, and then places a Javascript call to the Timeline-Setter boot() function, which tells it to generate the timeline. The first parameter is the JSON holding the data for the timeline; the second is a set of options, passed in as JSON. “container” of course is the ID of the DIV which will contain the timeline. Other options include “interval”, “formatter”, and “colorReset” among others. See the library’s page as listed at the beginning for details of the API, and in particular see the section headed “Configuring the Timeline JavaScript Embed” for the basics of calling the boot() function.

Now of course we need the make_timeline_json() method, which will take our object’s data and create the JSON needed by Timeline-Setter. As an example here, let’s pretend that we’re showing a timeline of books written by an author over the years.

class Author < ActiveRecord::Base
  def make_timeline_json
    timeline_list = []
    if (!birth_date.nil?)
      timeline_list << "{'html':'','date':'#{birth_date}','description':'Birth Date:','link':'','display_date':'','series':'Events','timestamp':#{(birth_date.to_time.to_i * 1000)}}"
    end
    books.each do |book|
      author_list = book.authors.map { |auth| "#{auth['name']}" }.join('; ')
      timeline_list << "{'html':'Authors: #{author_list}
Publisher: #{book.publisher}','date':'#{book.pub_date}','description':'#{book.title}','link':'','display_date':'','series':'Publications','timestamp':#{(book.pub_date.to_time.to_i * 1000)}}"
    end
    if (!death_date.nil?)
      timeline_list << "{'html':'','date':'#{death_date}','description':'Death Date:','link':'','display_date':'','series':'Events','timestamp':#{(death_date.to_time.to_i * 1000)}}"
    end
    return "#{timeline_list.join(',')}"
  end
end

Essentially, this method creates a JSON string containing a set of entries, each representing an event to place on the timeline. Each entry has several fields: html, date, description, link, display_date, series, and timestamp. Not all of these are used here, but with the basics you can experiment further. The important fields are:

  • html: This will be displayed in the event’s pop-up when clicked on.
  • description: Just what it says.
  • link: an optional URL which will be associated with a link in the event pop-up.
  • series: which “series” this event belongs to; see below for details on this.
  • timestamp: This is the timestamp associated with the event, used to construct the timeline in order.

A note about the “series” parameter: one very nice feature of Timeline-Setter is that you can display more than one set of events in a single timeline. Each set of events is called a “series”. In our example we’re creating two series: “Events” and “Publications”. Each will be shown in a different color, with a title (so the names of the series need to look nice, as they’ll be displayed) and a checkbox so that a viewer can hide and show each series individually. It’s extremely useful.

In the code above, you’ll see that we create the “Birth Date” and “Death Date” events individually, but in the middle we iterate over the books associated with this author. For each book we build a string of authors, semicolon-delimited, just to demonstrate one way to include another list of information in an event’s HTML. I have to admit that I’m not entirely certain why it’s necessary to multiply the timestamps by 1000 to get to the correct time, but it works fine…

And there you are. Hopefully anyone needing an economical, nice-looking timeline with dynamically-generated data can take advantage of this. But certainly, if you can work with static (or infrequently-updated) data, you may be able to use Timeline-Setter out of the box via a cron job or rake task — you could generate the CSV file for its command-line interface, run it, then copy the resulting files into your application. If you need dynamic timelines, though, I hope this post is helpful.

I’ve done some work integrating Rails apps with Salesforce over the past few years, and have been very happy to see the new databasedotcom gem take the place of the community’s older activesalesforce gem. Thanks to work from Heroku and Pivotal Labs, it’s now very easy to push and pull data between a Rails app and your Salesforce organization.

I wrote up an article about using the gem, which is now available on the DeveloperForce site. You can go and check it out at 
http://wiki.developerforce.com/page/Accessing_Salesforce_Data_From_Ruby
. I hope it helps get your started if you’re finding a need to do this sort of work. So many companies rely on Salesforce now for at least their sales pipeline that it can be extremely useful to do things like extract data to show in an internal Rails dashboard, or do more complex reporting. In our case we’re also sourcing data from external places and pushing it into our Salesforce organization so that our sales/support folks can have easy access to it within their Salesforce world.

 

Having the need to profile a rake task in order to figure out why it was taking so long, I decided to take the opportunity to check out the perftools.rb gem. It proved to be interesting, though I’m still working out the best way to get useful information from it. Getting it running on my MacBook Air running Lion (OS X 10.7) was a little involved so I thought I’d write up a post here both for my own memory as well as to help anyone else who might be interested. And it’s been a while since my last post because things have been so busy at work. Time to get back to writing things here.

The perftools.rb gem is of course on GitHub: here. Because I needed to profile a rake task rather than part of our Rails app, I couldn’t use the rack-perftools_profiler, so I had to do it a bit more manually. First thing of course was to add the gem to the Gemfile:

    gem "perftools.rb", "~> 2.0.0"

After a bundle install, I was good to go there. Next I had to put the profiling around the block of code I wanted to investigate:

    require 'perftools'	# At the top of the class

    PerfTools::CpuProfiler.start("/tmp/my_method_profile")
    my_method_call()
    PerfTools::CpuProfiler.stop

It’s pretty easy. I ran my code, and then I had the file /tmp/my_method_profile. But what to do with this binary file? Well, the simplest option is to run pprof.rb on it:

pprof.rb --text /tmp/my_method_profile

That will output a bunch of information, basically a list of the “samples” the profiler took in descending order of percentage of samples in the function — that is, the first entry is the one which was “seen” the most by the profiler. I recommend taking a look at the pprof documentation here.

That textual information is surely useful, but I hoped for more. Using callgrind output seemed very interesting, but I couldn’t use kcachegrind on my Mac. Thanks to this gist I was able to get qcachegrind, a Qt version, running. It’s not quite 100% happy with OS X 10.7 but it works. You’ll need Xcode installed, then follow the directions in that gist. I installed the latest “libs only” version of Qt, 4.8.0. Installing Graphviz was straightforward; you can follow the instructions in the gist or use Homebrew as well. I was then able to do the svn checkout of qcachegrind and build it. Note that the patch gist mentioned to let it open any file is no longer necessary — since that was written, a change to qcachegrind was made to allow it to open any file. Go ahead and do the qmake and make, and you should be good to go.

Okay, so now I could process the profile into callgrind format, and view it:

pprof.rb --callgrind /tmp/my_method_profile > /tmp/my_method_profile.grind
open qcachegrind.app

Using qcachegrind I opened the /tmp/my_method_profile.grind file, and there it was! I could view the call graphs, and get some nice views of what was going on in the code.

If all you want is a straight-forward call graph, though, you can also (once you have Graphviz installed) generate a call graph GIF image with “pprof.rb –gif /tmp/my_method_profile > /tmp/my_method_profile.gif”. Open that GIF and you’ll have a very useful view of what was happening in the block of code you’re profiling.

I hope this quick summary is helpful — it’s not always easy to figure out precisely what’s happening your code, and perftools.rb can help out a great deal.

This took me a while so I thought I should share the solution — however, see the caveat at the end, because there’s an element I haven’t tested yet.

The first requirement here is integrating Devise into Radiant. For the most part, the information at this page will get you there, though I’ll work on a separate post going through the process in detail. Once you have Devise working, then you have a user object, and naturally you’d like to display a “Logged in as…” element in your Radiant layout, right? Not so easy, it turns out.

In my testing I called the Devise model PortalUser since it has to be differentiated from the User model that Radiant uses. I put the authentication stuff into a custom extension, which we’ll call “my_auth”. So, I end up with my_auth_extension.rb:

class MyAuthExtension < Radiant::Extension
  
  SiteController.class_eval do
    include ContentManagement
    prepend_before_filter {|controller| controller.instance_eval {Thread.current[:current_portal_user] = current_portal_user} }
    prepend_before_filter {|controller| controller.instance_eval {authenticate_portal_user! if radiant_page_request?}}
  end

  # activate() method left out for brevity
end

The filter to call authenticate_portal_user! is needed to get Devise working. The other filter is the important one here, and what it does is get the current_portal_user reference in the controller and place it into the current thread for later access. This is the only way I’ve found (so far) to get something from a controller in Radiant to a tag. I’ve tried various instance variable tricks, all sorts of things, with no luck. If anyone has another solution, please do comment below, because yes, this seems like a hack.

Now we go create a new tag to display the logged-in user’s email address. In our extension we have lib/user_tags.rb:

module UserTags
  include Radiant::Taggable

  desc "Outputs the user email address"
  tag "user_email" do |tag|
    current_user = Thread.current[:current_portal_user]
    @user_email = current_user.email
    parse_template 'usertags/_email_template'
  end

  private

    def parse_template(filename)
      require 'erb'
      template = ''
      File.open("#{MyAuthExtension.root}/app/views/" + filename + '.html.erb', 'r') { |f|
        template = f.read
      }
      ERB.new(template).result(binding)
    end
end

First, let me give credit for the parse_template() method to Chris Parrish in this post. This tag simply gets the user object from the thread, and sets @user_email accordingly, which can then be used by the ERB template. parse_template() grabs the partial using the filename passed in, and renders it, which ends up being output by the tag. The partial, which lives in your extension as app/views/usertags/_email_template.html.erb, is simply:

<%= @user_email %>

So there’s nothing to that, really. If you modify your Radiant layout to include Logged in as: <r:user_email /> then you should be all set.

At the beginning I mentioned a caveat. I have not tested this yet to see what the effects of Radiant’s caching are — I am assuming that the tag contents will not be cached and thus all is well, but we will see. I’ve been bitten by the caching before in unexpected ways.

Anyway, I hope this helps someone out.

Here’s an oddity that we just ran across this afternoon, and it’s a nice bit of Ruby trivia to break the bit of silence here on the blog. It has to do with using an ordinary each on a Ruby array, and what happens if you change the array while you’re iterating. Here’s an example in irb:

irb(main):004:0> s = [1, 2, 3, 4, 5]
[
    [0] 1,
    [1] 2,
    [2] 3,
    [3] 4,
    [4] 5
]
irb(main):005:0> s.each do |i|
irb(main):006:1* puts "> #{i}"
irb(main):007:1> s.delete(i) if (i == 3)
irb(main):008:1> end
> 1
> 2
> 3
> 5
[
    [0] 1,
    [1] 2,
    [2] 4,
    [3] 5
]

As you can see, we first create an array containing the numbers 1 through 5. Then we do a simple iteration, and print out the value of each array element. If it’s element 3, then we delete that element from the array, and continue on.

What you might expect to happen is that it will print all of the entries, 1 through 5, and at the end the array will simply be missing element 3. That would make sense. However, it’s not what happens. As shown above, instead we end up skipping element 4 altogether!

While unexpected, there is some logic to why this happens. While we iterate, Ruby is essentially holding a pointer, or an offset, into the array. When we’re on element 3, the offset is 2 (counting from zero). When we delete element 3, the array shrinks, but the offset clearly is left at 2. When we continue our iteration, Ruby increments the offset to 3, which ends up pointing to the value “5″ because we deleted one element. That means that we end up skipping the value “4″.

We happened to encounter this problem while working in one of our models in a method that needed to delete all the elements in an associated model, and we had code that basically did the following:

other_models.each {|m| other_models.delete(m)}

The test data had three other_model entries, but every time it deleted the first and the third, leaving the second one. As we discovered, that’s because deleting the first one meant that the array shrank, and the each ended up skipping the second element.

It’s hard to say whether this should be considered a Ruby bug or not, but I’m feeling a bit inclined to say it is…

I stumbled on this in a random email list archive, and modified it slightly, but I certainly can’t take credit for it. In any case, it’s a really nice way to see what the environment you’re running in looks like so I thought I’d share. For example, I’ll run this in script/console to verify things like the Rails and Ruby version. It can be quite handy:

>> Object.constants.sort.each {|c| cv=Object.const_get(c); print c, "=", cv, "\n" unless Module === cv}; true
ARGF=ARGF
ARGV=
CROSS_COMPILING=nil
ENV=ENV
FALSE=false
NIL=nil
PLATFORM=x86_64-linux
RAILS_CACHE=#<ActiveSupport::Cache::MemoryStore:0x2aaaaf8b71d8>
RAILS_DEFAULT_LOGGER=#<ActiveSupport::BufferedLogger:0x2aaaaf8caee0>
RAILS_ENV=production
RAILS_GEM_VERSION=2.3.5
RAILS_ROOT=/opt/rpx/app/releases/20100525230203
RAILTIES_PATH=/usr/local/lib/ruby/gems/1.8/gems/rails-2.3.5/lib/..
RELATIVE_RAILS_ROOT=/opt/rpx/app/releases/20100525230203/config/..
RELEASE_DATE=2010-01-10
RPM_CONTRIB_LIB=/usr/local/lib/ruby/gems/1.8/gems/rpm_contrib-1.0.10/lib
RUBY_COPYRIGHT=ruby - Copyright (C) 1993-2010 Yukihiro Matsumoto
RUBY_DESCRIPTION=ruby 1.8.7 (2010-01-10 patchlevel 249) [x86_64-linux]
RUBY_PATCHLEVEL=249
RUBY_PLATFORM=x86_64-linux
RUBY_RELEASE_DATE=2010-01-10
RUBY_VERSION=1.8.7
STDERR=#<IO:0x2aaaaab15ab0>
STDIN=#<IO:0x2aaaaab15b00>
STDOUT=#<IO:0x2aaaaab15ad8>
TOPLEVEL_BINDING=#<Binding:0x2aaaaab08f90>
TRUE=true
VERSION=1.8.7
=> true

I put the “true” at the end just to prevent it from spewing out the object info.

For the new project I’m working on, after doing some initial very simple prototyping using MySQL (mainly because I could get from 0 to somewhere very quickly with ActiveScaffold and a few simple migrations), I started to look at alternate data stores. There are real reasons given the type of data being managed, but I have to admit that at least some of it was my desire to get a bit of hands-on experience with some of the new kids on the block, too. After exploring the alternatives, I settled on doing some prototyping with both MongoDB, and Redis. There are obviously others that are equally interesting, particularly Cassandra, but there simply isn’t time for everything! I selected Redis because I’d already done some playing with it, understood its basic concepts, and felt that its support for sets would be valuable for what I’m working on. I chose MongoDB as another option after doing some reading on it and finding it to be an interesting combination of key-value with some relational-style support. I also thought the mongoid was a nice bit of work that would be nice to use.

I want to note that I purposely did not call this “MongoDB vs Redis” — they’re different tools, and have different uses, which is one of the things I hope will be clear from these posts. This isn’t a competition, but just a summary of my experiments in looking at how I might approach my needs using the two.

The “problem” to be solved

I’m not at liberty to divulge the details of what I’m working on, so I have a sort of parallel-world simulation of the problem that replicates the types of issues I have to take care of. The idea, then, is to model a reference library, where we have Books and Authors. A Book can have multiple Authors, while an Author may have written multiple Books, so in a relational schema there would be a many-to-many relationship between them. In addition, a Book can contain references to other Books. We want to build a web app that will:

  • Show all of the Books
  • Show all of the Authors
  • For a Book, show all of the Authors
  • For a Book, show all of the Books that it references
  • For a Book, show all of the Books that reference it
  • For an Author, show all of the Books they’ve authored

MongoDB

As I mentioned above, I liked the look of the mongoid plugin to work with MongoDB, though I did do an initial pass using MongoMapper as well. I just felt that mongoid was a bit smoother, had more support for associations, and had somewhat more documentation, but they both did the job. Using Mongoid, my models looked something like this:

class Book
  include Mongoid::Document

  field :number
  field :title
  field :back_references, :type => Array
  field :forward_references, :type => Array
  index :number
  has_many :authors
end

class Author
    include Mongoid::Document

    field :name
    belongs_to :book, :inverse_of => :authors
end

As you can see, much like with ActiveRecord, you simply specify the fields you want persisted, and use a has_many/belongs_to pair to create an association. Do note that instead of extending a class as you would with AR, for mongoid you simply include Mongoid::Document. When I want to create a Book, it goes something like the following, assuming that I have the book number/title and an array of author names:

    the_book = Book.new(
                        :number => book_number,
                        :title => book_title
    )
    authors.each do |a|
      the_book.authors << Author.new(:name => a)
    end
    the_book.save

But what about the references, then? In the Book model above, I have two arrays, back_references (a list of books that reference this one) and forward_references (a list of books that are referenced by this one). Actually, all it takes for these is to create arrays containing the book numbers, assign them to the instance, and save. That’s one of the nice things about MongoDB, as we’ll see: you can query for items in embedded arrays.

A quick note here: I’ve glossed over the setup and configuration of MongoDB here, somewhat on purpose. Once you’ve installed it, if you’re using mongoid there are very clear instructions on setting up your Rails app to use the db so there’s not much need for me to repeat things here. Let’s just say we’re using a db called “books-development” which will then contain our collection, which is called “books”. Wait, shouldn’t we have another collection called “authors” since we have an Author model? Well, no, because the way we set up the has_many/belongs_to it means that Authors are embedded objects within Books. Let’s see what an entry looks like when we persist it. Running the mongo shell:

> db.books.find({number : "1234567890"});
{ "_id" : "4b58f90c69bef38f8f000720", "number" : "1234567890", "forward_references" : [
        "6215628454",
        "63107472345"
], "back_references" : [
        "39848733434",
        "51895763321",
        "5216434662"
], "authors" : [
        {
                "_id" : "4b58f90569bef38f8f000091",
                "name" : "Matsumoto,Yukihiro",
                "_type" : "Author"
        },
        {
                "_id" : "4b58f90569bef38f8f000092",
                "name" : "Flanagan,David",
                "_type" : "Author"
        }
],  "_type" : "Book", "title" : "The Ruby Programming Language" }

From this, you can see that Mongo has assigned “_id” values to each object, the references are both just arrays of book numbers, and the authors have become embedded objects with their own “_id” and “_type” (used by mongoid). As we’ll see in a bit, the fact that the authors are embedded objects is convenient for some purposes, but problematic for others due to the queries I needed to do. For now, though, let’s see what our queries look like for the various activities listed above.

  # Inside books_controller.rb, index action to list the books
  def index
    @entries = Book.count
    @pager = Paginator.new(@entries, 20) do |offset, per_page|
      Book.criteria.skip(offset).limit(per_page).order_by([[:title, :asc]])
    end
    @books = @pager.page(params[:page]) 
  end

  # show action to display a single book's details
  def show
    @book = Book.find(:first,  :conditions => { :number => params[:number] })
  end

Pretty straightforward stuff, even when bringing Paginator into the picture. Being able to chain the criteria with mongoid is a nice bonus to using it. So when a single book is displayed, the page can show the list of author names by simply iterating the array:

  <tr>
    <td class="label">Authors</td>
    <td class="show">
      <% if (@book.authors)
         @book.authors.each do |author| -%>
        <%= author.name %> |
      <% end -%>
      <% else -%>
        None
      <% end -%>
    </td>
  </tr>

The backward references are exactly the same way. However, I discovered while writing the data entry scripts that the forward references (i.e. the books that reference the current book) were not available. No problem, I figured, instead of storing that I’ll just query it:

  def referenced_by
    Book.find(:all, :conditions => { :back_references => number }) 
  end

There’s some nice MongoDB magic. Very simply, that will return any Book entry that contains “number” in its “back_references” attribute — even though that attribute is an array! That ability to query for contents of an array comes in very handy, needless to say. As an aside, I came across a reference that I sadly can’t find now to link to it, but it showed me how to add a super simple search. To make the books searchable, I just took the title and the author, did a split(), and created an array containing each word. I called that “search_words” and made it a new array-type attribute. The search is then a simple query:

  def search_books(search_term)
    Book.find(:all, :conditions => { :search_words => search_term }) 
  end

This is obviously a very simplistic search, but given that it takes about 2 minutes to implement, who’s complaining?

The Author problem

So now we come to where I began to find problems with the approach. I wanted to display the list of all authors. Hmm, the authors are embedded documents within the books. Okay, it is possible:

  def get_author_list
    results = Books.criteria.only(:authors)
    author_list = Hash.new
    results.each do |book|
      book.authors.each do |a|
        if (!author_list.has_key?(a))
          author_list[a] = Book.where(:authors => a)
        end
      end
    end
    return author_list
  end

Pretty ugly, ain’t it? It queries all of the books and gets just the authors attribute, then iterates each book, then iterates the authors. For each one, it does a query to get the list of books (so our page can show each author followed by their books), and creates a Hash with key=author, value=books array. This obviously doesn’t do any pagination, which would make it even messier, plus the results aren’t sorted yet. Nope, I didn’t like it.

The alternative seems to be to make authors a first-level document, and link explicitly with book numbers, which isn’t horrible but means, again, multiple queries to get our list of authors with their books. This was beginning to look like it might be too relational a problem for MongoDB to make sense.

Update: as noted in the comment below by module0000, using distinct(“author”) solves this particular problem in a much cleaner way — thanks for the comment! I’ll still stand by the thought that this is really a relational problem and a document database has shortcomings in that regard (and of course strengths in other ways).

So, I set this aside, since as a prototype it did work. I made a new branch (thanks, git) and converted it to use Redis. Which I’ll cover in part 2, shortly.

This is just an odd little note, since it took me a couple of searches to determine what the deal was. I had an XML string that I got from a SOAP call, and I wanted to do a quick conversion to JSON. Since I know the string’s going to be fairly small, the overhead’s not going to be too bad. So I tried to do a simple:

puts Hash.from_xml(response).to_json

Oddly, I got undefined method `from_xml' for Hash:Class (NoMethodError). That’s weird, I thought; I know that method exists. After checking ruby-docs, though, in fact it doesn’t — if you’re not in Rails. I was running this as a standalone test program. Sure enough, I brought up a Rails console and tested it, and there’s the method.

It just took a quick check then to confirm that, indeed, the from_xml() method is added by ActiveSupport. Putting this in:

require 'active_support'

Took care of the problem. Although I also had to add a to_s to the response in order to make it look like a String.

UPDATE: Thanks to Andrew for his comment below. Now for ActiveSupport 3 you may need to use this instead:

require ‘active_support/core_ext’

Silly little problem, but in case this will help someone out, here it is.

I’ve been writing a little test SOAP client, which turned out to — as with so many things — not be a straightforward as I expected. SOAP rarely is, though, but unfortunately the service I’m working with doesn’t provide anything as modern as a JSON-emitting REST interface, alas. SOAP it is, then.

First, I quickly discovered that it took some real searching to find anything reasonable in the way of a SOAP client library for Ruby. There’s a bunch of SOAP crap in the Ruby docs, of course: there’s the SOAP module, there’s SOAP::RPC, and tons more. With no documentation whatsoever. I could almost get something to work with that, except that I needed to add an additional header to the SOAP envelope, and…well, there doesn’t seem to be info anywhere on how to do that. Hopeless.

Thankfully, I finally discovered the Savon gem. It’s pretty well-documented and very simple to use. With that, I was quickly able to put together a request…which didn’t work. Hmm. There was some yucky PHP example code for the service I was trying to access, so I ran that with the cool HTTPScoop program going, and looked at the request that it sent, which did work. Some differences there, indeed, and it wasn’t obvious how to fix it. The problem was that the input wasn’t properly specified, and the parameter didn’t have a namespace declared on it. Instead of:

<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:ns1="http://example.com/webservice/schema/">
  <SOAP-ENV:Header>
    <ns1:API-KEY>foobar</ns1:API-KEY>
  </SOAP-ENV:Header>
  <SOAP-ENV:Body>
    <ns1:DoSimpleRequest>
      <ns1:uniqueId>12345</ns1:uniqueId>
    </ns1:DoSimpleRequest>
  </SOAP-ENV:Body>
</SOAP-ENV:Envelope>

I was getting this:

<env:Envelope xmlns:wsdl="http://example.com/webservice/schema/" xmlns:env="http://schemas.xmlsoap.org/soap/envelope/">
  <env:Header>
    <API-KEY>foobar</API-KEY>
  </env:Header>
  <env:Body>
    <wsdl:DoSimple>
      <uniqueId>12345</uniqueId>
    </wsdl:DoSimple>
  </env:Body>
</env:Envelope>

You can see that the input ended up as “DoSimple” instead of “DoSimpleRequest”. My Ruby code was doing response = client.do_simple which of course generated that. But if I changed it to response = client.do_simple_request I got a method missing error. Something about the WSDL wasn’t matching what I needed to send, which was annoying. In addition, the uniqueId parameter didn’t have the namespace prepended to it.

So, what I ended up needing to do with Savon was to specifically provide the input and action, rather than letting it figure them out from the WSDL; and then I needed to force in the namespace declaration for the parameter. The resulting, working code:

require 'rubygems'
require 'savon'

client = Savon::Client.new "http://example.com/the-soap-API.wsdl"
response = client.do_simple do |soap|
  soap.header["API-KEY"] = "foobar"
  soap.input = "DoSimpleRequest"
  soap.action = "DoSimpleRequest"
  body = Hash.new
  body["wsdl:uniqueId"] = 12345
  soap.body = body
end

puts "Response:\n"
puts response

It’s not ideal — normally I’d specify the uniqueId parameter with a nice

  soap.body = {
      :uniqueId => 12345
  }

But that resulted in uniqueId rather than wsdl:uniqueId, and I couldn’t find another way to force it. I looked through the Savon source code, and body is simply a Hash attribute on which it ends up calling @body.to_soap_xml. So I sort of fooled it by creating my own hash and then setting body to it. Obviously a fragile thing to do, so we’ll see if I missed something, and if not whether it makes sense to submit a patch to enable this in a more robust way. In any case, now I get the request that I needed:

<env:Envelope xmlns:wsdl="http://example.com/webservice/schema/" xmlns:env="http://schemas.xmlsoap.org/soap/envelope/">
  <env:Header>
    <API-KEY>foobar</API-KEY>
  </env:Header>
  <env:Body>
    <wsdl:DoSimpleRequest>
      <wsdl:uniqueId>12345</wsdl:uniqueId>
    </wsdl:DoSimpleRequest>
  </env:Body>
</env:Envelope>

Hope this saves someone some work, and I’ll be sure to follow up with the eventual outcome of this patchwork solution.

Follow

Get every new post delivered to your Inbox.