Our laboratory of open-source Ruby and Rails software.
Ideas blog: on software, productivity, and design. We have written parts of Ruby on Rails, ActiveMerchant, and a number of other open-source projects.

Code Challenge: Combinations

Given an array of arrays of possible values, enumerate all combinations that can occur, preserving order. For instance:

Given: [[1,2,3], [4,5,6], [7,8,9]], calculate the same result as the code below, but do so with an arbitrary size array:

combos = []
[1,2,3].each do |v1|
  [4,5,6].each do |v2|
    [7,8,9].each do |v3|
      combos << [v1, v2, v3]
    end
  end
end
combos

Entries can be written using one or more functions, and may optionally be written as a class extension (i.e. Array).

Points will be given on technique, performance/speed, and difficulty of the program style. Winner receives a gold medal made of pure awesome. Submit a comment with a link to a gist of your code to enter. Comments are moderated, so your entry is safe until a winner is chosen.

Announcing a DNS tool without the bullshit

I’ve created a simple DNS tool that allows you to easily link DNS results to share with others. Using a third-party tool is a great way to show that the DNS settings are not being affected by your local network, but most third-party tools are riddled with advertising and unclear tool names, with some even charging a membership to access some of their tools. Our tool uses (and displays in the result) the command line used to generate the results instead. While not everyone will be familiar with these different commands, users can still experiment to learn and discover more about these valuable tools.

The tool is available under our name, dns.norbauer.com, or at its Heroku hostname, dns.heroku.com, where the tool is hosted.

The site is written using the Sinatra framework and the code is open source.

ERD diagrams from Sequel Pro

If you need a diagram of your MySQL database and you’re on a Mac, generating an ERD diagram is quite easy – and completely free. Sequel Pro can export Graphviz dot files, and then all you need is a few tools to create the diagram.

  • Install graphviz from MacPorts via your Terminal:
    sudo port install graphviz
  • Install Sequel Pro, run the app, connect to your MySQL server and open the database you’d like to diagram.
  • Go to File > Export > Graphviz Dot file, and save the file somewhere convenient.
  • Generate an SVG file of your diagram:
    dot -Tsvg your_database.dot > your_database.svg
  • You can open SVG files with Opera, Safari, Illustrator, etc, but you can generate a PNG file in a number of ways. You can try installing ImageMagick or libsrvg from MacPorts, or use Illustrator or Inkscape to open and convert the file.
    • ImageMagick: convert your_database.svg your_database.png
    • libsrvg: cat your_database.svg | rsvg-convert -o your_database.png

The result is a basic table-based ERD, but it’s not bad for a few minutes of your time.

Thanks to the geert README for the introduction to this process.

Storing IP addresses as integers

Tying long sessions to an IP address is a good way to ensure some security for your users. But inevitably, you’ll want to store that IP. While you could, of course, store it as a string, such as "123.125.126.127" (called a “dotted quad”), some developers might prefer saving some space and storing it as a compact integer. There’s a right way and a wrong way to do this. The wrong way is as follows:

'12.34.56.78'.sub('.', '').to_i

This is a bad idea because two IP addresses can result in the same integer: 12.34.56.78 and 123.4.56.78 are both valid IPs. The correct way is as follows:

'12.34.56.78'.split('.').collect(&:to_i).
              pack('C*').unpack('N').first

To understand this seemingly obfuscated train wreck, let’s look at the data in this chain after each method call:

'12.34.56.78'.split('.'). #=> ['12', '34', '56', 78']
  collect(&:to_i).        #=> [12, 34, 56, 78]
  pack('C*').             #=> "\f\"8N"
  unpack('N').            #=> [203569230]
  first                   #=> 203569230

Many developers are unfamiliar with pack and unpack. These allow you to create and extract data into and out of binary-packed strings. In the third method call, you’ll see the result is "\f\"8N". This strange-looking string is really just 4 bytes of data, the numbers 12, 34, 56, and 78 in binary form, put into a string. We then unpack that 4-byte string into a 4-byte integer (with network byte order).

This is also the same problem solved by the C library method inet_aton, which is implemented as part of Ruby’s IPAddr class. Thus, there is a much simpler alternative:

require 'ipaddr'
IPAddr.new('12.34.56.78').to_i   #=> 203569230

But what fun is that!

PS. Yes, you can have multi-line method chains simply by leaving the dot at the end of the line. Ruby then knows to look for a method call on the next line. Both snippets are valid Ruby! You should, of course, indent appropriately.

Snow Leopard: Upgrading for Rails Developers

Upgrading to Snow Leopard is not as easy as Apple would lead you to believe; at least not if you’re a Rails developer. Here are a few select errors you might have encountered:

uninitialized constant MysqlCompat::MysqlRes
dlopen(/Library/Ruby/Gems/1.8/gems/nokogiri-1.3.3/lib/nokogiri/nokogiri.bundle, 9): no suitable image found

These errors seems to stem because OS X is now fully 64-bit, and unfortunately, all your compiled libraries are 32-bit. Whoops!

So, let’s fix them.

Important: These instructions are tailored for Intel 64-bit machines, as only 64-bit machines should have these issues. Any Intel Core 2 machines including recent Macbooks, Macbook Pros, and iMacs, and all Mac Pros should work fine with these instructions.

Install 64-bit MySQL

To fix the MySQL problem, you need install the x86_64 version of MySQL. This will uninstall your old MySQL version, but will not migrate your databases. In order to migrate your data, first make sure MySQL is not running, then:

$ sudo mv /usr/local/mysql/data /usr/local/mysql/data.default
$ sudo mv /usr/local/mysql-oldversion/data /usr/local/mysql/data

Start up MySQL and your databases should be in tact.

Reinstall the MySQL gem

Now that you have the correct version of MySQL, you need to reinstall the latest MySQL gem. Make sure to uninstall your current MySQL gem first:

$ sudo gem uninstall mysql
$ sudo env ARCHFLAGS="-arch x86_64" gem install mysql -- --with-mysql-config=/usr/local/mysql/bin/mysql_config

Re-install MacPorts

To fix nokogiri (and other gems with external dependencies), you’ll need to fix your Macports install. Unfortunately, the only way to fix Macports is to completely reinstall it. You need to completely delete (or at least move) your /opt/local directory. Once that is done, download and install the latest version of MacPorts for Snow Leopard.

You’ll probably want to install two things pretty quickly: libxml2 for nokogiri and git-core:

$ sudo port install libxml2
$ sudo port install git-core +bash_completion +doc

Re-install nokogiri (and other gems)

Any gem that has a compiled component will need to be reinstalled. nokogiri is just one of the libraries; others include ruby-debug, ruby-prof, and a lot of others. To fix them, just run this command:

$ sudo gem pristine --all

Symbol vs String performance in Ruby

A more interesting metric to this discussion is the use of strings versus symbols. Fortunately, these types of discussions can easily be solved by benchmarks:

Results under Ruby 1.8.6:

                           user     system      total        real    less no op
String instanciation   9.050000   0.010000   9.060000 (  9.057162)   5.921219
Symbol use             5.130000   0.000000   5.130000 (  5.131844)   1.995901
new String#to_sym     14.550000   0.020000  14.570000 ( 14.567466)   11.431523
const String#to_sym   11.960000   0.010000  11.970000 ( 11.967217)   8.831274
String const lookup    6.350000   0.010000   6.360000 (  6.358697)   3.222754
Symbol const lookup    6.400000   0.010000   6.410000 (  6.416395)   3.280452
No op                  3.130000   0.010000   3.140000 (  3.135943)   n/a

Results under Ruby 1.9.1:

                           user     system      total        real    less no op
String instanciation   9.170000   0.010000   9.180000 (  9.174451)   4.736163
Symbol use             4.920000   0.010000   4.930000 (  4.930031)   0.491743
new String#to_sym     18.080000   0.030000  18.110000 ( 18.087386)   13.649098
const String#to_sym   13.940000   0.030000  13.970000 ( 13.954099)   9.515811
String const lookup    4.920000   0.000000   4.920000 (  4.927497)   0.489209
Symbol const lookup    4.910000   0.020000   4.930000 (  4.921151)   0.482863
No op                  4.440000   0.000000   4.440000 (  4.438288)   n/a

What do these results mean? Well, first you need to subtract out the “no op” results from all the others, which I’ve added as a column above. We can now see that string instantiation takes about 90 nanoseconds, which means about 11000 string instantiations per millisecond. Are symbols faster? Considerably so. But the real lesson here is that these numbers are so small that no one in there right mind should spend time worrying about them.

So please, use symbols when you should use symbols, and otherwise use strings.

Remembering lighttpd, nginx, and the Internet as a pipe

We asked ourselves a year ago if lighttpd 1.5.0 was vaporware. It seems that was nearly true. At that time, nginx, apache’s mod_proxy_balancer, and haproxy were flourishing as Rails proxy solutions. The more recent introduction of Phusion Passenger (mod_rails) and Ruby Enterprise Edition, both excellent, free, and open-source products, has now driven most deployments (including our own) away from proxying altogether.

There is—or was—generally two counter-arguments to Passenger. The first is the stability and performance argument, which is well understood and has been discussed at length. I believed it was well summarized by Engine Yard’s discussion on the topic. But this is not the argument I’m not interested in.

My early argument against mod_rails was complexity: first, by using Apache instead of a smaller, simpler web server, and second, using a platform that I don’t comprehend and can’t debug. This argument relates to an old article I wrote: that Rails applications—and applications in general—should act like a pipe. (The original article, by James Duncan Davidson, is so old its only accessible through archive.org). My presumption was that the same argument should apply to a 3rd-party mod_rails solution: I don’t know anything about the app server, so if I have problems, I’m screwed. How is this better than FastCGI?

It is ironic, then, to note the success of Passenger in light of Rails’ history with FastCGI. The reason, as it turns out, basically comes down to two answers:

  1. Passenger is easy to set up under Apache, and requires less configuration than under nginx
  2. Passenger always works, in my experience, and thus debugging problems is non-existent.
  3. As a bonus, if you’re using something like monit to ensure your app stays up, monitoring apache2 is a lot easier than individual mongrel processes.

So, better products are better, and users (including myself) will flock to them once they realize it.

The Future

I do wonder, though, whether history will continue to repeat itself in this regard. Where will the next improvements be? What will be our theory (or justifications, as it seems to turn out) behind those improvements? Personally, I’ve been eying threaded solutions for a long time, and given that mod_rails is a multi-process solution, perhaps threads will move things forward in the future. However, given that Ruby 1.9 is still limited by the Global Interpreter Lock, we might not see a threaded answer for a while. The closest we’ll come in the short term is JRuby. Could it be possible that JRuby is the future? Only time will tell.

Garbage collection thresholds in Ruby

I’ve put together a somewhat extensive collection of scripts and results from looking at how often objects get garbage collected in Ruby 1.8 and Ruby 1.9. Performance metrics are also provided running the script under various different environments, which I found quite fascinating. irb, for instance, is ridiculously slow. Granted, the vast majority of these scripts times are spent performing very inefficient ObjectSpace calls, so the raw numbers should be taken with a grain of salt. The metrics are only interesting for comparison, and are of questionable use.

This was instigated by a String versus Symbol discussion in #ruby on Freenode. The individual was primarily worried about memory usage, and these scripts confirm that strings will get garbage collected often and quickly.

One thing to note is how Ruby 1.8 changes its thresholds for garbage collection under each environment, which may be based on the amount of objects in global space that it cannot garbage collect. Under Ruby 1.9, the results are bit more consistent, although this may be due to better metrics being available in Ruby 1.9.

ls, colors, and Terminal.app

This isn’t a Ruby thing but many of us spend a lot of time in Terminal.app, and I suspect few of you have taken the time to both enable colors and change your LSCOLORS, the setting which affects what colors ls uses when in color mode.

Enable Colors in ls

In order for ls to use colors at all, you need to set up an alias to turn colors on. To do this, open (or create) .profile file in your home directory using your favorite text editor and add:

alias ls="ls -G"

Now open a new Terminal window and type ls. You will see colors, hurray!

Make Colors Linux-like

If you’re used to Linux-like colors, you will appreciate this setting. This is what I use and it works particularly well on dark Terminal backgrounds (I use the “Pro” theme). I also check off “Use bright colors for bold text” under Terminal > Preferences > Settings. Again, add this to your .profile:

export LSCOLORS="ExGxBxDxCxEgEdxbxgxcxd"

Customize Your Colors

The values in LSCOLORS are codes corresponding to different colors for different types of files. The letter you use indicates which color to use, and the position in the string indicates what type of file should be that color. Each color comes in pairs – a foreground color and a background color. Here is a list of color values:

  • a = black
  • b = red
  • c = green
  • d = brown
  • e = blue
  • f = magenta
  • g = cyan
  • h = grey
  • A = dark grey
  • B = bold red
  • C = bold green
  • D = yellow
  • E = bold blue
  • F = magenta
  • G = cyan
  • H = white
  • x = default

And here is a list of the positions in LSCOLORS:

  1. directory
  2. symbolic link
  3. socket
  4. pipe
  5. executable
  6. block device
  7. character device
  8. executable with setuid set
  9. executable with setguid set
  10. directory writable by others, with sticky bit
  11. directory writable by others, without sticky bit

The default is “exfxcxdxbxegedabagacad”, which indicates blue foreground with default background for directories, magenta foreground with default background for symbolic links, etc.

Interview on the Ruby on Rails podcast.

The interview I did with the incredibly gracious and awesome Geoffrey Grosenbach (of Peepcasts) on the official Ruby on Rails podcast just went up.

I talk a little bit about consulting at Norbauer Inc, a bit about RubyRags, and I spend a bit of time kicking social networks in the nuts.

Incidentally, I’m proud to announce that we’re now selling Peepcode shirts at RubyRags.