Why DTrace Makes Leopard a Must-Have Upgrade

Posted by Nick Sieger Tue, 05 Feb 2008 21:12:00 GMT

I feel like I’m actually a relative late-comer to Leopard, at least in my social circle. A lot of the folks in the Ruby community already had it installed the week after it was out, and were showing it off at RubyConf back in November. I just didn’t have a compelling reason to upgrade and disrupt my workflow at the time. Plus, mixed reports were coming out about data loss, UI nits, and other instabilities.

By the time I went out to purchase, 10.5.1 was already the version boxed in the stores, and in retrospect, it seemed worth the wait. I haven’t had a single complaint or major issue with the upgrade so far, and have been enjoying the noticeable zippiness of a freshly-installed system.

Time Machine has been a widely-publicized feature, and has been touted as one of the top reasons to upgrade. So I bought a small portable drive with some leftover holiday gift cards and set out to try it. Initially it seemed promising, except after a day or two of backups the process would stall out during the “preparing” stage. Eventually I noticed that the TM background process, backupd, was eating up 0.5GB of memory and up to 100% of one of the CPUs.

If I wasn’t a nerd making my living having my way with computers, I probably would have given up on Time Machine at this point, after a couple of hours scouring Google and the Apple discussion boards searching for similar problems. But I knew that backupd had to be doing something pathological, and I was compelled to find out what.

On Solaris systems, truss is usually the order of the day for problems like this. It literally vomits an endless listing of system calls invoked by a process into your terminal window. Except there’s no truss on OS X. Is there a replacement? Google mentioned ktrace, present on Tiger systems and earlier, but it’s gone in Leopard. Replaced by? DTrace.

Ahhh, DTrace! Another geeky Leopard-only feature. Certainly DTrace will be able to trace system calls in the same manner as truss. But being a complete DTrace newb, I had no idea where to start. So, like any lazy programmer does, I started shopping around for examples to get me started. Looking around, this article on MacTech looked promising, but didn’t have what I needed. Eventually, I ended up finding the DTrace Toolkit on the OpenSolaris site.

The DTrace Toolkit appears to be your one-stop shop for all things DTrace. If you need a kick-start reason to take a look at DTrace and get you going, this is it. In my case, lo and behold, one of the scripts included in the toolkit is called dtruss!

Many of the scripts in the toolkit are tailored towards a Solaris system, and dtruss is a prime example. It won’t quite work out of the box on Leopard, because a few of the system calls mentioned in the script are non-existent there. Changing the shebang line at the top of the script to #!/bin/bash, and repeatedly running it a few times with sudo ./dtruss -p <pid> will give you an idea of which ones; I simply commented these out until I was successfully able to trace a process.

Now, finally we can pop the stack back to my original problem with Time Machine and backupd. I launched a backup run and waited for the process to start consuming large amounts of CPU and memory. I located the PID of the process in Activity Monitor, and started tracing it with my modified dtruss script. And, sure enough, I saw the following output scrolling by endlessly in my terminal:

mmap(0x0, 0x5000, 0x3)     = 958464 0
getdirentriesattr(..., ..., ...)    = ....
munmap(0xEA000, 0x5000)    = 0 0
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93

(The ellipsis were actual memory addresses, I didn’t save the output.) What was interesting is that the same chunk of memory (the first argument to getxattr) was floating by repeatedly. Looking at the man page for getxattr, the signature is:

ssize_t
getxattr(const char *path, const char *name, void *value,
         size_t size, u_int32_t position, int options);

So, the first argument contains the path. Now, how can I get the contents of that memory address? The answer is inside dtruss. Closer to the top of the script is this DTrace code:

/* print 3 args, arg0 as a string */
syscall::stat*:return, 
syscall::lstat*:return, 
syscall::open*:return
/* not on leopard -- syscall::resolvepath:return */
/self->start/
{
    /* calculate elapsed time */
    this->elapsed = timestamp - self->start;
    self->start = 0;
    this->cpu = vtimestamp - self->vstart;
    self->vstart = 0;
    self->code = errno == 0 ? "" : "Err#";

    /* print optional fields */
    OPT_printid  ? printf("%6d/%d:  ", pid, tid) : 1;
    OPT_relative ? printf("%8d ", vtimestamp/1000) : 1;
    OPT_elapsed  ? printf("%7d ", this->elapsed/1000) : 1;
    OPT_cpu      ? printf("%6d ", this->cpu/1000) : 1;

    /* print main data */
    printf("%s(\"%S\", 0x%X, 0x%X)\t\t = %d %s%d\n", probefunc,
        copyinstr(self->arg0), self->arg1, self->arg2, (int)arg0,
        self->code, (int)errno);
    OPT_stack ? ustack()    : 1;
    OPT_stack ? trace("\n") : 1;
    self->arg0 = 0;
    self->arg1 = 0;
    self->arg2 = 0;
}

I only had to add syscall::getxattr:return to the list of matched probes, and now I could finally inspect the path argument to getxattr:

munmap(0xEA000, 0x5000)      = 0 0
getxattr("/Users/nicksieger/Library/Mail/IMAP-nicksieger@gmail.com
@imap.gmail.com/[Gmail]/All Mail.imapmbox/Messages/1002680.emlx\0",
0x967288D4, 0x0)         = -1 Err#93
getxattr("/Users/nicksieger/Library/Mail/IMAP-nicksieger@gmail.com
@imap.gmail.com/[Gmail]/All Mail.imapmbox/Messages/1002681.emlx\0",
0x967288D4, 0x0)         = -1 Err#93
getxattr("/Users/nicksieger/Library/Mail/IMAP-nicksieger@gmail.com
@imap.gmail.com/[Gmail]/All Mail.imapmbox/Messages/1002682.emlx\0",
0x967288D4, 0x0)         = -1 Err#93
getxattr("/Users/nicksieger/Library/Mail/IMAP-nicksieger@gmail.com
@imap.gmail.com/[Gmail]/All Mail.imapmbox/Messages/1002683.emlx\0",
0x967288D4, 0x0)         = -1 Err#93
getxattr("/Users/nicksieger/Library/Mail/IMAP-nicksieger@gmail.com
@imap.gmail.com/[Gmail]/All Mail.imapmbox/Messages/1002684.emlx\0",
0x967288D4, 0x0)         = -1 Err#93
getxattr("/Users/nicksieger/Library/Mail/IMAP-nicksieger@gmail.com
@imap.gmail.com/[Gmail]/All Mail.imapmbox/Messages/1002685.emlx\0",
0x967288D4, 0x0)         = -1 Err#93
getxattr("/Users/nicksieger/Library/Mail/IMAP-nicksieger@gmail.com
@imap.gmail.com/[Gmail]/All Mail.imapmbox/Messages/1002686.emlx\0",
0x967288D4, 0x0)         = -1 Err#93
getxattr("/Users/nicksieger/Library/Mail/IMAP-nicksieger@gmail.com
@imap.gmail.com/[Gmail]/All Mail.imapmbox/Messages/1002687.emlx\0",
0x967288D4, 0x0)         = -1 Err#93

D’oh! GMail! That directory has so many files, it took over two minutes just for ls to list all 487356 of them. Hundreds of thousands of email messages, all being re-inspected by Time Machine every time some new messages are added to the directory. I’ll leave it to someone else to point fingers at what the actual problem is here (a side-effect of TM’s usage of hard links? Mail.app inefficiently storing too many messages in a single directory?), but after all this I just decided that I didn’t want a backup of my GMail messages since they’re stored on the server. So I added the directory to the list of excluded directories in TM, wiped my backup, and started over. (TM had similar problems trying to complete an incremental backup with the existing backed-up copy of my mail on the backup disk, so I decided to wipe it and start fresh.) I won’t declare the problem completely solved yet, but if it happens again, I’ll just repeat this process to find the new culprit. Hopefully I don’t end up excluding my entire home directory!

This whole process was a revelation to me -- the fact that I could pinpoint the exact problem in a piece of system software despite having few notions of the internals of that software. The next time I have a nagging issue in Leopard, DTrace will be my tool of choice in tracking it down. Let’s just hope it’s not a problem with iTunes!

Tags , ,  | 9 comments

RubyConf: Mac OS X and Ruby

Posted by Nick Sieger Sat, 21 Oct 2006 18:58:33 GMT

Laurent Sansonetti, who works at Apple Computer, is now in charge of maintaining Ruby inside the OS.

History of Ruby in the OS

  • 10.2 shipped with Ruby 1.6.7
  • 10.3 shipped with Ruby 1.6.8
  • 10.4 shipped with 1.8.4, and the latest software update has Ruby fixes to allow compilation of C extensions

Apple is actively looking to improve Ruby support in future versions of the OS in several areas:

  • Ruby.framework
  • 64 bit support
  • Controlling scriptable applications
  • Cocoa development

Framework

Why a framework? It’s easier for Mac development. It allows multiple versions, and lets you bundle Ruby inside your application. It’s also compatible with original layout /usr/bin/ruby => /System/Library/Frameworks/Ruby.framework/Current/bin/ruby

Misc

  • Include IRB support, RubyGems, plus important gems: Rake, Rails, libxml2 and sqlite3 bindings, DNSSD (Zeroconf)

Controlling scriptable applications

Traditionally applications are scripted by AppleScript. Apple Events are the IPC glue between the two.

RubyAEOSA is a Ruby wrapper for Apple events, that was started in 2001, but not active since 2003. The code required is unnecesarrily verbose. Instead, you could wrap and execute with AppleScript, but it’s slower and limited by knowledge of AppleScript.

RubyOSA is a new project created by Apple, intended to be a successor to RubyAEOSA, under active development and used today. It has a much more Rubyish API, generating Ruby code on the fly and sending events lazily. Apple events are completely hidden.

require 'rbosa'

itunes = OSA.app('itunes')
itunes.current_track
itunes.current_track.name
itunes.current_track.artist
itunes.current_track.size
itunes.play
itunes.pause
itunes.sound_volume = 0
100.times {|i| itunes.sound_volume = i; sleep 0.1 }

0.1.0 release is coming soon, gem install rubyosa.

Cocoa development

Mac development framework, mostly using Objective-C. RubyCocoa is a Ruby/Objective-C bridge, now under active development with support from Apple. It creates Obj-C classes to Ruby proxies and vice-versa. Messages and exceptions are forwarded, and types are converted.

It uses a message convention. ‘:’ are substituted with ‘_’ in the method name (selector). The final ‘_’ can be omitted, but can lead to ambiguities. It may be mandatory in a future release.

Some challenges remain in the Ruby/Obj-C bridge.

Problem: Speed of method invocation. Solution: libffi from gcc, allows invocation of arbitrary foreign function interfaces.

Problem: Maintenance. Currently RubyCocoa needs to be rebuilt to take into consideration new APIs, so to solve, a metadata framework is being built to allow the information to be more dynamic, and a metadata generator tool will help with creating a metadata file for new APIs

These latest developments are available from the apple-unstable branch in SVN from rubyforge.

Future

  • “Toll free bridge” -- implement Ruby primitive types with CoreFoundation objects so that no type conversion is necessary
  • Garbage collection -- Objective-C will be getting garbage collection in the future, and the idea is to implement the Obj-C garbage collector in Ruby through Obj-C APIs.

Tags , ,  | 5 comments | no trackbacks