read

I Do?

Oh yes. When it comes to working with the filesystem, Ruby is quite idiosyncratic – not only are there Dir, File (which happens to include FileTest) and FileUtils, but the seemingly simple concept of representing the path is split between the old, trusty, 168-instance-method-strong String and all the more appropriate Pathname (with its humble 141 methods per instance), which is worth using instead.

But… Why?

First, they happen to be objects actually representing a path, so using them for this purpose in an object-oriented language might be quite a reasonable thing in itself – but there are also pragmatic reasons. Contrary to general-purpose Strings – at best able to describe directory and file names – Pathnames are meant to represent filesystem paths with all the associated baggage: from proper path concatenation and the ability to traverse themselves, through useful methods (Pathname#sub_ext is so convenient when you need it!), to a plethora of predicates like #exist?, #readable? or #symlink?.

Oh OK. Sooo…

…how should you go about doing the switch? If your system is split into small, dedicated pieces, you can try doing it piece-by-piece. Parts that share a common object (like configuration that exposes paths) would need to be changed in one go or keep Strings as the exchange/boundary format for starters.

Pathnames can be created from Strings in two ways, either via a classic Pathname.new constructor or the funnily-cased Kernel.Pathname method:

Pathname.new('/etc/passwd') #=> #<Pathname:/etc/passwd>
Pathname('/etc/shadow')     #=> #<Pathname:/etc/shadow>

Note that both can transparently accept a Pathname argument, so you don’t have to worry about wrapping a String in a Pathname too many times:

Pathname.new(Pathname.new('/etc/passwd')) #=> #<Pathname:/etc/passwd>
Pathname(Pathname('/etc/shadow'))         #=> #<Pathname:/etc/shadow>

Symetrically, calling #to_s on both Pathnames and Strings will return a String, so you can defensively use it in the transition period.

File.join calls can be replaced by Pathname#join called on the first argument:

File.join(Dir.tmpdir, 'secrets')     #=> '/tmp/secrets'
Pathname(Dir.tmpdir).join('secrets') #=> #<Pathname:/tmp/secrets>

Strings and Pathnames can also be appended to existing Pathnames via the nicely-looking Pathname#/:

Pathname('/etc') / 'vim' / 'vimrc' #=> #<Pathname:/etc/vim/vimrc>

Commonly-used cases, like getting the current directory and globbing, can also operate on Pathnames from the get-go:

Pathname.pwd #=> #<Pathname:/home/chastell/coding/blog.rebased.pl>
dotfiles = Pathname.glob("#{ENV['HOME']}/.*").reject(&:directory?)

Where Pathnames really shine – in my experience – is handling directory / file / extension splitting:

post = Pathname.glob(Pathname.pwd / '_posts' / '*pathname*.md').first
post.dirname          #=> the directory as a Pathname
post.basename         #=> the file as a Pathname
post.split            #=> an Array with the above two
post.extname          #=> just the extension (as a String)
post.sub_ext('.html') #=> a Pathname with the extension replaced

Finally, Pathnames should work just fine as arguments to existing File and FileUtils methods, but some of them (like renaming or writing to a file) can be made more object-oriented:

config = Pathname('~/.emacs').expand_path
config.rename config.sub_ext('.my-precious')
config.write '(message "VIM HOOLS")'

Anything I Need to Be Wary Of?

I’m glad you asked! Plenty.

First and foremost, a Pathname is not a glorified String – most (107!) of String instance methods are not callable on Pathnames, and while some (like #center or #succ) wouldn’t make a lot of sense, lack of some others might be surprising (filesystem-aware #downcase and #upcase could be useful, #start_with? and #end_with? as well). If this wasn’t problematic enough, methods common to both classes can have vastly different (if reasonable) meaning: Pathname#split splits into directory and file, not on whitespace, and Pathname#size returns the size of the file, not the length of the path. I’m so happy that your app has proper test coverage which will catch any rogue corner cases!

If your app has any kind of configuration file (a YAML one, perchance?) that is supposed to be editable by hand, it’s usually a better idea to store the paths as Strings:

---
party_starters:
- /usr/share/games/fortunes/fortunes

Storing them as Pathnames is doable, but the file ends up much less user-friendly:

---
party_starters:
- !ruby/object:Pathname
  path: /usr/share/games/fortunes/fortunes

Be sure to diligently serialise and deserialise your Pathnames to and from Strings in your configuration singleton object.

Finally, note that although Pathname.pwd and Pathname.glob exist (and return Pathnames), __dir__ and Dir.tmpdir return Strings, so need to be wrapped in a Pathname() call; note also that Pathname#write and Pathname#/ were added in Ruby 2.2 (but previous Ruby versions can use Pathname#+ for the latter).

Whew. Is There Anything Else to Be Said?

Oh very much, this is a fascinating topic! Do feel free to check out Arne Brasseur’s in-detail write-up of the Pathname API, Rob Miller’s musings on how paths aren’t strings and – if you’re a paying subscriber, which you should consider being – the relevant Ruby Tapas episode.

Blog Logo

Piotr Szotkowski


Published

Rebased Blog

Rebased Team writing about tech we use. Languages, frameworks, libraries, tools. Certified for 0% fluff.

Back to Overview