Obscure and Ugly Perlisms in Ruby

Posted by Nick Sieger Sat, 06 Oct 2007 12:39:00 GMT

So, it’s well known that Ruby owes a debt to its predecessor Perl, although some (maybe many) question whether we should repay that debt or even go so far as to put Perl on trial and excise those elements which somehow haphazardly survived the generation gap. It turns out the evidence is mixed.

Update: I use the word “obscure” in the title because, in my experience, they are obscure. “Ugly” is pure opinion, but this is my blog, after all.

Exhibit A: BEGIN/END

Update: Yes, yes, this is an awk-ism, not a perlism, strictly speaking. And I don’t deny its usefulness for pure scripting tasks. I just don’t see its utility in a larger application.

END {
  puts "Bye!"
}

puts "Processing..."

BEGIN {
  puts "One moment while I start your program"
}

Output:

One moment while I start your program
Processing...
Bye!

Why would any sane Ruby programmer do this? Have you ever seen a use for BEGIN that isn’t met by simply executing code at the top level of the main program? Geez, BEGIN even has its own node in the AST!

And how about END? If you really need to hook into interpreter shutdown, just use Kernel#at_exit. (In fact, Rubinius currently uses END simply as an alias for at_exit.)

Exhibit B: <> (ARGF)

Thank goodness we didn’t get the diamond operator in Ruby, but we did get ARGF as a replacement. Though obscure, it actually turns out to be useful. Consider this program, which prepends copyright headers in-place (thanks to another perlism, -i) to every file mentioned on the command-line. Any other creative uses of ARGF out there?

#!/usr/bin/env ruby -i

Header = DATA.read

ARGF.each_line do |e|
  puts Header if ARGF.pos - e.length == 0
  puts e
end

__END__
#--
# Copyright (C) 2007 Fancypants, Inc.
#++

Exhibit C: The Flip-flop

This is a weird beast. I didn’t even know of its existence until Charlie was complaining about having to compile it properly. Apparently we have Perl to thank for this nonsense as well (and, indirectly, sed). With the exception of the sed-ism, I’m not convinced it adds any value -- in fact the code usually ends up looking more verbose.

This program, when run with itself as an argument, prints out everything between BEGIN and END.

#
# BEGIN

ARGF.each_line do |line| 
  if (line =~ /^# BEGIN/)..(line =~ /^# END/)
    puts line
  end
end

# END
# 

This snippet is a long-hand way to do 5.upto(10) {|i| puts i}.

i = 5
while (i == 5)...(i == 10) do
  puts i
  i += 1
end

Exhibit D: Output from defined?

Not sure if this came from Perl.

The basic need for defined? in a dynamic language is unquestionable. Instead, I meant to highlight the fact that defined? returns a string value here, which is strange.

Constant =  "Constant"
@ivar = [1, 2, 3]
integer = 10

puts "const : #{defined?(Constant)}"
puts "ivar  : #{defined?(@ivar)}"
puts "global: #{defined?($0)}"
puts "local : #{defined?(integer)}"
puts "expr  : #{defined?(Constant + integer)}"

Running this code produces:

const : constant
ivar  : instance-variable
global: global-variable
local : local-variable
expr  : method

Perl at least is sane enough to return true or false for its own defined operator. But method? Looking at the source, I see also expression, local-variable(in-block), assignment, class variable, true, false, and self. But why would this output be useful? As if it isn’t already plainly obvious what is defined?.

Any other obscure features in Ruby that you love to hate?

Tags  | 24 comments

Comments

  1. Avatar Jeremy McAnally said about 1 hour later:

    I think you misspelled your title:

    Obscene and Ugly Perlisms in Ruby

    There we go.

  2. Avatar Nathan said about 10 hours later:

    defined? comes in handy sometimes. I’ve even seen the output values used here and there. Since everything but false and nil evaluates to true, it might as well give extra information - and it can be very useful to be able to dynamically determine what’s defined as what.

  3. Avatar xan/hot@hot.com said about 10 hours later:

    You should provide the perl version of the ruby code too if you claim it is a Perlism.

    Also, the ‘;’ is not needed: integer = 10;

    Dont put more perl into ruby than in fact exists, I want to see the perl version of the code above too! ;-)

    Also $0 is $PROGRAM_NAME, lets use the longer names instead of the cryptic and hard-to-google perl legacy vars (they seem only useful for code golfing and one liners to me anyway)

  4. Avatar Nick said about 15 hours later:

    Nathan: Careful:

    defined?(false) => "false"
    defined?(nil)   => "nil"
    

    Maybe that’s not what you meant. As for its utility, I’ll wait until I see an example (although I’m not denying your claim that there is one). I couldn’t find any usage of defined? in the standard library that used the result for anything other than a boolean test.

    xan: Maybe the Perl code should be here, but the point is not to compare Ruby to Perl at the syntax level. Also, the examples are pretty simplistic (maybe too much so) and the equivalent Perl is not hard to imagine. But thanks for pointing out the semi! At least you can tell I was futzing with Perl a little bit during the preparation of the article.

    Everyone else: I don’t know why I initially said Perl is a “distant relative” -- I think I was getting overly dramatic with my prose. I’ve corrected that, as well as a side comment about quines -- someone seemed to think that was evidence I don’t know what they are. Whatever, it wasn’t adding anything, so it’s gone.

  5. Avatar Sixpack dude said about 15 hours later:

    Ex. A is an ‘awk-ism’ rather than a perl-ism. Read a book on Awk first, dude.

  6. Avatar Steve Stone said about 16 hours later:

    I actually find END blocks very useful in one-liners, particularly with the -n and -p switches.

    BEGIN blocks can have similar uses... though they’re really more useful for manipulating the interpreter at compile time. It’s important to understand the difference between “gets executed as the first thing in this code” versus “gets executed before any of the rest of this code is even compiled”.

  7. Avatar Robert Fischer said about 17 hours later:

    BEGIN blocks are extremely useful for generating methods at run-time or other cute stunts. I’ve never found a lot of use for END blocks, although I’ve heard some people use them to make sure all the filehandles are closed before a program exits.

    In either case, you use them because you either 1) you don’t have access to main thread of code being executed, or 2) you want to execute something before any other code even has a chance to be executed.

    It’s the kind of thing that seems worthless and silly until it is critical and necessary.

  8. Avatar Nathan said about 17 hours later:

    Nick:

    defined?(false) and defined?(nil) return strings, so something like

    if defined?(nil)
      puts "It's defined!"
    else
      puts "It's not."
    end

    works as you’d expect.

    One of the places I’ve used it is when writing plugins for Rails that can also operate outside of Rails. In this case I might do

    if defined?(ActiveRecord)
      class ActiveRecord::Base
        # Add relevant functionality
      end
    end
  9. Avatar Koz said about 20 hours later:

    Yeah, defined? is really useful when writing ‘really reflective’ code that gets used in places you don’t intend. For example the new routing optimisation code uses if defined?(request) to know if it can rely on using the request object to provide the host, protocol and port.

    However the particular string values returned don’t seem particularly useful. I thought it was a boolean until I used it in irb one day :).

  10. Avatar Ola Bini said 1 day later:

    Hi Nick,

    I kinda agree with some of what you write - defined? is totally more gross than it needs to be, but highly useful in some cases. For example, doing defined? on constants can be very nice. The string result is totally unnecessary, though.

    BEGIN can be useful in libraries sometimes, while END seems to not give anything more than at_exit.

    Flip-flops should DIEDIEDIE.

  11. Avatar Matthieu Riou said 1 day later:

    What about this?

    $irb

    foo = “foo” unless defined?(foo) foo => nil

    But:

    $irb

    unless defined?(foo) foo = “foo” end foo => “foo”

  12. Avatar riffraff said 1 day later:

    my 2c: I agree that the flip flop operator is rather obscure anod doesn’t deserve special syntax (it can be factored out into a class) but his behaviour is quite useful, and the equivalent ruby code without the flip flop hardly looks shorter and cleaner:

    tmp = false ARGF.each_line do |line| if tmp puts line tmp = !(line =~ /^# END/) else tmp = (line =~ /^# BEGIN/) end end

    Anyway while we’re at perlisms the example could also use $_ and regexps in conditionals :)

  13. Avatar Matthieu Riou said 1 day later:

    Mmmh the formatting didn’t quite work (guess I should have used pre blocks). The first snippet is with an unless as a single, the second inside a block. The => isn’t a hash, it’s the result produced by irb when you ask for the value of foo.

  14. Avatar Marsvin said 1 day later:

    I’m no expert, but I think BEGIN is used in Perl modules to do fancy stuff with the syntax tree and END is used to clean up things that the user of the module may not know about.

  15. Avatar Claus said 1 day later:

    <>, BEGIN/END and defined are natural and non-ugly in perl. It’s hard to convince a perl hacker that anything valuable is gained by throwing out the brevity of file I/O perl has. You’re missing the point of BEGIN/END - they are guaranteed to run, so they capture failure/exception conditions also, which makes them essential. Obviously defined is of less value in a pure OO language.

  16. Avatar Bruno Goncalves said 1 day later:

    The BEGIN and END clauses seem to have come straight out of awk where they actually make sense.

    http://www.bgoncalves.com/notes/2007/04/16/gawk-for-dummies-part-i/

    There they are executed before the first file is read and after the last file is read, respectively. Of course, in a more general purpose language they are mostly useless.

  17. Avatar Mark Thomas said 1 day later:

    I’ve used BEGIN and END in perl and found them very useful. In a persistent environment (PPerl, PerPerl, etc), the BEGIN clause is executed once only, whereas the body of the code is executed N times, with variables persisting between invocations. This lets you use BEGIN to establish connection pools, parse config files, and other one-time-only things. Likewise tear-down can go into END.

  18. Avatar Roger said 1 day later:

    Wow, a blogger ignorant of both basic Ruby and Perl! While it’s fine posing questions about a language’s syntax while you are learning it, claiming that what you don’t understand is ‘obscure’, ‘ugly’ and flawed is a red flag of a true poser.

  19. Avatar 13az said 2 days later:

    Re: first comment....

      I think you misspelled your title:

      Obscure and Beautiful Perlisms in Ruby

    Has most peeps have already mentioned these aren’t all Perlisms (historically) and Perl does more things with these “Perlisms” than the context provided in your code.

    If Ruby is the bastard son of Perl then please treat it’s lineage with more respect ;-)

  20. Avatar Michael Bar-Sinai said 2 days later:

    Good post. Here’s my “favorite”. The ||= operator. Generally it means “populate a variable with a value unless the variable is already populated”.

    irb(main):029:0> a_var = 3 => 3 irb(main):030:0> a_var => 3 irb(main):031:0> a_var ||=”four” => 3 irb(main):032:0> a_var => 3

    So far so good, right?

    irb(main):033:0> a_var = false => false irb(main):034:0> a_var => false irb(main):035:0> a_var ||=true => true irb(main):036:0> a_var => true

    Now imagine you have a view that uses a (hopefully) boolean instance variable set in the controller (we’re talking rails here), and you want to set it to true if it is not already set. So you use ||=. So you spend the next 3 hours banging your head at the keyboard.

  21. Avatar andreas said 3 days later:

    @Michael Bar-Sinai

    irb(main):038:0> f_var = false
    => false

    irb(main):039:0> t_var = :somethingOtherValueExceptNilAndFalse
    => :somethingOtherValueExceptNilAndFalse

    irb(main):040:0> t_var
    => :somethingOtherValueExceptNilAndFalse

    irb(main):041:0> f_var ||= true
    => true

    irb(main):042:0> t_var ||= true
    => :somethingOtherValueExceptNilAndFalse

    So be careful with the BooleanGang!

  22. Avatar andreas said 3 days later:

    @Michael Bar-Sinai

    The ||= Operator in
    a ||= b
    stands for:
    a = a || b
    Meaning: Set a = a if a is not nil or false, otherwise set a = b (to the second argument of the exclusive logical or operator, short-circuit evaluation).
    The second argument b of the logical OR || operator will only be evaluated if the first argument a is false (or nil).

  23. Avatar bjc said 3 days later:

    Your accusation of the “flip-flor” is way off.

    “This program, when run with itself as an argument, prints out everything between BEGIN and END.”

    Which is exactly what the toggle code does.

    Your ruby code, on the other hand, prints out all the lines between 5 and 10, which in this example happens to be those lines between BEGIN and END.

    To call the programs equivalent would be a huge stretch of the word. And it also happens to show exactly why the “flip-flop” is useful.

  24. Avatar na said 3 days later:

    I’m a long-time Perl developer and have just recently switched to Ruby full-time.

    I love it. Great language. I’m surprised at how many Ruby developers are ignorant of much Perl has been incorporated. It’s a very simple switch from Perl to Ruby.

    I think there are a lot of great features of Ruby and hopefully it will get a bit faster (ahem).

    I never found Perl obscure and still don’t. I think most of that criticism comes from people versed in other more traditional languages who never really learned Perl well.