A prettier truncate helper

Tired of Rails’ truncate() method cutting off mid-word? Me neither, but my clients are.

Enter awesome_truncate:

# Awesome truncate
# First regex truncates to the length, plus the rest of that word, if any.
# Second regex removes any trailing whitespace or punctuation (except ;).
# Unlike the regular truncate method, this avoids the problem with cutting
# in the middle of an entity ex.: truncate(“this & that”,9) => “this &am…”
# though it will not be the exact length.
def awesome_truncate(text, length = 30, truncate_string = “…”)
return if text.nil?
l = length – truncate_string.chars.length
text.chars.length > length ? text[/\A.{#{l}}\w*\;?/m][/.*[\w\;]/m] + truncate_string : text
end

The comments say it all, as the code is pretty much voodoo. The bulk of it is based on the standard truncate, so the magic is in the regular expressions.

The resulting string will always end with a full word, not punctuation or whitespace or the middle of an entity. It means the you don’t get the exact number of characters, but usually very close. I prefer this to truncating to a number or words; you get closer to the same number of characters each time.

I’ve seen top apps (Basecamp) get bit by truncating in the middle of an entity. This is my favorite bit.

I’m curious if anyone can think of a more efficient method.

7 Comments

kamal — July 10, 2007

There’s also a patch at http://dev.rubyonrails.org/ticket/8682 which truncates to a separator before the limit.
Chris Blow — October 20, 2007

Awesome indeed!

I also wanted the last word to be tacked on (aka the way that OS X handles truncating) so I added this bit:

last_word = text.split.last

and then added it to the string output:

truncate_string + last_word

and it werks … beautifully.
Alan Miles — December 02, 2007

Nice Daniel. Used it already on my site – and had a little blog about it too.
Nuno Job — February 14, 2008

This is my approach, word based and not char based.

def truncate_words(text, length = 30, separator = ' ', truncate_string = '...') ' ' if text.nil? truncated_text = text.split[0..length].join(separator) if(truncated_text == text) text else truncated_text + ' ' + truncate_string end end

What do you think? It looks more efficient than that nasty regexp.
Daniel Morrison — February 14, 2008

Nuno,

My suspicion is that the regexp would be quicker.

I stuck with characters because I wanted it to be the same interface as the built-in truncate. Given that some words are much longer than others, I can think bettter in terms of characters than words.

Of course that’s just me. Your method looks fine and is certainly easier to read.
Joe Van Dyk — March 20, 2008

Hm, it seems to fail if the text is all symbols (and no letters).

Try that function with this text:
(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.( O_o)>
Seems to fail for me on my site.
Daniel Morrison — March 20, 2008

Joe:

Yes, that’s because the regexp is specifically looking for word characters (\w). I guess I could refactor it to look for spaces instead, but I don’t have that immediate need.