A prettier truncate helper

written by daniel on July 10th, 2007 @ 12:23 AM

Tired of Rails’ truncate() method cutting off mid-word? Me neither, but my clients are.

Enter awesome_truncate:

# Awesome truncate
# First regex truncates to the length, plus the rest of that word, if any.
# Second regex removes any trailing whitespace or punctuation (except ;).
# Unlike the regular truncate method, this avoids the problem with cutting
# in the middle of an entity ex.: truncate("this & that",9)  => "this &am..."
# though it will not be the exact length.
def awesome_truncate(text, length = 30, truncate_string = "...")
  return if text.nil?
  l = length - truncate_string.chars.length
  text.chars.length > length ? text[/\A.{#{l}}\w*\;?/m][/.*[\w\;]/m] + truncate_string : text
end

The comments say it all, as the code is pretty much voodoo. The bulk of it is based on the standard truncate, so the magic is in the regular expressions.

The resulting string will always end with a full word, not punctuation or whitespace or the middle of an entity. It means the you don’t get the exact number of characters, but usually very close. I prefer this to truncating to a number or words; you get closer to the same number of characters each time.

I’ve seen top apps (Basecamp) get bit by truncating in the middle of an entity. This is my favorite bit.

I’m curious if anyone can think of a more efficient method.

Comments

  • kamal on 10 Jul 03:26

    There’s also a patch at http://dev.rubyonrails.org/ticket/8682 which truncates to a separator before the limit.

  • Chris Blow on 20 Oct 00:11

    Awesome indeed!

    I also wanted the last word to be tacked on (aka the way that OS X handles truncating) so I added this bit:

    last_word = text.split.last

    and then added it to the string output:

    truncate_string + last_word

    and it werks … beautifully.

  • Alan Miles on 02 Dec 18:28

    Nice Daniel. Used it already on my site – and had a little blog about it too.

  • Nuno Job on 14 Feb 11:25

    This is my approach, word based and not char based.

    def truncate_words(text, length = 30, separator = ' ', truncate_string = '...') ' ' if text.nil? truncated_text = text.split[0..length].join(separator) if(truncated_text == text) text else truncated_text + ' ' + truncate_string end end

    What do you think? It looks more efficient than that nasty regexp.

  • Daniel Morrison on 14 Feb 15:29

    Nuno,

    My suspicion is that the regexp would be quicker.

    I stuck with characters because I wanted it to be the same interface as the built-in truncate. Given that some words are much longer than others, I can think bettter in terms of characters than words.

    Of course that’s just me. Your method looks fine and is certainly easier to read.

  • Joe Van Dyk on 20 Mar 03:12

    Hm, it seems to fail if the text is all symbols (and no letters).

    Try that function with this text:

    (> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.(> ^.( O_o)>

    Seems to fail for me on my site.

  • Daniel Morrison on 20 Mar 18:59

    Joe:

    Yes, that’s because the regexp is specifically looking for word characters (\w). I guess I could refactor it to look for spaces instead, but I don’t have that immediate need.

Post a comment

Options:

Size

Colors