We recently needed to show a truncated version of existing HTML content. Although there are several issues1 when dealing with HTML content, our specific concern was maintaining the integrity of the HTML. Some quick googling led to a nice helper written by Henrik Nyh last year. We tweaked the original a bit to append the ellipsis within the tag at the truncation point and truncate at a word (or tag) boundary. Here it is, enjoy.
1 # By Henrik Nyh <http://henrik.nyh.se> 2008-01-30. 2 # Free to modify and redistribute with credit. 3 # Word truncation and fixes by Les Hill <http://blog.leshill.org> 2009-06-02 4 # 5 6 require "rubygems" 7 require "hpricot" 8 9 module TextHelper 10 11 # Like the Rails _truncate_ helper but doesn't break HTML tags or entities. 12 def truncate_html(text, max_length = 30, ellipsis = "...") 13 return if text.nil? 14 doc = Hpricot(text.to_s) 15 doc.inner_text.chars.length > max_length ? doc.truncate(max_length, ellipsis).inner_html : text.to_s 16 end 17 18 def self.truncate_at_space(text, max_length, ellipsis = '...') 19 l = [max_length - ellipsis.length, 0].max 20 stop = text.rindex(' ', l) || 0 21 (text.length > max_length ? text[0...stop] + ellipsis : text).to_s 22 end 23 end 24 25 module HpricotTruncator 26 module NodeWithChildren 27 def truncate(max_length, ellipsis) 28 return self if inner_text.chars.length <= max_length 29 truncated_node = dup 30 truncated_node.name = name 31 truncated_node.raw_attributes = raw_attributes 32 truncated_node.children = [] 33 each_child do |node| 34 break if max_length <= 0 35 node_length = node.inner_text.chars.length 36 truncated_node.children << node.truncate(max_length, ellipsis) 37 max_length = max_length - node_length 38 end 39 truncated_node 40 end 41 end 42 43 module TextNode 44 def truncate(max_length, ellipsis) 45 self.content = TextHelper.truncate_at_space(content, max_length, ellipsis) 46 self 47 end 48 end 49 50 module IgnoredTag 51 def truncate(max_length, ellipsis) 52 self 53 end 54 end 55 end 56 57 Hpricot::Doc.send(:include, HpricotTruncator::NodeWithChildren) 58 Hpricot::Elem.send(:include, HpricotTruncator::NodeWithChildren) 59 Hpricot::Text.send(:include, HpricotTruncator::TextNode) 60 Hpricot::BogusETag.send(:include, HpricotTruncator::IgnoredTag) 61 Hpricot::Comment.send(:include, HpricotTruncator::IgnoredTag)
1 For example: preventing XSS attacks, maintaining coherent styling.