This repository is private.
All pages are served over SSL and all pushing and pulling is done over SSH.
No one may fork, clone, or view it unless they are added as a member.
Every repository with this icon (
) is private.
Every repository with this icon (
This repository is public.
Anyone may fork, clone, or view it.
Every repository with this icon (
) is public.
Every repository with this icon (
Home
HpricotScrub
Hpricot Scrub is a wrapper around Hpricot that adds methods to scrub HTML tags from a document.
To Install
gem install hrpicot_scrub </pre>Now you can use the following to remove all tags from an HTML doc
require 'rubygems' require 'hpricot_scrub'doc = Hpricot(open(‘http://slashdot.org/’).read)
text = doc.scrub
Scrub the doc based on a config hash ([source:/examples/config.yml sample config])
doc.scrub(hash) </pre>Strip all hrefs, leaving the text inside in tact
(doc/:a).strip </pre>The gem version also has a couple of new convenience methods on String
String#scrub(config={}) String#scrub!(config={}) </pre>
>> str = '<a href="http://example.com/">example.com</a>' => "<a href="http://example.com/">example.com</a>" >> str.scrub => "example.com" >> str => "<a href="http://example.com/">example.com</a>" >> str.scrub! => "example.com" >> str => "example.com" </pre>







