public
Description: A swift, liberal HTML parser with a fantastic library
Home | Edit | New

Home

A Fast, Enjoyable HTML Parser for Ruby

Hpricot is a very flexible HTML parser, based on Tanaka Akira’s HTree and John Resig’s jQuery, but with the scanner recoded in C. I’ve borrowed (what I believe to be) the best ideas from these wares to make Hpricot heaps of fun to use.

 # load the Family guy's home page
 require "hpricot" # need hpricot and open-uri
 require "open-uri"
 doc = Hpricot(open("http://www.fox.com/familyguy/index.htm"))
 # change the CSS class on list element ul
 (doc/"ul.site-nav").set("class", "new-site-nav")
 # remove the header
 (doc/"#header").remove
 # print the altered HTML
 puts doc

A Proper Start

Last edited by andhapp, Sun Sep 06 17:00:23 -0700 2009
Home | Edit | New
Versions: