Thursday, March 15, 2007

Using Net::https with Hpricot to parse html

My rails app that accessed sites to extract images from web pages started throwing this error (OpenSSL::SSL::SSLError). I was using open-uri to fetch pages.

I changed my code to use Net::https to fetch pages over SSL. Here is sample code to access a web page and print it contents to terminal over SSL.


require 'rubygems'
require 'hpricot'
require 'uri'
require 'net/https'


url = "https://www.orkut.com/GLogin.aspx?done=http%3A%2F%2Fwww.orkut.com%2F"

url1 = URI.parse(url)

http = Net::HTTP.new(url1.host, url1.port)
http.use_ssl = true

req = Net::HTTP::Get.new(url1.path)

res = http.start do |http|
http.request(req)
end

puts res.body

No comments: