Blog | Phodal - A Growth Engineer

Ruby RSS解析——Wordpress微信开发记

2014-06-17T05:05:17+00:00

在数据库和服务器不给力的情况下，只能尽可能多地将服务一点点从Wordpress中抽出来。 ##Wordpress RSS图片这里用到了``rss-image-feed`` The RSS Image Feed adds the first image of a post to your feeds, even in firefox and even if you only display the excerpt. RSS Image Feed会将文章中的第一张图片添加到feeds里。so安装它吧，效果可见: [http://www.xuntayizhan.com/feed/](http://www.xuntayizhan.com/feed/) ##Ruby RSS解析可以直接用RSS库解析+Nokogiri+open-uri，我们就可以返回我们需要的结果到公众微信里头。 require 'rss' require 'open-uri' require 'nokogiri' class Get_RSS def get_new result = [] url = 'http://www.xuntayizhan.com/feed' open(url) do |rss| feed = RSS::Parser.parse(rss) feed.items.each do |item| image_req = 'http://www.xuntayizhan.com/xt.jpg' if Nokogiri::HTML(item.description).at_css('img') image_req = Nokogiri::HTML(item.description).css('img').first['src'] end result << { :title => item.title, :description => Nokogiri::HTML(item.description).at_css('p'), :picture_url => image_req, :url => item.link } end end return result.take(7) end end ##其他但是问题并没有完成解决，也就是我们可以下载feed到目录里，而不是每次请求都去获取一次。于是下载feed，将URL改为本地。

ruby nokogiri ruby解析HTML

2014-03-29T02:39:33+00:00

这里是用到了nokogiri库，从某HTML里读出内容。不过有意思的是，当我们加上一些特定功能的时候就可以当一个爬虫到处搜索资料了。 ##Ruby Nokogiri## 安装nokogiri,大家都懂的 gem install nokogiri ##Ruby解析HTML## 我们想要的是从 phodal [caption id="attachment_23" align="alignnone" width="240"] 这段HTML中解析出里面的img标签，于是 require 'nokogiri' doc = Nokogiri::HTML('phodal [caption id="attachment_23" align="alignnone" width="240"]') p doc.css('img').first['src'] 于是这就是一个简单的示例，如果我们还想从某个网页中抓取我们需要的内容。


require 'rubygems'
require 'nokogiri'
require 'open-uri'
   
page = Nokogiri::HTML(open("http://www.phodal.com/"))   
puts page.css('img').first['src']

用于抓取本网站的第一个带有src的img标签，换句话说就是图片的来源。而这里是用于解析上一篇中说到的[wordpress 微信][0] [0]:http://www.phodal.com/blog/wordress-wechat-to-build-a-blog-wechat/


require 'json'
require 'net/http'
require 'nokogiri'

post_id = 1
image_response = Net::HTTP.get_response("localhost","/?wpapi=get_posts&dev=1&comment=1&content=1&id="+post_id)
image_response_content = (JSON.parse image_response.body)['posts'][0]['content']
image_req = Nokogiri::HTML(image_response_content).css('img').first['src']