ruby - Best way to DRY up code using procs and blocks and/or dynamic methods -
i writing way parse websites, each "scraper" has it's own way gather information, there plenty of common functionality between 2 methods.
differences:
- one scraper uses nokogiri open page via css selectors
- the other scraper uses rss feed gather information
similarities:
- each scraper creates "event" object has following attributes:
- title
- date
- description
if nokogiri scraper, this:
event_selector = page.css(".div-class") event_selector.each_with_index |event, index| date = date.parse(event.text) #code want share end
for rss scraper, this
open(url) |rss| feed = rss::parser.parse(rss) feed.items.each |event| description = sanitize.fragment(event.description) date = description[/\d{2}-\d{2}-20\d{2}/] date = date.strptime(date, '%m-%d-%y') #code want share end end
^^ date grabbed via regex description , converted date object via .strptime method
as can see each scraper uses 2 different method calls/ways find date. how abstract information class?
i thinking of this:
class scrape attr_accessor :scrape_url, :title, :description, :date, :url def initialize(options = {}) end def find_date(&block) # process block?? end
end
and in each of scraper methods like
scrape = scrape.new date_proc = proc.new {date.parse(event.text)} scrape.find_date(date_proc)
is right way go problem? in short want have common functionality of 2 website parsers pass desired code instance method of "scrape" class. appreciate tips tackle scenario.
edit: maybe make more sense if want find "date" of event, way find - behavior - or specific code run, different.
you use event builder. this:
class event::builder def date(raw) @date = date.strptime(raw, '%m-%d-%y') end # ... more setters (title, description) ... def build event.new(date: @date, ... more arguments ..) end end
and then, inside scraper:
open(url) |rss| builder = event::builder.new feed = rss::parser.parse(rss) feed.items.each |event| description = sanitize.fragment(event.description) date = description[/\d{2}-\d{2}-20\d{2}/] builder.date(date) # ... set other attributes ... event = builder.build # event ... end end
Comments
Post a Comment