Not very elegant, but it’s my first script in Ruby. Bits and pieces of code mashed into a beautiful chaos of workability. The result is the load time for Google.com via webpagetest.org. I’m using Selenium, Nokogiri and action mailer to send the result.

Next project is to translate this script into Python. Hmm… maybe do this in Python and parse the data from XML instead? That would be cool. I’ll mull it over…

webscraper.rb

require 'rubygems'
require 'nokogiri'
require 'open-uri'
require 'selenium-webdriver'
require_relative 'SimpleMailer'

@driver = Selenium::WebDriver.for :firefox
@driver.navigate.to 'https://www.webpagetest.org'
wait = Selenium::WebDriver::Wait.new(:timeout => 5)

input = wait.until {
  element = @driver.find_element(:id, 'url')
  element if element.displayed?
}
@driver.find_element(:id, 'url').clear
input.send_keys('google.com')
@driver.find_element(:id, 'start_test-container').click

wait = Selenium::WebDriver::Wait.new(:timeout => 450) # seconds
wait.until {
  @driver.find_element(:id, 'test_results-container')
}

html_source = @driver.page_source
# Check that the id exists
# starttest = wait.until {
#   element = @driver.find_element(:id, 'test_results-container')
#   element if element.displayed?
# }
# puts "Test Passed: ID found" if starttest.displayed?
# @driver.quit
# Then you can use Nokogiri to parse the html:

#####################################################################
# doc = Nokogiri::HTML(html_source)

# PREVIOUS WORKING SCRIPT
# doc.css('tr').each do |el|
#   puts el.text
#   email = SimpleMailer.simple_message('mygmailaddress, 'Best I can do', el.text)
#   email.deliver
# end
# @driver.quit
######################################################################
# copied xpath from chrome inspector
doc = Nokogiri::HTML.parse(html_source)
content = doc
          .xpath('//*[@id="header_data"]/h2/span'\
          ,'//*[@id="LoadTime"]'\
          ,'//*[@id="tableResults"]/tbody/tr[2]/th[2]')
          .to_a.join(" ")

puts content
email = SimpleMailer.simple_message('reciepientemailaddress'\
                                   , '1st script parsed and spaced'\
                                   , content)
email.deliver
@driver.quit
#########################################################

SimpleMailer.rb

require 'action_mailer'

ActionMailer::Base.smtp_settings = {
    :address => 'smtp.gmail.com',
    :port => 587,
    :domain => 'gmail.com',
    :user_name => 'mygmailaddress',
    :password => 'myapppassword',
    :authentication => :plain,
}

class SimpleMailer < ActionMailer::Base
  def simple_message(recipient, subject, message)
    mail(:from => 'mygmailaddress',
         :to => recipient,
         :subject => subject,
         :body => message)
  end
end

My first script: hacked webscraper

Category: NotesProgrammierung
225 views