Selenium Webdriver Api

In this article, I sumamrize some selenium webdriver api command to help coding webspider.

locating element

selenium webdriver offer several ways to locate the elements :

  • id
  • name
  • class name
  • tag name
  • link text
  • partial link text
  • xpath
  • css selector

coressponding to the following :

  • find_element_by_id()
  • find_element_by_name()
  • find_element_by_class_name()
  • find_element_by_tag_name()
  • find_element_by_link_text()
  • find_element_by_partial_link_text()
  • find_element_by_xpath()
  • find_element_by_css_selector()

following the above locating element, we have four most commonly used operations:

  • clear clear the content, if possible
  • send_keys simulate the keyboard input operation. But if we want to input chinese chacarters, usually we declare “utf-8” format at the beginning and use send_keys(u”中文内容”)
  • click simulate the mouse click operation, can click any text/image link, button, option …
  • submit submit a form, but the object should be a form

for example, the following is a typical case for user to login a website:

1
2
3
4
5
6

driver.find_element_by_id(“username").clear()
driver.find_element_by_id(" username ").send_keys("username")
driver.find_element_by_id(“password").clear()
driver.find_element_by_id(" password ").send_keys("password")
driver.find_element_by_id("loginBtn").click()

please note that if we use find_elements_by… , we get a group of info which has the key word info. Then we can use loop to select a group of elements that we want.

keyboard operation

most commonly used keyboard function key :

1
2
3
4
5
6
7
8
9
10

send_keys(Keys.BACK_SPACE) 删除键(BackSpace)
send_keys(Keys.SPACE) 空格键(Space)
send_keys(Keys.TAB) 制表键(Tab)
send_keys(Keys.ESCAPE) 回退键(Esc)
send_keys(Keys.ENTER) 回车键(Enter)
send_keys(Keys.CONTROL,'a') 全选(Ctrl+A)
send_keys(Keys.CONTROL,'c') 复制(Ctrl+C)
send_keys(Keys.CONTROL,'x') 剪切(Ctrl+X)
send_keys(Keys.CONTROL,'v') 粘贴(Ctrl+V)

we can also use webdriver to get page information, such as title, url link, text content. For example:

1
2
3
4
5
6
7
8
9
10
11

#get the title
title = driver.title
print(title)
#get page current URL
now_url = driver.current_url
print(now_url)

# get "userid" text content
now_user=driver.find_element_by_id(“userid").text
print(now_user)

multiple windows operation

If we need to switch to multiple windows in a single webpage, we can use current_window_handle, window_handles,switch_to_window() command to operate.

figure 1

javascript operation

generally, we can use driver.execute_script() to apply javascript command

scrollbar operation, sometimes (sign up for a website ), we need to control the scrollbar

1
2
3
4
5
6
7
8
9
10
11
code for scroll:
<body onload= "document.body.scrollTop=0 ">
<body onload= "document.body.scrollTop=100000 ">

#将页面滚动条拖到底部
js = "var q=document.documentElement.scrollTop=10000"
driver.execute_script(js)
time.sleep(3)
#将滚动条移动到页面的顶部
js_ = "var q=document.documentElement.scrollTop=0"
driver.execute_script(js_)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

# get cookie info
cookie = driver.get_cookies()
# print
print(cookie)

driver = webdriver.Firefox()
driver.get("http://www.youdao.com")
#add name and value into cookie
driver.add_cookie({'name':'key-aaaaaaa', 'value':'value-bbbb'})
#loop the name and value and other info in cookies
for cookie in driver.get_cookies():
print "%s -> %s" % (cookie['name'], cookie['value'])
##### we have two ways to delete cookie #####
# delete specific cookie
driver.delete_cookie("CookieName")
# delete all cookies
driver.delete_all_cookies()
time.sleep(2)