If you use Selenium for automation you may need to get the content of the whole page. This can be done easily with Selenium by one line of code like:
- python
driver.page_source
or java / groovy
driver.getPageSource();
You can get only the text of the body which should be the visible text on the page with:
- python
element = driver.find_element_by_tag_name("body")
element.get_attribute('innerHTML')
- java / groovy
element.getAttribute("innerHTML");
The code above is working in the most cases but may fail for some ( like HtmlUnitDriver). You can use another code which will result in similar output but it will work more widely:
WebElement element = driver.findElement(By.id("foo"));
String contents = (String)((JavascriptExecutor)driver).executeScript("return arguments[0].innerHTML;", element);
Full example for python:
from selenium import webdriver
driver = webdriver.Chrome('./chromedriver_linux64/chromedriver')
driver.maximize_window()
driver.get("https://www.google.com/ncr")
print (driver.find_element_by_tag_name("body").text)
result:
Gmail
Images
Sign in
Google offered in: french
A privacy reminder from Google
REMIND ME LATER
REVIEW NOW
France
PrivacyTermsSettings
AdvertisingBusinessAbout
Note that if you don't provide a link to to your chrome driver you may get an error like:
FileNotFoundError: [Errno 2] No such file or directory: 'chromedriver': 'chromedriver'
os.path.basename(self.path), self.start_error_message)
selenium.common.exceptions.WebDriverException: Message: 'chromedriver' executable needs to be in PATH. Please see https://sites.google.com/a/chromium.org/chromedriver/home