If you want to get information from a table or similar structure(like Jupyter notebook) you need to extract data in synchronized order. Otherwise you can lose data consistency. Let say that we want to extract several columns from Jupyter notebooks:
Let have look closer at the html code:
<div class="list_item row">
<div class="col-md-12"><input type="checkbox" title="Click here to rename, delete, etc.">
<i class="item_icon notebook_icon icon-fixed-width"></i>
<a class="item_link" href="/notebooks/test/Untitled.ipynb" target="_blank">
<span class="item_name">Untitled.ipynb</span></a>
<span class="file_size pull-right">5.96 kB</span>
<span class="item_modified pull-right" title="2018-08-17 17:26">18 days ago</span>
<div class="item_buttons pull-right">
<div class="running-indicator" style="visibility: hidden;">Running</div>
</div>
</div>
</div>
If you want extract the information about:
- name -
<a class="item_link">
- date/time -
<span class="item_modified pull-right">
Then you can try to find all names and then all dates by:
WebDriverWait wait = new WebDriverWait(driver, 30);
wait.until(ExpectedConditions.presenceOfElementLocated(By.xpath("//a[@class='item_link']")));
List<WebElement> ele = driver.findElements(By.xpath("//a[@class='item_link']"));
List<WebElement> dates = driver.findElements(By.xpath("//span[@class='item_modified pull-right']"))//.getAttribute("title")
This code could cause confussion in case of missing elements. If you want to be sure that you have couples of name - date you can use:
first find all rows as elements - <div class="list_item row">
:
List<WebElement> rows = driver.findElements(By.xpath("//div[@class='list_item row']"))//.getAttribute("title")
and then iterate over each and get the information about the child elements by:
WebElement link = it.findElement(By.tagName("a")).getAttribute("href");
WebElement date = it.findElements(By.tagName("span"))[1]?.getAttribute("title")//it.findElements(By.tagName("span"))[2]?.getAttribute("title")
This way using selenium you can extract child elements from based on parent.
This is a groovy example of the same code:
def listDates
def rows = driver.findElements(By.xpath("//div[@class='list_item row']"))//.getAttribute("title")
rows.each {
def c = it
def link = it.findElement(By.tagName("a")).getAttribute("href");
def date = it.findElements(By.tagName("span"))[1]?.getAttribute("title")//it.findElements(By.tagName("span"))[2]?.getAttribute("title")
listDates << [link, date]
}