Extracting the text between span tags in a Javascript-rendered page using Selenium in Python

Question

I am trying to scrape all instances of text between tags with a particular class on a web page that dynamically updates. I am using selenium with a chrome WebDriver in Python.

In a normal browser, if I right click on the elements I want and go to 'Developer Tools>Inspect', I can see the tags I want, for example, as:

<span class="sCell valX poolX">2112</span>

With the number 2112 being what I want. These are nested within dozens of other outer tags. Note that if I choose 'Page Source' instead of 'Inspect' in the browser it shows:

<span class="sCell valX poolX" <% if(poolState !== "Y"){%> style="display: none"<%}%>><%=xPool%></span>

The problem is that I am getting an empty array when I use xPath to find this information.

Here is the relevant code in the simplest iteration of what I have tried:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait

options = Options()
options.headless = True
driver = webdriver.Chrome(
    options=options, 
    executable_path=chrome_path
)

x_path = '//span[@class="sCell valX poolX"]'

wait = WebDriverWait(driver, 20)
driver.get(url)

wp = driver.find_elements(By.XPATH, x_path)

for n in wp: 
     print(wp.text)

I receive the error: AttributeError: 'list' object has no attribute 'text'

Please note that when I use:

from selenium.webdriver.support import expected_conditions as EC

wait.until(EC.visibility_of_element_located((By.XPATH, x_path)))

I get a TimeoutException

I can't help but assume I'm missing something very simple here. I don't have much experience with this, but it seems like a straightforward scrape.

Note that if I print driver.page_source, I get the same tags as 'Developer Tools>Page Source':

<span class="sCell valX poolX" <% if(poolState !== "Y"){%> style="display: none"<%}%>><%=xPool%></span>

Toka47 · Accepted Answer · 2024-07-07 21:42:12Z

0

First off, if you are using "WebDriverWait" to wait for the page to load, then you should do that after "driver.get". Secondly, your for loop is wrong, change "wp.text" for "n.text".

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait

options = Options()
options.headless = True
driver = webdriver.Chrome(
    options=options, 
    executable_path=chrome_path
)

x_path = '//span[@class="sCell valX poolX"]'

driver.get(url)
wait = WebDriverWait(driver, 200)

wp = driver.find_elements(By.XPATH, x_path)

for n in wp: 
     print(n.text)

answered Jul 7 at 21:42

Toka47

563 bronze badges

Thanks for the help! A little embarrassing to have two sloppy mistakes forever immortalized on Stack Overflow, but that will motivate me to be more careful in the future :). It's now running - and is even finding the correct number of tags - but unfortunately the print(n.text) statement is producing a blank for each tag instead of the text that is visible on the site and the 'inspect' code. Also the 'wait.until' still times out for the same xpath. Any thoughts are appreciated.
– zicari
Commented Jul 8 at 16:23
1

In addition to the error corrections found by @Toka47 I finally figured out that I needed the "inner text" of the elements in question. or n in wp: print(n.get_attribute('innerText'))
– zicari
Commented Jul 8 at 19:20

Add a comment |

Collectives™ on Stack Overflow

Extracting the text between span tags in a Javascript-rendered page using Selenium in Python

1 Answer 1

Not the answer you're looking for? Browse other questions tagged
javascript
python
selenium-webdriver
web-scraping
xpath
or ask your own question.

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Not the answer you're looking for? Browse other questions tagged javascriptpythonselenium-webdriverweb-scrapingxpath or ask your own question.

Related

Not the answer you're looking for? Browse other questions tagged
javascript
python
selenium-webdriver
web-scraping
xpath
or ask your own question.