I am currently trying to scrape a local web page, generated by my EV charger. I access it through it's IP address, which requires me to sign in. After signing in, I want to retrieve the data from the JS chart below. This data is shown in chunks (not even 1 complete day visible), but it goes way back (1 year +). I want to use this data, to compare my EV charging sessions with my available power in house.
However, I struggle so far to extract the data shown in the chart, and then make it iteratively go back in time, clicking the arrow below.
driver_path = r"C:\\chromedriver-win64\chromedriver.exe"
ACCOUNT = "[email protected]"
PASSWORD = "pw"
driver = webdriver.Chrome(driver_path)
driver.maximize_window()
wait = WebDriverWait(driver, 10)
driver.get("http://192.168.1.245/#!/login")
#login to my charger
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[@type='text']"))).send_keys("[email protected]")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[@type='password']"))).send_keys("pw")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//button[@type='submit']"))).click()
#Using this code, I can extract the y-axis and x-axis, with the visible parameters.
el = driver.find_element(By.XPATH, "//div[@id='powerManagementDashboard']//*[local-name()='svg']").text
print(el)
08:00
09:00
10:00
11:00
12:00
13:00
-10
0
10
20
30
40
50
kW
#go back in time clicking the left arrow.
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//i[@ng-click='ChartAddHour(-6)']"))).click()
But I can't work out how to scrape the most important section in the HTML, with an abundance of <g> tags
, of which only one I need.
In the image below the data is shown. I want to retrieve the datapoints that contain the time and the measured kW at that point.
But how do I get to that specific <g>
? Or scrape all of those <g> tags
, and clean the data later.
Wondering if anyone can help me out. Thanks in advance.
EDIT 1:
I've managed to access all the data inside. However, when I want to iterate back in time, it returns a StaleElementReferenceException: stale element not found in current frame
.
This occurs right away when iterating for the second time, using this code:
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//i[@ng-click='ChartAddHour(-6)']"))).click()
try:
chartdata = driver.find_elements(By.XPATH, "//div[@id='powerManagementDashboard']//*[local-name()='svg']/*[name()='g']/*[name()='g']/*[name()='g']/*[name()='g']")
except:
chartdata = None
chdata = []
for i in chartdata:
store = {
'kW data': i.get_attribute("aria-label")
}
chdata.append(store)
print(chdata)
which gives me:
[{'kW data': 'Building power 20:40:00 3.711'}, {'kW data': 'Building power 20:45:00 5.235'}, {'kW data': 'Building power 20:50:00 5.241'}, {'kW data': 'Building power 20:55:00 5.346'}, {'kW data': 'Building power 21:00:00 5.375'}, {'kW data': 'Building power 21:05:00 5.28'},
etcetera.
Not sure if this is the most efficient way to do it, but now I need to loop back 10-100s of times (using left-click arrow), to generate all data.
Any comments on the method so far?
find_elements()
because selenium gives reference to objects in browser's memory and when you click then it moves objects in memory and old references can't find objects in memory - so you have to use againfind_elements()
to get new reference to objects in memory..append( i.get_attribute("aria-label") )
. and even withi.get_attribute("aria-label").replace("Building power ", "")
ori.get_attribute("aria-label").split(" ", 2)[-1]
sleep()
before you use againfind_elements()
because you may still get older references before JavaScript replace objects in memory. Or you have to catch error and run againfind_elements()
to get correct data - buttime.sleep(1)
is simpler.