Skip to main content

All Questions

Tagged with
0 votes
0 answers
19 views

Is there a way to mimic the Element.closest() function from javascript in Scrapy python?

I am trying to convert my web-scraper I built in JavaScript using the puppeteer library into a python-based web-scraper running on Scrapy. I want to be able to do something similar to JavaScript's ...
Christopher Cho's user avatar
1 vote
1 answer
72 views

Can't Scrape a webpage whose contents are dynamically generated through JavaScript

I am trying to scrape table data from a webpage but it's not a normal webpage that can be scraped using its html tags and CSS class or ID. The contents of the webpage are dynamically generated using ...
Abhinay's user avatar
  • 11
0 votes
0 answers
50 views

Why is Scrapy-splash not returning expected HTML from dynamic javascript page?

I'm attempting to scrape the Market table data from the following page utilizing scrapy-splash: "manta.layerbank.finance/bank" (Put in quotes because might be causing spam issue?) So far I'm ...
Kody F's user avatar
  • 1
0 votes
2 answers
52 views

CSS Notation for a Scrapy Spider Script

I wrote the below python script to return the item name, price, and link for items listed on https://shop.doverstreetmarket.com/collections/shops-noah import scrapy class DSMUKSpider(scrapy.Spider): ...
Teron's user avatar
  • 23
0 votes
1 answer
364 views

How to scrape location data from a leaflet map?

I want to access the location (latitude, longitude) of the water level sensor markers found in this website but I can't find any HTML tags which contains their locations. Any guidance would be very ...
Msh. Niyaz's user avatar
0 votes
0 answers
62 views

using scrapy with selenium together

i was trying to integrate selenium into my scrapy project i had a middleware setup for selenium chrome as such now it works fine it loads the page and it collect data needed . tho couldn't figure a ...
Low LiFe's user avatar
-1 votes
1 answer
84 views

How can I loop in unlimited scroll sites to extract every page?

I don't want to use api to extract data i just want to learn this way for the project. The element for next page is not visible and the website has unlimited scroll. I have scraped the first page but ...
Anish Thapa's user avatar
0 votes
0 answers
84 views

Playwright doesnot return anything?

import scrapy from scrapy_playwright.page import PageMethod class PositionsSpider(scrapy.Spider): name = "positions" allowed_domains = ["https://trafigura.com/"] ...
Anish Thapa's user avatar
0 votes
0 answers
174 views

Pycharm JavaScript heap out of memory on Unbuntu

Extracting data from multiple urls using scrapy-playwright leads to following error after parsing about 1000 urls. FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory I ...
Michael's user avatar
  • 367
1 vote
0 answers
919 views

Button not clicking with scrapy playwright

I am attempting to click on an sso login for a platform by testing its button functionality with scrapy playwright. I have inputted an incorrect email and so after clicking the button, it should throw ...
Dollar Tune-bill's user avatar
1 vote
0 answers
43 views

Is there anyway of forcing a javascript GET request before rendering html with Splash?

I'm getting stucked loading a dynamic content from this website with Splash: https://www.fravega.com/l/tv-y-video/tv/?categorias=tv-y-video%2Ftv&page=1 I'm trying to get the href attribute from ...
Drupman's user avatar
  • 11
0 votes
0 answers
92 views

Data Scraping: Integrate Scrapy [python library to build spiders] with pupeteer- or playwright-extra [JS headless browser automation]

I am currently developing multiple scrapers that should be maintained for the next couple of years. My typcial approach to traverse large pages is to use scrapy, a well maintained python framework to ...
Rondo Bohrens's user avatar
0 votes
1 answer
73 views

Is it possible to extract the download syllabus link with requests or scrapy without selenium

I am trying to extract the download syllabus link from this website- https://www.simplilearn.com/big-data-and-analytics/python-for-data-science-training The link is not available on page source, and I ...
nikhil kumar's user avatar
-2 votes
1 answer
36 views

I want to fetch the details from payload tab of developers tool

I want to fetch the details of url which differ from ones which are present in the anchor tab. Accessing the href link directs to previous page rather than next page. How do I fetch the following view ...
Reema's user avatar
  • 11
1 vote
0 answers
55 views

Support with scrapy. get data from a double Postback

I am looking for help with a specific problem to get data from a postback table. I need to access a table that is loaded after pressing a button with a JavaScript PostbackWithOption. I think I am ...
crianopa's user avatar

15 30 50 per page
1
2 3 4 5
14