Concurrent Web Scraping with Selenium Grid and Docker Swarm
Run a Python and Selenium-based web scraper in parallel with Selenium Grid and Docker Swarm.
Web scraping is a term used to describe the process of downloading and extracting structured data from the web using a program or algorithm. It's a useful skill to have when you need to extract data from a website that does not have a public API.
The tutorials and articles on TestDriven.io teach how to leverage parallelism and concurrency in order to speed up web scrapers that scrape large amounts of data.
Run a Python and Selenium-based web scraper in parallel with Selenium Grid and Docker Swarm.
Speed up a Python web scraping and crawling script with multithreading via the concurrent.futures module.
Join our mailing list to be notified about updates and new releases.