Speed up a Python web scraping and crawling script with multithreading via the concurrent.futures module.
Web scraping is a term used to describe the process of downloading and extracting structured data from the web using a program or algorithm. It's a useful skill to have when you need to extract data from a website that does not have a public API.
The tutorials and articles on TestDriven.io teach how to leverage parallelism and concurrency in order to speed up web scrapers that scrape large amounts of data.
Latest Posts (2)
Run a Python and Selenium-based web scraper in parallel with Selenium Grid and Docker Swarm.