Web Scrapping
Frameworks
- BeautifulSoup
- MechanicalSoup
- Scrapy
- Photon → collection of URLs, files, specific data (emails, social networks)
- Puppeteer
Puppeteer
Allows to drive a chromium instance (works with nodejs).
- automate tasks (forms, data monitoring)
- browse web pages (tests, scrapping)
- make screenshots or export web pages to PDF
- capture a chronological trace of a site to diagnose performance problems
- test chrome extensions
- possibility to display the window to follow the navigation
- dev chrome tools →
Recorder
allows to record a navigation and extract the puppeteer code.