Hands-on web scraping with Python : extract quality data from the web using effective Python techniques /
Saved in:
Author / Creator: | Chapagain, Anish, author. |
---|---|
Edition: | Second edition. |
Imprint: | Birmingham, UK : Packt Publishing Ltd., 2023. |
Description: | 1 online resource (324 pages) : illustrations |
Language: | English |
Subject: | |
Format: | E-Resource Book |
URL for this record: | http://pi.lib.uchicago.edu/1001/cat/bib/13712774 |
Table of Contents:
- Cover
- Title page
- Copyright and Credits
- Contributors
- Table of Contents
- Preface
- Part 1: Python and Web Scraping
- Chapter 1: Web Scraping Fundamentals
- Technical requirements
- What is web scraping?
- Understanding the latest web technologies
- HTTP
- HTML
- XML
- JavaScript
- CSS
- Data-finding techniques used in web pages
- HTML source page
- Developer tools
- Summary
- Further reading
- Chapter 2: Python Programming for Data and Web
- Technical requirements
- Why Python (for web scraping)?
- Accessing the WWW with Python
- Setting things up
- Creating a virtual environment
- Installing libraries
- Loading URLs
- URL handling and operations
- requests
- Python library
- Implementing HTTP methods
- GET
- POST
- Summary
- Further reading
- Part 2: Beginning Web Scraping
- Chapter 3: Searching and Processing Web Documents
- Technical requirements
- Introducing XPath and CSS selectors to process markup documents
- The Document Object Model (DOM)
- XPath
- CSS selectors
- Using web browser DevTools to access web content
- HTML elements and DOM navigation
- XPath and CSS selectors using DevTools
- Scraping using lxml
- a Python library
- lxml by example
- Web scraping using lxml
- Parsing robots.txt and sitemap.xml
- The robots.txt file
- Sitemaps
- Summary
- Further reading
- Chapter 4: Scraping Using PyQuery, a jQuery-Like Library for Python
- Technical requirements
- PyQuery overview
- Introducing jQuery
- Exploring PyQuery
- Installing PyQuery
- Loading a web URL
- Element traversing, attributes, and pseudo-classes
- Iterating using PyQuery
- Web scraping using PyQuery
- Example 1
- scraping book details
- Example 2
- sitemap to CSV
- Example 3
- scraping quotes with author details
- Summary
- Further reading
- Chapter 5: Scraping the Web with Scrapy and Beautiful Soup
- Technical requirements
- Web parsing using Python
- Introducing Beautiful Soup
- Installing Beautiful Soup
- Exploring Beautiful Soup
- Web scraping using Beautiful Soup
- Web scraping using Scrapy
- Setting up a project
- Creating an item
- Implementing the spider
- Exporting data
- Deploying a web crawler
- Summary
- Further reading
- Part 3: Advanced Scraping Concepts
- Chapter 6: Working with the Secure Web
- Technical requirements
- Exploring secure web content
- Form processing
- Cookies and sessions
- User authentication
- HTML processing using Python
- User authentication and cookies
- Using proxies
- Summary
- Further reading
- Chapter 7: Data Extraction Using Web APIs
- Technical requirements
- Introduction to web APIs
- Types of API
- Benefits of web APIs
- Data formats and patterns in APIs
- Example 1
- sunrise and sunset
- Example 2
- GitHub emojis
- Example 3
- Open Library
- Web scraping using APIs
- Example 1
- holidays from the US calendar
- Example 2
- Open Library book details
- Example 3
- US cities and time zones