Using Python to Access Web Data

Description

This course will show how one can treat the Internet as a source of data. We will scrape, parse, and read web data as well as access data using web APIs. We will work with HTML, XML, and JSON data formats in Python. This course will cover Chapters 11-13 of the textbook “Python for Everybody”. To succeed in this course, you should be familiar with the material covered in Chapters 1-10 of the textbook and the first two courses in this specialization. These topics include variables and expressions, conditional execution (loops, branching, and try/except), functions, Python data structures (strings, lists, dictionaries, and tuples), and manipulating files. This course covers Python 3.

What you will learn

Getting Started

In this section you will install Python and a text editor. In previous classes in the specialization this was an optional assignment, but in this class it is the first requirement to get started. From this point forward we will stop using the browser-based Python grading environment because the browser-based Python environment (Skulpt) is not capable of running the more complex programs we will be developing in this class.

Regular Expressions (Chapter 11)

Regular expressions are a very specialized language that allow us to succinctly search strings and extract data from strings. Regular expressions are a language unto themselves. It is not essential to know how to use regular expressions, but they can be quite useful and powerful.

Networks and Sockets (Chapter 12)

In this section we learn about the protocols that web browsers use to retrieve documents and web applications use to interact with Application Program Interfaces (APIs).

Programs that Surf the Web (Chapter 12)

In this section we learn to use Python to retrieve data from web sites and APIs over the Internet.

What’s included