Using Python to Access Web Data by University of Michigan
About this Course
This course will show how one can treat the Internet as a source of data. We will scrape, parse, and read web data as well as access data using web APIs. We will work with HTML, XML, and JSON data formats in Python. This course will cover Chapters 11-13 of the textbook “Python for Informatics”. To succeed in this course, you should be familiar with the material covered in Chapters 1-10 of the textbook and the first two courses in this specialization. These topics include variables and expressions, conditional execution (loops, branching, and try/except), functions, Python data structures (strings, lists, dictionaries, and tuples), and manipulating files.
Thank you for enrolling in Using Python to Access Web Data. This course will show how one can treat the Internet as a source of data. We will scrape, parse, and read web data as well as access data using web APIs. We will work with HTML, XML, and JSON data formats in Python. This course will cover Chapters 11-13 of the textbook 'Python for Informatics'.
We have put a lot of detail about the course into the document titled 'Syllabus' which can be found in Week 1, so please take a moment to review it. Week 1 also contains instructions for installing Python so you can develop and run the programs needed to complete this course. In previous courses, installing Python was optional but as we write more complex programs the Python Playground and auto-grader cannot handle the programs we will write in this class.
If you like what you see, you are welcome to encourage your friends to join the course. We leave registration open for several weeks and delay the first due dates to give late joiners a chance to join and catch up. If you are not joining late, the extended due dates for the first graded quiz and assignment are a good time to 'get ahead' in case later you need to take a week off from the class.
We will try to do live online office hours or face-to-face office hours from time to time so we have a chance to talk one way or another. Keep an eye out in your email for any announcements regarding office hours.
This is the third of four courses in the Python for Everybody specialization. We hope that you are successful in this course and continue with the rest of the courses in the specialization.
Again welcome to the course and best of luck in the class.
Goal Accomplishment Criteria
прошла весь курс, выполнила все задания
-
Getting Started
In this section you will install Python and a text editor. In previous classes in the specialization this was an optional assignment, but in this class it is the first requirement to get started. From this point forward we will stop using the browser-based Python grading environment because the browser-based Python environment (Skulpt) is not capable of running the more complex programs we will be developing in this class.
-
Welcome to The Course
-
Python Textbook
-
Lecture Slides
-
Privacy Policy
-
Welcome to Python - Guido van Rossum
-
Notes on Python 2.x versus Python 3.x
-
Notes on Choice of Text Editor
-
Notes on Choice of Text Editor
-
Peer Graded Assignment: Installing Python Screen Shots
-
Review Your Peers: Installing Python Screen Shots
-
Windows 8: Installing Python and Writing A Program
-
Windows 8: Taking Screen Shots
-
-
Regular Expressions (Chapter 11)
Regular expressions are a very specialized language that allow us to succinctly search strings and extract data from strings. Regular expressions are a language unto themselves. It is not essential to know how to use regular expressions, but they can be quite useful and powerful.
-
Regular Expressions - Part 1
-
Regular Expressions - Part 2
-
Python Regular Expression Quick Guide
-
Quiz: Quiz: Regular Expressions
-
Extracting Data With Regular Expressions
-
Bonus: Office Hours - Den Haag
-
Bonus Interview: Bjarne Stroustrup - C++
-
-
Networks and Sockets (Chapter 12)
In this section we learn about the protocols that web browsers use to retrieve documents and web applications use to interact with Application Program Interfaces (APIs).
-
Networked Programs
-
From Sockets to Applications
-
Let’s Write a Web Browser
-
If You Want to Learn More
-
Quiz: Networks and Sockets
-
Understanding the Request / Response Cycle
-
Bonus Video: Leonard Kleinrock - The First Two Packets on the ARPANET
-
Bonus Video: Robert Cailliau - co-Inventor of the Web
-
Bonus: Office Hours - Atlanta GA (Buckhead)
-
Fun: Dr. Chuck @ CNN Reading the News
-
-
Programs that Surf the Web (Chapter 12)
In this section we learn to use Python to retrieve data from web sites and APIs over the Internet.
-
Understanding HTML
-
Parsing HTML with BeautifulSoup
-
Quiz: Reading Web Data From Python
-
Notes Regarding the Use of Beautiful Soup
-
Scraping HTML Data with BeautifulSoup
-
Assignment: Following Links in HTML Using BeautifulSoup
-
Bonus: Office Hours - Montreal
-
Bonus Interview: Tim Berners-Lee - Inventing the Web
-
Fun: I Got My Mojo Working - Geneva, Switzerland
-
-
Web Services and XML (Chapter 13)
In this section, we learn how to retrieve and parse XML (eXtensible Markup Language) data.
-
Web Services Overview
-
Interview: Roy Fielding - Understanding the REST Architecture
-
eXtensible Markup Language - XML
-
XML Schema
-
Parsing XML in Python
-
Quiz: eXtensible Markup Language
-
Extracting Data from XML
-
Bonus: Office Hours - Boston
-
Bonus Video: Ian Horrocks / RDF / OWL (Advanced)
-
-
JSON and the REST Architecture (Chapter 13)
In this module, we work with Application Program Interfaces / Web Services using the JavaScript Object Notation (JSON) data format.
-
JavaScript Object Notation
-
Interview: Douglas Crockford - Discovering JSON
-
Service Oriented Approach
-
Video: Service Oriented Architectures
-
Accessing APIs in Python
-
API Security and Rate Limiting
-
Quiz: REST, JSON, and APIs
-
Extracting Data from JSON
-
Using the GeoJSON API
-
Bonus: Office Hours - Melbourne, AU
-
Bonus: Office Hours - Santa Monica, CA
-
Bonus: Class Reunion at Bletchley Park
-
- 3488
- 21 June 2016, 11:04
Don't miss new posts!
Subscribe for the Goal and follow through to its completion