Capstone: Retrieving, Processing, and Visualizing Data with Python
Welcome to the Python for Everybody Capstone! Here you will build a series of applications to retrieve, process, and visualize data using elements you have learned from each course within the specialization. You will create visualizations to become familiar with the technologies, and will have optional assignments for exploring course concepts in even more detail and sharing your results with other students. Chapter 15 from the book “Python for Informatics” will serve as the backbone for the capstone.
We will also have live online office hours every week so that we’ll have a chance to talk and to share questions and ideas. Keep an eye out in your email for any announcements regarding office hours.
Congratulations for making it this far!
- Chuck and the Teaching Staff of Python for Everybody
Критерій завершення
завершила проект, получила сертификат
-
Welcome to the Capstone
Congratulations to everyone for making it this far. Before you begin, please view the Introduction video and read the Capstone Overview. The Course Resources section contains additional course-wide material that you may want to refer to in future weeks.
-
Video: Introduction: Welcome to the Class
-
Reading: Capstone Overview
-
Video: Office Hours in Den Haag, Netherlands
-
Video: Interview: John Resig and Pam Fox - Khan Academy
-
-
Exploring Data Sources (Project)
The optional Capstone project is your opportunity to select, process, and visualize the data of your choice, and receive feedback from your peers. The project is not graded, and can be as simple or complex as you like. This week's assignment is to identify a data source and make a short discussion forum post describing the data source and outlining some possible analysis that could be done with it. You will not be required to use the data source presented here for your actual analysis.
-
Reading: Identifying Your Data Source - Introduction
-
Reading: List of Data Sources (Instructional Staff Curated)
-
Discussion Prompt: Identifying a Data Source
-
Video: Dr. Chuck's New Kitten - Sakaiger
-
Video: Interview: Bruce Schneier - The Security Mindset
-
-
Spidering and Modeling Email Data
In our second required assignment, we will retrieve and process email data from the Sakai open source project. Video lectures will walk you through the process of retrieving, cleaning up, and modeling the data.
-
Reading: Spidering and Modeling Email Data - Introduction
-
Video: Gmane Introduction
-
Video: Gmane Loading from the Web
-
Video: Gmane Data Cleanup/Modeling
-
Video: Gmane Looking at Modeled Data
-
Video: Office Hours Baltimore, MD
-
Video: Interview: Bruce Schneier - Building Cryptographic Systems
-
Graded: Loading and Modeling Mail Data
-
-
Accessing New Data Sources (Project)
The task for this week is to make a discussion thread post that reflects the progress you have made to date in retrieving and cleaning up your data source so can perform your analysis. Feedback from other students is encouraged to help you refine the process.
-
Reading: Accessing New Data Sources - Introduction
-
Discussion Prompt: Analyzing a Data Source
-
Video: Office Hours: Dr. Chuck Pretends to be Anthony Bourdain
-
-
Visualizing Email Data
In the final required assignment, we will do two visualizations of the email data you have retrieved and processed: a word cloud to visualize the frequency distribution and a timeline to show how the data is changing over time.
-
Reading: Visualizing Email Data
-
Video: Gmane Basic Statistics and Word Cloud
-
Video: Gmane Visualizing Lines
-
Video: Office Hours, Montreal, Canada
-
Video: Interview: Nathaniel Borenstein - The Father of MIME
-
Graded: Visualizing Email Data
-
-
Visualizing new Data Sources (Project)
This week you will discuss the analysis of your data to the class. While many of the projects will result in a visualization of the data, any other results of analyzing the data are equally valued, so use whatever form of analysis and display is most appropriate to the data set you have selected.
-
Reading: Visualizing new Data Sources - Introduction
-
Discussion Prompt: Data Analysis and Visualization
-
Video: Office Hours - Dr. Chuck's Office - Ann Arbor, Michigan
-
Video: Video: Steve Jobs, NeXT and the Internet
-
-
Building a Search Engine
This week we will download and run a simple version of the Google PageRank Algorithm and practice spidering some content. The assignment is peer-graded, and the first of three required assignments in the course. This a continuation of the material covered in Course 4 of the specialization, and is based on Chapter 15 of the textbook.
-
Reading: Building a Search Engine - Introduction
-
Video: Page Rank Introduction
-
Video: Page Rank Spidering
-
Video: Computing Page Rank
-
Video: Page Rank - Visualization
-
Video: Office Hours Detroit, Michigan
-
Video: Interview: Anil Jain - Image Processing
-
Graded: Peer Grade: Page Rank
-
- 6050
- 22 жовтня 2016, 10:04
Не пропустіть нові записи!
Підпишіться на ціль і стежте за її досягненням