1

Этап 1

Exploratory Data Analysis

2

Этап 2

Reproducible Research

3

Этап 3

Statistical Inference

4

Этап 4

Regression Models

5

Этап 5

Practical Machine Learning

6

Этап 6

Developing Data Products

7

Этап 7

Capstone Project

8

Этап 8

The Data Scientist’s Toolbox

22 апреля—17 мая

9

Этап 9

R Programming

22 апреля—17 мая

10

Этап 10

Getting and Cleaning Data

15 июня—06 июля

1

Этап 1

Exploratory Data Analysis

2

Этап 2

Reproducible Research

3

Этап 3

Statistical Inference

4

Этап 4

Regression Models

5

Этап 5

Practical Machine Learning

6

Этап 6

Developing Data Products

7

Этап 7

Capstone Project

8

Этап 8

The Data Scientist’s Toolbox

22 апреля—17 мая

10

Этап 10

Getting and Cleaning Data

15 июня—06 июля

9

Этап 9

R Programming

22 апреля—17 мая

28 апреля 2014 31 декабря 2014
Цель завершена 9 февраля 2015
Образование

Specialization on Coursera "Data Science"

What You'll Learn

Formulate context-relevant questions and hypotheses to drive data scientific research

Identify, obtain, and transform a data set to make it suitable for the production of statistical evidence communicated in written form

Build models based on new data types, experimental design, and statistical inference

Incentives & Benefits

At completion, students will have a portfolio demonstrating their mastery of the material. The top 10 students for the capstone will get the chance to video-conference with instructors and ask them questions. In addition, top students will be profiled on the Simply Statistics Blog, a widely read data science blog.

Courses

This specialization covers the concepts and tools you'll need throughout the entire data science pipeline, from asking the right kinds of questions to making inferences and publishing results. The Specialization concludes with a Capstone project that allows you to apply the skills you've learned throughout the courses.

  1. Exploratory Data Analysis

    About the Course

    This course covers the essential exploratory techniques for summarizing data. These techniques are typically applied before formal modeling commences and can help inform the development of more complex statistical models. Exploratory techniques are also important for eliminating or sharpening potential hypotheses about the world that can be addressed by the data. We will cover in detail the plotting systems in R as well as some of the basic principles of constructing data graphics. We will also cover some of the common multivariate statistical techniques used to visualize high-dimensional data.

    Course Syllabus

    After successfully completing this course you will be able to make visual representations of data using the base, lattice, and ggplot2 plotting systems in R, apply basic principles of data graphics to create rich analytic graphics from different types of datasets, construct exploratory summaries of data in support of a specific question, and create visualizations of multidimensional data using exploratory multivariate statistical techniques.

  2. Reproducible Research

    About the Course

    This course focuses on the concepts and tools behind reporting modern data analyses in a reproducible manner. Reproducible research is the idea that data analyses, and more generally, scientific claims, are published with their data and software code so that others may verify the findings and build upon them. The need for reproducibility is increasing dramatically as data analyses become more complex, involving larger datasets and more sophisticated computations. Reproducibility allows for people to focus on the actual content of a data analysis, rather than on superficial details reported in a written summary. In addition, reproducibility makes an analysis more useful to others because the data and code that actually conducted the analysis are available. This course will focus on literate statistical analysis tools which allow one to publish data analyses in a single document that allows others to easily execute the same analysis to obtain the same results.

    Course Syllabus

    In this course you will learn to write a document using R markdown, integrate live R code into a literate statistical program, compile R markdown documents using knitr and related tools, and organize a data analysis so that it is reproducible and accessible to others.

  3. Statistical Inference

    About the Course

    Statistical inference is the process of drawing conclusions about populations or scientific truths from data. There are many modes of performing inference including statistical modeling, data oriented strategies and explicit use of designs and randomization in analyses. Furthermore, there are broad theories (frequentists, Bayesian, likelihood, design based, …) and numerous complexities (missing data, observed and unobserved confounding, biases) for performing inference. A practitioner can often be left in a debilitating maze of techniques, philosophies and nuance. This course presents the fundamentals of inference in a practical approach for getting things done. After taking this course, students will understand the broad directions of statistical inference and use this information for making informed choices in analyzing data.

    Course Syllabus

    In this class students will learn the fundamentals of statistical inference. Students will receive a broad overview of the goals, assumptions and modes of performing statistical inference. Students will be able to perform inferential tasks in highly targeted settings and will be able to use the skills developed as a roadmap for more complex inferential challenges.

  4. Regression Models

    About the Course

    Linear models, as their name implies, relates an outcome to a set of predictors of interest using linear assumptions. Regression models, a subset of linear models, are the most important statistical analysis tool in a data scientist’s toolkit. This course covers regression analysis, least squares and inference using regression models. Special cases of the regression model, ANOVA and ANCOVA will be covered as well. Analysis of residuals and variability will be investigated. The course will cover modern thinking on model selection and novel uses of regression models including scatterplot smoothing.

    Course Syllabus

    In this course students will learn how to fit regression models, how to interpret coefficients, how to investigate residuals and variability. Students will further learn special cases of regression models including use of dummy variables and multivariable adjustment. Extensions to generalized linear models, especially considering Poisson and logistic regression will be reviewed.

  5. Practical Machine Learning

    About the Course

    One of the most common tasks performed by data scientists and data analysts are prediction and machine learning. This course will cover the basic components of building and applying prediction functions with an emphasis on practical applications. The course will provide basic grounding in concepts such as training and tests sets, overfitting, and error rates. The course will also introduce a range of model based and algorithmic machine learning methods including regression, classification trees, Naive Bayes, and random forests. The course will cover the complete process of building prediction functions including data collection, feature creation, algorithms, and evaluation.

    Course Syllabus

    Upon completion of this course you will understand the components of a machine learning algorithm. You will also know how to apply multiple basic machine learning tools. You will also learn to apply these tools to build and evaluate predictors on real data.

  6. Developing Data Products

    About the Course

    A data product is the production output from a statistical analysis. Data products automate complex analysis tasks or use technology to expand the utility of a data informed model, algorithm or inference. This course covers the basics of creating data products using Shiny, R packages, and interactive graphics. The course will focus on the statistical fundamentals of creating a data product that can be used to tell a story about data to a mass audience.

    Course Syllabus

    Students will learn how communicate using statistics and statistical products. Emphasis will be paid to communicating uncertainty in statistical results. Students will learn how to create simple Shiny web applications and R packages for their data products.

  7. Capstone Project

    The capstone project class will allow students to create a usable/public data product that can be used to show your skills to potential employers. Projects will be drawn from real-world problems and will be conducted with industry, government, and academic partners. The capstone project will be four weeks long, offered in conjunction with the series. The capstone class will be offered thrice yearly.

  • 3076
  • 28 апреля 2014, 16:10
Регистрация

Регистрация

Уже зарегистрированы?
Быстрая регистрация через соцсети
Вход на сайт

Входите.
Открыто.

Еще не зарегистрированы?
 
Войти через соцсети
Забыли пароль?