1

Этап 1

Introduction. Examples, data science articulated, history and context, technology landscape

2

Этап 2

Databases and the relational algebra

3

Этап 3

Parallel databases, parallel query processing, in-database analytics

4

Этап 4

MapReduce, Hadoop, relationship to databases, algorithms, extensions, languages

5

Этап 5

Key-value stores and NoSQL; tradeoffs of SQL and NoSQL

6

Этап 6

Topics in statistical modeling: basic concepts, experiment design, pitfalls

7

Этап 7

Topics in machine learning

8

Этап 8

Visualization, data products, visual data analytics

9

Этап 9

Provenance, privacy, ethics, governance

10

Этап 10

Guest Lectures

11

Этап 11

Graph Analytics

1

Этап 1

Introduction. Examples, data science articulated, history and context, technology landscape

2

Этап 2

Databases and the relational algebra

3

Этап 3

Parallel databases, parallel query processing, in-database analytics

4

Этап 4

MapReduce, Hadoop, relationship to databases, algorithms, extensions, languages

5

Этап 5

Key-value stores and NoSQL; tradeoffs of SQL and NoSQL

6

Этап 6

Topics in statistical modeling: basic concepts, experiment design, pitfalls

7

Этап 7

Topics in machine learning

8

Этап 8

Visualization, data products, visual data analytics

9

Этап 9

Provenance, privacy, ethics, governance

10

Этап 10

Guest Lectures

11

Этап 11

Graph Analytics

10 июля 2014
Цель завершена 9 июля 2015
Знания и Навыки

Introduction to Data Science on Coursera

Commerce and research are being transformed by data-driven discovery and prediction. Skills required for data analytics at massive levels – scalable data management on and off the cloud, parallel algorithms, statistical modeling, and proficiency with a complex ecosystem of tools and platforms – span a variety of disciplines and are not easy to obtain through conventional curricula. Tour the basic techniques of data science, including both SQL and NoSQL solutions for massive data management (e.g., MapReduce and contemporaries), algorithms for data mining (e.g., clustering and association rule mining), and basic statistical modeling (e.g., linear and non-linear regression).

  1. Introduction. Examples, data science articulated, history and context, technology landscape

    Readings

  2. Databases and the relational algebra

    Readings

  3. Parallel databases, parallel query processing, in-database analytics

    Readings for step 3-4-5

    Data cleaning, entity resolution, data integration, information extraction

    (NOT COVERED IN LECTURES)Readings / Talks

    Elmagarmid, et. al. Duplicate Record Detection: A Survey, Koudas, et. al. Record Linkage: Similarity Measures and Algorithms
  4. MapReduce, Hadoop, relationship to databases, algorithms, extensions, languages

    Readings for step 3-4-5

    Data cleaning, entity resolution, data integration, information extraction

    (NOT COVERED IN LECTURES)Readings / Talks

    Elmagarmid, et. al. Duplicate Record Detection: A Survey,Koudas, et. al. Record Linkage: Similarity Measures and Algorithms
  5. Key-value stores and NoSQL; tradeoffs of SQL and NoSQL

    Readings for step 3-4-5

    Data cleaning, entity resolution, data integration, information extraction

    (NOT COVERED IN LECTURES)Readings / Talks

    Elmagarmid, et. al. Duplicate Record Detection: A Survey,Koudas, et. al. Record Linkage: Similarity Measures and Algorithms
  6. Topics in statistical modeling: basic concepts, experiment design, pitfalls

    Readings

  7. Topics in machine learning

    1. Ssupervised learning (rules, trees, forests, nearest neighbor, regression),
    2. Optimization (gradient descent and variants),
    3. Unsupervised learning

    Readings

    Unsupervised learning: k-means, multi-dimensional scaling

    Readings

  8. Visualization, data products, visual data analytics

    Readings (well, watchings)

  9. Provenance, privacy, ethics, governance

    Backlash: Ethics, privacy, unreliable methods, irreproducible results
    (NOT COVERED IN LECTURES)

  10. Guest Lectures

  11. Graph Analytics

    • structure
    • traversals
    • analytics
    • PageRank
    • community detection
    • recursive queries
    • semantic web

    Readings

    Sherif Sakr, Processing large-scale graph data: A guide to current technology, June 2013(more to come)
  • 2378
  • 10 июля 2014, 11:15
Регистрация

Регистрация

Уже зарегистрированы?
Быстрая регистрация через соцсети
Вход на сайт

Входите.
Открыто.

Еще не зарегистрированы?
 
Войти через соцсети
Забыли пароль?