1

Etapa 1

Introduction. Examples, data science articulated, history and context, technology landscape

2

Etapa 2

Databases and the relational algebra

3

Etapa 3

Parallel databases, parallel query processing, in-database analytics

4

Etapa 4

MapReduce, Hadoop, relationship to databases, algorithms, extensions, languages

5

Etapa 5

Key-value stores and NoSQL; tradeoffs of SQL and NoSQL

6

Etapa 6

Topics in statistical modeling: basic concepts, experiment design, pitfalls

7

Etapa 7

Topics in machine learning

8

Etapa 8

Visualization, data products, visual data analytics

9

Etapa 9

Provenance, privacy, ethics, governance

10

Etapa 10

Guest Lectures

11

Etapa 11

Graph Analytics

1

Etapa 1

Introduction. Examples, data science articulated, history and context, technology landscape

2

Etapa 2

Databases and the relational algebra

3

Etapa 3

Parallel databases, parallel query processing, in-database analytics

4

Etapa 4

MapReduce, Hadoop, relationship to databases, algorithms, extensions, languages

5

Etapa 5

Key-value stores and NoSQL; tradeoffs of SQL and NoSQL

6

Etapa 6

Topics in statistical modeling: basic concepts, experiment design, pitfalls

7

Etapa 7

Topics in machine learning

8

Etapa 8

Visualization, data products, visual data analytics

9

Etapa 9

Provenance, privacy, ethics, governance

10

Etapa 10

Guest Lectures

11

Etapa 11

Graph Analytics

10 julio 2014
Objetivo completado 9 julio 2015

Autor del objetivo

Алексей

Rusia, Москва

42 año / año / año

Conocimientos y Destrezas

Introduction to Data Science on Coursera

Commerce and research are being transformed by data-driven discovery and prediction. Skills required for data analytics at massive levels – scalable data management on and off the cloud, parallel algorithms, statistical modeling, and proficiency with a complex ecosystem of tools and platforms – span a variety of disciplines and are not easy to obtain through conventional curricula. Tour the basic techniques of data science, including both SQL and NoSQL solutions for massive data management (e.g., MapReduce and contemporaries), algorithms for data mining (e.g., clustering and association rule mining), and basic statistical modeling (e.g., linear and non-linear regression).

  1. Introduction. Examples, data science articulated, history and context, technology landscape

    Readings

  2. Databases and the relational algebra

    Readings

  3. Parallel databases, parallel query processing, in-database analytics

    Readings for step 3-4-5

    Data cleaning, entity resolution, data integration, information extraction

    (NOT COVERED IN LECTURES)Readings / Talks

    Elmagarmid, et. al. Duplicate Record Detection: A Survey, Koudas, et. al. Record Linkage: Similarity Measures and Algorithms
  4. MapReduce, Hadoop, relationship to databases, algorithms, extensions, languages

    Readings for step 3-4-5

    Data cleaning, entity resolution, data integration, information extraction

    (NOT COVERED IN LECTURES)Readings / Talks

    Elmagarmid, et. al. Duplicate Record Detection: A Survey,Koudas, et. al. Record Linkage: Similarity Measures and Algorithms
  5. Key-value stores and NoSQL; tradeoffs of SQL and NoSQL

    Readings for step 3-4-5

    Data cleaning, entity resolution, data integration, information extraction

    (NOT COVERED IN LECTURES)Readings / Talks

    Elmagarmid, et. al. Duplicate Record Detection: A Survey,Koudas, et. al. Record Linkage: Similarity Measures and Algorithms
  6. Topics in statistical modeling: basic concepts, experiment design, pitfalls

    Readings

  7. Topics in machine learning

    1. Ssupervised learning (rules, trees, forests, nearest neighbor, regression),
    2. Optimization (gradient descent and variants),
    3. Unsupervised learning

    Readings

    Unsupervised learning: k-means, multi-dimensional scaling

    Readings

  8. Visualization, data products, visual data analytics

    Readings (well, watchings)

  9. Provenance, privacy, ethics, governance

    Backlash: Ethics, privacy, unreliable methods, irreproducible results
    (NOT COVERED IN LECTURES)

  10. Guest Lectures

  11. Graph Analytics

    • structure
    • traversals
    • analytics
    • PageRank
    • community detection
    • recursive queries
    • semantic web

    Readings

    Sherif Sakr, Processing large-scale graph data: A guide to current technology, June 2013(more to come)
  • 2269
  • 10 julio 2014, 11:15
Registración

Las posibilidades
están ilimitadas.
Es la hora
de descubrir las suyas

Уже зарегистрированы?
Entrada al sitio

Entre.
Está abierto.

¿Aún no está registrado?
 
Conéctese a cualquiera de sus cuentas, sus datos se tomarán de la cuenta.
¿Ha olvidado la contraseña?