Goal abandoned
The author does not write in the goal 4 years 5 months 13 days
Spark: The Definitive Guide
Работаю в команде поддержки проекта, написанного много лет назад при помощи Hadoop. Много чего узнал за это время, но пришло час двигаться дальше. Моя основная цель - поставить этот проект на рельсы Spark.
Это книга будет отправной точкой в повышении своей компетенции.
Goal Accomplishment Criteria
Проработать книгу Spark: The Definitive Guide.
-
I. Gentle Overview of Big Data and Spark
-
What Is Apache Spark?
-
A Gentle Introduction to Spark
-
A Tour of Spark’s Toolset
-
-
II. Structured APIs—DataFrames, SQL, and Datasets
-
Structured API Overview
-
Basic Structured Operations
-
Working with Different Types of Data
-
Aggregations
-
Joins
-
Data Sources
-
Spark SQL
-
Datasets
-
-
III. Low-Level APIs
-
Resilient Distributed Datasets (RDDs)
-
Advanced RDDs
-
Distributed Shared Variables
-
-
IV. Production Applications
-
How Spark Runs on a Cluster
-
Developing Spark Applications
-
Deploying Spark
-
Monitoring and Debugging
-
Performance Tuning
-
-
V. Streaming
-
Stream Processing Fundamentals
-
Structured Streaming Basics
-
Event-Time and Stateful Processing
-
Structured Streaming in Production
-
-
VI. Advanced Analytics and Machine Learning
-
Advanced Analytics and Machine Learning Overview
-
Preprocessing and Feature Engineering
-
Classification
-
Regression
-
Recommendation
-
Unsupervised Learning
-
Graph Analytics
-
Deep Learning
-
-
VII. Ecosystem
-
Language Specifics: Python (PySpark) and R (SparkR and sparklyr)
-
Ecosystem and Community
-
- 731
- 07 June 2020, 17:13
Don't miss new posts!
Subscribe for the Goal and follow through to its completion