강의

멘토링

커뮤니티

Data Science

/

Data Engineering

Learning by doing: Practical Spark Part 1

By the end of this course, you will be able to implement Apache Spark projects in your organization.

22 learners are taking this course

  • nexthumans
실습 중심
명령어
데이터엔지니어
데이터처리
Apache Spark
Big Data
Machine Learning(ML)
data-transformation

What you will learn!

  • Using Spark Core Commands

  • Spark-based Data Science

Hands-on Practical Spark Part 1

Course Introduction

"Hands-on Practical Spark Part 1" is a practice-oriented course designed for everyone from learners who are new to data science to professionals preparing for real-world projects using Spark. This course is structured to systematically learn from Spark's basic concepts to practical applications, with particular focus on essential commands and data processing methods that are crucial for executing Spark projects.

@Apache Spark, @Big Data, @Machine Learning, @Data Engineering, @Data Transformation

Course Objectives

  • Spark Fundamentals and Environment Setup: Learn Spark's operating principles and environment configuration methods to efficiently utilize it in both local and Docker environments.

  • Distributed Data Processing and Optimization: Learn the fundamentals of large-scale data processing through hands-on practice with Spark's distributed processing concepts, data partitioning, shuffling, and cluster resource configuration.

  • Acquiring practical data processing skills: Learn advanced data processing techniques by loading, transforming, filtering, and combining data through various Spark commands.

  • Developing Data Analysis and Visualization Skills: Analyze data using Spark's DataFrames and SQL commands, and visualize the results.


Curriculum Structure

  1. Orientation

    • Introduces the concepts of Spark and its practical application possibilities, and presents learning directions.

  2. Spark Environment Setup

    • Learn how to install and configure Spark using local environment and Docker to set up a practice environment.

  3. Distributed Processing Concepts

    • Learn how Spark processes large-scale data and the basic principles of distributed processing.

  4. Understanding Spark Operations

    • You will visually understand the core operational principles of Lazy Operation, partitioning, shuffling, and more through Jupyter Notebook and Spark UI.

  5. Essential Spark Commands for Real-World Practice

    • You will learn frequently used commands in practice such as data loading, date filtering, join, aggregation, UDF utilization, and data storage.

    • It also includes methods for efficiently utilizing SQL commands.

  6. Advanced Data Processing

    • You will learn advanced techniques for handling common real-world problems such as string data processing, null value handling, JSON data manipulation, and partition optimization.


Who is this course for?

  • Beginner learners who want to learn from the basics of Spark to practical application methods for beginners

  • Data engineers who want to learn data analysis and engineering techniques using Spark

  • Professionals who want to carry out corporate Spark projects or build scalable data pipelines working experts


Expected Benefits After Taking the Course

  • You can develop data processing and analysis capabilities using Spark and secure the competency to execute Spark projects in enterprise environments.

  • You will acquire practical know-how for efficiently processing large-scale data by loading, transforming, and storing data in real-world scenarios.

  • You can solidly build the foundation for Spark projects in cloud environments that will be covered in Part 2.


If you're just starting with Spark or want to learn practical data processing skills, "Hands-on Practical Spark Part 1" will be the perfect starting point. Let's move forward together into the world of data science! 🎓✨

Recommended for
these people

Who is this course right for?

  • People who are new to Spark

  • People who want to do a Spark corporate project

Need to know before starting?

  • Python Basics (Very Low Level)

Hello
This is

127

Learners

11

Reviews

24

Answers

4.9

Rating

3

Courses

현재 대기업 중심으로 아래와 같은 프로젝트의 개발책임 및 컨설팅을 맡고 있습니다. 현역^^입니다.

더불어, 고려대 대학원에서 인공지능 관련 겸임교수로도 활동하고 있습니다.

저의 목표는 실전에 바로 써먹을 수 있는 현장감 있는 프로그래밍 기술입니다. 앞으로 많은 여러분과 함께 재미난 수업 만들어 나가고 싶습니다.

  • 엔터프라이즈 인공지능 구조 및 서비스 설계

  • 머신러닝 서비스 구현

  • 벡엔드 서비스 개발

  • 클라우드(Azure) Databricks, ETL, Fabric 등 각종 클라우드 환경에서의 데이터베이스 구축 및 서비스 개발

Curriculum

All

48 lectures ∙ (10hr 18min)

Published: 
Last updated: 

Reviews

Not enough reviews.
Please write a valuable review that helps everyone!

$77.00

nexthumans's other courses

Check out other courses by the instructor!

Similar courses

Explore other courses in the same field!