강의

멘토링

로드맵

Data Science

/

Data Engineering

Big Data Cluster Build Package; Roadmap to Success

This is a code lab-based class where you will directly build a big data system or distributed processing system cluster (HDFS, Zookeeper, Spark, Zeppelin) that guarantees high availability.

(4.8) 20 reviews

113 learners

  • jphil
cluster
클러스터
빅데이터
실습 중심
Big Data
Apache Spark
Hadoop
Data Engineering

Reviews from Early Learners

What you will learn!

  • Big Data Cluster Setup

  • Distributed File OR Processing System

  • High Availability

  • Hadoop

  • HDFS

  • Apache Spark

  • Apache Zeppelin

  • Apache Zookeeper

  • AWS (EC2, AMI, Security Group)

Building a big data distributed cluster through code lab.
Big Data Cluster Build Package
👨🏻‍🎓

Hello, this is J.PHIL 🍏

As one semester has passed and a good opportunity has come up, this season we will be holding a lecture titled ' Big Data Cluster Construction Package ' where you will build a big data distributed cluster yourself 📚

Thanks to your support, inspired by the previous "Big Data Pipeline Master" class, I pondered, "Isn't there a more challenging, yet meaningful, course?" After much deliberation, I've painstakingly crafted this course.

Keywords: Big Data Cluster, Distributed System, High Availability, Hadoop, HDFS, Apache Spark, Zookeeper, Zeppelin, AWS EC2 & AMI

Why should we attend lectures 🙇🏻

Over the past decade, rapid technological advancements have led to the proliferation of platforms and services, enabling us to utilize and analyze the vast amounts of data generated from our daily lives, enabling us to live a higher quality of life.

As shown in Figure 1 below, not only domestic large corporations but also global giants openly emphasize the importance of Big Data Storage and Big Data Processing, and demand similar analysis and construction skill sets from many engineers.

001.png

002.png

However, before entering the industry , it's difficult to gain hands-on experience building or managing a BIG DATA CLUSTER . Therefore, when the opportunity to gain meaningful value arises, a lack of experience can lead to disappointing results.

When I was a researcher, I had to build a big data cluster of 50 people myself while writing a paper for the 'DATA TOPTIER CONFERENCE'. I endured the burden of having to set an example for the members and the great stress of having to pay, and I stayed up day and night for two weeks , focusing solely on building the cluster.

Of course, I learned a lot from that valuable experience, and it served as valuable nurturing for my future. However , I don't want you to waste your time inefficiently like this. In other words, I created this course with the hope that you won't just spend your precious 200 hours building a cluster, but instead dedicate it to efficiently conducting experiments or analyzing customer data on top of it. 📝

Above all, I hope that after you take the current lecture and gain experience in building a cluster, it will be of great help to you when you build a big data cluster in the field or in graduate school like me. Please refer to the lecture as it is unlimited.💓

What will we learn 📚

📝

Experience of writing a paper for a top-tier data conference

👨🏻‍💼

Valuable experience in building and analyzing big data systems gained from the field

🧑🏻‍🏫

Long experience in nurturing good students at university

With this valuable experience, we hope to help you create a ⚔️ powerful weapon in your field.

1. On top of HDFS , a distributed file system that guarantees high availability (see Daemon example below)

2. Big Data System Masterpiece: Apache Spark and Zeppelin , a Big Data-Dedicated Notebook

We will build the cluster package ourselves through theory and solid code labs.

image.png

Do the high-availability file system daemon configurations above seem a bit daunting? Seeing architecture and system configuration diagrams for the first time can be overwhelming.

but

Based on the valuable feedback from excellent students over the past six years and the experience of launching the last two Inflearn courses, we have organized the content into easy-to-understand, high-quality content that is as easy to understand as possible, step by step, tailored to the students' level . Feel free to follow along.

special thanks to my lovely students 👨🏻‍🎓

Please tell me about the curriculum 🧑🏻‍🏫

Rather than starting directly with CODELAB, we'll begin by learning the theory behind building a high-availability cluster . For students unfamiliar with AWS or Linux environments, we'll watch video tutorials and study background knowledge before moving on to in-depth code labs .

curri-1.jpg

Anyone interested in big data or distributed processing can take this course 🧑🏻‍🎓

What is the training environment like? 💻

You can follow the class sufficiently by preparing a stress-free environment as shown below.

  • OS: Ubuntu 20.04 LTS

  • Editor: Vim (up to your preferences)

  • Machine specifications

    • AWS EC2 / c5.large ( 2 Core 4GB ) 4 or 5 units

Please watch the Course Curriculum for more details 😊

Introducing J.PHIL 👨‍👨‍👧‍👦

image.png

Recommended for
these people

Who is this course right for?

  • Students who want to experience building a big data processing system cluster

  • Students interested in data analysis and systems and who wish to pursue a career in this field

  • Developers who want to experience high availability cluster practice firsthand

  • Job seekers who want to build strengths in the field of big data analysis and construction

Need to know before starting?

  • Python Basic Coding

  • Basic knowledge of Linux commands

  • Database Basics

Hello
This is

450

Learners

40

Reviews

50

Answers

4.9

Rating

2

Courses

안녕하세요 J.PHIL 입니다 🧑🏻‍🎓

첫번째 강의로 [ 빅데이터 시스템 구축 및 분석에 관심있는 입문자 ] 를 위해
"Mastering Big Data Processing: Tools and Techniques for Success" 강의를 오픈 하였습니다.

'수업 및 프로필' 자세한 사항들은 수업 상세 페이지에 잘 작성했으니 참고 부탁드립니다  🙏🏻

Curriculum

All

36 lectures ∙ (4hr 51min)

Course Materials:

Lecture resources
Published: 
Last updated: 

Reviews

All

20 reviews

4.8

20 reviews

  • 귤껍데기님의 프로필 이미지
    귤껍데기

    Reviews 3

    Average Rating 4.3

    5

    44% enrolled

    내용이 알차고 처음 시작하기에 좋은 강의라고 생각되네요. 이런 강의를 준비해 주셔서 감사합니다.

    • one831님의 프로필 이미지
      one831

      Reviews 1

      Average Rating 5.0

      5

      19% enrolled

      곧 졸업을 앞둔, 데이터 엔지니어를 지망하는 컴공과 학생입니다. 취업 관련 포트폴리오를 만들면서, 빅데이터를 처리하기 위한 파이프라인 및 아키텍쳐를 어떻게 구성하고, 어떤 식으로 aws 환경을 설정하여 최대한 낮은 비용으로 효율적으로 이용할 수 있을지 고민이 많았었는데, 본 강의를 통해 엄청난 인사이트와 노하우들을 얻어갑니다. 특히, 빅데이터를 다루는 다양한 프레임워크들에 대한 많은 지식도 얻게 되어 앞으로 어느쪽으로 파고들수 있을지 영감을 얻은 것 같아 기쁩니다. 가뭄 끝에 단비를 만났습니다. 저와 같이 이쪽 분야를 지망하시는 학생분들께 수강 추천드립니다.

      • J.PHIL
        Instructor

        안녕하세요 one831님, 소중한 수강평 감사드리며, 앞으로도 좋은 결과 있기를 바랍니다 화이팅입니다

    • 권영미님의 프로필 이미지
      권영미

      Reviews 3

      Average Rating 5.0

      5

      100% enrolled

      감사합니다!

      • J.PHIL
        Instructor

        안녕하세요 권영미님, 소중한 수강평 감사드립니다! 화이팅입니다

    • Jason.king님의 프로필 이미지
      Jason.king

      Reviews 2

      Average Rating 5.0

      5

      36% enrolled

      이전에 파이프라인 강의를 듣고 본 강의를 듣고 있는데 머리속에 잘 들어와서 너무 좋아요~ 컴팩트하고 실무에 쓰일 강의 감사해요~ 이 강의도 금방 들어버릴 것 같은데 다른 강의도 있을지 기대됩니다.

      • 2일 걸렸네요. lab 형식이라 좀 빠르게 진행되고 , namenode 기동이 안되어서 삽질하느라 어려웠는데 (아마 어딘가 실수하여 그런듯) 나중에 보니 trouble shoot guide 부분에 기동절차 스크립트 및 로그 보는 부분 정리해 두셨네요. 이것도 봤다면 좀더 빨리 실수를 복구했을텐데 ㅜㅜ 혹시 진행하시는 분들은 모두 따라 치는 것보다는 한번 정독하고 따라하는 것도 좋을 것 같아요~ 강사님. 좋은 강의 매번 감사해요~

      • J.PHIL
        Instructor

        안녕하세요 Jason.King 님, 제 이번 강의를 열심히 수강해주셔서 감사합니다 :) 때때로 버그나 trouble shooting을 직접 겪어보면서 고민해보고 복기해보는 것이 많이 도움이 될 때가 있을테니 오히려 이번 경험이 추후 큰 도움될거라 사료됩니다. 굵직한 오프소스를 직접 구축해보면 클러스터를 구축해보셨으니 다른 오프소스가 나와도 이제 빠른 시간에 잘 구축하실 수 있을겁니다. 앞으로도 화이팅입니다

    • Yeonwoo Jung님의 프로필 이미지
      Yeonwoo Jung

      Reviews 4

      Average Rating 5.0

      5

      31% enrolled

      이론 부터 코드랩까지 초기 입무자에게 정말 추천하는 강의 입니다!! 빅데이터 클러스터 구축 강의로 필수로 수강하길 추천합니다!!

      • J.PHIL
        Instructor

        안녕하세요 Yeonwoo Jung님, 소중한 수강평 감사합니다. 기회될 때 하루이틀 투자하셔서 AWS 로 실습을 따라해보셔서 좋은 성과 있기를 바랍니다. 새해 복 많이 받으세요 :)

    $77.00

    jphil's other courses

    Check out other courses by the instructor!

    Similar courses

    Explore other courses in the same field!