![[개정판] 파이썬 머신러닝 완벽 가이드Course Thumbnail](https://cdn.inflearn.com/public/courses/324238/cover/7e380aa0-48ba-4ee7-a6b2-8da7900568d6/324238-eng.png?w=420)
[개정판] 파이썬 머신러닝 완벽 가이드
권 철민
이론 위주의 머신러닝 강좌에서 탈피하여 머신러닝의 핵심 개념을 쉽게 이해함과 동시에 실전 머신러닝 애플리케이션 구현 능력을 갖출 수 있도록 만들어 드립니다.
Basic
Python, 머신러닝, 통계
If you want to be recognized as a machine learning expert based on large-scale data, from understanding the core framework of Spark machine learning, to SQL-based data processing through difficult practical problems, to data analysis through business domain analysis, and to the ability to implement optimized machine learning models, please join this course.
938 learners

Implementing Machine Learning Models in Spark
Detailed understanding of DataFrame, the foundation of Spark's data processing
Understand the various technical elements that make up the Spark Machine Learning Framework
Mastering Spark's Machine Learning Pipeline
Ability to use SQL for data analysis
SQL-based Feature Engineering Techniques
Implementing models with XGBoost and LightGBM in Spark
Model hyperparameter tuning method based on Bayesian optimization
Improve your data analysis and ML model implementation skills simultaneously through challenging real-world problems.
Data analysis method based on analysis domain
Various data visualization techniques
Since the changes to the local environment are limited to specific parts of the practice code, most of the lecture videos from Section 1 to Section 10 will continue to use the existing recorded videos from Databricks Community, and the course has been newly structured with practice videos on local Spark only for the major changes. Additionally, from Section 11 onwards, all practice videos will be on local Spark, and the course will be newly structured by January 15, 2026, so please keep this in mind when selecting the course.
Data Analysis + Feature Engineering + ML Implementation,
Master all three skills at once.
The strongest open-source large-scale distributed processing solution, Apache Spark, meets Machine Learning.
Many large domestic corporations and financial institutions are using Apache Spark to analyze large-scale data and build machine learning models. Since Spark is based on a distributed data processing framework, it can process large volumes of data and create ML models while scaling capacity across anywhere from a few to dozens of servers. This allows you to overcome the limitations of scikit-learn, which can only implement machine learning models on a single server.
The 'Complete Guide to Spark Machine Learning - Part 1' course will help you grow into a machine learning expert proficient in data processing and analysis, going beyond just learning how to implement machine learning models in Spark.
To grow into a true machine learning expert, it's crucial not only to have ML implementation skills but also to develop the ability to process and combine business data to create ML models. To achieve this, you will learn through hands-on practice how to process data using SQL, which is most commonly used for handling large-scale data in real-world settings, and data analysis techniques based on business domain analysis.
Implementing machine learning models on a Spark-based platform is not easy. This is because you encounter many problems that existing data scientists and machine learning experts have not experienced before, such as unique machine learning APIs and frameworks based on Spark's architectural specificity, and SQL-based data processing.
Through this course, Spark Machine Learning Complete Guide, I will help you develop the ability to solve the problems you face.
The first half of the course consists of detailed theoretical explanations and abundant hands-on practice covering various components that make up the Spark Machine Learning Framework, including DataFrame, SQL, Estimator, Transformer, Pipeline, and Evaluator. Through this, you will be able to implement ML models in Spark easily and quickly.
Additionally, I will explain in detail how to use XGBoost and LightGBM in Spark, and how to tune hyperparameters using HyperOpt based on Bayesian optimization.
Currently, the latter part of the course consists of a hands-on practice using Kaggle's Instacart Market Basket Analysis competition, but as the Instacart Market Basket Analysis competition has been removed from Kaggle, it will be changed to a hands-on practice using Kaggle's Home Credit Default Risk competition (scheduled to be completed by January 15, 2026)
Through implementing models for Kaggle's Home Credit Default Risk competition, a highly challenging competition, we will simultaneously improve your practical data processing/analysis skills and machine learning model implementation abilities.
Through this dataset, you will learn in detail how to process and analyze business data in SQL, how to perform Feature Engineering, how to derive analytical domains in business contexts, and how to build models based on these derived features.
💻 Please check before enrolling!
This course uses Docker to set up a hands-on environment based on local Spark and Jupyter. The practice environment is configured by installing Docker Desktop on your local PC, and the course is structured so that you can set up the practice environment without any problems even if you don't know Docker.
The practice code and lecture materials can be downloaded from 'Download Practice Code and Materials'.
This course is designed assuming that students have knowledge equivalent to Chapter 5 (Regression) of the Python Machine Learning Perfect Guide, and also have a very basic understanding of SQL. Please refer to these requirements when selecting this course.
Spark is good to know the basics of, but even if you don't, you should have no problem following the course.
Curious about the instructor's interview? (Click)
Who is this course right for?
Anyone who wants to implement machine learning using Spark
Those who want to implement machine learning based on large-scale data
Anyone who wants to improve their data processing techniques for machine learning using SQL
Anyone who wants to learn the entire process of processing data into the desired format and creating an ML model based on it in practice
Anyone who wants to improve data analysis, feature engineering capabilities, and ML implementation
Need to know before starting?
Understanding up to Chapter 5 (Regression) of the Complete Guide to Python Machine Learning or equivalent prior knowledge
Understanding SQL Basics
26,975
Learners
1,376
Reviews
4,011
Answers
4.9
Rating
14
Courses
(전) 엔코아 컨설팅
(전) 한국 오라클
AI 프리랜서 컨설턴트
파이썬 머신러닝 완벽 가이드 저자
All
122 lectures ∙ (24hr 23min)
Course Materials:
All
28 reviews
4.9
28 reviews
Reviews 7
∙
Average Rating 5.0
5
파이썬 머신러닝 완벽가이드 통해서 권철민선생님을 처음 알게 되었습니다. 그 강의를 통해서 비전공자였던 저는 포기하려고 했던 이 분야를 포기하지 않을 수 있었습니다. 현재 이 분야에서 일을 하면서 이렇게 인프런 강의를 들으며 공부도 꾸준히 하고 있습니다. 선생님께 감사하다는 말씀을 전하고 싶어서 처음에 질문답변 사안에 선생님께 감사하다는 말씀을 드렸었는데, 선생님께서 꾸준히 하면 노력한 바를 이룰 수 있을 거라고 응원하면서 말씀해주셨습니다. 앞으로도 선생님께서 강의하시는 것 꾸준히 들을 예정입니다. ^^ㅎㅎ 그만큼 정말 잘 가르쳐주십니다. 권철민 선생님 이 자리를 빌러, 진심으로 정말 감사합니다.
이렇게 가슴 뭉클한 수강평을 남겨 주시다니 제가 더 감명 받았습니다. 강의를 만드는 수고를 한 순간에 보상받는 글이여서 제가 오히려 감사드려야 할 것 같습니다. 앞으로도 계속 이렇게 정진하신다면, 원하는 모든 일 확실히 다 성취 하실 것입니다. 감사합니다.
Reviews 54
∙
Average Rating 5.0
Reviews 13
∙
Average Rating 5.0
Reviews 8
∙
Average Rating 4.9
Reviews 1
∙
Average Rating 5.0
$77.00
Check out other courses by the instructor!
Explore other courses in the same field!