강의

멘토링

로드맵

Inflearn brand logo image
BEST
AI Development

/

Natural Language Processing

Everyone's Korean Text Analysis and Natural Language Processing with Python

Python Korean Text Analysis and Natural Language Processing: Word Cloud Visualization, Morphological Analysis, Topic Modeling, Clustering, Similarity Analysis, Bag of Words and TF-IDF for Text Data Vectorization, Text Classification Using Machine Learning and Deep Learning, and How to Use Hugging Face

(4.8) 20 reviews

587 learners

  • todaycode
이론 실습 모두
텍스트분석
자연어처리
NLP
Text Mining
Machine Learning(ML)
data-clustering
Big Data

Reviews from Early Learners

What you will learn!

  • Word cloud visualization

  • Morphological analysis

  • Topic modeling

  • Clustering

  • Similarity analysis

  • Bag of Words and TF-IDF for Text Data Vectorization

  • Text classification using machine learning and deep learning (RNN, LSTM)

  • Utilizing BERT and koGPT2 with Hugging Face

📚 Get insights from complex documents with text analysis and natural language processing!

  • 💻 Understand the essence of language and learn how to effectively preprocess and analyze text data.

  • 🚀 Learn powerful NLP tools and text mining techniques to help you make more effective decisions by developing practical skills in business settings. 🛠 📊

  • 🗝 Find the key to transforming your business with Python text analytics. 💬 🔍

I recommend this to these people

📊 Planner, Marketer, Analyst 🕵‍♂️

  • Customer feedback, FGI, questions, complaints, data qualitative analysis

  • Understand market trends through online product review evaluation and analysis

  • Brand monitoring through market research and competitive product social media analysis

🔬 Researcher 🧪

  • Understanding social interactions and cultural phenomena through social media data

  • Extracting information from research papers

  • Analysis of extensive news articles, speeches, patents, and legal policy documents.

🎓 Student 📚

  • Complete text projects and assignments

  • Analysis of academic papers

  • Cultivating data literacy through information exploration based on text big data

  • Anyone who wants to gain insight from text and develop problem-solving skills

Learn about these things

Text preprocessing

  • Regular Expressions, Text Cleaning


  • Tokenization


  • Korean Morphological Analyzer KoNLPy

  • Pure Python Korean Morphological Analyzer PeCab

  • Noun extraction and

    Part-of-Speech Tagging

  • Stemming and Lemmatization

  • Stop Words

Word Cloud Visualization

Word vectorization

  • Term Frequency Calculation

  • TF-IDF (Term Frequency-Inverse Document Frequency)

  • Word Embedding

Topic modeling through word vectorization

Topic modeling, clustering, and similarity analysis

  • Latent Dirichlet Allocation (LDA)

  • Non-Negative Matrix Factorization (NMF)

  • Clustering similar documents through document clustering

  • Recommend documents and find similar documents through similarity analysis

NMF Topic Modeling

Text classification

  • Text classification techniques using machine learning

  • Hyperparameter tuning methods to improve machine learning performance

  • How to measure classification quality

  • Tensorflow Deep Learning-based Classification DNN, RNN, LSTM

Text classification

Difference between text vectorization and embedding

  • Understanding the difference between vectorization and embedding

  • Using an Embedded Projector

  • Using deep learning models

  • Measuring model performance with TensorBoard

  • Text classification and visualization using the BERT model

Model performance evaluation using TensorBoard

Embedding Projector Visualization

Word distance via embedding projector

Understanding how to use hugging face and key language model tasks

  • Natural Language Generation

  • 📖 Document Summarization


  • 🌐 Language Translation

  • Latest text analysis trends and practical applications

How to use the official Hugging Face tutorial

What will you understand and be able to do well after attending the lecture?

  • 📝 Tokenization

    • Split text into individual words, phrases, sentences, etc.

  • 🏷 Part-of-Speech Tagging

    • Learn how to tag each token (word) with a part of speech (noun, verb, etc.) and remove particles, punctuation, etc.


  • 📚 Topic Modeling, Clustering, Similarity Analysis

    • Extracts hidden topics from a set of documents.

    • Clustering similar texts (data-clustering)

    • Find or recommend similar text.

  • 📊 Text Classification

    • Categorizes documents into predefined categories.

  • 😃 Sentiment Analysis

    • Analyze positive, negative, and neutral sentiments in text.

  • 🔑 Keyword Extraction

    • Extract important keywords or phrases from text.


Practice Materials - Available in two versions: Practice and Executable

Practice material without code input (*_input.ipynb)

Practice material with code entered (*_output.ipynb)

You can follow along by looking at the code and practicing by providing a practice file (*_input.ipynb) that only has explanations without code and a practice file (*_output.ipynb) that includes code and explanations. You can also practice by looking at the explanations without code.

Theoretical data

Provides slides explaining the core contents of natural language processing (NLP) in over 200 pages

Over 200 pages of slides

Co-author of Everyone's Korean Text Analysis

Who created this course

What are you curious about?
Check it out first!
🙋‍♀

Q. Can non-majors also take the course?

If you understand the basic grammar of Python, you can easily listen to it even if you are not a major because it mainly uses the API of morphological analyzer, scikit-learn or pandas. This lecture is for those who want to utilize text analysis in various fields. It was created for the purpose of using text in business for planners, marketers, analysts, and non-IT researchers. Therefore, it may not be suitable for those who want to develop AI models directly or analyze them by writing formulas from the bottom up.

Q. Is it the same as the Everyone's Text Analysis book video released on YouTube ?

Most of this course is newly filmed. The videos released on the YouTube channel overlap with some of the content of Python, Pandas Basics, and Classification Basics. Also, topic modeling, clustering, similarity analysis, dimensionality reduction, and deep learning utilization are covered in much more detail than on YouTube. Before purchasing, check the videos on YouTube to see if they are what you were thinking about. => https://bit.ly/pytextbook-youtube

Q. Is the content the same as the book? Do I need to buy the book too?

There are some parts that overlap with the book and some that do not. Topic modeling, clustering, etc. are covered in more detail than in the book, and not all the examples in the book are covered.
You can take the course without the book. The book is recommended for those who want to reorganize the text into an organized form.

Q. What level of computer performance is required to take the course?

Any PC or laptop with at least 8GB of memory and about 20GB of remaining storage will do. If your computer's performance is low, you can try practicing through Google Colaboratory.

Q. To what extent does the class cover the content?

Starting with a small food review example corpus data, we cover Seoul 120 FAQ data, shopping reviews, and KLUE news topic data.
We cover tokenization, morphological analysis, topic modeling, clustering, similarity analysis, and machine learning.
We'll cover how to leverage previously shared models through HuggingFace.

Q. Does it cover mathematics, probability, or statistics?

Instead of directly learning math, probability, and statistics, we use scikit-learn, pandas, tensorflow, pytorch, and huggingface.

Things to note before taking the class

I don't recommend this to these people. 🚫

  • 🙅‍♂ Anyone who wants to learn the mathematical calculations and principles of the LLM model and create an LLM model from scratch

  • 🙅‍♂ Anyone who wants to develop LLM-based AI services

Practice environment

  • Operating System and Version (OS): Any operating system is fine as long as Python is installed and Jupyter or Colab is used.

  • Tools used: Jupyter or Google Colab.

  • PC specifications: If you have at least 8G of RAM and 20G of free storage space, you can easily take the course.

Learning Materials

  • We provide links to hands-on exercises via colab and Jupyter notebook files.

  • We provide two files, one with descriptions and code, and one with only descriptions so you can practice directly.

Please listen to some of the classes released through Inflearn Preview or YouTube Channel first and then decide whether to take the class.

You can preview some of the classes before taking the class. Check if it is the learning direction you want. ( => https://bit.ly/pytextbook-youtube ) Also, if you have any questions, please ask them through the inquiry before taking the class. In addition to the content on YouTube, the class covers a much wider range of tasks and deep learning utilization methods. It covers things that aren't on YouTube in more detail.

Player Knowledge and Notes

  • An understanding of basic Python syntax is required.

  • You should know how to use Jupyter or Google Colaboratory.


Recommended for
these people

Who is this course right for?

  • Business professionals who need text analysis

  • Researchers who need topic modeling or similarity analysis for research and papers

  • A student who wants to do a text analysis project

  • Job seeker looking to create a text analysis portfolio

Need to know before starting?

  • Basic Python Syntax

  • How to use Jupyter or Google Colab

Hello
This is

18,830

Learners

787

Reviews

1,334

Answers

4.9

Rating

6

Courses

  • Microsoft MVP(Python Developer Technologies)

  • 오늘코드 YouTube 📺 https://youtube.com/todaycode

  • “모두가 데이터에 친숙해지는 날이 오길”– 마이크로소프트웨어 (링크)

  • 네이버 커넥트 재단 부스트코스 데이터사이언스 강의 설계 및 교수자

  • 서울대 빅데이터혁신공유대학, 서울대 평생교육원, 연세대 DX Academy, 한신대 ABC Camp, 한양대 대학원, 전남대,

    한국능률협회, 삼성SDS 멀티캠퍼스, 멋쟁이사자처럼, 패스트캠퍼스, 모두의연구소 등 다수의 교육기관 및 기업 강의

  • 다양한 도메인(제약, 통신, 자동차, 커머스, 교육, 정부기관 등)의 기업 데이터 분석

  • 20년이상 게임, 광고, 교육 등 다양한 도메인에서 웹 백엔드 개발자 및 데이터 분석가 현업 경험

Curriculum

All

53 lectures ∙ (18hr 6min)

Course Materials:

Lecture resources
Published: 
Last updated: 

Reviews

All

20 reviews

4.8

20 reviews

  • ckfs8971님의 프로필 이미지
    ckfs8971

    Reviews 7

    Average Rating 5.0

    5

    100% enrolled

    After taking instructor Park Jo-eun's course at Naver Boost Course - I learned the basics of PyTorch and TensorFlow and then proceeded. It's easy and good for non-majors to listen to, and it's 90% focused on project practice, so I recommend it... It was also good for easily understanding the concepts of deep learning, which I had studied with difficulty. It would be good if you also provided the marketing analysis that you do on YouTube as a lecture.

    • geogeo20205381님의 프로필 이미지
      geogeo20205381

      Reviews 2

      Average Rating 5.0

      5

      100% enrolled

      • sygogu4600님의 프로필 이미지
        sygogu4600

        Reviews 2

        Average Rating 5.0

        5

        30% enrolled

        • iklee7286님의 프로필 이미지
          iklee7286

          Reviews 1

          Average Rating 5.0

          5

          60% enrolled

          • jisulim8819님의 프로필 이미지
            jisulim8819

            Reviews 1

            Average Rating 5.0

            5

            100% enrolled

            I enjoyed the good content.

            $59.40

            todaycode's other courses

            Check out other courses by the instructor!

            Similar courses

            Explore other courses in the same field!