강의

멘토링

로드맵

Inflearn brand logo image
Programming

/

Programming Language

CUDA Programming (0) - C/C++/GPU Parallel Computing - Open Sample Lecture

✅ This is an introductory lecture (0) that introduces the entire series of lectures (1) to (6). ✅ NVIDIA GPU + CUDA programming is explained step by step from the basics. ✅ It processes arrays/matrices/image processing/statistical processing/sorting, etc. very quickly with parallel computing in C++/C language.

(4.9) 45 reviews

1,398 learners

  • onemoresipofcoffee
CUDA
GPU
Parallel Processing
C
C++
Thumbnail

Reviews from Early Learners

What you will learn!

  • Full Series - Massively Parallel Computing with CUDA on GPUs

  • This lecture is - Part (0) - Introduction to Massive Parallel Computing and CUDA

  • Update - June 2023, "Remastered" 🍀 (some audio, intro video)

  • ✅ Bundle Discount Coupon✳️ provided in the roadmap "CUDA Programming"

Speed is everything in a program!
Make it fast with massive parallel processing techniques 🚀

They say massive parallel computing is important 🧐

CUDA = The most widely used GPU parallel computing technology
Step by step + abundant examples + detailed explanations = This is the course!

Large-scale parallel computing based on GPUs and graphics cards is actively used in AI, deep learning, big data processing, and image/video/audio processing. Currently, the most widely applied technology in GPU parallel computing is NVIDIA's CUDA architecture.

Among parallel computing technologies, large-scale parallel computing and CUDA are considered crucial. However, it's difficult to find a course that systematically teaches this field, making it difficult to even begin learning. Learn CUDA programming step by step through this course. CUDA and parallel computing require a theoretical background and can be challenging. This course's rich examples and background explanations, along with a thorough understanding of the fundamentals, will give you the tools you need! This course will be produced as a series, ensuring ample lecture time.

This lecture will explain how C++/C programmers can use the CUDA library and C++/C functions to accelerate a wide range of problems using massively parallel processing techniques . This approach can be used to accelerate existing C++/C programs or to dramatically accelerate new algorithms and programs by developing them entirely using parallel computing.

📢 Please check before taking the class!

  • For this tutorial, please ensure you have a hardware environment that supports NVIDIA CUDA. You will need a PC or laptop equipped with an NVIDIA GeForce graphics card .
  • While NVIDIA GeForce graphics cards can be used in some cloud environments, cloud environments often have frequently changing configurations and often require a fee. If you're using a cloud environment, be sure to secure an environment that supports graphics cards.
  • You can check the lecture practice environment in detail in the curriculum's <00. Preparation before lecture> lecture.

Lecture Features ✨

#1.
rich
Examples and explanations

CUDA and large-scale parallel computing require extensive examples and explanations. This series of lectures covers parts (0) through (6), totaling over 24 hours.

#2.
Practice is essential!

Since it is a computer programming subject, it emphasizes abundant practical training and provides actual working source code so that you can follow along step by step.

#3.
The important part
Focus!

During lecture time, we will try to avoid redundant explanations as much as possible for the source code parts that have already been explained, so that you can focus on only the changed parts or the parts that need to be emphasized.


I recommend this to these people 🙋‍♀️

Programmers who want to dramatically improve existing programs

Researchers who want to know how various applications are accelerated

College students who want to add new technologies to their portfolio before getting a job.

Anyone who wants to learn about the theory and practice of parallel processing such as AI, deep learning, and matrix calculations.

Preview lecture review 🏃

*The review below is a review of an external lecture given by a knowledge sharer on the same topic.

"I knew nothing about parallel algorithms or parallel computing,
After taking the course, I feel more confident in parallel computing."

"There were many algorithms that could not be solved with existing C++ programs.
Through this lecture, I was able to improve my ability to process in real time!"

"After attending the lecture, when I was interviewed and said that I had experience with parallel computing, the interviewers were very surprised.
"I heard that it's not easy to find CUDA or parallel computing courses at the college level."


CUDA Programming Conquest Roadmap 🛩️

  • The CUDA programming course is designed to increase concentration on the topic, with 7 series totaling over 24 hours of lectures.
  • A roadmap lecture titled "CUDA Programming" is also available. Be sure to check it out.
  • Each lecture is divided into six or more sections, each covering a separate topic . (The current lecture, Part 0, consists of only two sections, the Introduction.)
  • Slides used in the lecture are provided as PDF files, and the source code of the programs used is provided in the sections explaining the practical examples.

Part 0 (1-hour free lecture) Current lecture

  • Introduction to MPC and CUDA - This section provides an overall introduction to MPC and CUDA.

Part 1 (3 hours 40 minutes)

  • CUDA Kernel Concepts - Learn the concepts of CUDA Kernels, the starting point of CUDA programming, and see parallel computing in action.

Part 2 (4 hours 15 minutes)

  • Vector addition - Various examples of operations between vectors, which are one-dimensional arrays, are presented, and AXPY routines are actually implemented in CUDA.

Part 3 (4 hours 5 minutes)

  • Memory Hierarchy - Learn about the memory structure at the heart of CUDA programming. Implement examples such as matrix addition and adjacent difference.

Part 4 (3 hours 45 minutes)

  • Matrix transpose & multiply - Provides various examples of operations between two-dimensional arrays of matrices, and implements the GEMM routine with CUDA.

Part 5 (3 hours 55 minutes)

  • Atomic Operation & Reduction - Learn CUDA control flow, from problem definition to solution, including atomic operations and reductions. You'll also implement GEMV routines in CUDA.

Part 6 (3 hours 45 minutes)

  • Search & Sort - Learn examples of how to effectively implement search-all problems, even-odd sort, bitonic sort, and counting merge sort using the CUDA architecture.

CUDA programming and
Conquering massive parallel computing!


Q&A 💬

Q. What are the reviews of the paid lectures?

Paid courses are being released sequentially, from (1) to (6), so course reviews are scattered and not yet public. The paid courses currently have the following reviews:

  • It was very helpful that you explained in detail the process of maximizing performance by applying various techniques in one example.
  • It was much easier to understand because you explained the memory structure and logic visually.
  • While studying vague AI, it's good to be able to add in-depth content about devices.
  • The software installation was well explained and the source code was provided, making it easy to practice.

Q. Is this a course that non-majors can also take?

  • Some experience with C++ programming is required. At the very least, some C programming experience is expected. While all examples are written in a simple, straightforward manner, they are provided in C++/C code, and the functionality provided by functions like malloc and memcpy is not specifically explained.
  • If you have an understanding of computer architecture (registers, cache memory, etc.), operating systems (time sharing, etc.), and compilers (code generation, code optimization), you will be able to understand the lecture content more deeply.
  • This course was originally designed for advanced study by seniors in computer science at four-year universities.

Q. Is there anything I need to prepare before attending the lecture? Are there any notes regarding the course (necessary environment, other considerations, etc.)?

  • You must first secure a hardware environment that supports NVIDIA CUDA for practical training. A PC/laptop equipped with an NVIDIA GeForce graphics card or a cloud environment is required.
  • Some cloud environments also allow you to use NVIDIA GeForce graphics cards, but cloud environments often have changing settings and are often paid, so please choose an environment that allows you to use your graphics card.

Q. What level of content is covered in the class?

  • Starting from Part (0), moving up to Part (1) and Part (6), deeper theory and greater understanding are required.
  • We strongly recommend that you take the courses in order, from Part (0) to Part (6).
  • The counting merge sort covered in the final part of Part (6) is a challenging topic, even for expert researchers. However, many students who followed along step by step found it easy to understand, building on their previous learning.

Q. Is there a reason for setting a course deadline?

  • The reason for setting a deadline for the course is that, given the nature of the computer field, the course content is likely to become outdated after that amount of time.
  • By then, I'll be back with a new lecture. 😄

Q. Are there subtitles in the video?

  • Yes, all videos have subtitles!
  • Subtitles may be added to videos without subtitles when updated, but for now, all videos have subtitles.

Recommended for
these people

Who is this course right for?

  • Those who want to accelerate arrays/matrices/image processing/statistical processing/sorting, etc. with C++C-based parallel computing/parallel processing

  • Those who want to accelerate their own developed program with parallel computing/CUDA/CUDA

  • For those who want to study NVIDIA CUDA programming/CUDA computing from the basics

  • Those who want to study the theory and practice of GPU parallel processing/parallel computing

Need to know before starting?

  • C++ or C programming experience

  • Knowledge of computer architecture, registers, caches, time sharing, etc. would be helpful.

Hello
This is

9,108

Learners

221

Reviews

64

Answers

4.9

Rating

30

Courses

One more cup of drip coffee for the road

Curriculum

All

15 lectures ∙ (1hr 8min)

Course Materials:

Lecture resources
Published: 
Last updated: 

Reviews

All

45 reviews

4.9

45 reviews

  • wayfarecru0581님의 프로필 이미지
    wayfarecru0581

    Reviews 25

    Average Rating 5.0

    5

    7% enrolled

    As someone else wrote in their course review... I'm really grateful that you made this course into Korean.

    • onemoresipofcoffee
      Instructor

      Hello. Thank you for your good review. I will see you again with more content.

  • asdasv님의 프로필 이미지
    asdasv

    Reviews 2

    Average Rating 5.0

    5

    33% enrolled

    • junoio7614님의 프로필 이미지
      junoio7614

      Reviews 1

      Average Rating 5.0

      5

      33% enrolled

      • ninety25296님의 프로필 이미지
        ninety25296

        Reviews 112

        Average Rating 5.0

        5

        100% enrolled

        • noojun105977님의 프로필 이미지
          noojun105977

          Reviews 6

          Average Rating 3.8

          5

          33% enrolled

          Free

          onemoresipofcoffee's other courses

          Check out other courses by the instructor!

          Similar courses

          Explore other courses in the same field!