I didn't take diffusion 1 and 2. I work in the ML field and know diffusion to some extent, but I took this course to save time studying on my own.
Honestly, the lecture quality is quite disappointing for the price.
Overall issues:
There's a lot of stuttering, making it hard to concentrate. At 60,000 won per hour, this was quite disappointing.
Easy parts are explained in too much detail, while difficult and important parts are glossed over.
Specifically lacking areas:
CLIP/T5
The course description says "CLIP/T5 integration and token flow understanding," but it just mentions loading and using them, and that's it.
There's no explanation of how CLIP and T5 differ, why they're used together, or why the sequence length is set to 77.
RoPE
There's almost no explanation of RoPE itself.
There are cases where RoPE is used in attention blocks and cases where it isn't, but there's no explanation of this difference, and while caching is in the code, there's no explanation of when or why it's done.
AdaLN
SA and CA, which were already covered, are explained in detail again, but important concepts like AdaLN-single are only described as "same as before, using zero initialization in cross attention projection."
I don't understand what this means or why it's done.
When I looked it up separately, zero initialization refers to AdaLN-Zero, which seems to be a different concept from AdaLN-Single... but the lecture had no such distinction or explanation at all.
Linear Attention (SANA)
The preliminary explanation is okay, but when explaining the code, you don't explain how it differs from vanilla attention and only point out the same parts (qkv) before moving on.
Errors:
When explaining the SANA scheduler, I think you said "0.5 to x" when it should have been "0.5 to t." It's a small mistake, but it's disappointing that a 60,000 won per hour lecture wasn't even reviewed.
Conclusion:
I can get a few keywords and study by reading papers and code, but I wonder if it's worth paying 60,000 won per hour. The satisfaction is lower than free YouTube lectures, which is very disappointing... Even the responses to course reviews seem automated using LLM...