Reinforcement Learning for Programmers (Author's Direct Lecture)
The easiest and most detailed lecture on reinforcement learning, the core technology for business innovation!!! We will put reinforcement learning in your hands within 17 days, dedicating 2 hours a day (2 lectures). From now on, reinforcement learning will not be a difficult problem to understand, but a great tool for you.
471 learners
Level Basic
Course period Unlimited

Program Error Action Guide (December 10, 2022)
Notice regarding errors on December 10, 2022.
There have been many changes to the related packages since I posted the lecture.
There are three types of errors that can occur:
Error 1 is caused by a change in the protoc package.
You can solve the problem by deleting the protobuf package and installing version 3.8.
Error number 2 is a problem with the reset function provided by the gym package. Since the return value is a dictionary, the problem can be solved by adding the code state[0] that selects the first value.
Error number 3 is a problem that occurs because the step function provided by the gym package has one more return value. You can solve it by adding one more none2 variable to the receiver.
1. When running the example program, the following error occurs:
TypeError: Descriptors cannot be created directly.
If this call came from a _ pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
1. Downgrade the protobuf package to 3.20.x or lower.
2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).
1. Solution
pip uninstall protobuf
pip install protobuf==3.8
2. Dictionary processing solution
state = env.reset()
state = state[0] #Add code
3. How to solve the problem of adding return value
state_next, reward, done, none, none2 = self.env.step(action)




