Question 1
In Reinforcement Learning, what does the term 'policy gradient' refer to?
Question 2
Which of the following algorithms is primarily used for policy optimization in Reinforcement Learning?
Question 3
What is the primary challenge of applying Reinforcement Learning in real-world robotics?
Question 4
In the context of Reinforcement Learning, what is the main advantage of using 'experience replay'?
Question 5
What is the primary purpose of 'domain randomization' in simulation-to-real transfer?