In OpenAI gym classic-like env training, the model yields good results and completes the task. Validating with unseen data yields considerably lower results. Tried:
No matter, getting same results with the above. Any idea what it could be or what I can try?
2.1m questions
2.1m answers
60 comments
57.0k users