Using Simulation to Enable Real World Robotics
Abstract
Real world robotics is a multifarious process spanning several fields including simulation, semantic/scene understanding, reinforcement learning, domain randomization, just to name a few. Ideally simulators would accurately capture the real world perfectly in a much faster capacity allowing for a predictive power of how a robot will interact with its environment. Unfortunately, simulators neither have the speed nor accuracy to support this. Simulators, such as Gazebo, Webots, and OpenRave, are supplemented with machine learned models of their environment to solve specific tasks such as scene understanding and path planning. This can be compared to a physical only solution which can be costly in terms of price and time. Advances in virtual reality allow for new ways for humans to provide training data for robotic systems in simulation. Using modern datasets such as SUNCG and Matterport3D we now have more ability than ever to train robots in virtual environments. Through understanding modern applications of simulations, better robotic platforms can be designed to solve some of the most pressing challenges of modern robotics.
Organization
First I will discuss the background and motivation of simulation in robotics, citing a foundational paper discussing the nature of simulating motion elements in 1983. Then I will compare virtual simulation to the drawbacks of physical simulation. I will then do an in depth analysis of different simulation architectures as well as applications built on top of their backends. There are multiple simulation datasets in use today and I will mention how they help supplement current research. I will then discuss the human in the loop data that can be generated because of the ease of these simulators through virtual reality applications. I will then discuss how we are using these sophisticated datasets to train scene understanding network architectures and enable sim-to-real applications. Finally I will discuss the impacts and boons that simulation has given to deep learning and reinforcement learning applications.
Papers
Motivation
- Derby, S., & Robot, G. (1983). Simulating Motion Elements of General-Purpose Robot Arms. The International Journal of Robotics Technology, 2(1), 3–12. Retrieved from https://journals.sagepub.com/doi/pdf/10.1177/027836498300200101
- Gu, S., Holly, E., Lillicrap, T., & Levine, S. (2017). Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In Proceedings - IEEE International Conference on Robotics and Automation (pp. 3389–3396). http://doi.org/10.1109/ICRA.2017.7989385
Simulators
- Miller, A. T., & Allen, P. K. (2004). Graspit: A versatile simulator for robotic grasping. IEEE Robotics and Automation Magazine, 11(4), 110–122. http://doi.org/10.1109/MRA.2004.1371616
- Michel, O. (2004). Webots: professional mobile robot simulation. International Journal of Advanced Robotic Systems, 1(1), 39–42. http://doi.org/10.1.1.86.1278
- Diankov, R., & Kuffner, J. (2008). OpenRAVE : A Planning Architecture for Autonomous Robotics. Tech. Rep. CMU-RI-TR-08-34, Robotics Institute, (July). http://doi.org/CMU-RI-TR-08-34
- Todorov, E., Erez, T., & Tassa, Y. (2012). MuJoCo: A physics engine for model-based control. IEEE International Conference on Intelligent Robots and Systems, 5026–5033. http://doi.org/10.1109/IROS.2012.6386109
- Koenig, N., & Howard, A. (2001). Design and use paradigms for gazebo, an open-source multi-robot simulator. 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566), 3(3), 2149–2154. http://doi.org/10.1109/IROS.2004.1389727
- Xia, F., & Sax, A. (n.d.). Gibson Env : Real-World Perception for Embodied Agents.
- Savva, M., Chang, A. X., Dosovitskiy, A., & Funkhouser, T. (n.d.). MINOS : Multimodal Indoor Simulator, 1–14.
- Koenig, N., & Howard, A. (2001). Design and use paradigms for gazebo, an open-source multi-robot simulator. 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566), 3(3), 2149–2154. http://doi.org/10.1109/IROS.2004.1389727
- Shah, S., Dey, D., Lovett, C., & Kapoor, A. (2017). AirSim: High-Fidelity Visual and Physical Simulation for Autonomous Vehicles, 1–14. http://doi.org/10.1007/978-3-319-67361-5_40
Simulation Datasets
- Chang, A., Dai, A., Funkhouser, T., Savva, M., & Song, S. (n.d.). Matterport3D : Learning from RGB-D Data in Indoor Environments.
- Song, S., Yu, F., Zeng, A., Chang, A. X., Savva, M., & Funkhouser, T. (2016). Semantic Scene Completion from a Single Depth Image, 1746–1754. http://doi.org/10.1109/CVPR.2017.28
- Calli, B., Walsman, A., Member, S., Singh, A., Member, S., Srinivasa, S., … Member, S. (n.d.). Benchmarking in Manipulation Research : The YCB Object and Model Set and Benchmarking Protocols.
- Chang, A. X., Funkhouser, T., Guibas, L., Hanrahan, P., Huang, Q., Li, Z., … Yu, F. (2015). ShapeNet: An Information-Rich 3D Model Repository. http://doi.org/10.1145/3005274.3005291
Virtual Reality
- Mandlekar, A., Zhu, Y., Garg, A., Booher, J., Spero, M., Tung, A., … Fei-Fei, L. (2018). ROBOTURK: A Crowdsourcing Platform for Robotic Skill Learning through Imitation, (CoRL). Retrieved from http://vision.stanford.edu/pdf/mandlekar2018corl.pdf
- Whitney, D., Rosen, E., Phillips, E., Konidaris, G., & Tellex, S. (2017). Comparing Robot Grasping Teleoperation across Desktop and Virtual Reality with ROS Reality. International Symposium on Robotics Research, 1–16. http://doi.org/10.4103/1817-1737.56008
Vision Based Methods
- Brook, P., Ciocarlie, M., & Hsiao, K. (2011). Collaborative grasp planning with multiple object representations. Proceedings - IEEE International Conference on Robotics and Automation, 2851–2858. http://doi.org/10.1109/ICRA.2011.5980490
- Li, Y., Yue, Y., Xu, D., Grinspun, E., & Allen, P. K. (2015). Folding Deformable Objects using Predictive Simulation and Trajectory Optimization, 6000–6006.
- Shao, L., Tian, Y., & Bohg, J. (2018). ClusterNet: 3D Instance Segmentation in RGB-D Images, (1). http://doi.org/arXiv:1807.08894v2
- Varley, J., Watkins-Valls, D., & Allen, P. (2018). Multi-Modal Geometric Learning for Grasping and Manipulation. http://doi.org/arXiv:1803.07671v2
- Corporation, T.-C. W. M.-Y. L. J.-Y. Z. A. T. J. K. B. C. NVIDIA, & Berkeley, UC. (2017). High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs. ACM Transactions on Speech and Language Processing, 8(2), 1–18. http://doi.org/10.1145/2050100.2050101
- Wang, T.-C., Liu, M.-Y., Zhu, J.-Y., Liu, G., Tao, A., Kautz, J., & Catanzaro, B. (2018). Video-to-Video Synthesis, 1–14. http://doi.org/arXiv:1808.06601v1
Sim-to-real
- Warnell, G., Waytowich, N., Lawhern, V., & Stone, P. (2017). Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces. Retrieved from http://arxiv.org/abs/1709.10163
- Tan, J., Zhang, T., Coumans, E., Iscen, A., Bai, Y., Hafner, D., … Vanhoucke, V. (2018). Sim-to-Real: Learning Agile Locomotion For Quadruped Robots. http://doi.org/arXiv:1804.10332v2
- Sadeghi, F., & Levine, S. (2016). CAD2RL: Real Single-Image Flight without a Single Real Image. http://doi.org/10.15607/RSS.2017.XIII.034
- Lee, R., Mou, S., Dasagi, V., Bruce, J., Leitner, J., & Sünderhauf, N. (2018). Zero-shot Sim-to-Real Transfer with Modular Priors. Retrieved from http://arxiv.org/abs/1809.07480
- Lee, M. A., Zhu, Y., Srinivasan, K., Shah, P., Savarese, S., Fei-Fei, L., … Bohg, J. (2018). Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks. http://doi.org/arXiv:1810.10191v1
- Tobin, J., Biewald, L., Duan, R., Andrychowicz, M., Handa, A., Kumar, V., … Abbeel, P. (2017). Domain Randomization and Generative Models for Robotic Grasping. http://doi.org/10.1017/CBO9781107415324.004
Machine Learning
- OpenAI, :, Andrychowicz, M., Baker, B., Chociej, M., Jozefowicz, R., … Zaremba, W. (2018). Learning Dexterous In-Hand Manipulation, 1–27. http://doi.org/arXiv:1808.00177v2
- Ha, D., & Schmidhuber, J. (2018). World Models. http://doi.org/10.5281/zenodo.1207631
- Li, T., Rai, A., Geyer, H., & Atkeson, C. G. (2018). Using Deep Reinforcement Learning to Learn High-Level Policies on the ATRIAS Biped. Retrieved from http://arxiv.org/abs/1809.10811
- Faust, A., Ramirez, O., Fiser, M., Oslund, K., Francis, A., Davidson, J., & Tapia, L. (2017). PRM-RL: Long-range Robotic Navigation Tasks by Combining Reinforcement Learning and Sampling-based Planning. http://doi.org/10.1109/ICRA.2018.8461096
In Progress
- Degrave, J., Hermans, M., Dambre, J., & wyffels, F. (2016). A Differentiable Physics Engine for Deep Learning in Robotics, 13(March), 1–9. http://doi.org/10.3389/fnbot.2019.00006
- Hummel, J., Wolff, R., Stein, T., Gerndt, A., & Kuhlen, T. (2012). An Evaluation of Open Source Physics Engines for Use in Virtual Reality Assembly Simulations BT - Advances in Visual Computing: 8th International Symposium, ISVC 2012, Rethymnon, Crete, Greece, July 16-18, 2012, Revised Selected Papers, Part II, 346–357. http://doi.org/10.1007/978-3-642-33191-6_34
- Savva, M., Kadian, A., Maksymets, O., Zhao, Y., Wijmans, E., Jain, B., … Batra, D. (2019). Habitat: A Platform for Embodied AI Research. Retrieved from http://arxiv.org/abs/1904.01201