Using Simulation to Enable Real World Robotics

Candidacy Exam Paper List

David Watkins-Valls December 2018

Abstract

Real world robotics is a multifarious process spanning several fields including simulation, semantic/scene understanding, reinforcement learning, domain randomization, just to name a few. Ideally simulators would accurately capture the real world perfectly in a much faster capacity allowing for a predictive power of how a robot will interact with its environment. Unfortunately, simulators neither have the speed nor accuracy to support this. Simulators, such as Gazebo, Webots, and OpenRave, are supplemented with machine learned models of their environment to solve specific tasks such as scene understanding and path planning. This can be compared to a physical only solution which can be costly in terms of price and time. Advances in virtual reality allow for new ways for humans to provide training data for robotic systems in simulation. Using modern datasets such as SUNCG and Matterport3D we now have more ability than ever to train robots in virtual environments. Through understanding modern applications of simulations, better robotic platforms can be designed to solve some of the most pressing challenges of modern robotics.

Organization

First I will discuss the background and motivation of simulation in robotics, citing a foundational paper discussing the nature of simulating motion elements in 1983. Then I will compare virtual simulation to the drawbacks of physical simulation. I will then do an in depth analysis of different simulation architectures as well as applications built on top of their backends. There are multiple simulation datasets in use today and I will mention how they help supplement current research. I will then discuss the human in the loop data that can be generated because of the ease of these simulators through virtual reality applications. I will then discuss how we are using these sophisticated datasets to train scene understanding network architectures and enable sim-to-real applications. Finally I will discuss the impacts and boons that simulation has given to deep learning and reinforcement learning applications.

Papers

Motivation

Derby, S., & Robot, G. (1983). Simulating Motion Elements of General-Purpose Robot Arms. The International Journal of Robotics Technology, 2(1), 3–12. Retrieved from https://journals.sagepub.com/doi/pdf/10.1177/027836498300200101
Gu, S., Holly, E., Lillicrap, T., & Levine, S. (2017). Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In Proceedings - IEEE International Conference on Robotics and Automation (pp. 3389–3396). http://doi.org/10.1109/ICRA.2017.7989385

Simulators

Miller, A. T., & Allen, P. K. (2004). Graspit: A versatile simulator for robotic grasping. IEEE Robotics and Automation Magazine, 11(4), 110–122. http://doi.org/10.1109/MRA.2004.1371616
Michel, O. (2004). Webots: professional mobile robot simulation. International Journal of Advanced Robotic Systems, 1(1), 39–42. http://doi.org/10.1.1.86.1278
Diankov, R., & Kuffner, J. (2008). OpenRAVE : A Planning Architecture for Autonomous Robotics. Tech. Rep. CMU-RI-TR-08-34, Robotics Institute, (July). http://doi.org/CMU-RI-TR-08-34
Todorov, E., Erez, T., & Tassa, Y. (2012). MuJoCo: A physics engine for model-based control. IEEE International Conference on Intelligent Robots and Systems, 5026–5033. http://doi.org/10.1109/IROS.2012.6386109
Koenig, N., & Howard, A. (2001). Design and use paradigms for gazebo, an open-source multi-robot simulator. 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566), 3(3), 2149–2154. http://doi.org/10.1109/IROS.2004.1389727
Xia, F., & Sax, A. (n.d.). Gibson Env : Real-World Perception for Embodied Agents.
Savva, M., Chang, A. X., Dosovitskiy, A., & Funkhouser, T. (n.d.). MINOS : Multimodal Indoor Simulator, 1–14.
Koenig, N., & Howard, A. (2001). Design and use paradigms for gazebo, an open-source multi-robot simulator. 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566), 3(3), 2149–2154. http://doi.org/10.1109/IROS.2004.1389727
Shah, S., Dey, D., Lovett, C., & Kapoor, A. (2017). AirSim: High-Fidelity Visual and Physical Simulation for Autonomous Vehicles, 1–14. http://doi.org/10.1007/978-3-319-67361-5_40

Simulation Datasets

Chang, A., Dai, A., Funkhouser, T., Savva, M., & Song, S. (n.d.). Matterport3D : Learning from RGB-D Data in Indoor Environments.
Song, S., Yu, F., Zeng, A., Chang, A. X., Savva, M., & Funkhouser, T. (2016). Semantic Scene Completion from a Single Depth Image, 1746–1754. http://doi.org/10.1109/CVPR.2017.28
Calli, B., Walsman, A., Member, S., Singh, A., Member, S., Srinivasa, S., … Member, S. (n.d.). Benchmarking in Manipulation Research : The YCB Object and Model Set and Benchmarking Protocols.
Chang, A. X., Funkhouser, T., Guibas, L., Hanrahan, P., Huang, Q., Li, Z., … Yu, F. (2015). ShapeNet: An Information-Rich 3D Model Repository. http://doi.org/10.1145/3005274.3005291

Virtual Reality

Mandlekar, A., Zhu, Y., Garg, A., Booher, J., Spero, M., Tung, A., … Fei-Fei, L. (2018). ROBOTURK: A Crowdsourcing Platform for Robotic Skill Learning through Imitation, (CoRL). Retrieved from http://vision.stanford.edu/pdf/mandlekar2018corl.pdf
Whitney, D., Rosen, E., Phillips, E., Konidaris, G., & Tellex, S. (2017). Comparing Robot Grasping Teleoperation across Desktop and Virtual Reality with ROS Reality. International Symposium on Robotics Research, 1–16. http://doi.org/10.4103/1817-1737.56008

Vision Based Methods

Brook, P., Ciocarlie, M., & Hsiao, K. (2011). Collaborative grasp planning with multiple object representations. Proceedings - IEEE International Conference on Robotics and Automation, 2851–2858. http://doi.org/10.1109/ICRA.2011.5980490
Li, Y., Yue, Y., Xu, D., Grinspun, E., & Allen, P. K. (2015). Folding Deformable Objects using Predictive Simulation and Trajectory Optimization, 6000–6006.
Shao, L., Tian, Y., & Bohg, J. (2018). ClusterNet: 3D Instance Segmentation in RGB-D Images, (1). http://doi.org/arXiv:1807.08894v2
Varley, J., Watkins-Valls, D., & Allen, P. (2018). Multi-Modal Geometric Learning for Grasping and Manipulation. http://doi.org/arXiv:1803.07671v2
Corporation, T.-C. W. M.-Y. L. J.-Y. Z. A. T. J. K. B. C. NVIDIA, & Berkeley, UC. (2017). High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs. ACM Transactions on Speech and Language Processing, 8(2), 1–18. http://doi.org/10.1145/2050100.2050101
Wang, T.-C., Liu, M.-Y., Zhu, J.-Y., Liu, G., Tao, A., Kautz, J., & Catanzaro, B. (2018). Video-to-Video Synthesis, 1–14. http://doi.org/arXiv:1808.06601v1

Sim-to-real

Warnell, G., Waytowich, N., Lawhern, V., & Stone, P. (2017). Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces. Retrieved from http://arxiv.org/abs/1709.10163
Tan, J., Zhang, T., Coumans, E., Iscen, A., Bai, Y., Hafner, D., … Vanhoucke, V. (2018). Sim-to-Real: Learning Agile Locomotion For Quadruped Robots. http://doi.org/arXiv:1804.10332v2
Sadeghi, F., & Levine, S. (2016). CAD2RL: Real Single-Image Flight without a Single Real Image. http://doi.org/10.15607/RSS.2017.XIII.034
Lee, R., Mou, S., Dasagi, V., Bruce, J., Leitner, J., & Sünderhauf, N. (2018). Zero-shot Sim-to-Real Transfer with Modular Priors. Retrieved from http://arxiv.org/abs/1809.07480
Lee, M. A., Zhu, Y., Srinivasan, K., Shah, P., Savarese, S., Fei-Fei, L., … Bohg, J. (2018). Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks. http://doi.org/arXiv:1810.10191v1
Tobin, J., Biewald, L., Duan, R., Andrychowicz, M., Handa, A., Kumar, V., … Abbeel, P. (2017). Domain Randomization and Generative Models for Robotic Grasping. http://doi.org/10.1017/CBO9781107415324.004

Machine Learning

OpenAI, :, Andrychowicz, M., Baker, B., Chociej, M., Jozefowicz, R., … Zaremba, W. (2018). Learning Dexterous In-Hand Manipulation, 1–27. http://doi.org/arXiv:1808.00177v2
Ha, D., & Schmidhuber, J. (2018). World Models. http://doi.org/10.5281/zenodo.1207631
Li, T., Rai, A., Geyer, H., & Atkeson, C. G. (2018). Using Deep Reinforcement Learning to Learn High-Level Policies on the ATRIAS Biped. Retrieved from http://arxiv.org/abs/1809.10811
Faust, A., Ramirez, O., Fiser, M., Oslund, K., Francis, A., Davidson, J., & Tapia, L. (2017). PRM-RL: Long-range Robotic Navigation Tasks by Combining Reinforcement Learning and Sampling-based Planning. http://doi.org/10.1109/ICRA.2018.8461096

In Progress

Degrave, J., Hermans, M., Dambre, J., & wyffels, F. (2016). A Differentiable Physics Engine for Deep Learning in Robotics, 13(March), 1–9. http://doi.org/10.3389/fnbot.2019.00006
Hummel, J., Wolff, R., Stein, T., Gerndt, A., & Kuhlen, T. (2012). An Evaluation of Open Source Physics Engines for Use in Virtual Reality Assembly Simulations BT - Advances in Visual Computing: 8th International Symposium, ISVC 2012, Rethymnon, Crete, Greece, July 16-18, 2012, Revised Selected Papers, Part II, 346–357. http://doi.org/10.1007/978-3-642-33191-6_34
Savva, M., Kadian, A., Maksymets, O., Zhao, Y., Wijmans, E., Jain, B., … Batra, D. (2019). Habitat: A Platform for Embodied AI Research. Retrieved from http://arxiv.org/abs/1904.01201