A game that presents scenarios of impoverishment in America. Written as part of a collaboration with researchers at Yale to encourage subjects to gain an alternative perspective to inequalities in America.
Recommended citation: David Watkins-Valls, Chaiwen Chou, Caroline Weinberg, Jacob Varley, Kenneth Lyons, Sanjay Joshi, Lynne Weber, Joel Stein, and Peter Allen. "Human Robot Interface for Assistive Grasping." arXiv preprint arXiv:1804.02462 (2018). paper
Recommended citation: David Watkins-Valls, Chaiwen Chou, Caroline Weinberg, Jacob Varley, Lynne Weber, Adam Blanchard, Peter Allen, Joel Stein "Human Robot Interface for Assistive Grasping (Poster)". In: New England Manipulation Symposium (2017). paper
Recommended citation: Jacob Varley, David Watkins, and Peter Allen. “Visual-Tactile Geometric Reasoning (Abstract and Poster)”. In: Data-Driven Manipulation workshop, Robotics: Science and Systems (2017). paper
Recommended citation: D. Watkins-Valls, J. Varley and P. Allen, "Multi-Modal Geometric Learning for Grasping and Manipulation," 2019 International Conference on Robotics and Automation (ICRA), 2019, pp. 7339-7345, doi: 10.1109/ICRA.2019.8794233. paper
Recommended citation: Wu, Bohan, Akinola, Iretiayo, Gupta, Abhi, Xu, Feng, Varley, Jacob, Watkins-Valls, David, and Allen, Peter K. Generative Attention Learning: a “GenerAL” framework for high-performance multi-fingered grasping in clutter. Retrieved from https://par.nsf.gov/biblio/10164432. Autonomous Robots . Web. doi:10.1007/s10514-020-09907-y. paper
Recommended citation: I. Akinola et al., "Accelerated Robot Learning via Human Brain Signals," 2020 IEEE International Conference on Robotics and Automation (ICRA), 2020, pp. 3799-3805, doi: 10.1109/ICRA40945.2020.9196566. paper
Recommended citation: D. Watkins-Valls, J. Xu, N. Waytowich and P. Allen, "Learning Your Way Without Map or Compass: Panoramic Target Driven Visual Navigation," 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020, pp. 5816-5823, doi: 10.1109/IROS45743.2020.9341511. paper
Recommended citation: Goecks, Vinicius G., et al. “Combining Learning from Human Feedback and Knowledge Engineering to Solve Hierarchical Tasks in Minecraft.” ArXiv:2112.03482 [Cs], Dec. 2021. arXiv.org, http://arxiv.org/abs/2112.03482. paper
Recommended citation: Watkins-Valls, D., Maia H., Varley J., Seshadri M., Sanabria J., Waytowich, N., \& Allen, P. (2022). Mobile Manipulation Leveraging Multiple Views. 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2022 paper
The world of crypto-currency is currently driven by Bitcoin due to its support from ASICs. There is an opportunity to build ASICs for Scrypt Cryptocurrencies so that they can flourish as well.
Many videos on the Web about international events are maintained in different countries, and some come with text descriptions from different cultural points of view. We perform a spectral decomposition algorithm to cluster these videos based on their visual memes and their written tag identifiers. The spectral decomposition provides a matrix containing tags clustered with tags, and coclustered with visual memes, as well as visual memes clustered with visual memes and coclustered with tags. We take one of these coclustered matrices and provide a Web service for visualizing the clustering in scatterplot format, force-directed graph layout, and histograms. In addition we have demonstrated that Applying algorithms such as Reverse Cuthill McKee can allow for the viewer to see a diagonalized representation of the matrix.
This research looks at the impact that consistency between a startups internal culture and projected external culture has on the success of the startup, performing case studies on New York startups Betaworks, PowerToFly, and Estimize. Furthermore, this research also examined the impact of average sentiment surrounding the startup on the success of the startup. The research draws from in-person interviews, online articles written about these startups, and a sentiment analysis dataset of 26 startups in New York City. The report finds that the largest impact on the success of a startup is the consistency of a startup’s internal and external culture. PowerToFly had the lowest consistency score and the least funding, while Estimize and Betaworks both had higher consistency scores and more funding. Furthermore, the research found a negative correlation between positive sentiment and startup success, indicating that positive sentiment surrounding company is not a good indicator of a company’s present or future success.
SMT solvers have, in recent years, undergone optimizations that allow them to be considered for use in commercial software. Usages for such SMT solvers include program verification, buffer overflow detection, bit-width prediction, and loop unrolling. Companies such as Microsoft have pioneered SMT research through their Z3 solver. In this paper I investigate the potential techniques for implementing these techniques as well as provide examples of potential applications of SMT solvers.
Expanding on the work done by Jake Varley et al. for the Shape Completion Enabled Robotic Grasping[3], I performed a series of optimizations that would enhance the pipeline to increase is performance and its flexibility. The marching cubes algorithm as been rewritten to support GPU operations, preliminary code as been written for completing entire scenes based on work done by Evan Shelhamer et al.[2], and written a headless depth renderer to help generate scenes for training data much faster than the current pipeline. These three contributions will prove to effectively push forward the shape completion project to a much more usable state for not only our lab but also any labs that may choose to use this software in the future.
Compiling high level programming languages into hardware is no small task. It requires dividing the program into constituent parts that are representable by a hardware circuit and creating a proper memory management system that can fit on a single hardware circuit. Designing a memory system that can reduce contention requires analysis of the dataflow circuit generated from the high level program and can be determined using a graph coloring algorithm and using a separate memory system for each color of the graph. This will reduce memory contention and allow the system to work faster overall.
This work describes a new human-in-the-loop (HitL) assistive grasping system for individuals with varying levels of physical capabilities. We investigated the feasibility of using four potential input devices with our assistive grasping system interface, using able-bodied individuals to define a set of quantitative metrics that could be used to assess an assistive grasping system. We then took these measurements and created a generalized benchmark for evaluating the effectiveness of any arbitrary input device into a HitL grasping system. The four input devices were a mouse, a speech recognition device, an assistive switch, and a novel sEMG device developed by our group that was connected either to the forearm or behind the ear of the subject. These preliminary results provide insight into how different interface devices perform for generalized assistive grasping tasks and also highlight the potential of sEMG based control for severely disabled individuals.
This work provides an architecture which uses a learning algorithm that incorporates depth and tactile information to create rich and accurate 3D models from single depth images. The models are then able to be used for robotic manipulation tasks. This is accomplished through the use of a 3D convolutional neural network (CNN). Offline, the network is provided with both depth and tactile information and trained to predict the object’s geometry, filling in the occluded regions of the object. At runtime, the network is provided a partial view of an object. The network then produces an initial object hypothesis using depth alone. A grasp is planned using this hypothesis and a guarded move takes place to collect tactile information. The network can then improve the system’s understanding of the object’s geometry by utilizing the newly collected tactile information.
This work provides an architecture that incorporates depth and tactile information to create rich and accurate 3D models useful for robotic manipulation tasks. This is accomplished through the use of a 3D convolutional neural network (CNN). Offline, the network is provided with both depth and tactile information and trained to predict the object’s geometry, thus filling in regions of occlusion. At runtime, the network is provided a partial view of an object and tactile information is acquired to augment the captured depth information. The network can then reason about the object’s geometry by utilizing both the collected tactile and depth information. We demonstrate that even small amounts of additional tactile information can be incredibly helpful in reasoning about object geometry. This is particularly true when information from depth alone fails to produce an accurate geometric prediction. Our method is benchmarked against and outperforms other visual-tactile approaches to general geometric reasoning. We also provide experimental results comparing grasping success with our method.
The objective is to learn grasp synergies with high fidelity given object pose and geometry. However, under-actuated, anthropomorphic hands require complex, high dimensional control strategies. Including object pose and geometry further increase the size of the state space. Therefore, grasping in unstructured environments in the same fashion as humans proves to be non-trivial.
Despite recent advancement in virtual reality technology, teleoperating a high DoF robot to complete dexterous tasks in cluttered scenes remains difficult.
We present a robot navigation system that uses an imitation learning framework to successfully navigate in complex environments. Our framework takes a pre-built 3D scan of a real environment and trains an agent from pre-generated expert trajectories to navigate to any position given a panoramic view of the goal and the current visual input without relying on map, compass, odometry, or relative position of the target at runtime. Our end-to-end trained agent uses RGB and depth (RGBD) information and can handle large environments (up to 1031m 2 ) across multiple rooms (up to 40) and generalizes to unseen targets. We show that when compared to several baselines our method (1) requires fewer training examples and less training time, (2) reaches the goal location with higher accuracy, and (3) produces better solutions with shorter paths for long-range navigation tasks.
While both navigation and manipulation are challenging topics in isolation, many tasks require the ability to both navigate and manipulate in concert. To this end, we propose a mobile manipulation system that leverages novel navigation and shape completion methods to manipulate an object with a mobile robot. Our system utilizes uncertainty in the initial estimation of a manipulation target to calculate a predicted next-best-view. Without the need of localization, the robot then uses the predicted panoramic view at the next-best-view location to navigate to the desired location, capture a second view of the object, create a new model that predicts the shape of object more accurately than a single image alone, and uses this model for grasp planning. We show that the system is highly effective for mobile manipulation tasks through simulation experiments using real world data, as well as ablations on each component of our system.
Real-world tasks of interest are generally poorly defined by human-readable descriptions and have no pre-defined reward signals unless it is defined by a human designer. Conversely, data-driven algorithms are often designed to solve a specific, narrowly defined, task with performance metrics that drives the agents learning. In this work, we present the solution that won first place and was awarded the most human-like agent in the 2021 NeurIPS Competition MineRL BASALT Challenge: Learning from Human Feedback in Minecraft, which challenged participants to use human data to solve four tasks defined only by a natural language description and no reward function. Our approach uses the available human demonstration data to train an imitation learning policy for navigation and additional human feedback to train an image classifier. These modules, together with an estimated odometry map, are then combined into a state-machine designed based on human knowledge of the tasks that breaks them down in a natural hierarchy and controls which macro behavior the learning agent should follow at any instant. We compare this hybrid intelligence approach to both end-to-end machine learning and pure engineered solutions, which are then judged by human evaluators. Codebase is available at this https URL.
Real world robotics is a multifarious process spanning several fields including simulation, semantic/scene understanding, reinforcement learning, domain randomization, just to name a few. Ideally simulators would accurately capture the real world perfectly in a much faster capacity allowing for a predictive power of how a robot will interact with its environment. Unfortunately, simulators neither have the speed nor accuracy to support this. Simulators, such as Gazebo, Webots, and OpenRave, are supplemented with machine learned models of their environment to solve specific tasks such as scene understanding and path planning. This can be compared to a physical only solution which can be costly in terms of price and time. Advances in virtual reality allow for new ways for humans to provide training data for robotic systems in simulation. Using modern datasets such as SUNCG and Matterport3D we now have more ability than ever to train robots in virtual environments. Through understanding modern applications of simulations, better robotic platforms can be designed to solve some of the most pressing challenges of modern robotics.
Providing mobile robots with the ability to manipulate objects has, despite decades of research, remained a challenging problem. The problem is approachable in constrained environments where there is ample prior knowledge of the environment and objects that will be manipulated. The challenge is in building systems that scale beyond specific situational instances and gracefully operate in novel conditions. In the past, heuristic and simple rule based strategies were used to accomplish tasks such as scene segmentation or reasoning about occlusion. These heuristic strategies work in constrained environments where a roboticist can make simplifying assumptions about everything from the geometries of the objects to be interacted with, level of clutter, camera position, lighting, and a myriad of other relevant variables. In this thesis we will demonstrate how a system for mobile manipulation can be built that is robust to changes in these variables. This robustness will be enabled by recent simultaneous advances in the fields of Big Data, Deep Learning, and Simulation. The ability of simulators to create realistic sensory data enables the generation of massive corpora of labeled training data for various grasping and navigation based tasks. We will show that it is now possible to build systems that works in the real world trained using deep learning almost entirely on synthetic data. The ability to train and test on synthetic data allows for quick iterative development of new perception, planning and grasp execution algorithms that work in a large number of environments.