TR2023-009
H-SAUR: Hypothesize, Simulate, Act, Update, and Repeat for Understanding Object Articulations from Interactions
-
- "H-SAUR: Hypothesize, Simulate, Act, Update, and Repeat for Understanding Object Articulations from Interactions", IEEE International Conference on Robotics and Automation (ICRA), DOI: 10.1109/ICRA48891.2023.10160575, May 2023, pp. 7272-7278.BibTeX TR2023-009 PDF
- @inproceedings{Ota2023may,
- author = {Ota, Kei and Tung, Hsiao-Yu and Smith, Kevin and Cherian, Anoop and Marks, Tim K. and Sullivan, Alan and Kanezaki, Asako and Tenenbaum, Joshua B.},
- title = {H-SAUR: Hypothesize, Simulate, Act, Update, and Repeat for Understanding Object Articulations from Interactions},
- booktitle = {IEEE International Conference on Robotics and Automation (ICRA)},
- year = 2023,
- pages = {7272--7278},
- month = may,
- publisher = {IEEE},
- doi = {10.1109/ICRA48891.2023.10160575},
- url = {https://www.merl.com/publications/TR2023-009}
- }
,
- "H-SAUR: Hypothesize, Simulate, Act, Update, and Repeat for Understanding Object Articulations from Interactions", IEEE International Conference on Robotics and Automation (ICRA), DOI: 10.1109/ICRA48891.2023.10160575, May 2023, pp. 7272-7278.
-
MERL Contacts:
-
Research Areas:
Abstract:
The world is filled with articulated objects that are difficult to determine how to use from vision alone, e.g., a door might open inwards or outwards. Humans handle these objects with strategic trial-and-error: first pushing a door then pulling if that doesn’t work. We enable these capabilities in autonomous agents by proposing “Hypothesize, Simulate, Act, Update, and Repeat” (H-SAUR), a probabilistic generative framework that simultaneously generates a distribution of hypotheses about how objects articulate given input observations, captures certainty over hypotheses over time, and infer plausible actions for exploration and goal-conditioned manipulation. We compare our model with existing work in manipulating objects after a handful of exploration actions, on the PartNet-Mobility dataset. We further propose a novel PuzzleBoxes benchmark that contains locked boxes that require multiple steps to solve. We show that the proposed model significantly outperforms the current state-of- the-art articulated object manipulation framework, despite using zero training data. We further improve the test-time efficiency of H-SAUR by integrating a learned prior from learning-based vision models.
Related News & Events
-
NEWS MERL Researchers Present Thirteen Papers at the 2023 IEEE International Conference on Robotics and Automation (ICRA) Date: May 29, 2023 - June 2, 2023
Where: 2023 IEEE International Conference on Robotics and Automation (ICRA)
MERL Contacts: Anoop Cherian; Radu Corcodel; Siddarth Jain; Devesh K. Jha; Toshiaki Koike-Akino; Tim K. Marks; Daniel N. Nikovski; Arvind Raghunathan; Diego Romeres
Research Areas: Computer Vision, Machine Learning, Optimization, RoboticsBrief- MERL researchers will present thirteen papers, including eight main conference papers and five workshop papers, at the 2023 IEEE International Conference on Robotics and Automation (ICRA) to be held in London, UK from May 29 to June 2. ICRA is one of the largest and most prestigious conferences in the robotics community. The papers cover a broad set of topics in Robotics including estimation, manipulation, vision-based object recognition and segmentation, tactile estimation and tool manipulation, robotic food handling, robot skill learning, and model-based reinforcement learning.
In addition to the paper presentations, MERL robotics researchers will also host an exhibition booth and look forward to discussing our research with visitors.
- MERL researchers will present thirteen papers, including eight main conference papers and five workshop papers, at the 2023 IEEE International Conference on Robotics and Automation (ICRA) to be held in London, UK from May 29 to June 2. ICRA is one of the largest and most prestigious conferences in the robotics community. The papers cover a broad set of topics in Robotics including estimation, manipulation, vision-based object recognition and segmentation, tactile estimation and tool manipulation, robotic food handling, robot skill learning, and model-based reinforcement learning.