Manipulating arbitrary objects in unstructured environments is a significant challenge in robotics, primarily due to difficulties in determining an object's center of mass. This paper introduces U-GRAPH: Uncertainty-Guided Rotational Active Perception with Haptics, a novel framework to enhance the center of mass estimation using active perception. Traditional methods often rely on single interaction and are limited by the inherent inaccuracies of Force-Torque (F/T) sensors. Our approach circumvents these limitations by integrating a Bayesian Neural Network (BNN) to quantify uncertainty and guide the robotic system through multiple, information-rich interactions via grid search and a neural network that scores each action. We demonstrate the remarkable generalizability and transferability of our method with training on a small dataset with limited variation yet still perform well on unseen complex real-world objects.
After grasping the object, we define its CoM by some displacement dx, dy, and dz away from the grasping point.
The hardware system features a 6-DoF UR5e robot arm. Attached to the robot's wrist is a 6-axis NRS-6050-D80 F/T sensor from Nordbo Robotics with a sampling rate of 1000 Hz. The arm is also equipped with a WSG-50 2-fingered gripper from Weiss Robotics with customized PLA 3D-printed fingers. For data collection, we designed and 3D-printed two objects with dimensions of 15cm * 15cm * 8cm, each including two holders sized 4cm * 4cm for placing AprilTags. The plate object weighs 127.36 grams and allows grasp onto the center. The box object weighs 185.36 grams and is designated to be grasped on the side. We utilize standard laboratory weights for the experiments, specifically two 100-gram weights and one 200-gram weight.
Targeting a generalized and robust CoM estimation framework, we propose U-GRAPH: Uncertainty-Guided Rotational Active Perception with Haptics. This system incorporates a BNN that processes 6-dimensional force-torque readings and 2-dimensional orientation data to yield a 3-dimensional CoM estimation. U-GRAPH also features ActiveNet, which utilizes the output from the BNN to determine the next best action. Assuming the robot has already grasped the object, we perform two measurements at different orientations to accurately estimate its CoM. The BNN supplies both prior predictions and quantifies uncertainty through the standard deviation. The ActiveNet takes in prior estimation distribution and uses grid search to calculate a score for each action to determine the best one. Specifically, the action space is the 2-dimensional orientation of the grasping pose. We define the action executed as changing the pose. For inference, we first use the fixed orientation to generate a prior estimation of the CoM location. Then ActiveNet performs a grid search over the entire action space and calculates the score for each action with prior estimation as input.
In our first experiment, we use the same plate and box setup but vary the weight configurations. We test five different weight configurations across both objects: no weight, a single 100-gram weight, two 100-gram weights placed together, two 100-gram weights placed separately, and two separate weights that weigh 300 grams in total. For each configuration, we conducted five randomly selected grasps. The data for this experiment is captured using the same overhead camera and AprilTags setup as training.
We also perform experiments on a set of 12 real-world objects that are commonly seen in daily life. We predefined the grasping point and found the ground truth CoM by balancing each axe with a gripper.