A Right to Bear (Robo) Arms Doesn't Mean You Have To

Bye bye octopi,
Your appendages can go,
This shell is sure swell

Automation often gets associated with robotic manipulators, especially the serial-chain variety, which also forms the basis for any intro robotics course. Arms and AGVs probably comprise the core components of nearly all robotic automation setups. It is the platform on which we place our end-effectors as well as our hopes and dreams. Companies like UR have made it increasingly easy to plug-and-play arms into various roles previously reserved for humans, and despite the (imo) market domination held by the major manipulator vendors (e.g. UR, Fanuc, ABB, etc), as well as the ever-growing list of failures (e.g. Rethink, Carbon, Innfos, Pneubotics), we continue to see new players developing new iterations on the humble kinematic chain (e.g. Ally, Flexiv, Dobot). Aside from a race to the bottom in terms of price (that somehow magically doesn't compromise key performance capabilities), new arm offerings strive to reach for more compliant/reactive capabilities, higher payload, improved usability, and novel form factors. The arm seemingly remains a key component for positioning your tooling in your workspace.

I've dealt with a range robotic arms for the whole duration of my career now (not sure if I should be grateful), dealing with everything from integration, motion control, and basic planning to routing air/electrical lines, mounting custom end-effectors (so many end-effectors), and long-term mechanical maintenance/repair. With each arm, there's been new interfaces, both hardware and software, that needed to be understood and then adapted for the target task. The idiosyncrasies in terms of reachable workspace and joint constraints are different enough to be frustrating, but consistent enough to be expected. Despite all the setup and resources we pour into building the application solution around a robot arm, I would argue that it's woefully under-utilized in most cases, and we should probably replace it with far simpler positioning devices, if only they were easier to customize/construct.

Do We Really Need 6DOF

I'm not convinced that the majority of applications employing robotic arms need the full workspace that a typical 6-dof serial chain arm provides, but let me be clear: the capability to position and pose the end of a manipulator in all 6 degrees of freedom is substantial, and it's not something that can be easily re-added or bandaided into existence on top of other motion systems. That reason alone may be sufficient justification for integrating a robotic arm wherever possible. I'll admit that I'm still a bit tainted by academia, where we're always trying to reduce the dimensionality of the problem. Entire theses have been written and will continue to be written on how principal component analysis (or some variation) reduces a super complex problem into something way more tractable (under certain conditions, blah blah blah).

That said, I think we would still mostly be in agreement in saying that you'd be hard-pressed to make up an application where you absolutely needed the robot arm to exhaustively move through its workspace in 6-dof. You may, with some difficulty, conceptualize a task where you need a highly dexterous arm, especially if the environment is dynamic. However, I'd ask: at what point does that operation venture into the nebulous world of 'general purpose' tasks, something so vague that by definition you need the full workspace range at your disposal? Once the task scope begins to crystallize and narrow, I'd argue that the utility of a 6-dof motion platform starts to quickly depreciate.

Palletizing and Container Handling

Boxes or box-like things meant to be stacked on pallets seem highly structured when compared to other candidate objects in more generalized pick-and-place problems, with consistent reference surfaces and stable resting configurations. In their typical use cases, they're likely to be well organized or at least singulated as well, so why would we need 6-dof arms to handle these things? Whether it's loading a pallet, re-organizing racks of totes, handling trays, moving luggage, mail-handling, or other similar applications, the motion itself seems predominantly (if not exclusively) cartesian, with the end-effector approaching from the top or bottom, and never needing to tilt the box itself in any way. There are even 3rd-party attachments sold to "fix" the payload-limitation issue of serial-chain arms. In a lot of these use-cases, we could probably even get away with chutes, and it seems like the primary benefit you get from serial-chain arms is the slightly more compact space usage.

The more "exotic" box-handling solutions are even more confusing to me. I don't understand the advantage that a humanoid like Digit or Atlas has over more conventional solutions. Even the more simplified/industrial implementation from Dexterous Robotics has me scratching my head. Boston Dynamics' Stretch implementation may provide benefits in terms of payload and speed, but is it still worth the complexity? How much additional utility do these solutions offer beyond the humble forklift?

I'm a much bigger fan of more "dedicated" box-handlers, so to speak, like Invia Robotics or Hai Robotics' solutions. Those seem like a much better middle ground between a complex manipulator system that is able to handle more chaotic package clutters and the complete "robotic warehouse" solutions like Symbotic and Ocado. At the smaller scale, more established examples of pharmacy automation have seemed to learn more towards the "smart vending machine" with gantries and custom shelves to dispense pillboxes instead of larger setups with industrial arms, though demonstrations of the latter still pop up from time to time.

I would actually also bucket the majority of the recent wave of food-robotics setups into the container-handling space. In my opinion, very few of them interface with the food directly; most attempts just move some container (e.g. tray, bowl, wire basket) between standalone dispensers and different heating zones. For example, I don't see why functionally, the arms of CafeX and Bobacino couldn't be replaced by the automated drink conveyors already developed by Miso (Sippy) and Cornelius (Automatic Beverage System), the latter of which has been around for ages. This isn't to say that all startups in that space lean a certain way: whereas companies like Pazzi and Hyper Robotics use 6-dof arms to handle certain tools, we see places like Stellar use more bespoke conveyors and gantries.  

Bin-Picking and Stowing

Bin-picking, especially with objects in clutter, presents a more complicated challenge, albeit one that predominantly only needs an approach from the top. For me, the key cop-out distinction that should be kept in mind for this task is that we only need to somehow attach the object to the end-effector, not necessarily in a particular relative pose. Now, there may be a limited (or perhaps only one) relative, pre-grasp pose in which the selected end-effector can reliably pick the object, but in my opinion, that particular scenario is far less likely with the move towards compliant suction cups with adaptive bellows. Somehow connecting an object to the end-effector doesn't always necessitate a secure and locked relative transform between the two.

Take your pick of some of the current stalwarts in the bin-picking space: Righthand, Berkshire Grey, Soft Robotics, Nimble AI, Kindred AI (and a whole bunch more I'm missing). For all the talk of optimized approach vectors, I'm not convinced that jamming a compliant suction cup into the object's center of mass from directly above wouldn't return a similar if not the same end result, given that the demonstrated results are pretty much just that. I have zero numbers to back up that accusation, but I hope the reader can accept my point that for the dominant, semi-"unstructured", commercial task that's challenging the robotics community at the moment, the required range of motion is fairly minimal, if not only Cartesian. Also, because it's convenient for my argument, I'd like to point out that while the winner of the initial Amazon Robotics Picking Challenge used a 7-dof Barrett WAM, the last winner used a far simpler, custom gantry of the team's own design. I think it's also interesting that the problem of sorting recyclables/waste, arguably a more unstructured problem (with less object knowledge know a priori) than bin-picking, has stuck with limited-mobility gantries over 6-dof arms, as shown by implementations from AMP Robotics, Glacier, and Waste Robotics.  

Part-handling and Assembly

I'm a bit conflicted on whether we should consider part-handling and assembly a more difficult robotics problem than bin-picking. On one hand, scope is fairly limited and usually the entire problem scene is well modeled and known ahead of time. On the other hand, the object pose relative to the end-effector is now significantly more critical, and the task may only be completable within a very narrow band of system states. However, a high degree of motional fidelity is not the same as range. As the integrator generally has significant control over the design of the support structures, functional steps, and additional tooling, the kinematic flow of the target part can be made to be incredibly simple, often locked to a single or limited set of nominal orientations.

While it's true that robotic arms may place components at some angle or perform some minor adjustments as part of the assembly motion, the trajectories are most commonly run open-loop, with end-effector modifications handling any slight deviations from the nominal trajectory where necessary. Even in applications like applying adhesive/primer to a vehicle windshield or de-flashing the plastic from an injection-mold part, the arm's primary motion should remain identical (or identical with some simple offset).

A lot of assembly demos you can find online nowadays largely show a robotic arm responsible for shuttling the part between various stations, each of which is custom-built for a particular task in the assembly sequence. The individual stations typically need alignment features and guides to ensure reliable operation, so the robotic part-handling element has some allowable error in dropping off and picking up the parts in question. We typically see gravity (or some other constant external force like a driven conveyor surface) leveraged to further drive the part against physical hardstops to better guarantee the system state, avoiding the need for the robotic subsystem to actively detect and adjust for errors mid-task.

What really troubles me (as much as this thought experiment should trouble anyone) is that we've had highly complex manufacturing assembly lines devoid of robotic arms for ages now, and as far as the fundamental kinematic motions are concerned, I'm at a loss for explaining what significant benefit having a serial-chain arm brings to the task. I feel that part of a possible explanation is that we have a tendency (and aptitude) to break down tasks into simpler motions that typically end up being rectilinear or grid-like. Take additive manufacturing for example (maybe one of the more freeform robotic assembly processes): three-dimensional shapes with any sort of curvatures are simply resolved to stacks of 2D contours without (arguably) significant loss in the end result.

Now, does this mean that we, as the literal puppet masters of these high-dof positioning tools, lack the imagination to maximize our tools' potential? Or are physical tasks really much simpler than we initially assume?

Where We (May) Need 6-DOF

The easy cop-out answer (for me) is that when the objects and environment gets increasingly unstructured, or our system state is especially prone to errors, increased kinematic reachability becomes a lot more necessary. Mobile robotic platforms like TRI's system, TIAGo Pal, or any of the DARPA robotics challenge entrants probably would not get very far without dexterous manipulators. The arm motion in those cases (I think) are necessarily predicated by the system's estimation of the world state and need to compensate for the mobility system's shortcomings. That said, I suppose I could point to Kevin Robot (built on Care-O-Bot's base) and Hello Robot's Stretch as counterpoints where the platform makes do with much simpler arms by either limiting the scope of the problem or leveraging the mobility system as part of the manipulation task.

Beyond adaptive motions/tasks, I would be excited to see applications that employ the same arm for multiple operations in a single setup, to maximize the utilized uptime of the machine. Even in manufacturing lines running 24/7, it's not uncommon to see arms/gantries paused between steps. Various vendors sell multi-gripper setups, primarily to increase throughput, and many systems have implemented multi-tool setups to avoid multiple arms or tool changers in executing more complex tasks, but I don't think it's too much of a stretch to shift an otherwise idle arm to a secondary task. In that scenario, at least we'd be utilizing the arm for multiple toolpaths.

However, I'd still say that sensory feedback is key to fully utilizing the range of motion that high-dof arms have to offer. Several RaaS startups in recent years have focused their offerings on high-mix/low-volume welding, polishing, grinding jobs: contact-rich tasks that ideally require active probing during execution. These examples require various degrees of motion compensation that may be difficult to predict a priori (maybe less so in the welding case). I also feel that the auxiliary motion needed in these operations (and I'll admit that whether they're truly needed can still be open to debate) should be driven by the sensory elements themselves, not an arbitrary offset or buffer decided by the human operators before the task starts. At the end of the day, these sort of tasks (imo) just aren't that prevalent yet, and so I continue to be more than a bit befuddled that integrators don't do more with modular linear robots or reconfigurable gantries.