We present a novel low-cost assistive robotic feeding system developed in collaboration between UC Berkeley's Recursive Pioneers and Tukuypaj, a Chilean non-profit organization serving individuals with severe motor disabilities. We learnt that the organization is often short-staffed in having available personnel to feed their quadriplegic community during mealtimes. Therefore, our primary project objective was to build a robotic system that can help reach for food and feed quadriplegic beneficiaries. We aspired to create a robotic system that grants the quadriplegic community greater self-sufficiency, while devising a solution that can be replicated at scale. Our work demonstrates that high-performance feeding assistance can be achieved using affordable hardware - specifically the SO-100 robotic arm platform ($110 per arm), which we've enhanced through custom modifications and advanced control algorithms. Prior to system development, we conducted extensive interviews with Tukuypaj's beneficiaries and caregivers, gathering crucial insights about the challenges and requirements of assisted feeding, which directly informed our technical specifications and design choices. Our technical implementation leverages the LeRobot framework for data collection and utilizes an Adversarial Control Transformer (ACT) architecture trained on demonstration data, exploring the feasibility of combining imitation learning with affordable hardware to create accessible robotic solutions. Initial development has focused on establishing reliable control mechanisms and movement patterns using the LeRobot framework's sophisticated tools for robot control and training, with an emphasis on system stability and movement precision while maintaining the low-cost advantage of the SO-100 platform. Through systematic testing in controlled environments, we are laying the groundwork for future real-world applications, representing an important step toward making assistive robotics technology more accessible through cost-effective solutions. By combining affordable hardware with sophisticated control frameworks, we demonstrate the potential for bringing advanced robotics assistance to resource-constrained settings, setting the stage for future field testing and deployment in therapeutic environments across Tukuypaj's centers and similar care facilities worldwide.
The first steps of our project involved going through the design process, conducting interviews to gather data on the problem and gain insight into the target users that we are designing for. There were two key insights that we learnt about the beneficiaries during the interview process:
The desired functionality of our system involves the creation of path trajectories for the robot system to bring the food to the beneficiary's mouth, while enabling the beneficiary to enact the movement on his/her own. Our main design criteria involves creating a simplified system that is easy to implement, ensuring robustness in reproducibility and reliability, all while ensuring that the system does not deviate too far from their usual feeding routine to ensure a sense of familiarity and easy adoption. The design that we chose is in alignment with this design philosophy.
We first considered constrained situations where we have fixed types of food and fixed relative positions between the food item and the beneficiary. We knew that when our robotic system would be deployed, it would need to perform its function in various environments and situations. Furthermore, we also wanted direct human input over the path trajectories of the robot arm, such that the caretakers can directly dictate how the robot brings the food to the beneficiary and personalize the trajectories for each of them. To fulfill these conditions, we thought it was best for our system to employ a path planning setup where we had two robot arms – one which allowed a caretaker to manually create the path trajectory, and the other arm replicating and storing the motion. We knew that creating and obtaining hardware for two arms was much more difficult than for one, but we felt that this design choice would allow for the path planning element of the project to be easier to implement, and allow caretakers greater influence over the movement of the robot arms as they interact with the beneficiaries in the physical world.
Secondly, we also wanted to create a bridge where the beneficiaries can directly communicate with the robot system. Learning that they mainly communicate through facial expressions, we knew that incorporating computer vision to read their expressions was crucial in our design setup. It was hard to decide upon a mechanism for communication as in reality we learnt that every beneficiary has a different way of communicating their needs through facial expressions (some smile while others chuff), hence we thought it was easiest to standardize the manner of communication. We decided upon having the robot system react to the beneficiaries opening their mouths to bring the food to them.
In tandem, the overall flow of the use of the robot system was intended as follows: the caretakers can manually create the path trajectories, and the beneficiaries can execute the movement of the arms along these trajectories by opening their mouths.
We also aspired to incorporate a second tier to our design philosophy, where we considered dynamic situations involving different types of food, positions of food relative to the beneficiary, beneficiary interactions with the robot system, etc. We wanted to create a form of general intelligence where the robot would "know" how to bring the food item to the beneficiary, regardless of the situation it is being presented. Therefore, we wanted to try implementing a learning aspect to the project. By creating various scenarios, and using the design of the robot to dictate how the food is brought to the beneficiary, we hoped to enact reinforcement learning where the robot learns the intended action to carry out given the scenario, and develop a policy which enable the robot to autonomously create pathways and bring the food item to the beneficiary.
We felt that the design choices we made fulfilled our design objectives in creating a simplified design which was efficient in implementation while being highly reproducible in executing what we wanted the system to do in serving the needs of the beneficiaries. We tackled the design philosophy on the constrained and dynamic front of varying levels of difficulty, while ensuring that user-friendliness remained our top priority, ensuring a created system that would be durable. We were unsure over how robust this system would be in executing the task in the real world, but felt that it is a good starting point to build off of.
We spent a considerable amount of time choosing a design for the robotic arm. We carefully reviewed and selected a few key constraints we wanted to apply to the robot:
Due to the time constraints of the class, we decided it was best to use an open source robot found online. We eventually settled on the SOARM-100, as it satisfied all of our requirements outlined above, and it fit the overall design philosophy we wanted to employ. The SOARM-100 uses a teleoperation setup that involves 2 arms, one where a human operator manually creates the intended path trajectories, and the other copying the created motion synchronously (TheRobotStudio et al., 2024).
The SO-ARM 100 uses STS3215 servos, which utilize metal gears and magnetic encoders. With the encoders in the servos and the lerobot linux environment codebase used to operate the SO-ARM 100, the joint states of angle of rotation of each servo over a period of time could be easily recorded, stored and replicated. This step was crucial to simulate real-world scenarios accurately and align the robot's behavior with the actions and intentions of the caregiver.
We printed the SOARM-100 using PLA, which is a durable, accurate, and relatively cheap material. According to our calculations, both of the robots together (the follower and the leader) incur the total cost of: 511 grams of PLA * $0.025/gram * 2 robots = $25.55 total. The servos are slightly more expensive than a typical servo (roughly $30 each), but we agreed that the cost difference was worth both the increase in safety for the user, as well as the longevity of the robot overall.
Properly tolerancing the prints quickly became the next challenge. The SOARM-100 is designed to have the STS3215 servos press fit into it, to ensure precision. All parts were printed at Jacobs Hall, with Prusa i3 Mk3 printers over a span of 72 hours, with a total print time of around 48 hours.
The STS3215 servos were connected via a daisy chain and configured with unique IDs. We utilized scripts from the official repository (Alibert et al., 2024) for:
For the mouth-open detection system, we adapted an existing camera mount design to work with the Innomaker 720p USB2.0 UVC Camera, modifying it to fit the SOARM-100 (Chekuri, 2024). We learnt the need of putting a camera right next to the gripper to ensure that we could gather camera data of the gripper opening and closing. This enables visual feedback for the robot's operation and user interaction.
To develop a robust and reliable robotic feeding assistant, we implemented a comprehensive data collection strategy utilizing our teleoperation setup. The system was designed to capture both the physical movements of the robot and visual feedback from multiple perspectives, enabling us to create a rich dataset for training and validation purposes.
Our data collection setup consisted of two strategically placed cameras:
This dual-camera configuration allowed us to simultaneously monitor the robot's precise movements and maintain a comprehensive view of the interaction space. The overhead camera proved particularly valuable for pick-and-place tasks, providing clear visibility of object positions and the target locations.
During each teleoperated session, we recorded:
The data collection process yielded approximately 300 episodes across various tasks, including pick-and-place operations and feeding motions with baby carrots. Each episode was automatically processed to calculate key statistical metrics, including mean trajectories and standard deviations, which were subsequently uploaded to Hugging Face for further analysis.
To ensure data quality, we implemented a rigorous validation process:
The replay verification step was particularly crucial, as it allowed us to confirm the accuracy of our recorded trajectories and ensure that the robot could faithfully reproduce the intended motions. This double-validation approach significantly enhanced the reliability of our dataset.
The elalcance contains five files in total, the .pycache and __init__.py files allow for functions created within this folder to be exported and executable. The mouth_recognition.py file provides a window for a live demonstration of the code used for implementing the mesh for mouth-open detection, and mouth_recog.py turns the code into a exportable function named mouth_open_activate which returns a boolean value, only giving a True output when it detects that a mouth being open. Otherwise, when the mouth remains closed the program continues to run.
(Please refer to our README.md for more information on the codebase) Through this systematic approach to data collection and validation, we established a comprehensive dataset that effectively captures the nuances of human-guided robotic manipulation. This data forms the foundation for developing robust path-planning algorithms and ensuring consistent, reliable performance in assistive feeding scenarios.
We implemented the Affordance-based Conditioning for Transformer (ACT) policy training using a comprehensive dataset collected from our teleoperation setup. The training process was conducted using Google Colab's computational resources, although the extensive nature of the transformer architecture resulted in significant training durations. Our primary focus was on the pick-and-place task, for which we had collected robust demonstration data.
The training pipeline incorporated several key components:
Our approach to policy conditioning followed a structured data collection template:
The language conditioning was implemented through semantic signal association. For instance, we collected 50 episodes with the signal "feed carrot" and another 50 with "feed cherry," enabling the model to develop object-specific manipulation strategies. This methodology creates a semantic mapping between natural language commands and corresponding manipulation behaviors.
Based on insights from interviews with beneficiaries and their families at the non-profit organization, we identified mouth opening as a critical signal for feeding initiation. Our system architecture comprises sophisticated computer vision components working in concert with robotic control systems.
The technical implementation includes:
The training process yielded interesting insights into the challenges of robotic manipulation learning:
The observed noise in the loss function can be attributed to several factors:
To ensure the reliability of our trained policies, we conducted extensive validation through physical replay on the original setup, verifying the model's ability to generalize across different scenarios while maintaining safety constraints.
Our implemented system successfully demonstrated the core functionality of trajectory storage and replication, activated through a computer vision-based mouth-opening detection system. The current implementation achieves single-trajectory actuation per mouth-open event, though further development is required to enable multiple trajectory replications across repeated activations.
A significant technical challenge emerged in the end-effector design phase. Initial attempts focused on developing CAD-designed spoon holder attachments for the SO-ARM100's end effector, with the intention of enabling food scooping and delivery operations. However, practical testing revealed considerable limitations in this approach:
Given resource and time constraints, we pivoted to direct food manipulation using the existing gripper system, focusing on handling discrete food items such as carrots or broccoli. While this approach demonstrated system viability, our user research indicates that spoon-based feeding remains the preferred method for beneficiaries. Future iterations should prioritize:
These improvements would align the system more closely with established feeding practices while maintaining the core benefits of our current implementation.
The picture above was taken on Tuesday RRR week at 5:30 am
The entire team contributed countless hours with all-nighters for data collection, extensive robot maintenance, and shared many laughs along the way.
[1] TheRobotStudio, Moss, J., Cadene, R., & Alibert, S. (2024). TheRobotStudio/so-ARM100: Standard open arm 100. SO-ARM100. https://github.com/TheRobotStudio/SO-ARM100
[2] Alibert, S., Cadene, R., Soare, A., Gallouedec, Q., Zouitine, A., & Wolf, T. (2024). Lerobot/examples/10_use_so100.md at main · Huggingface/lerobot. LeRobot: State-of-the-art Machine Learning for Real-World Robotics in Pytorch. https://github.com/huggingface/lerobot
[3] Chekuri, S. R. (2024). Meetsitaram/koch-V1-1: A version 1.1 of the alexander koch low cost robot arm with some small changes. Low-Cost Robot Arm: Koch v1.1. https://github.com/meetsitaram/koch-v1-1
@misc{crivelli2024lealcance, author = {Francesco C., Bryan S., Claire B., and Boris}, title = {Le-Alcance: A Fork of LeRobot for Low-Cost Robotic Learning}, howpublished = "\url{https://github.com/francescocrivelli/le-alcance}", year = {2024} }