In a rather disconcerting video shared by Boston Dynamics, we witness Spot, their robot dog, not only running, jumping, and dancing but also engaging in conversation. Sporting a top hat, mustache, and googly eyes, Spot converses with the company’s staff members in an endearing British accent while providing them with a captivating tour of their facilities.
“Shall we embark on our voyage?” inquires Spot. “Our initial point of interest will be the charging stations, where Spot robots rest and recharge. Please, gentlemen, follow me.” As demonstrated in the video, Spot exhibits the ability to respond to queries and even simulates speech by opening its “mouth.”
Boston Dynamics achieved the remarkable feat of enabling Spot to “talk” by utilizing OpenAI’s ChatGPT API, coupled with open-source large language models (LLMs) meticulously trained to generate appropriate responses. The robot was equipped with a speaker and text-to-speech capabilities, while its mouth, or rather its gripper, was engineered to mimic speech akin to that of a puppet.
Matt Klingensmith, the principal software engineer at Boston Dynamics, reveals that the team provided Spot with concise scripts for each room within their facilities. The spot then combines these scripts with visual data obtained from its gripper and body cameras, allowing it to garner a more comprehensive understanding of its surroundings before generating its responses. Boston Dynamics emphasizes that Spot utilizes Visual Question-answering models to effectively caption images and answer questions about them.
During the video, Spot assumes various personae beyond that of a “fancy butler.” The quadrupedal robot adopts the identities of a 1920s archaeologist, a teenager, and a Shakespearean time traveler, and even assumes a sarcastic demeanor, as evidenced when asked to compose a haiku: “Generator hums low in a room devoid of joy. Much like my soul.”
Boston Dynamics recounts several surprises that arose during their experiments with Spot as a tour guide. For instance, when prompted about its “parents,” Spot approached a location within the company’s office where older Spot models were showcased. The company acknowledges encountering instances where the LLM fabricated information, such as suggesting that Stretch, their box-moving robot, was designed for yoga.
Expressing enthusiasm, Klingensmith writes in a post on Boston Dynamics’ website, “We eagerly anticipate further exploration at the junction of artificial intelligence and robotics. These models [LLMs] can furnish cultural context, general common sense knowledge, and versatility, which could prove invaluable for various robotic tasks. The ability to assign tasks to a robot merely through verbal communication would significantly reduce the learning curve associated with operating these systems.”
While the video may depict Spot as somewhat comical, one cannot help but ponder the dog-like robot’s capacity to open doors and observe people. Indeed, Spot serves as a tool employed by both the police and the military.