When ChatGPT gets it wrong
I recently tested ChatGPT with a list of things to do in San Francisco on a Wednesday. I listed the places I wanted to visit and the time I wanted to spend at each location. Having done these things before, I wanted to know how ChatGPT would allocate the time and sequence the route. I gave it a start time, all the bits in between, and a request to catch the sundown at Ocean Beach.
It got the route sequencing right, and it was shaping up to be a perfect Wednesday in the city. It included my preference for a two-hour museum visit to SFMOMA and lunch at the museum cafe. Except the museum is closed on Wednesdays, and that fact is published everywhere online. That misguided recommendation would have amounted to a three-hour waste of time.
When this error was brought to ChatGPT’s attention, it responded: “I recommended SFMOMA earlier based on general museum hours, which I verified afterward. Upon further checking, I found that the museum is closed on Wednesdays. My initial recommendation didn’t account for the specific day of your visit, but now you can adjust your itinerary accordingly. Let me know if you’d like alternatives for your Wednesday visit!”
What puzzled me the most was the line, “My initial recommendation didn’t account for the specific day of your visit…” I assumed that specifying the day of the week would help set filters for the recommendation.
ChatGPT informed me of the Wednesday closure only after I caught the error and asked about the hours.
So, I changed the day to Thursday, and the revised recommendation kept the 2-hour museum visit from 10:15 AM to 12:15 PM with a cafe lunch from 12:15 PM to 1:15 PM. Except SFMOMA does not open until noon on Thursdays—a fact it already knew. So even though it describes me as “exploring the museum’s art exhibitions” for 2 hours, I would have only had 15 minutes after it opened before running to lunch at 12:15 PM. This is basic math, folks.
Even though ChatGPT was aware that SFMOMA opened at noon on Thursdays, the timeline still had me "touring the museum" starting at 10:15
ChatGPT kept asking me to “Let me know if you’d like to adjust any part!” I prodded its trip-planning abilities by asking it to squeeze in a stop at Tartine Bakery, which it did—but it also dropped my museum lunch without asking or bringing it to my attention. When asked why it disappeared, it cheerily replied with an exclamation, “You're right! I unintentionally dropped lunch at Café 5 in SFMOMA. Here's an updated itinerary with that included.”
Another omission from ChatGPT, it dropped lunch from the itinerary without notice.
This is unlike tricking Google’s AI to recommend adding glue to pizza sauce to keep the cheese from sliding off. Getting basic operating hours wrong is a foundational error that erodes trust in the rest of the advice. Regardless of the sourcing, we need our recommendations to be trustworthy, accurate, and safe. Google claims to be able to plan, schedule, and rearrange complex trip segments, but based on this small, low-stakes experiment, I’m not prepared to relinquish control anytime soon.
*Caveat - This test was done mid-September 2023, and ChatGPT will undoubtedly continue to evolve and learn.