Forgotten Polygons: Multimodal Large Language Models are Shape-Blind - err, oh, no they're not
Summary #
In short, the "Forgotten Polygons: Multimodal Large Language Models are Shape-Blind" paper that's been just published has been blown out of the water by OpenAI's o3 within a few days of it being published.
In Depth #
The most problematic example, Arrow-Plus, scoring 0% in the paper was solved perfectly by o3.
o3's output:
Thought for 2m 30s
The outline consists of 15 straight‑line segments, so the shape is a 15‑sided polygon (a pentadecagon).