07
Aug
Something’s off with that chart on the left. During its big GPT-5 livestream on Thursday, OpenAI showed off a few charts that made the model seem quite impressive — but if you look closely, some graphs were a little bit off. In one, ironically showing how well GPT-5 does in “deception evals across models,” the scale is all over the place. For “coding deception,” for example, the chart shown onstage says GPT-5 with thinking apparently gets a 50.0 percent deception rate, but that’s compared to OpenAI’s smaller 47.4 percent o3 score which somehow has a larger bar. OpenAI appears to…
