Google Gemini breaks 90% mark for MMLU, which is beyond expert human level for this set of tests. For the first time, a large language model has breached the 90% mark on MMLU, designed to be very difficult for AI. Gemini Ultra scored 90.04%; average humans are at 34.5% (AGI) while expert humans are at …

Read more

By