Futurology

3272 readers

71 users here now

founded 2 years ago

MODERATORS

When AI is tested on questions it can't model from pre-existing answers on the internet, it only scores 10% in the test. (qz.com)

submitted 7 months ago by Lugh to c/futurology

15 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] Lugh 39 points 7 months ago (4 children)

Some people are naively amazed at AI scoring 99% in bar and medical exams, when all it is doing is reproducing correct answers from internet discussions on the exam questions. A new AI benchmark called "Humanity's Last Exam" has stumped top models. It will take independent reasoning to get 100% on this test, when that day comes does it mean AGI will be here?

[–] NuraShiny@hexbear.net 6 points 7 months ago (1 children)

No, because this test will now be discussed and invalidated for that purpose.

[–] Lugh 8 points 7 months ago

They say the answer to this issue is they've released public question samples, but the real questions are kept private.

https://agi.safe.ai/

load more comments (2 replies)