in

Got any Fun Problems that are Too Hard for ChatGPT? (www.linkedin.com)

Model Evaluation & Threat Research (METR) is offering a reward for new problems to test the abilities of large language models (LLMs).

New, novel, unusual problems that there won’t be answers for or even hints at out on the Internet at large.

The tasks should be at least moderately hard, likely to stay hard over time, and easy to evaluate the quality of the solution.

Details here

Bounty: Diverse hard tasks for LLM agents

on the types of problems they’re looking for and the bounties they are offering.

 

#metr #llm #chatgp #ModelEvaluationAndThreatResearch

Leave a Reply

Your email address will not be published. Required fields are marked *

GIPHY App Key not set. Please check settings

Posted by Russell Brand

Russell has started three successful companies, one of which helped agencies of the federal government become very early adopters of open source software, long before that term was coined. His first project saved The American taxpayer 250 million dollars. In his work within federal agency, he was often called, “the arbiter of truth,” facilitating historically hostile groups and factions to effectively work together towards common goals

 

The Project Management Software You Don’t Use (www.linkedin.com)

 

Will AI Crush your Career or Turbocharge it? (www.linkedin.com)