Close Menu
  • Home
  • Bitcoin
    • Bitcoin Atm Machines
    • Bitcoin Books
      • Bitcoin Jobs
        • Bitcoin Price Prediction
        • Bitcoin Coin
  • Bitcoin Farm
  • Bitcoin Gifts
    • Bitcoin Gift Card
    • Bitcoin Mining
    • Bitcoin Wallets
  • Technology
  • Shop
    • Bitcoin Atm Machine
    • Bitcoin Coins
    • Bitcoin Coins, Wallets,Shirts,Books,Gifts
    • Bitcoin Mining Machine
    • Bitcoin Mining Machine Full Set Up
    • Computers and Accessories
    • USB Flash Drives
    • Mini Bitcoin Mining Machine
What's Hot

Artificial Superintelligence Alliance Whales Purchase Mpeppe (MPEPE) Before Its Enters Stage 4

March 18, 2025

Tesla’s China Rivals Report Steady August Sales Performance Amid Intense Price Competition: What Investors Should Know – XPeng (NYSE:XPEV), NIO (NYSE:NIO), Li Auto (NASDAQ:LI)

March 18, 2025

IEEE President’s Note: Why Students Should Stay with IEEE

March 18, 2025
Facebook X (Twitter) Instagram
  • Bitcoin
  • Bitcoin Books
  • Bitcoin Coin
  • Bitcoin Farm
  • Bitcoin Gift Card
Facebook X (Twitter) Instagram
farm-bitcoin.com
  • Home
  • Bitcoin
    • Bitcoin Atm Machines
    • Bitcoin Books
      • Bitcoin Jobs
        • Bitcoin Price Prediction
        • Bitcoin Coin
  • Bitcoin Farm
  • Bitcoin Gifts
    • Bitcoin Gift Card
    • Bitcoin Mining
    • Bitcoin Wallets
  • Technology
  • Shop
    • Bitcoin Atm Machine
    • Bitcoin Coins
    • Bitcoin Coins, Wallets,Shirts,Books,Gifts
    • Bitcoin Mining Machine
    • Bitcoin Mining Machine Full Set Up
    • Computers and Accessories
    • USB Flash Drives
    • Mini Bitcoin Mining Machine
farm-bitcoin.com
Home » Reflection 70B’s performance questioned, accused of ‘fraud’
Reflection 70B’s performance questioned, accused of ‘fraud’
Technology

Reflection 70B’s performance questioned, accused of ‘fraud’

adminBy adminMarch 18, 2025No Comments6 Mins Read
Share
Facebook Twitter LinkedIn Pinterest Email

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


It took just one weekend for the new, self-proclaimed king of open source AI models to have its crown tarnished.

Reflection 70B, a variant of Meta’s Llama 3.1 open source large language model (LLM) — or wait, was it a variant of the older Llama 3? — that had been trained and released by small New York startup HyperWrite (formerly OthersideAI) and boasted impressive, leading benchmarks on third-party tests, has now been aggressively questioned as other third-party evaluators have failed to reproduce some of said performance measures.

The model was triumphantly announced in a post on the social network X by HyperWrite AI co-founder and CEO Matt Shumer on Friday, September 6, 2024 as “the world’s top open-source model.”

I’m excited to announce Reflection 70B, the world’s top open-source model.

Trained using Reflection-Tuning, a technique developed to enable LLMs to fix their own mistakes.

405B coming next week – we expect it to be the best model in the world.

Built w/ @GlaiveAI.

Read on ⬇️: pic.twitter.com/kZPW1plJuo

— Matt Shumer (@mattshumer_) September 5, 2024

In a series of public X posts documenting some of Reflection 70B’s training process and subsequent interview over X Direct Messages with VentureBeat, Shumer explained more about how the new LLM used “Reflection Tuning,” a previously documented technique developed by other researchers outside the company that sees LLMs check the correctness of or “reflect” on their own generated responses before outputting them to users, improving accuracy on a number of tasks in writing, math, and other domains.

However, on Saturday September 7, a day after the initial HyperWrite announcement and VentureBeat article were published, Artificial Analysis, an organization dedicated to “Independent analysis of AI models and hosting providers” posted its own analysis on X stating that “our evaluation of Reflection Llama 3.170B’s MMLU score” — referencing the commonly used Massive Multitask Language Understanding (MMLU) benchmark — “resulted in the same score as Llama 3 70B and significantly lower than Meta’s Llama 3.1 70B,” showing a major discrepancy with HyperWrite/Shumer’s originally posted results.

Our evaluation of Reflection Llama 3.1 70B’s MMLU score resulted in the same score as Llama 3 70B and significantly lower than Meta’s Llama 3.1 70B.

A LocalLLaMA post (link below) also compared the diff of Llama 3.1 & Llama 3 weights to Reflection Llama 3.1 70B and concluded the… pic.twitter.com/hqvFp2TyCC

— Artificial Analysis (@ArtificialAnlys) September 7, 2024

On X that same day, Shumer stated that Reflection 70B’s weights — or settings of the open source model — had been “fucked up during the upload process” to Hugging Face, the third-party AI code hosting repository and company, and that this issue could have resulted in worse quality performance compared to HyperWrite’s “internal API” version.

We’ve figured out the issue. The reflection weights on Hugging Face are actually a mix of a few different models — something got fucked up during the upload process.

Will fix today. https://t.co/rKuOlTApRK

— Matt Shumer (@mattshumer_) September 7, 2024

On Sunday, September 8, 2024 at around 10 pm ET, Artificial Analysis posted on X that it had been “given access to a private API which we tested and saw impressive performance but not to the level of the initial claims. As this testing was performed on a private API, we were not able to independently verify exactly what we were testing.”

Reflection 70B update: Quick note on timeline and outstanding questions from our perspective

Timeline:
– We tested the initial Reflection 70B release and saw worse performance than Llama 3.1 70B.

– We were given access to a private API which we tested and saw impressive…

— Artificial Analysis (@ArtificialAnlys) September 9, 2024

The organization detailed two key questions that seriously call into question HyperWrite and Shumer’s initial performance claims, namely:

  • “We are not clear on why a version would be published which is not the version we tested via Reflection’s private API.
  • We are not clear why the model weights of the version we tested would not be released yet.

As soon as the weights are released on Hugging Face, we plan to re-test and compare to our evaluation of the private endpoint.”

All the while, users on various machine learning and AI Reddit communities or subreddits, have also called into question Reflection 70B’s stated performance and origins. Some have pointed out that based on a model comparison posted on Github by a third party, Reflection 70B appears to be a Llama 3 variant rather than a Llama-3.1 variant, casting further doubt on Shumer and HyperWrite’s initial claims.

This has led to at least one X user, Shin Megami Boson, to openly accuse Shumer of “fraud in the AI research community” as of 8:07 pm ET on Sunday, September 8, posting a long list of screenshots and other evidence.

A story about fraud in the AI research community:

On September 5th, Matt Shumer, CEO of OthersideAI, announces to the world that they’ve made a breakthrough, allowing them to train a mid-size model to top-tier levels of performance. This is huge. If it’s real.

It isn’t. pic.twitter.com/S0jWT8rDVb

— ? Shin Megami Boson ? (@shinboson) September 9, 2024

Others accuse the model of actually being a “wrapper” or application built atop of propertiary/closed-source rival Anthropic’s Claude 3.

However, other X users have spoken up in defense of Shumer and Reflection 70B, and some have posted about the model’s impressive performance on their end.

I know @mattshumer_ and this does not mesh with my understanding of him. He knows his stuff and is super pragmatic and works around problems in impressive ways that most people get bogged down on for months. I would say maybe give the guy a little more time before you say stuff…

— Sasha krecinic (@SashaKrecinic) September 9, 2024

Regardless, the model’s rollout, lofty claims, and now criticism show how rapidly the AI hype cycle can come crashing down.

As for now, the AI research community waits with breath baited for Shumer’s response and updated model weights on Hugging Face. VentureBeat has also reached out to Shumer for a direct response to these allegations of fraud and will update when we hear back.

VB Daily

Stay in the know! Get the latest news in your inbox daily

By subscribing, you agree to VentureBeat’s Terms of Service.

Thanks for subscribing. Check out more VB newsletters here.

An error occured.



Source link
Post Views: 34
70Bs Accused fraud Performance Questioned Reflection
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
admin
  • Website

Related Posts

Tesla’s China Rivals Report Steady August Sales Performance Amid Intense Price Competition: What Investors Should Know – XPeng (NYSE:XPEV), NIO (NYSE:NIO), Li Auto (NASDAQ:LI)

March 18, 2025

IEEE President’s Note: Why Students Should Stay with IEEE

March 18, 2025

Get 25% Off a 1Password Family Plan Subscription for Labor Day

March 18, 2025

Audi replaces its bestseller—here’s the next Q5 SUV

March 18, 2025
Add A Comment
Leave A Reply Cancel Reply

Subscribe to Updates

Get the latest creative news from farm-bitcoin about crypto, bitcoin, business and technology.

Please enable JavaScript in your browser to complete this form.
Loading
About Us
About Us

At Farm Bitcoin, we are passionate about unlocking the potential of cryptocurrency and blockchain technology. Our mission is to make the world of digital currencies accessible and understandable for everyone, from beginners to seasoned investors. We believe that cryptocurrency represents the future of finance, and we are here to guide you through this exciting landscape.

Top Insights

Artificial Superintelligence Alliance Whales Purchase Mpeppe (MPEPE) Before Its Enters Stage 4

March 18, 2025

Tesla’s China Rivals Report Steady August Sales Performance Amid Intense Price Competition: What Investors Should Know – XPeng (NYSE:XPEV), NIO (NYSE:NIO), Li Auto (NASDAQ:LI)

March 18, 2025

IEEE President’s Note: Why Students Should Stay with IEEE

March 18, 2025
Get Informed

Subscribe to Updates

Get the latest creative news from farm-bitcoin about crypto, bitcoin, business and technology.

Please enable JavaScript in your browser to complete this form.
Loading
Facebook X (Twitter) Instagram Pinterest
  • About Us
  • Contact Us
  • Disclaimer
  • Privacy Policy
  • Terms and Conditions
Copyright 2024 Farm Bitcoin Design By Prince Ayaan.

Type above and press Enter to search. Press Esc to cancel.

Go to mobile version