Is OpenAI A Victim — DeepSeek And IP Rights — A Playbook For Leaders

1 year ago 30

Is OpenAI A Victim - Learn From Sam Altman's Playbook On How To Protect Your Business

Lutz Finger

Microsoft investigates whether DeepSeek improperly used OpenAI APIs? Wait, what? OpenAI’s ChatGPT has become the victim? It’s ironic. Didn’t OpenAI train on Forbes, NYT, and other sources? Essentially, we can assume that some of my articles helped OpenAI’s initial models. Now they complain that DeepSeek might have used them? The complaint about DeepSeek follows a classic playbook on how to protect a business.

Distillation — Small AI Learns From Big AI’s IP

The practice of an AI learning from another AI is called "distillation." Essentially, the model being trained — DeepSeek in this case — extracts knowledge from the trainer model — for example, OpenAI. The process is very similar to how I teach students in my eCornell certificate program on AI. Students ask an AI-Lutz questions, the AI-Lutz answers, and students learn. LLMs training other LLMs first became known with Alpaca — see my article on “Competitive Advantage of LLMs” or the discussion on “DeepSeek": Alpaca was 46,000 times (!) cheaper to train than GPT-3 because it used distillation.

We Don’t Know Whether DeepSeek Used Distillation — But It Is Simple To Do

There’s no public acknowledgment of this tactic by DeepSeek, but I’d be surprised if DeepSeek hadn’t used it. Technically, it’s easy to do 'distillation’. One only needs the ChatGPT API, or if OpenAI blocks this approach, one can even use a simple chat client. Also this is not uncommon. This type of ‘training’ has happened before — startups in the early 2000s learned to index the web simply by querying Google. If done carefully, it is easy to disguise, and the startups will avoid detection. Yes, MS will now investigate, but I would be equally surprised if we ever get proof of such tactics.

Is OpenAI Is The Victim? Violation Of IP Rights?

Surely, “distillation” will violate most terms of service, but that did not stop OpenAI from doing the same kind of violation with news organizations such as Forbes or NYT. Knowledge is free. I personally believe that this is good and the basis for much of our growth. On the other hand it will mean that knowledge is hard to protect.

Classical Playbook To Protect Your Business

Let’s review some of Sam Altman’s core statements over the past few months. All of them come down to one focus: OpenAI tries to protect their market position. That’s all.

AGI is here, and OpenAI is the only one who knows how to build it. This is OpenAI’s way to position itself as the thought leader. They have to as others have a better access to the market. We now see that they are not the only ones. DeepSeek also knows how to build a reasoning model. Read more in “Sam Altman Says, ‘We Know How To Build AGI’ — Not So Fast.”
Stargate’s $500B investment — message to the competitors. This is OpenAI’s message to intimidate potential challengers with overwhelming financial backing. I voiced my skepticism as we see cost of training came down faster than anyone thought possible. Read more in "Stargate: The $500 Billion Message To China — The Future Of AI Regulation.”
AI is extremely dangerous; only a select few should be allowed to develop it. This is OpenAI’s storyline to use safety concerns as a justification for a regulatory setup that creates barriers to entry and gives them more monopolistic power. Elon Musk used a similar tactic. As he realized that OpenAI was faster, he called for stopping investments. Those calls have calmed down now since he is back in the driver’s seat.
From now on, models should respect IP rights. OpenAI ignored IP rights to train and gain a competitive edge. This message is meant to secure OpenAI’s advantage while shutting the door on newcomers.

The race has just begone.

Follow me on Twitter or LinkedIn. Check out my website or some of my other work here.

Read Entire Article