How Open Models Are Transforming AI Asset Creation

2 weeks ago 15

Kshitij Dixit, SaaS Founder at Zeo, YC Alum, is building AI-driven products used by over a million users globally.

Improve application security through software testing in software development. Security-focused testing, such as penetration testing, and adherence to best practices help prevent security breaches and build customer trust.

A few years ago, frontier laboratories like OpenAI, Google DeepMind, Meta and Tencent dominated AI-asset generation. They invested massively in computing power and proprietary data, but the most-advanced models are generally kept behind application programming interfaces (APIs) and nondisclosure agreements.

Open-source initiatives struggled to compete at that scale. However, the landscape has shifted. Epoch AI's analysis found that open-weight models now trail closed-weight benchmarks by only three months on key capability indexes.

From my perspective as a founder and CEO, the improvement of open-source models can open several opportunities for creators and developers. These models allow you to self-host professional-grade asset generation tools without recurring fees or vendor lock-in.

Adopting these models, though, requires considering factors like what kind of support you need and whether you have the necessary skills in-house to manage the tools.

Let's take a look at a few of the key advancements in open-source models before I share some insights into what developers should know before implementing these tools.

Video Generation

Models like OpenAI's Sora and Tencent's Kling defined the standard of fluid motion and detail for video-generation models.

Open source models have improved significantly in recent years. For example, Wan2.2 can handle complex motion and aesthetics, according to industry research from SiliconFlow. A WhiteFiber analysis found that LTX-Video could generate "24 FPS videos at 768x512 resolution at speeds faster than it takes to watch them on an NVIDIA H200."

Because open-source models often don't require users to pay to access AI inference services, research from MIT found that the price of running inference is 87% less on open models.

Three-Dimensional Assets

3D asset generation models are designed to turn a photo or text prompt into a textured, ready-to-render 3D mesh for game engines or AR experiences.

Microsoft's TRELLIS.2 converts images into high-fidelity textured meshes using an OmniVoxel structure. Stability AI's TripoSR reconstructs complete objects from single photographs, outperforming alternatives on accuracy benchmarks.

When teams select open models for production work, data ownership is often a deciding factor. Third-party APIs require sharing inputs outside your infrastructure, where they’re subject to vendor policies. For teams working under NDAs or regulations like the General Data Protection Regulation, self-hosting can help ensure ownership of data.

Images And Textures

Image-generation models can create high-resolution visuals and textures based on prompts. This has several use cases, including everything from ads to e-commerce listings to marketing content to in-game assets.

Open-source image generation, pioneered by Stable Diffusion, has been improving in recent years. For instance, Black Forest Labs' Flux 2 Dev is a text-to-image tool that can handle complex anatomy and high-resolution compositions through rectified flow transformers. Ubisoft's CHORD prototype generates complete physically based rendering material packs from text prompts.

Open models revolve around checkpoints, which are snapshots of model weights, usually fine-tuned by communities for specific aesthetics or business use cases.

Because many checkpoints are open-weight, you can tweak, inspect or even combine them, which makes iteration faster than working within a closed API. On some platforms, new checkpoints show up nearly every day, which are tailored to niches like vintage film styles or clean product renders.

Audio And Voice

With audio AI, there's more at stake than the data itself, including brand tone, customer interactions and regional nuance. Creating natural-sounding voices—whether it’s cloning, narration or syncing speech to video—used to require a trade-off between quality and data control.

However, open models have become better at handling subtle details like intonation and rhythm, which makes outputs feel much closer to real human speech. For instance, several open-weight models are now included in Artificial Analysis's leaderboard of the top-ranked text-to-speech models.

Since similar results are available with self-hosted tools, open models can become a serious consideration, which is especially relevant for managing privacy, compliance and long-term ownership concerns.

A Practical Roadmap For Adoption

Working with open-source models changes the shape of your responsibilities. You can gain control over data, customization and cost structure, but you also take on infrastructure, reliability and safety.

Beyond swapping the tools, the main focus is on building the capabilities to manage these models in-house. Many open-weight models lack the support that frontier labs offer, and debugging them often demands internal machine learning expertise. In my experience, the initial setup can take weeks. The upside is full-stack control and innovation velocity.

Having used these models in my organization to ship SaaS products, here are five principles that can help when making the switch to open source:

• Pilot strategically. Start with a non-critical workflow. Open models need tuning and early outputs may not be production-ready. Testing internally helps evaluate quality and cost without risk.

• Secure infrastructure. Treat infrastructure as something you design. Open models run on your compute resources, so scaling, monitoring and cost control are your responsibility. Without limits, GPU and token costs can erase savings.

• Build guardrails. Open models often don’t come with built-in safeguards. Logging outputs, versioning models and adding human review can help to maintain reliability.

• Upskill teams. Open-source blurs the line between engineering and creative work. Teams that understand both move faster and make better decisions.

• Track ROI rigorously. Look beyond cost per query. The real gains often show up in faster iteration, stronger data control and better compliance.

While frontier labs set the pace and are continuing to advance, open-source communities are advancing and bringing several key advantages to working with AI asset-generation.

By understanding the requirements, open-source models can often reduce the risks of API dependency like rate limits, price hikes and data exposure, which is making them more competitive options as the AI landscape evolves.

As open-weight models catch up on leaderboards and benchmark testing, the biggest consideration is whether organizations are ready to commit to the infrastructure responsibility to managing AI models in-house.

Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?

Read Entire Article