Gemma 4 12B, GPT-Rosalind & frame.md: This Week in AI

👋 Hello hello,

Your laptop is having a moment. Google just shipped a model you can actually run on it — no cloud, no monthly bill, no asking permission. Just 16GB of VRAM and you're in business.

OpenAI went full scientist. They upgraded GPT-Rosalind, their life sciences model, with agentic capabilities purpose-built for drug discovery. And HeyGen launched frame.md, a spec that finally teaches AI agents how to make branded videos (not just repurpose slide decks with a logo slapped on).

Let’s dig in.

🔥🔥🔥 Three Curated AI Updates

1. Google ships Gemma 4 12B — a full multimodal model that runs on your laptop

Most powerful AI models live in the cloud. You pay per token, your data leaves your machine, and latency is someone else's problem. Gemma 4 12B flips that.

It runs locally on just 16GB of VRAM and handles text, images, and audio natively. That last part is the real story. Traditional multimodal models bolt on separate encoders for each input type — one for vision, one for audio — which bloats memory and adds latency. Google removed those entirely. Vision processing now runs through the core model backbone. Audio signals get projected directly into the same space as text tokens.

The result is a 12B parameter model that performs close to Google's larger Gemma models, at a fraction of the footprint. It's open under Apache 2.0, so you can use it, modify it, and ship with it.

2. HeyGen launches frame.md — branded video, finally in spec form

If you've used design.md, you know the idea: give AI your brand rules once, and it keeps everything consistent. HeyGen saw a problem — when agents tried applying design.md to video, they just recreated the assets as webpages or decks. Useful. But not video.

frame.md is built to fix that. It's a spec for video and motion that teaches agents what your brand actually looks and moves like. Feed it your design.md and it converts it into motion-ready instructions that agents can act on.

This matters because branded video has always been the hard part of AI content workflows. Everything else scaled. Video stayed manual. This is a real step toward changing that.

3. OpenAI upgrades GPT-Rosalind with agentic reasoning for life sciences

GPT-Rosalind is OpenAI's model built specifically for life sciences research at enterprise scale. It was already strong. Now it's stronger — and more autonomous.

The upgrade brings GPT-5.5's agentic coding and tool use into Rosalind's core. That means it can now reason through multi-step workflows: drug discovery, experimental design, data analysis, and molecular work. It doesn't just answer questions. It runs tasks.

This is the kind of capability that used to require a full team of bioinformaticians and a lot of compute budget. Now it's a model call.

🔥🔥 Two Pro AI Tools

1. 🫘 Dreambeans by Google Labs

Dreambeans is a personalized daily digest that connects to your Google apps — Gmail, Calendar, Search history — and surfaces things you'd otherwise scroll past. Every day it builds a curated set of stories and topics based on what's actually relevant to you.

It's built on Google's Personal Intelligence tech, so it learns from your existing data rather than asking you to set up preferences from scratch. Currently it is only available to Google AI Ultra users in the US, with an open waitlist.

2. 🔗 🎨 Ideogram 4.0

Ideogram 4.0 just dropped as the top open-weight text-to-image model on DesignArena's third-party leaderboard — ahead of every other open model out there. It generates images natively at 2K resolution, handles dense text accurately inside images, supports transparent backgrounds, and gives you precise layout control through bounding-box prompting. You can download the weights, fine-tune on your own data, and run it on your own hardware. Full data privacy, no cloud dependency. It's live on every Ideogram plan and the API today.

🔥 Things You Didn’t Know You Could Do With AI

Turn any video's animations into working code

MagicPath is a shared workspace where humans and agents collaborate on creative and design work. One of its less-known features is external agent support, which lets tools like OpenAI's Codex interact directly with MagicPath to generate and manipulate animations.

Here's how to do it:

Open Codex and upload or paste the video you want to recreate.
Tell it: "Recreate these animations in MagicPath."
Codex analyzes the motion, maps it to MagicPath's animation system, and writes the implementation.
Review the output in MagicPath, then adjust timing or easing as needed.
Export or hand off directly from MagicPath.

No manual keyframing. No frame-by-frame analysis. Just describe what you want and watch it build.

Did you find today's post useful?

Until next time,
Team @PracticalyAI

🧠 Gemma on Your Laptop, Drug Discovery Gets a Brain, and Video Animations Just Got Automated