🧠 NotebookLM Cinematic Videos, Perplexity Voice Mode, and Microsoft’s Phi-4 Vision

👋 Hello hello,

The most interesting AI updates lately aren’t always giant new apps. Sometimes they show up as small features that quietly change how we work.

This week is a good example. One research notebook now generates cinematic videos, a search engine is turning into a voice-driven assistant, and Microsoft is sharing how it trained a model that can reason across images and text.

Let’s get into it.

🔥🔥🔥 Three Highly Curated AI Stories

1. NotebookLM introduces Cinematic Video Overviews

NotebookLM just launched Cinematic Video Overviews, a new feature inside NotebookLM Studio that turns your source materials into immersive videos.

— # (#)

Unlike the standard templates people are used to, these videos are generated using a combination of Google’s advanced models. The goal is to transform research documents and source material into a more engaging visual narrative.

This could make NotebookLM even more useful for people who work with large research sets. Instead of reading long summaries, you could generate a visual overview that explains the material. The feature is rolling out now to Ultra users in English

2. Perplexity launches Voice Mode for Perplexity Computer

— # (#)

Perplexity continues pushing beyond search. The company just introduced Voice Mode inside Perplexity Computer, which allows users to interact with the system by simply speaking.

The idea is straightforward. Instead of typing instructions, you can talk to the assistant and have it carry out actions.

Voice interfaces are becoming a common direction for AI products. As these systems become more capable, the keyboard becomes less central to how we interact with them.

3. Microsoft shares how it trained Phi-4 Reasoning Vision

Microsoft Research published a detailed look at Phi-4 Reasoning Vision, a multimodal reasoning model that can work across text and visual inputs.

The research focuses on the training process behind the model and the challenges involved in teaching systems to reason about images and language together.

For developers and researchers, the blog offers insight into how multimodal reasoning models are built and what lessons emerged during training.

🔥🔥 Two Pro Tips Worth Knowing

1.🎨 Claude Skill for generating slide decks

This creator built a Claude skill that generates full presentation slides directly on the web, and the results are surprisingly polished.

Instead of manually designing slides, Claude interviews you about the visual style you want. It then generates multiple design directions with animations, transitions, and interactive elements.

Because the slides are web-based, they automatically adapt to different screen sizes. The tool can also convert existing PPTX files into web-based slides while preserving images and brand assets.

2. 🛠 Build your own micro-tools with Claude Code

This example, shared by a creator on X, involved creating a custom X video downloader skill. Instead of searching for random download sites, the user asked Claude Code to create a small automation.

Now they simply paste an X post URL, and the script automatically downloads the video into a designated folder. It’s a simple example, but it shows how AI coding assistants can help you create small utilities that remove repetitive tasks. Access the skills file here.

Use prompt libraries to find website design inspiration before vibe coding

When people use AI coding tools, they often jump straight into building. A better starting point is collecting design inspiration first.

The SuperDesign.dev prompt library makes this easy. It contains prompts that generate interactive website layouts, which you can reuse in your own projects.

Here’s how to use it:

1. Visit the SuperDesign prompt library and browse through the available website layouts.
2. Pick a design you like and one that matches the style you want.
3. Copy the prompt used to generate that design.
4. Paste the prompt into your AI coding tool and customize the result for your project.

This simple step gives you strong design direction before you start building with vibe-coding tools.

Before you go, did today's newsletter help you stay ahead?

💬 Quick poll: What’s one task you’d want AI to run automatically for you?

Until next time,
Kushank @PracticalyAI

🧠 NotebookLM Cinematic Videos, Perplexity Voice Mode, and Microsoft’s Phi-4 Vision

🔥🔥🔥 Three Highly Curated AI Stories

1. NotebookLM introduces Cinematic Video Overviews

2. Perplexity launches Voice Mode for Perplexity Computer

3. Microsoft shares how it trained Phi-4 Reasoning Vision

🔥🔥 Two Pro Tips Worth Knowing

Use prompt libraries to find website design inspiration before vibe coding

Before you go, did today's newsletter help you stay ahead?

Reply

Recommended for you

Quick Links

Subscription

Socials