Happy Lunar New Year of the Horse — Nvidida AI That Talks With Us, Google Creates Music with Lyrics for Us

Jack Lau
2月20日
讀畢需時 3 分鐘

First of all, Happy Lunar New Year of the Horse to all readers.

I have always liked the symbolism of the Horse — energy, motion, and momentum, but also the importance of direction. Technology often runs fast; leadership is about deciding where it should go.

Recently, two AI developments caught my attention. On the surface they look unrelated, but together they tell an interesting story about where AI is heading.

One is NVIDIA’s PersonaPlex, a conversational voice AI that sounds remarkably natural.The other is Google Gemini’s Lyria 3, which can generate original music instantly from a simple prompt.

One changes how AI interacts.The other changes how AI creates.

Together, they hint that AI is shifting from being a tool we operate to something we increasingly work alongside.

When Conversation Starts Feeling Natural: NVIDIA PersonaPlex

Here’s the demo if you haven’t seen it:

https://www.youtube.com/watch?v=KcSSMsZTz6Y

When I watched this, what struck me was not the intelligence of the responses, but the feel of the interaction.

The AI reacts quickly.It acknowledges naturally.It sounds present.

Most voice assistants today still feel like machines:

You speak.It pauses.Then it answers.

That pause is enough to remind you that you are interacting with software.

PersonaPlex aims to remove that feeling by allowing the AI to listen and speak simultaneously, closer to how humans converse.

This small technical change creates a big experiential shift:

interruptions feel natural
responses feel immediate
rhythm feels human

When conversation flows naturally, the interface disappears.

We stop navigating software.We simply talk.

For organizations, this matters enormously. Customer engagement, internal workflows, training, healthcare intake, education — all of these depend heavily on interaction quality.

The breakthrough is not just smarter AI.

It is AI that feels present in the moment.

When Creation Becomes Conversational: Gemini + Lyria 3

Now consider this demo:

https://www.youtube.com/watch?v=Op8X8RmiE98

With Gemini’s Lyria 3, you describe a mood or idea, and within seconds the system generates:

lyrics
melody
vocals
arrangement
even artwork

When I first saw this, I didn’t think “AI can compose music.”

I thought:

Creation is becoming conversational.

For most of history, making music required instruments, software, and technical skills.Now the primary requirement is simply being able to describe the idea.

The AI handles execution.

We have already seen this pattern with writing, design, and coding.Music is just the latest creative domain to follow.

But the deeper change is this:

Creation no longer happens before the conversation.It happens inside the conversation.

A teacher can create a soundtrack while planning a lesson.A marketing team can test campaign ideas instantly.A creator can explore variations in real time.

Creation becomes part of thinking itself.

The Signal Behind Both

PersonaPlex improves how AI interacts.Lyria 3 improves how AI creates.

Put together, they point to the same direction:

AI is becoming less like software and more like a collaborator.

In the past, we gave software commands.Today, we ask AI for help.Tomorrow, we may simply work with it continuously.

For leaders, this is the more important question:

Not “Which AI model should we use?”But “How do we redesign work when interaction is natural and creation is instant?”

That answer will shape organizations far more than any individual technology choice.

And perhaps that is a fitting reflection for the Year of the Horse:

Speed is valuable.Direction is decisive.

Hugging Face Snapshot — Popular Open Models by Downloads (the Past 7 days)

(Hugging Face is the world’s largest open AI model hub, where developers share and download models used in real applications.)

Model (recently trending)	What it does	Organization
Qwen3 / Qwen3.5 variants	Multimodal language, reasoning, coding	Alibaba
DeepSeek reasoning models	Advanced reasoning & problem-solving LLMs	DeepSeek
MiniMax agentic model	Workflow automation & agent-style tasks	MiniMax
GLM / ChatGLM family	Dialogue & multilingual language tasks	Z.ai
Meta segmentation foundation model	Image/video understanding	Meta
Mistral small & efficient LLM variants	Lightweight general language models	Mistral AI
Sentence-Transformer updates	Embeddings & semantic search	Community / UKPLab
Stable Diffusion fine-tunes	Image generation variants	Stability ecosystem
Whisper derivatives	Speech recognition / transcription	OpenAI ecosystem
Lightweight reranker & retrieval models	Search ranking & RAG pipelines	Various community teams

A Blog on Technology and Business

for Our Curious Mind

"E pur si muove"
"And, yet it moves" --- Galileo 1633

Jack Lau

Happy Lunar New Year of the Horse — Nvidida AI That Talks With Us, Google Creates Music with Lyrics for Us

When Conversation Starts Feeling Natural: NVIDIA PersonaPlex

When Creation Becomes Conversational: Gemini + Lyria 3

The Signal Behind Both

Hugging Face Snapshot — Popular Open Models by Downloads (the Past 7 days)

最新文章

留言