top of page
搜尋

Happy Lunar New Year of the Horse — Nvidida AI That Talks With Us, Google Creates Music with Lyrics for Us

  • Jack Lau
  • 22小时前
  • 讀畢需時 3 分鐘

First of all, Happy Lunar New Year of the Horse to all readers.


I have always liked the symbolism of the Horse — energy, motion, and momentum, but also the importance of direction. Technology often runs fast; leadership is about deciding where it should go.


Recently, two AI developments caught my attention. On the surface they look unrelated, but together they tell an interesting story about where AI is heading.


One is NVIDIA’s PersonaPlex, a conversational voice AI that sounds remarkably natural.The other is Google Gemini’s Lyria 3, which can generate original music instantly from a simple prompt.


One changes how AI interacts.The other changes how AI creates.

Together, they hint that AI is shifting from being a tool we operate to something we increasingly work alongside.


When Conversation Starts Feeling Natural: NVIDIA PersonaPlex

Here’s the demo if you haven’t seen it:


When I watched this, what struck me was not the intelligence of the responses, but the feel of the interaction.


The AI reacts quickly.It acknowledges naturally.It sounds present.

Most voice assistants today still feel like machines:


You speak.It pauses.Then it answers.


That pause is enough to remind you that you are interacting with software.

PersonaPlex aims to remove that feeling by allowing the AI to listen and speak simultaneously, closer to how humans converse.


This small technical change creates a big experiential shift:

  • interruptions feel natural

  • responses feel immediate

  • rhythm feels human


When conversation flows naturally, the interface disappears.


We stop navigating software.We simply talk.


For organizations, this matters enormously. Customer engagement, internal workflows, training, healthcare intake, education — all of these depend heavily on interaction quality.


The breakthrough is not just smarter AI.

It is AI that feels present in the moment.


When Creation Becomes Conversational: Gemini + Lyria 3

Now consider this demo:

With Gemini’s Lyria 3, you describe a mood or idea, and within seconds the system generates:

  • lyrics

  • melody

  • vocals

  • arrangement

  • even artwork


When I first saw this, I didn’t think “AI can compose music.”


I thought:


Creation is becoming conversational.


For most of history, making music required instruments, software, and technical skills.Now the primary requirement is simply being able to describe the idea.

The AI handles execution.


We have already seen this pattern with writing, design, and coding.Music is just the latest creative domain to follow.


But the deeper change is this:

Creation no longer happens before the conversation.It happens inside the conversation.


A teacher can create a soundtrack while planning a lesson.A marketing team can test campaign ideas instantly.A creator can explore variations in real time.

Creation becomes part of thinking itself.


The Signal Behind Both


PersonaPlex improves how AI interacts.Lyria 3 improves how AI creates.


Put together, they point to the same direction:

AI is becoming less like software and more like a collaborator.

In the past, we gave software commands.Today, we ask AI for help.Tomorrow, we may simply work with it continuously.


For leaders, this is the more important question:


Not “Which AI model should we use?”But “How do we redesign work when interaction is natural and creation is instant?”


That answer will shape organizations far more than any individual technology choice.

And perhaps that is a fitting reflection for the Year of the Horse:

Speed is valuable.Direction is decisive.


Hugging Face Snapshot — Popular Open Models by Downloads (the Past 7 days)


(Hugging Face is the world’s largest open AI model hub, where developers share and download models used in real applications.)


Model (recently trending)

What it does

Organization

Qwen3 / Qwen3.5 variants

Multimodal language, reasoning, coding

Alibaba

DeepSeek reasoning models

Advanced reasoning & problem-solving LLMs

DeepSeek

MiniMax agentic model

Workflow automation & agent-style tasks

MiniMax

GLM / ChatGLM family

Dialogue & multilingual language tasks

Meta segmentation foundation model

Image/video understanding

Meta

Mistral small & efficient LLM variants

Lightweight general language models

Mistral AI

Sentence-Transformer updates

Embeddings & semantic search

Community / UKPLab

Stable Diffusion fine-tunes

Image generation variants

Stability ecosystem

Whisper derivatives

Speech recognition / transcription

OpenAI ecosystem

Lightweight reranker & retrieval models

Search ranking & RAG pipelines

Various community teams





 
 
 

留言


bottom of page