Undergrads’ AI Speech Model Rivals Google’s NotebookLM

Two Korea-based undergrads have created an openly available AI model called Dia, which can generate podcast-style clips similar to Google’s NotebookLM. The model, developed by Nari Labs, offers more control over generated voices and “freedom in the script.” Dia can run on most modern PCs and generates a random voice unless prompted with a description of an intended style.

Forecast for 6 months: Expect an increase in the adoption of synthetic speech tools, with more startups and companies investing in voice AI technology. Nari Labs may release a technical report for Dia and expand its support to languages beyond English.
Forecast for 1 year: The market for synthetic speech tools is expected to continue growing, with more players entering the market. Nari Labs may release a synthetic voice platform with a “social aspect” on top of Dia and larger, future models. The use of Dia for disinformation or scammy recordings may become a concern.
Forecast for 5 years: The use of AI-generated voices will become more widespread, with applications in various industries such as entertainment, education, and marketing. Nari Labs may become a major player in the synthetic speech market, and the company may face increased competition from other startups and established companies.
Forecast for 10 years: AI-generated voices will become indistinguishable from human voices, and the technology will be used in various applications such as virtual assistants, customer service, and even politics. The use of AI-generated voices may raise concerns about authenticity and trust, and regulations may be put in place to govern their use.