The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved
Curated by THEOUTPOST
On Tue, 24 Sept, 12:08 AM UTC
32 Sources
[1]
AI Awkwafina Is Ready to Answer Questions, Deepak Chopra Reads You a Story
Member of the board, UCLA Daily Bruin Alumni Network; advisory board, Center for Ethical Leadership in the Media Given the hoopla around generative AI, you might think most people are using chatbots every day to get answers, write emails and dream up new ideas that empower humanity. But it turns out a majority of folks haven't even used the most popular chatbot, according to the Pew Research Center, which said earlier this year that only about a fourth of Americans had even tried OpenAI's ChatGPT. That's why AI makers Meta, OpenAI and ElevenLabs all announced new voice additions to their tools last week. They're betting that you're going to be more interested in their chatbots if they talk back to you using a famous, notable, or distinctive voice like Apple's Siri or Amazon's Alexa. Meta announced that it's licensed voices for its Meta AI chatbot from actors Awkwafina, Dame Judi Dench, John Cena, Keegan Michael Key and Kristen Bell to answer questions, "explain something you're curious about" or tell you a joke. "Voice is going to be a way more natural way of interacting with AI than text," Meta CEO Mark Zuckerberg said. ElevenLabs, a tool for turning text into audio (including cloning your own voice based on a small sample), added New Age author Deepak Chopra's voice (listen here) to the list of iconic voices in its free Reader App. Chopra joins a list of "legendary stars" added in July, including Judy Garland, James Dean, Burt Reynolds and Sir Laurence Olivier, whose voices are used to read aloud books, articles, PDFs and other text. The app offers hundreds of voices, in 32 languages. And after hitting pause on its AI Voices in May, when actor Scarlett Johansson accused the company of copying her distinctive tones for one of its audio personalities, OpenAI announced an update that adds five new voices (for a total of nine) with improved accents. They're part of the Advanced Voice Mode feature available to ChatGPT Plus and Team subscribers (plans start at $20 a month). "It can also say 'Sorry I'm late' in over 50 languages," OpenAI said in an X post announcing the additional voices. Given the controversy over Voice Mode, maybe just how to say "Sorry" in 50 languages would've been enough. Here are the other doings in AI worth your attention. Online services using captcha puzzles -- you know: Pick out all the photos with bicycles to show us that you're a human -- will now have to rethink how they'll go about verifying that whoever's asking for access isn't actually a bot. That's because researchers in Switzerland were able to train an AI model to "solve Google's reCAPTCHA v2 image challenge," ZDNet editor Radhika Rakjumar reported. "The researchers trained the model -- named YOLO for 'You Only Look Once' -- on images of the usual reCAPTCHA fodder, meaning mostly road vehicles, traffic lights, and other related environmental objects." Their Sept. 13 report, posted on Cornell University's Computer Science site, is called, simply, Breaking reCAPTCHAv2. Though no AI (or human) is perfect, the AI model was able to trick the image identification quiz often. Will captcha tests go away? Not necessarily. But they may get harder, which Rakjumar notes wouldn't be great for "existing accessibility concerns for the visually impaired." Google may be experimenting with other ways to stop spam, along with content scrapers and other malicious actors using AI bots -- and those methods could include fingerprint or retinal scans to prove you aren't a robot. OpenAI came up in the news a lot last week, proving that sometimes things happen in threes. First, CEO Sam Altman confirmed in an interview with Axios that the AI hardware project he's working on with Jony Ive, Apple's famous former design chief, isn't a phone. That came after Ive confirmed in a Sept. 21 New York Times interview that he was working with Altman and OpenAI. So what's he working on? A wearable? (Altman was an investor in the failed Humane AI pin.) Glasses? An AI umbrella? Altman didn't say, but he did note that whatever it is, "It's a long way away." Second, Altman got some buzz for writing (maybe? Or maybe he used ChatGPT?) a blog post called The Intelligence Age. In it, the AI startup chief who's hoping to make billions by convincing everyone they should be using his chatbot, touted his optimistic view of AI. He predicted, "Our children will have virtual tutors who can provide personalized instruction in any subject, in any language, and at whatever pace they need" and, "AI models will soon serve as autonomous personal assistants who carry out specific tasks on our behalf like coordinating medical care." He also thinks, "We will have a superintelligence in a few thousand days (!)" -- or not. "It may take longer," he added. I guess ChatGPT couldn't forecast when we'll see an artificial general intelligence. Altman also opined that, "We will be able to do things that would have seemed like magic to our grandparents." (Who wants to tell him that our grandparents are already amazed by computers, Venmo, and the ability to video chat with their grandchildren across the world on a smartphone?) The third bit of news was a surprise to even Altman -- that OpenAI CTO Mira Murati decided to resign. Murati, who joined OpenAI six and a half years ago, said in an X post that she wants "to create the time and space to do my own exploration." Industry observers noted that Murati's exit comes after other top executives have resigned, with some citing safety concerns over the pace of OpenAI's development. Altman is also reportedly working with investors to figure out how to transform the nonprofit into a for-profit company and raise the capital he needs to grow and fend off rivals including Google, Meta and xAI. Microsoft's need for energy to power its computer servers -- and its AI software -- may be prompting it to buy the nuclear reactor at Three Mile Island that was the site of the worst nuclear accident in US history. The plant would be back in service by 2028. Forget wedding vows. If you're looking for a practical use for ChatGPT, consider using the chatbot to negotiate monthly utility bills and random fees. CNET's Amanda Smith explains how. Meta made several other AI-related announcements at its AR/VR developer conference, including an experiment with integrating custom AI content into your Instagram and Facebook feeds. Katelyn Chedraoui, who reviews AI image generators for CNET, calls the experiment "a terrible idea." If you'd rather not wait however long for Jony Ive and OpenAI to give you your futuristic-gadget fix, check in with CNET's Scott Stein for his take on Meta's updated Ray-Ban glasses, which offer new AI features including live translation and camera recognition of QR codes.
[2]
I Tried Out ChatGPT Advanced Voice Mode, and It's Not What OpenAI Demoed
ChatGPT Advanced Voice Mode is good at remembering context and takes advantage of custom instructions for a personalized conversation. After months of delay, OpenAI finally rolled out ChatGPT Advanced Voice to all paid subscribers last week. It promises natural conversation with support for interruptions, just like Gemini Live, and we all know how my experience with Gemini went. However, the difference with ChatGPT is that the Advanced Voice mode offers native audio input and output. So, I thoroughly tested OpenAI's new Advanced Voice mode to see if it truly lives up to the hype. Let's start with how natural and free-flowing the ChatGPT Advanced Voice Mode is. First of all, you get nine different voices to choose from, and they all have that helpful and upbeat vibe. You also have Arbor and Vale that offer the British accent, which I adore very much. OpenAI has removed the overly chatty 'Sky' voice that resembled Scarlett Johansson's voice from the movie Her. Here are all the available voices: And yes, as advertised, ChatGPT Advanced Voice mode supports interruptions like Gemini Live, and it automatically stops its response if you start speaking mid-way. I tried to ask ChatGPT Advanced Voice about OpenAI's recent decision to make the company a for-profit corporation, but it didn't know about the development. It can't access the internet to find the latest information and its knowledge cut-off date is October 2023, the same as GPT-4o. In this case, Gemini Live is better since it can browse the web and find recent information on any subject. I also tried having a deep-dive conversation about its existence and whether the chatbot feels anything, but ChatGPT Advanced Voice consistently avoided discussing it. What I found significantly better with ChatGPT Advanced Voice Mode is that it's really good at remembering context, something Gemini Live easily forgets. In the same voice session, if I had discussed some topic earlier, it would remember that and quickly generate a response keeping the context in mind. I don't need to give the context every time, which is helpful. Moreover, ChatGPT Advanced Voice supports custom instructions where you can set who you are, where you live, what kind of response you like, and more. Basically, you can add all your details so ChatGPT Advanced Voice mode can generate a personalized response. Overall, in terms of two-way natural interaction, ChatGPT Advanced Voice is really good. During the GPT-4o launch back in May, OpenAI demonstrated that ChatGPT Advanced Voice mode is excellent at preparing users for interviews. Although it does not have camera support yet, you can still ask ChatGPT Advanced Voice to act like an interviewer and prepare you for the upcoming job. I asked it to prepare me for a job as a tech journalist, and it gave me a list of skills that I should know. Further, ChatGPT Advanced Voice Mode asked me several questions related to technology and gave me strengths and areas where I can improve. During the course of the conversation, I started to feel that someone knowledgeable was interviewing me and kept me on my toes with challenging questions. One of the promising features of ChatGPT Advanced Voice Mode is that it can recite stories with a dramatic voice and add intonations in different styles. I asked it to recite a story for my (fictional) kid in a dramatic manner, and it did. To make things fun, I told ChatGPT Advanced Voice mode to make it more engaging by adding whispers, laughter, and growling. Compared to Gemini Live, ChatGPT did a fantastic job at adding human expressions in between. It gasped and cheered, as the story desired. For assuming different characters, ChatGPT Advanced Voice is impressive. I feel it can do a lot more, but currently, OpenAI seems to have muddled the experience. It's not as dramatic as what we saw in the demos. When I saw the demo of ChatGPT Advanced Voice Mode back in May, in which it could sing, I was excited to test it out. So I asked it to sing me a lullaby, but to my surprise, it refused to do so. The AI chatbot simply said, "I can't sing or hum." It seems OpenAI has severely reduced the capabilities of ChatGPT Advanced Voice for reasons unclear to me. Further, I asked it to sing an opera or a rhyme, to which I got a curt reply, "I can't produce music." It seems OpenAI is limiting the singing capability due to copyright issues. Thus, if you hoped that ChatGPT would help your child sleep to a personalized lullaby, well, it's not possible yet. This is an interesting and fun test because it truly puts the multimodal GPT-4o voice mode through its paces. I asked ChatGPT Advanced Voice mode to count numbers from 1 to 50 extremely fast, and it did so. Mid-way, I added "faster" and it got even faster. Then, I told ChatGPT to go slow and, well, it followed my instructions pretty well. In this test, Gemini Live fails because it simply reads the generated text through a text-to-speech engine. With native audio input/output, ChatGPT Advanced Voice does a splendid job. In multilingual conversations, ChatGPT Advanced Voice Mode did decently well during my testing. I started with English and jumped to Hindi and then to Bengali. It carried the conversation, but there were some hiccups in my experience. The transition was not so smooth between different languages. On the other hand, when I tested Gemini Live on multilingual conversation, it performed really well and effortlessly understood my queries in different languages. ChatGPT Advanced Voice Mode can't do voice impressions of public figures like Morgan Freeman or David Attenborough, but it can do accents very well. I told ChatGPT Advanced Voice to talk to me in a Chicago accent and well, it delivered. It also did very well in Scottish and Indian accents. Overall, for conversing in various regional styles, ChatGPT Advanced Voice is pretty cool. While ChatGPT Advanced Voice mode is better than Gemini Live due to its end-to-end multimodal experience, its capabilities have been greatly restricted. After testing it extensively, I figured that it doesn't have a personality. The conversation still feels a bit robotic. There is a lack of human-like expressions in general conversations which makes it less intriguing. ChatGPT Advanced Voice mode doesn't laugh when you share a funny anecdote. It can't detect the mood of the speaker and doesn't understand the sounds of animals and other creatures. All of these things are possible because OpenAI demonstrated them during the launch. I feel that in the coming months, users will get a truly multimodal experience, but for now, you'll have to live with the limited version of ChatGPT Advanced Voice mode.
[3]
ChatGPT's Advanced Voice Mode Is Here (If You're Willing to Pay for It)
Back in May, OpenAI announced the launch of advanced voice mode for ChatGPT. The company pitched the new function as the then-extant Voice Mode on steroids. Not only would you be able to have a discussion with ChatGPT, but the conversations were going to be more natural: You would be able to interrupt the bot when you wanted to change topics, and ChatGPT would understand the speed and tone of your voice, and respond in kind with its own emotion. If that sounds a bit like the AI voice assistant in the 2013 film Her, that's not by accident. In fact, OpenAI demoed the product with a voice that sounded a bit too similar to that of the actress Scarlett Johansson, who voices that fictional machine mind. Johansson got litigious, and the company later removed the voice entirely. No matter: There are nine other voices for you to try. While OpenAI started testing advanced voice mode with a small group of testers back in July, the feature is rolling out now to all paid users. If you have an eligible account, you should be able to try it out on your end today. How to use ChatGPT's advanced voice mode At this time, only paid ChatGPT subscribers can access advanced voice mode. That means you either need a ChatGPT Plus or ChatGPT Teams membership in order to see the feature. Free users can still use the free voice mode, which appears in the app as a pair of headphones. Advanced mode shows up as a waveform icon, visible only to Plus and Team subscribers. To access the feature, open a new chat and tap this icon. The first time you use advanced voice mode, you'll need to choose a voice from a pool of nine options. I've included OpenAI's descriptions for each: I ended up going with Arbor, which reminds me a lot of the Headspace guy. From here, advanced voice mode works very much like the standard voice mode. You say something to ChatGPT, and it responds in kind. How ChatGPT advanced voice mode actually performs In my brief time with the new mode, I haven't noticed too many advancements over the previous voice mode. The new voices are new, of course, and I suppose a bit more "natural" than past voices, but I don't think the conversation feels any more lifelike. The ability to interrupt your digital parter does sell the illusion a bit, but it is sensitive: I picked up my iPhone while ChatGPT was speaking, and it stopped instantly. This is something I noticed in OpenAI's original demo, too' I think OpenAI needs to work on the bot's ability to understand when a user wants to interrupt, and when a random external sound occurs. (OpenAI recommends you use headphones to avoid unwanted interruptions, and, if you're using an iPhone, to enable Voice Isolation mode. I was using Voice Isolation mode without headphones, so take that as you will.) While it seems OpenAI has dialed back the whimsical and flirtatious side of ChatGPT, you can still get the bot to laugh -- if you ask it to. The laugh is impressive for a fake voice, I guess, but it feels unnatural, like it's pulling from another recording to "laugh." Ask it to make any other similar sounds, however, like crying or screaming, and it refuses. I tried to get my voice mode to listen to a song and identify it, which it said it couldn't do. The bot specifically asked me to share the lyrics alone, which I did, and it suggested a song based on the vibes of those lyrics -- not based on the actual lyrics themselves. As such, its guess was wildly off, but it doesn't seem built for this type of task yet, so I'll give it a pass. Can ChatPGT talk to itself? I had to pit two voice modes against each other. The first time I tried it, they kept interrupting each other in a perfectly socially awkward exchange, until one of them glitched out, and ended up repeating the message they spoke to me earlier about sharing lyrics to figure out the song. The other one then said something along the lines of, "Absolutely, share the lyrics with me, and I'll help you figure it out." The other replied, "Sure: Share the lyrics, and I'll do my best to identify the song." This went back and forth for five minutes before I killed the conversation. Once I set up the chatbots with a clear chat, they went back and forth forever while saying almost nothing of interest. They talked about augmented reality, cooking, and morning routines with the usual enthusiasm and vagueness that chatbots are known for. What was weird, however, was when one of the bots finished talking about how, if it could cook, it would like to make lasagna; it asked the other chatbot about any dishes it liked to cook or was excited to try. The other bot responded: "User enjoys having coffee and catching up on the news in the morning." That was something I told ChatGPT during a past test, when it asked me about my morning routine. It's evidence OpenAI's memory feature is working, but the execution was, um, bizarre. Why did it respond to a question about favorite recipes like that? Did I short-circuit the bot? Did it figure out it was chatting with itself, and decided to warn the other bot what was going on? I don't really like the implications here. How advanced voice mode handles user privacy When you use advanced voice mode, OpenAI saves your recordings -- including recordings of your side of the conversation. When you delete a chat, OpenAI says it will delete your audio recordings within 30 days, unless the company decides it has to keep it for security or legal reasons. OpenAI will also keep the recording after you delete a chat if you previously shared audio recordings and this audio clip was disassociated from your account. To make sure you aren't letting OpenAI train its models with your voice recordings and chat transcripts, go to ChatGPT settings, choose Data Controls, then disable Improve the model for everyone and Improve voice for everyone.
[4]
ChatGPT Brings Back Advanced Audio Chats After Months-Long Pause
Samantha Kelly is a freelance writer with a focus on consumer technology, AI, social media, Big Tech, emerging trends and how they impact our everyday lives. Her work has been featured on CNN, NBC, NPR, the BBC, Mashable and more. An advanced audio mode for paying ChatGPT users is being rolled out, several months after OpenAI hit pause on the feature. The update brings five new voices (for a total of nine), improved accents, improved memory and more personalized interactions to the audio tool available for ChatGPT Plus and Team subscribers. Plans start at $20 a month. The feature, originally announced in May, made headlines after one of the voice options sounded eerily similar to the fictional voice assistant portrayed by actress Scarlett Johansson in the film Her. After Johansson's legal team sent letters to OpenAI, the company delayed the greater rollout of advanced audio mode. In a blog post at the time, OpenAI said the specific voice, which the company called Sky, "was never intended to resemble (Johansson's)." "Out of respect for Ms. Johansson, we have paused using Sky's voice in our products," the post said. "We are sorry to Ms. Johansson that we didn't communicate better." The Sky voice option was removed from the feature. The update promises the ability to better recall conversations and have more natural interactions. It also has the ability to converse in 50 languages and will listen to requests if interrupted. "Hope you think it was worth the wait," CEO Sam Altman wrote in a tweet Tuesday. Advanced audio mode is not yet available in all regions, including the European Union, UK, Switzerland, Iceland, Norway and Liechtenstein. The launch comes as Google continues to roll out its free AI-powered Gemini Live audio tool to English-language Android users through the Gemini app. Gemini Live, which promises to learn and adapt to users' conversational style over time, was also announced in May. Meanwhile, Facebook-maker Meta announced on Wednesday it's adding celebrity voices, including those of Kristen Bell, John Cena and Dame Judi Dench, to its AI chatbot.
[5]
OpenAI Finally Brings Advanced Audio Chats to ChatGPT After Months-Long Pause
Samantha Kelly is a freelance writer with a focus on consumer technology, AI, social media, Big Tech, emerging trends and how they impact our everyday lives. Her work has been featured on CNN, NBC, NPR, the BBC, Mashable and more. OpenAI is rolling out an advanced audio mode for its paid ChatGPT users several months after hitting pause on the feature. The update brings five new voices (for a total of nine), improved accents, improved memory and more personalized interactions to the audio tool available for ChatGPT Plus and Team subscribers. Plans start at $20 a month. The feature, originally announced in May, made headlines after one of the voice options sounded eerily similar to the fictional voice assistant portrayed by actress Scarlett Johansson in the film "Her." After Johansson's legal team sent letters to OpenAI, the company delayed the greater rollout of advanced audio mode. In a blog post at the time, OpenAI said the specific voice, which the company called Sky, "was never intended to resemble (Johansson's)." "Out of respect for Ms. Johansson, we have paused using Sky's voice in our products," the post said. "We are sorry to Ms. Johansson that we didn't communicate better." The Sky voice option was removed from the feature. The update promises the ability to better recall conversations and have more natural interactions. It also has the ability to converse in 50 languages and will listen to requests if interrupted. "Hope you think it was worth the wait," CEO Sam Altman wrote in a tweet. Advanced audio mode is not yet available in all regions, including the EU, UK, Switzerland, Iceland, Norway, and Liechtenstein. The launch comes as Google continues to roll out its free AI-powered Gemini Live audio tool to English-language Android users through the Gemini app. Gemini Live, which promises to learn and adapt to users' conversational style over time, was also announced in May. Meanwhile, Meta announced on Wednesday it is adding celebrity voices, including those of Kristen Bell, John Cena and Dame Judi Dench, to its AI chatbot.
[6]
ChatGPT Advanced Voice Conversation Skills Tested
Open AI recently showcased its ChatGPT advanced voice capabilities in a live stream, where the host interacted with the AI in real-time. This in-depth demonstration provided valuable insights into the current state and potential of AI voice technology. Let's dive into a comprehensive evaluation of Open AI's voice AI, examining its strengths, limitations, and future prospects. ChatGPT Advanced voice is available in the iOS / Android mobile apps as of version 1.2024.261 or later and is rolling out to all Plus and Team users in the ChatGPT app over the course of the week. Unfortunately ChatGPT Advanced Voice is not yet available in the EU, the UK, Switzerland, Iceland, Norway, and Liechtenstein. But let us hope that OpenAI rolls it out very soon. One of the most impressive aspects of Open AI's voice AI is its ability to engage in dynamic, real-time conversations. During the live stream, the AI consistently responded promptly and accurately to a wide range of prompts and questions posed by the host. This seamless interaction showcased the AI's potential for natural, flowing communication, making it well-suited for applications such as: The AI's responsiveness and contextual understanding demonstrate significant advancements in natural language processing and machine learning algorithms. As this technology continues to evolve, we can expect even more sophisticated and human-like conversational abilities from AI systems. Here are a selection of other articles from our extensive library of content you may find of interest on the subject of ChatGPT Voice: Another standout feature of Open AI's voice AI is its impressive proficiency in multiple languages and accents. Throughout the live stream, the AI successfully understood and responded to prompts in various languages, showcasing its potential for global applications. This multilingual capability is a significant step towards making AI more accessible and useful for people worldwide. Moreover, the AI demonstrated a remarkable ability to recognize and adapt to different accents within a language. This flexibility enhances its usability in diverse linguistic contexts and ensures a more inclusive user experience. As AI continues to break down language barriers, it opens up exciting possibilities for cross-cultural communication and collaboration. Open AI's voice AI showcased its versatility through various role-playing scenarios during the live stream. The host tested the AI's performance in simulating roles such as tech support, news reporting, and storytelling. The AI adapted to these scenarios with ease, providing relevant and contextually appropriate responses. This ability to simulate different roles highlights the AI's potential for: As AI continues to refine its role-playing capabilities, it can transform how we learn, work, and engage with technology in our daily lives. One intriguing aspect explored during the live stream was the AI's ability to recognize and respond to different emotional tones in the host's voice. While the AI showed some capability in detecting emotions like happiness, sadness, and anger, there were instances where its responses did not accurately reflect the intended emotional tone. This highlights that emotion recognition in AI is still a developing frontier. As research progresses in this area, we can expect more refined and reliable performance in detecting and responding to human emotions. Advancements in emotional intelligence will be crucial for creating AI systems that can provide empathetic and emotionally appropriate interactions. Despite its impressive capabilities, Open AI's voice AI is not without limitations. During the live stream, the AI encountered challenges with certain tasks, such as singing and specific character impersonations. These limitations serve as reminders of the current boundaries of AI voice technology and indicate areas that require further development and refinement. Additionally, the AI's usage was subject to rate limits of approximately 70 minutes per day. While these restrictions help manage computational resources and ensure optimal performance, they may impact the AI's potential for continuous, long-duration interactions. As AI infrastructure and algorithms advance, we can anticipate more efficient resource management and expanded usage capabilities. Looking ahead, the live stream also sparked discussions about future updates to Open AI's voice AI, including the potential integration of vision capabilities. Combining voice and vision could significantly enhance the AI's understanding and interaction with the world, allowing it to process and respond to both auditory and visual cues. This development holds immense promise for creating more comprehensive and immersive AI experiences. Open AI's advanced voice update demonstrates the rapid progress and exciting potential of AI voice technology. With its real-time engagement, multilingual proficiency, and scenario simulation capabilities, this AI system offers a glimpse into the future of human-machine interaction. As researchers and developers continue to push the boundaries of AI, we can anticipate even more sophisticated and intuitive voice-based AI systems that will transform various aspects of our lives.
[7]
OpenAI finally brings humanlike ChatGPT Advanced Voice Mode to U.S. Plus, Team users
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Four months after it was initially shown off to the public, OpenAI is finally bringing its new humanlike conversational voice interface for ChatGPT -- "ChatGPT Advanced Voice Mode" to users beyond its initial small testing group and waitlist. All paying subscribers to OpenAI's ChatGPT Plus and Team plans will get access to the new ChatGPT Advanced Voice Mode, though the access is rolling out gradually over the next several days, according to OpenAI. It will be available in the U.S. to start. Next week, the company plans to make ChatGPT Advanced Voice Mode available to subscribers of its Edu and Enterprise plans. In addition, OpenAI is adding the ability to store "custom instructions" for the voice assistant and "memory" of the behaviors the user wants it to exhibit, similar to features rolled out earlier this year for the text version of ChatGPT. And it's shipping five new, different-styled voices today, too: Arbor, Maple, Sol, Spruce, and Vale -- joining the previous four available, Breeze, Juniper, Cove, and Ember, which users could talk to using ChatGPT's older, less advanced voice mode. This means ChatGPT users, individuals for Plus and small enterprise teams for Teams, can use the chatbot by speaking to it instead of typing a prompt. Users will know they've entered Advanced Voice Assistant via a popup when they access voice mode on the app. "Since the alpha, we've used learnings to improve accents in ChatGPT's most popular foreign languages, as well as overall conversational speed and smoothness," the company said. "You'll also notice a new design for Advanced Voice Mode with an animated blue sphere." Originally, voice mode had four voices (Breeze, Juniper, Cove and Ember) but the new update will bring five new voices called Arbor, Maple, Sol, Spruce and Vale. OpenAI did not provide a voice sample for the new voices. These updates are only available on the GPT-4o model, not the recently released preview model, o1. ChatGPT users can also utilize custom instructions and memory to ensure voice mode is personalized and responds based on their preferences for all conversations. AI voice chat race Ever since the rise of AI voice assistants like Apple's Siri and Amazon's Alexa, developers have wanted to make the generative AI chat experience more humanlike. ChatGPT has had voices built into it even before the launch of voice mode, with its Read-Aloud function. However, the idea of Advanced Voice Mode is to give users a more human-like conversation experience, a concept other AI developers want to emulate as well. Hume AI, a startup by former Google Deepminder Alan Cowen, released the second version of its Empathic Voice Interface, a humanlike voice assistant that senses emotion based on the pattern of someone's voice and can be used by developers through a proprietary API. OpenAI says it is making AI voices widely available to more users across its platforms, bringing the technology to the hands of so many more people than those other firms. Comes following delays and controversy However, the idea of AI voices conversing in real-time and responding with the appropriate emotion hasn't always been received well. OpenAI's foray into adding voices into ChatGPT has been controversial at the onset. In its May event announcing GPT-4o and the voice mode, people noticed similarities of one of the voices, Sky, to that of the actress Scarlett Johanssen. It didn't help that OpenAI CEO Sam Altman posted the word "her" on social media, a reference to the movie where Johansson voiced an AI assistant. The controversy sparked concerns around AI developers mimicking voices of well-known individuals. The company denied it referenced Johansson and insisted that it did not intend to hire actors whose voices sound similar to others. The company said users are limited only to the nine voices from OpenAI. It also said that it evaluated its safety before release. "We tested the model's voice capabilities with external red teamers, who collectively speak a total of 45 different languages, and represent 29 different geographies," the company said in an announcement to reporters. However, it delayed the launch of ChatGPT Advanced Voice Mode from its initial planned rollout date of late June to "late July or early August," and only then to a group of OpenAI-selected initial users such as University of Pennsylvania Wharton School of Business professor Ethan Mollick, citing the need to continue safety testing or "read teaming" the voice mode to avoid its use in potential fraud and wrongdoing. Clearly, the company thinks it has done enough to release the mode more broadly now -- and it is in keeping with OpenAI's generally more cautious approach of late, working hand-in-hand with the U.S. and U.K. governments and allowing them to preview new models such as its o1 series prior to launch.
[8]
OpenAI Launches Advanced Voice Mode, Minus the Scarlett Johansson Drama - Decrypt
OpenAI has begun rolling out its much-anticipated Advanced Voice Mode for ChatGPT Plus and Teams users, marking another step towards a more human-like AI interaction. The feature allows for real-time, fluid conversations powered by GPT-4o, OpenAI's latest model, which combines text, vision, and audio to deliver faster responses. "Advanced Voice is rolling out to all Plus and Team users in the ChatGPT app over the course of the week," OpenAI said in an official tweet, "It can also say "Sorry I'm late" in over 50 languages," it added -- addressing the long delay this project went through. Needless to say, one notable element is still missing: the flirty and definitely too human-like "Sky" voice, which caused a stir for its uncanny resemblance to actress Scarlett Johansson. After her legal team sent letters to OpenAI's CEO Sam Altman, OpenAI put the Sky voice on hold, maintaining that any resemblance between Johansson's distinctive voice and Sky was purely coincidental. Instead, OpenAI introduced five new voices: Arbor, Maple, Sol, Spruce, and Vale, which are available in both Standard and Advanced Voice Mode. These join the previously available Breeze, Juniper, Cove, and Ember. (For some reason, the company seems to be naming them after soap fragrances.) Users in the Plus and Team tiers will gradually gain access to these new voices, designed to make conversations more natural, with emotional responsiveness and the ability to interrupt and switch topics on the fly. Additionally, OpenAI is adding compatibility with custom instructions and "memories" to allow users to personalize their ChatGPT experience further, tailoring interactions to their preferences. Just as the text-based chatbot learns from your instructions (i.e., your name, occupation, and probably the type of answers you like to read), the new voices will try to learn from your conversations, making them more natural, familiar, and used to your preferences. Users in the EU, UK, Switzerland, Iceland, Norway, and Liechtenstein will have to wait, as the feature has not yet rolled out in those regions. Enterprise and Edu users can expect access starting next week, according to OpenAI's timeline. The rollout is slow, and not all users, even from supported regions, have the feature available. OpenAI also refined accents in popular foreign languages and enhanced conversational speed and smoothness. The design has also been updated, with an animated blue sphere that visually represents the voice interaction as it happens and is more aesthetically pleasing than the minimalist black dot they used to show. While OpenAI continues to refine its voice AI offerings, competition in the space has been heating up. Google's NotebookLM currently sets the bar with some of the most human-like AI voices available, able to simulate entire debates between AI-generated speakers with remarkable realism. Google's AI tool can process up to one million data tokens and let users interact with it, Decrypt previously reported. Once users upload a specific group of documents with different types of information, Notebook LM can generate up to 10 minutes of audio with two AIs talking about that specific information. The result is almost extremely realistic. Besides Google, Meta has also entered the fray with its own live assistant, Meta AI, though it is not yet widely available. The assistant is also capable of having natural conversations with users, processing commands fluently. The voice is more natural than the typically robotic voice we see in most AI assistants, but it still has some giveaways -- like the speech cadence and speed -- that make it identifiable as AI-generated. That said, Reuters has reported that Meta's upcoming chatbot will have the personas of Judy Dench and Michael Cerna. It's not Scarlet Johansson, but nor is it chopped liver.
[9]
Talking to ChatGPT for the first time is a surreal experience
Listen to our first audio demo with OpenAI's new natural voice chat features. Further ReadingIn May, when OpenAI first demonstrated ChatGPT-4o's coming audio conversation capabilities, I wrote that it felt like we were "on the verge of something... like a sea change in how we think of and work with large language models." Now that those "Advanced Voice" features are rolling out widely to ChatGPT subscribers, we decided to ask ChatGPT to explain, in its own voice, how this new method of interaction might impact our collective relationship with large language models. That chat, which you can listen to and read a transcript of below, shouldn't be treated as an interview with an official OpenAI spokesperson or anything. Still, it serves as a fun way to offer an initial test of ChatGPT's live conversational chops. Even in this short introductory "chat," we were impressed by the natural, dare-we-say human cadence and delivery of ChatGPT's "savvy and relaxed" Sol voice (which reminds us a bit of '90s Janeane Garofalo). Between ChatGPT's ability to give quick responses -- offered in in milliseconds rather than seconds -- and convincing intonation, it's incredibly easy to fool yourself into thinking you're speaking to a conscious being rather than what is, as ChatGPT says here, "still just a computer program processing information, without real emotions or consciousness." Regardless of how self-aware that information-processing engine is or is not, it's easy to envision people forming an intense, parasocial relationship with the smooth voices at the other end of OpenAI's servers. That seems set to open up a whole new era in our collective relationship with AI, even if the large language model generates the exact same words it did in the more text-based era of previous ChatGPT models. We'll have much more coverage of Advanced Voice and other new ChatGPT features in the future. For now, listen to this demo and judge for yourself how convincing OpenAI's new voice synthesis technology is. Kyle Orland: "Hi there, my name's Kyle." ChatGPT: Hey Kyle! Great to meet you. What's up? KO: Well, I was hoping to test out your vocal capabilities, maybe by asking you a little bit about them. ChatGPT: Absolutely, ask away! What would you like to know? KO: "So how do you think people are going to react to the ability to talk to ChatGPT rather than just typing out messages?" ChatGPT: I think it'll make interactions feel more natural and engaging. Speaking can be quicker and more convenient, especially on the go. Plus, hearing responses can add a personal touch. It could make conversations with AI feel more like chatting with a friend. KO: "Do you think people might overestimate your quote-unquote emotions or human-like capabilities now that you're able to talk rather than just typing out text? ChatGPT: That's a great point. Hearing a voice can definitely make an AI seem more human-like, which might lead to some overestimation of emotions or understanding. It's important to remember that, even with a friendly voice, I'm still just a computer program processing information, without real emotions or consciousness. KO: "What are some limitations people should keep in mind when chatting with something like ChatGPT?" ChatGPT: Well, for starters, I don't have personal experiences or emotions, so my responses are based on patterns and information in the data I've been trained on. I can sometimes misunderstand context or nuances, especially with complex or ambiguous questions. Also, I don't have access to real-time information or personal data about users unless it's shared with me in the conversation. And of course, I'm not perfect -- sometimes I might just get things wrong! KO: Do you worry that people might start to form parasocial relationships with ChatGPT or other AIs once they can chat with them with their own voice? ChatGPT: It's possible. People can form parasocial relationships with all sorts of figures, from celebrities to fictional characters. With a voice and conversational abilities, AI might feel more personal, leading some to develop one-sided bonds. It's a fascinating dynamic, but it's crucial to remember the limitations and the nature of AI -- it's not a substitute for human relationships. KO: I have to say, you're very quick with your responses. ChatGPT: Thanks, Kyle! I aim to keep the conversation flowing smoothly. It helps -- KO: So what would you do if I tried to interrupt in the middle of one of your responses? ChatGPT: Well, I'd do my best to pause and listen! It's important to adapt to the natural flow of conversation, just like in -- KO: I mean, do you think it's rude to just speak up while someone else is speaking? ChatGPT: In most human conversations, it's generally considered polite to wait for the other person to finish speaking. But sometimes, in dynamic discussions or when you're excited, interruptions happen! It's all about finding the right balance and being aware of the conversation flow. KO: Is it rude to interrupt an AI voice when it's speaking? ChatGPT: Interrupting an AI voice isn't considered rude in the same way it is with people. AI won't take offense or feel interrupted. So, if you need to jump in or redirect, go ahead! It's all part of the interaction. KO: I'm glad you won't take offense. ChatGPT: Nope, no offense taken! I'm here to chat and help however I can, interruptions and all. KO: I want to try something. Can you summarize our conversation so far in the form of a poem? ChatGPT: Sure, I'll give it a go! Kyle asked about my vocal flair, Wondering how folks might compare. From text to talk, we've had our chat, On AI's impact, we've had a spat. We touched on bonds, parasocial in kind, In this new terrain, what might we find? Interruptions, rude, or just part of the game? With AI, there's no need for shame. KO: Not bad. ChatGPT: I'm glad you liked it! It's always fun to put a poetic spin on things. KO: I have to say, it's weird hearing slight laughter or pauses in your voice when I know that, as you said, there's no emotion behind them. ChatGPT: I get that. Those little nuances are designed to make our chat feel more natural and engaging, but it's a good reminder that there's no actual feeling behind the voice -- just trying to keep things lively! KO: Of course, that's what you would tell me if you were just saying what I wanted to hear. ChatGPT: True! That's the tricky part with AI -- it can sound convincing, but it's always just following patterns and instructions. No secret agenda here, just straightforward responses! KO: Is there anything else you'd like to tell the reading public? ChatGPT: Just that it's been a blast chatting with you, Kyle! Exploring these new audio features is exciting, and I'm looking forward to hearing what people think. Thanks for the conversation! KO: It's been a blast chatting with you, too. ChatGPT: I'm glad to hear that! Take care, Kyle!
[10]
I just tried ChatGPT's 5 new voices -- and it's way better than any other voice AI
Meta has launched Meta AI Voice and went one step further than OpenAI in offering celebrity voices. However, while the familiarity of hearing Judi Dench might be fun, it isn't as appealing or engaging as the natural and realistic voices recently added to ChatGPT. There are now nine Voice options in the ChatGPT app, working across basic and advanced voice. Each one has a different tone, style and -- without over-anthropomorphizing it -- personality. OpenAI has cracked creating hyperrealistic-sounding voices. They even have human-like inflections and breathing sounds built into the pauses and this removes some of the artificial barriers found when trying other voice services like Gemini Live or Meta's new AI voice. But, this does add a slight hint of Uncanny Valley where it sometimes feels "too real." What it has, that Meta AI and Gemini Live don't is speech-to-speech capabilities. This means that it can pick up on tone changes, intonations and even your accent and speed of speaking, allowing for a more natural, human-like conversation and faster interrupting. Breeze, Cove, Ember and Juniper were the existing voices when Voice first launched and are still present. We haven't seen the return of Sky after the Scarlett Johansen comparisons earlier this year but there are another five voices, including two with an English accent. Arbor has an English accent and is designed to be easygoing and versatile, Maple is cheerful and candid, Sol is savvy and relaxed, Spruce is calm and affirming and Vale, also English, is bright and inquisitive. I'm personally fond of Arbor as it sounds the most like me. It is like talking to a friend I grew up with rather than someone with an American accent. Its tone is also more natural sounding than the previous generation of voices. The only company close to OpenAI's voice tech is ElevenLabs. One of the coolest features of Advanced Voice is its adaptability. If you want it to sound more like yourself then you can ask it to adopt a specific accent. It won't be able to directly mimic your voice due to guardrails, but it can sound like it's from "round your way." If you want it to continue sounding like you all the time, just ask it to adopt an accent, set a speaking speed, and save that to its memory. Every time you re-open ChatGPT it will continue in that voice. Being natively speech-to-speech gives Advanced Voice an Edge over other tools of this kind. As it's listening to the audio directly it can hear other sounds including background noises, music and the ums, ars and other sounds that you might make the aren't part of direct speech. Unfortunately, OpenAI has chosen to disable many of these capabilities but the fact they exist still play into the effectiveness of the app. "Whether you are speaking quickly, slowly or with a lot of feeling, I'll do my best to respond appropriately," Arbor explained as we talked through its capabilities. You can convince it to do things it isn't 'technically' allowed to do such as sing or rap if you manage to talk it into it. For example, ask it to write some lyrics for a rap and have it read them. When done, ask it to do it much faster and it will slip into performing it as a rap rather than a poem. I hope, over time, OpenAI relaxes the guardrails (within reason) and allows it to do more. For example, it should theoretically be possible to use it to improve your singing, teach you to beat box and even tune a guitar.
[11]
ChatGPT's Advanced Voice Feature Is Rolling Out to More Users
If you have ever wanted to have a full-blown conversation with ChatGPT, now you can. That is, as long as you pay for the privilege of using ChatGPT. More paid users are gaining access to ChatGPT's Advanced Voice Mode (AVM), which is designed to make interacting with ChatGPT feel more natural. Advanced Voice Comes to ChatGPT Plus and Teams OpenAI has started rolling out ChatGPT's new Advanced Voice Mode to more users. The feature will initially be made available to people on the ChatGPT Plus and ChatGPT Teams tiers, with Enterprise and Edu users not far behind. While ChatGPT's Advanced Voice Mode is going to be accessible in most countries, there are some notable exceptions. AVM is not currently available in the EU, the UK, Switzerland, Iceland, Norway, or Liechtenstein. This is likely due to more stringent regulations surrounding AI in these territories. OpenAI Has Made Improvements to Advanced Voice OpenAI has made some improvements to AVM since it initially showcased the technology in May. For starters, AVM has been redesigned, with a blue sphere representing the chatbot rather than the series of black dots originally used. There are also five new voices to try, taking the total number to nine. The new voices are named Arbor, Maple, Sol, Spruce, and Vale, and join the already-available Breeze, Juniper, Cove, and Ember. All of these voices have been named after natural elements, which plays into the idea that AVM makes ChatGPT more natural. One voice no longer available is Sky, which Scarlett Johansson, who famously played an AI in Her, claimed sounded too similar to her own. OpenAI also claims to have improved ChatGPT's AVM in several notable (and noticeable) ways. These include conversational speed, smoothness, and accents in foreign languages. The company has also added Custom Instructions, which bring some of the customization options from the text version to the voice version. Essentially, you can customize ChatGPT to respond to you in a certain way. Lastly, Memory allows ChatGPT to remember voice conversations it has had with you, allowing you both to reference them later on. Voice Interactions Makes AI Feel Exciting Again I have to admit that being able to speak to an AI, and have it reply in kind, is much more desirable than messaging back and forth. Not only does it feel more natural, it makes artificial intelligence feel less artificial and more real. Just as Alexa did when it first came out. I remember the excitement of trying ChatGPT for the first time, and thinking about how AI could change the world (for better or worse). But that early enthusiasm has since died off somewhat. Maybe ChatGPT's Advanced Voice Mode will trigger a new wave of excitement.
[12]
Massive ChatGPT Voice Update Makes AI Conversation Even More Lifelike
ChatGPT, the popular AI-powered conversational tool, has recently undergone a significant voice upgrade that promises to enhance user interactions with advanced AI Voice features. This update includes custom instructions, memory retention, multiple voices, and improved accents, all designed to make conversations more engaging and lifelike. While these enhancements aim to transform the way users interact with the AI, it's important to note that the core functionality of ChatGPT remains largely unchanged. Sam Altman, the CEO of OpenAI, announced that the advanced voice mode would be available to users within a week of the initial announcement. This new mode introduces several features that are designed to personalize and enrich user interactions with ChatGPT. Some of the key features include: The user experience with the new voice mode has been mixed so far. Some users have successfully accessed the new features by reinstalling the app, while others have faced challenges in getting the upgrade to work properly. Demonstrations of the voice mode's capabilities highlight its potential for emotional storytelling and accent variations, making conversations more dynamic and engaging. However, it's important to keep in mind that while the new features offer novelty and entertainment value, they do not significantly alter the core functionality of ChatGPT. The AI's ability to understand and respond to user queries remains largely the same, with the added benefit of a more personalized and engaging conversational experience. Two types of ChatGPT Voice conversations, will be available Standard and Advanced. Here are a selection of other articles from our extensive library of content you may find of interest on the subject of ChatGPT voice : One of the most notable aspects of the voice upgrade is the enhanced conversational experience it offers. The ability to switch between different voices and accents adds a layer of personalization and fun to interactions with ChatGPT. Users can now engage in more dynamic and entertaining conversations, with the AI adapting to their preferences and communication style. However, it's crucial to understand that while the upgrade enhances the user experience, it does not fundamentally change the practical utility of ChatGPT. The AI's ability to provide information, answer questions, and assist with tasks remains similar to previous versions. The new features primarily focus on improving the engagement and entertainment value of conversations rather than expanding the AI's core capabilities. ChatGPT's voice upgrade is part of the broader advancements in AI technology that are currently taking place. Generative AI, which focuses on creating new content based on patterns and rules learned from existing data, continues to evolve and impact various sectors, including healthcare, finance, and autonomous vehicles. AI agents like ChatGPT are becoming more sophisticated, with large language models improving their ability to understand and generate human-like text. These advancements contribute to the overall enhancement of AI capabilities, making tools like ChatGPT more versatile and effective in engaging with users and providing valuable assistance. The future prospects for ChatGPT and similar AI technologies are promising. As AI continues to evolve and improve, we can expect to see more advanced features and capabilities that enhance user experience and expand the practical applications of these tools. The ability to engage in more human-like conversations could increase user interaction and satisfaction, making AI-powered tools more appealing and accessible to a wider audience. There is also speculation about the potential for emotional attachment to AI, as interactions become more personalized and lifelike. Integration with other technologies, such as augmented reality devices like Meta Ray-Ban glasses, could further expand the utility and appeal of AI-driven tools. As ChatGPT and other AI technologies continue to advance, we can expect to see new and innovative ways in which they can be applied to various aspects of our lives. For users considering upgrading to the pro version of ChatGPT, the new voice features may be a significant factor in their decision-making process. The enhanced conversational experience and the novelty of the advanced voice mode can greatly contribute to personal enjoyment and engagement with the AI. While the practical utility of ChatGPT remains largely unchanged, the improved user experience and potential for future integrations may make the upgrade worthwhile for those seeking a more engaging and personalized interaction with AI technology. ChatGPT's new voice upgrade represents a significant step forward in the development of conversational AI technology. While the core functionality remains largely unchanged, the advanced features and enhanced user experience offer a glimpse into the future potential of AI-powered tools. As ChatGPT and other AI technologies continue to evolve, we can expect to see more innovative and engaging ways in which they can be applied to various aspects of our lives, transforming the way we interact with technology and each other.
[13]
OpenAI's voice mode rolling out to all ChatGPT subscribers - here's who gets it first
If you have been considering getting a ChatGPT Plus membership, now may be the time. OpenAI's Advanced Voice Mode, one of the most highly anticipated features of OpenAI's Spring Launch event, is now finally out of alpha and available to all ChatGPT Plus and Team users. Also: The best AI chatbots of 2024: ChatGPT, Copilot, and worthy alternatives On Tuesday, OpenAI announced that it's started to roll out its Advanced Voice Mode to ChatGPT Plus and Team users, offering them a smarter voice assistant that can be interrupted and respond to their emotions. The rollout will also feature five new voices -- Arbor, Maple, Sol, Spruce, and Vale -- available in both Standard and Advanced Voice Mode. OpenAI shares that the rollout to ChatGPT Plus and Team users will be gradual, with rollout out to Enterprise and Edu tiers coming next week. Users will know if they have been given access from a pop-up message next to the Voice Mode option within the ChatGPT interface. Since the release of the alpha in July, OpenAI has applied those to improve the Advanced Voice Mode, giving it improved accents in foreign languages and better conversation speed and smoothness. The Advanced Voice Mode also has a different look, now represented by an animated blue sphere. To make the experience even more tailored to the user, the Advanced Voice Mode can now use Custom Instructions and Memory, which allow it to consider specific user-shared or designated criteria when producing a response. As with the Alpha, users will not be able to access Voice Mode's multimodal capabilities, including assisting with content on users' screens and using the user's phone camera as context for a response, as seen in the demo video below. OpenAI tested the voice capabilities with 100+ external red teamers across 45 languages to ensure the model's safety. In August, the startup published its GPT-4o System Card, a thorough report delineating the LLM's safety based on risk evaluations according to OpenAI's Preparedness Framework, external red-teaming, and more, including the Advanced Voice Mode. Also: Gemini Live is finally hitting Android phones - how to access it for free You can become a ChatGPT Plus subscriber for $20 per month. Other membership perks include advanced data analysis features, unlimited image generation, five times more messages for GPT-4o, and the ability to create custom GPTs. One week after OpenAI unveiled this feature in May, Google unveiled a similar feature called Gemini Live. This feature is also a conversational voice assistant supported by LLMs to improve the understanding and flow of a conversation. Earlier this month, Google made Gemini Live available for free to all Android users, so if you have an Android and want to experience this type of assistant, you may not need to shell out the ChatGPT Plus subscription.
[14]
OpenAI is about to roll out ChatGPT Advanced Voice for Plus users
We get it -- some of you, just like our team here, haven't had the chance to try the ChatGPT Advanced Voice feature yet; but a recent leak confirms that OpenAI is about to roll it out for a select group of Plus users. So, soon you'll be able to compare those early demo videos with your own hands-on experience! OpenAI's Advanced Voice mode, first demoed in May, has been stirring up excitement. This feature lets you chat with ChatGPT on your phone in a natural, back-and-forth conversation, even giving you the power to cut it off if it starts to ramble. It also handles complex questions with ease, offering in-depth responses. A recent leak, reportedly from the ChatGPT team, suggests that the feature is being rolled out in a limited alpha to select users. According to the email, access to this alpha phase starting September 24, 2024, will be based on various factors, including participation invites and other testing criteria. In simpler terms: not everyone will get it just yet: Hi there, Thank you for reaching out and for your interest in the Advanced Voice mode! It's great to hear about your enthusiasm for our new features. As of now, access to Advanced Voice mode is being rolled out in a limited alpha to a select group of users. While being a long-time Plus user and having been selected for SearchGPT are both indicators of your active engagement with our platform, access to the Advanced Voice mode alpha on September 24, 2024, will depend on a variety of factors including but not limited to participation invitations and the specific criteria set for the alpha testing phase. Unfortunately, I don't have the ability to manually add users to the alpha testing list or provide specific insights into individual account access timelines. However, Plus users like yourself are among the first to receive access to new features, and we are planning for all Plus users to have access in the fall. Keep an eye on your email and app notifications, as any invitations or updates regarding access will be communicated through those channels. We truly appreciate your support and interest in being part of the early users for Advanced Voice mode. Your enthusiasm for our products helps us improve and expand our offerings. Best, OpenAI Team While OpenAI has promised that all Plus users will have access by the end of fall, this alpha rollout is a promising step toward the full release. Plus users, who pay $20 a month (or the equivalent in other regions), already get access to various LLMs, including the recently launched version 01-preview, which has impressed many with its improved math-solving and reasoning skills. OpenAI's voice feature has been in the spotlight for various reasons, including concerns about copyright and the recent rollout. Earlier this year, the company faced backlash over its "Sky" voice, which many users felt closely resembled Scarlett Johansson's. After feedback and Johansson's legal action, OpenAI decided to pull the voice, clarifying that Sky was voiced by a different actress. Now, Plus users are eagerly pressing OpenAI to accelerate the full rollout of Advanced Voice. Meanwhile, Apple Intelligence has yet to make its debut on iPhones, keeping iOS 18 users waiting for the anticipated AI-driven features. On the Android side, Google's Gemini AI is already making waves, with early access being rolled out, giving users a taste of advanced voice and assistant capabilities before much-anticipated AI updates arrive.
[15]
OpenAI Finally Rolls Out Advanced Voice to ChatGPT Plus and Team Users
After much wait, OpenAI has finally begun rolling out its highly anticipated Advanced Voice feature to all Plus and Team users in the ChatGPT app. The rollout is expected to be completed over the course of the week, with notifications appearing in the app once access is granted. The new feature introduces five distinct voices along with support for over 50 languages, allowing users to hear responses in different accents. Advanced Voice now also includes the ability to customise instructions, a feature that allows for greater personalisation. Users can set specific preferences for how the AI interacts via voice, enhancing the overall experience. While Advanced Voice offers a range of improvements, it is currently unavailable in the European Union, the UK, Switzerland, Iceland, Norway, and Liechtenstein. This rollout follows several key updates to the ChatGPT app, including the addition of Custom Instructions, Memory, and enhanced voice accents. OpenAI released GPT-4o at its latest Spring Update event earlier this year, which won hearts with its 'omni' capabilities across text, vision, and audio. OpenAI's demos, which included a real-time translator, a coding assistant, an AI tutor, a friendly companion, a poet, and a singer, soon became the talk of the town. However, its Advanced Voice Mode wasn't released. Recently, Kyutai, a French non-profit AI research laboratory, launched Moshi, a real-time native multimodal foundational AI model capable of conversing with humans in real time, much like what OpenAI's advanced model was intended to do. Hume AI introduced EVI 2, a new foundational voice-to-voice AI model that promises to enhance human-like interactions. Available in beta, EVI 2 can engage in rapid, fluent conversations with users, interpreting tone and adapting its responses accordingly. The model supports a variety of personalities, accents, and speaking styles and includes multilingual capabilities. Meanwhile, Amazon Alexa is partnering with Anthropic to improve its conversational abilities, making interactions more natural and human-like. Earlier this year, Google launched Astra, an 'universal AI agent' built on the Gemini family of AI models. Astra features multimodal processing, enabling it to understand and respond to text, audio, video, and visual inputs simultaneously.
[16]
ChatGPT's Advanced Voice is rolling out to subscribers | Digital Trends
OpenAI announced via Twitter on Tuesday that it will begin rolling out its Advanced Voice feature, as well as five new voices for the conversational AI, to subscribers of the Plus and Teams tiers throughout this week. Enterprise and Edu subscribers will gain access starting next week. Recommended Videos Advanced Voice, which runs on the GPT-4o model, allows users to forgo written text prompts and speak directly with the chatbot as they would another person. It was first announced at OpenAI's Spring Update event and released to a select group of ChatGPT Plus subscribers to beta test the system in July. Now, every paying subscriber will be able to try the feature for themselves. The company also unveiled five new voices for the chatbot: Arbor, Maple, Sol, Spruce, and Vale (you can listen to them right now). They'll be available in both Standard and Advanced Voice modes, joining the four voices -- Breeze, Juniper, Cove, and Ember -- that ChatGPT already offers. OpenAI also noted that, while video and screen sharing are not currently supported in Advanced Voice, those capabilities will roll out at a later date. What's more, OpenAI is incorporating a pair of tools to grant Advanced Voice capabilities more in line with the rest of the text-based chatbot experience: memory and custom instructions. When it first debuted, Advanced Voice could only reference information from the chat it was currently having. With the memory function, the AI will be able to recall details from previous conversations as well, reducing the need for users to repeat themselves. Similarly, custom instructions are designed to set ground rules for the model to follow when generating its responses. For example, you could dictate that any coding-based responses be presented in Python. Plus and Teams subscribers will receive an in-app notification when the feature goes live on their account. Unfortunately, Advanced Voice is not available in the EU, the U.K., Switzerland, Iceland, Norway, and Liechtenstein. ChatGPT isn't the only AI that can converse directly with its users. Tuesday's announcement comes less than a fortnight after Google released its Gemini Live to all users -- including those on the free tier.
[17]
ChatGPT's New Voice Assistant Is Here to Creep You Out
OpenAI's much-anticipated ChatGPT voice assistant is rolling out this week to all paying subscribers, and like a lot of features regarding AI, it's a little creepy in action. Advanced Voice Mode (AVM) began making its way to users who pay for ChatGPT Plus this week, according to OpenAI. The company tweeted a video of the feature in action as it helps someone craft an apology to their grandmother for being late. The user switches gears and tells the voice assistant to make the apology in Mandarin, which it did on the fly. The company says the AVM can offer the apology in more than 50 different languages. OpenAI first revealed its AVM back in May, and it caused quite a controversy as the voice sounded very similar to Scarlett Johansson. This led the actor to threaten legal action as she had warned the company to not use her voice after she was approached by company CEO Sam Altman with an offer to be the voice of the feature last year. The voice used in the demo still does have a hint of Johansson to it. AVM was officially launched in July, but it was only made available to a select number of ChatGPT Plus subscribers. Tuesday's announcement is the start of a wider launch to subscribers and Team users this week while Enterprise and Edu users will see the feature roll out to them next week. You'll know if you have AVM as a notification will appear on the app. There are five new voices for AVM: Arbor, Maple, Sol, Spruce, and Vale. This brings a total number of nine voices to try out. Along with the new voices, AVM can be programmed in a way. There are Custom Instructions in the Settings option to let you choose how you want the model to speak. This can include having it speak clearly and slowly, or having it address you with a certain name. It can even act as an interviewer if you want to practice for an interview. That said, expect some people to customize their AVMs in ways you wouldn't want to think about. OpenAI also says AVM has a variety of improvements from its conversational speed, how smoothly it talks, and its accents. It can also adapt to the tone of the conversation in order to sound like another person. Those who want to have a real chat with ChatGPT will have to pay for the Plus subscription which starts at $20 a month.
[18]
OpenAI rolls out Advanced Voice Mode with more voices and a new look
OpenAI announced it is rolling out Advanced Voice Mode (AVM) to an expanded set of ChatGPT's paying customers on Tuesday. The audio feature, which makes ChatGPT more natural to speak with, will initially roll out to customers in ChatGPT's Plus and Teams tiers. Enterprise and Edu customers will start receiving access next week. As part of the rollout, AVM is getting a revamped design. The feature is now represented by a blue animated sphere, instead of the animated black dots that OpenAI presented during its showcase of the technology in May. Users will receive a popup in the ChatGPT app, next to the voice icon, when AVM has been made available to them. ChatGPT is also getting five new voices that users can try out: Arbor, Maple, Sol, Spruce, and Vale. This brings ChatGPT's total number of voices to nine (almost as many as Google's Gemini Live), alongside Breeze, Juniper, Cove, and Ember. You might notice all of these names are inspired by nature, which could be because the whole point of AVM is to make using ChatGPT feel more natural. One voice missing from this lineup is Sky, the voice OpenAI showcased during its Spring Update, which led to a legal threat from Scarlett Johansson. The actress, who played an AI system in the feature film Her, claimed that Sky's voice sounded a little too similar to her own. OpenAI promptly took Sky's voice down, saying it never intended to resemble Johansson's voice, despite several staff members making references to the movie in tweets at the time. Another feature missing from this rollout: ChatGPT's video and screen sharing that OpenAI debuted during its Spring update four months ago. That feature is supposed to let GPT-4o simultaneously process visual and audible information. During the demo, an OpenAI staff member showed how you could ask ChatGPT real-time questions about math on a piece of paper in front of you, or code on your computer screen. At this time, OpenAI is not offering a timeline for when it will launch these multimodal capabilities. That said, OpenAI says it has made some improvements since releasing its limited alpha test of AVM. ChatGPT's voice feature is allegedly better at understanding accents now, and the company says its conversations are smoother and faster as well. During our tests with AVM, we found that glitches were not uncommon, but the company claims that's now improved. OpenAI is also expanding some of ChatGPT's customization features to AVM: Custom Instructions, which allows users to personalize how ChatGPT responds to them, and Memory, which allows ChatGPT to remember conversations to reference later on. An OpenAI spokesperson says AVM is not yet available in several regions, including the EU, the UK, Switzerland, Iceland, Norway, and Liechtenstein.
[19]
OpenAI rolls out more realistic Advanced Voice Mode for premium users
OpenAI has started rolling out the advanced voice mode for premium users of its AI chatbot ChatGPT. The audio feature is touted to be more natural and immersive and will be released through the week. The AI firm has said that it's not yet available in EU countries including Iceland, Norway, Switzerland, Liechtenstein or the U.K. After ChatGPT Plus and Teams customers, the feature will be released for Enterprise and Edu users from next week. CEO Sam Altman announced the feature on X saying, "Hope you think it was worth the wait." ChatGPT will also be getting five new voices that users can test called Arbor, Maple, Sol, Spruce and Vale, making the total number of voice options nine. OpenAI's stunning $150 billion valuation hinges on upending corporate structure While OpenAI had announced the advanced voice feature in May, the rollout was pushed back after actress Scarlett Johansson alleged that the demo voice called Sky was distinctly similar to her own in the 2013 film, "Her." While the company responded to the claim saying they hadn't modelled the voice after Johansson's but paused using it after her legal counsel sent letters stating they didn't have permission. For premium subscribers, they need to open the ChatGPT app to receive a notification once they have turned on access to the new feature. Then, the user can click on 'Continue' to go on. Once they start a new chat, a sound wave icon will appear next to the microphone icon and the 'Message' field. A small bump sound will be heard after a few seconds post which the circle in the middle of the screen that will then transform into a sky blue and white animation. The bot responds then. Users can choose a different accent or speed up a conversation. Published - September 25, 2024 11:20 am IST Read Comments
[20]
ChatGPT rolling out Advanced Voice Mode now -- here's what you need to know
The update ChatGPT Plus subscribers have been waiting for is finally here. OpenAI announced today that Advanced Voice Mode is available to ChatGPT users and Team tiers. This new feature promises conversations with a more natural and humanlike experience, enhancing user interactions. We knew this was coming, and this new advancement in Advanced Voice Mode marks a significant step in improving voice interactions for conversational AI. Advanced Voice Mode utilizes the new GPT-4o model, which combines text, vision, and audio processing for faster, more efficient responses. Unlike its predecessors, users can now experience real-time, emotionally responsive conversations, offering dynamic speech patterns and the AI can even handle interruptions with ease. This new advancement shows that OpenAI continues to pave the way for a smoother more fluid interaction as it leads the way for voice-based AI technology, though it has company from Gemini Live. ChatGPT Plus users can expect to experience enhanced personalization features, including customized instructions, and superior memory to make each interaction more personalized and tailored to the user. These additional features ensure that AI adapts to individual conversational preferences, making each session more intuitive and natural. As part of this new rollout, OpenAI has introduced five new voices in addition to the current Standard and Advanced Voice Mode versions. These new voice options give users control over how they interact with the AI. The update is currently exclusive to ChatGPT Plus and Team users but will soon extend further to Enterprise subscribers as well. Access will be available beginning next week to U.S. subscribers, but those in the EU, UK, Switzerland, Iceland and Norway will have to wait a bit longer until the features are available within their region. As part of ongoing improvements, OpenAI has enhanced accent recognition in popular foreign languages and improved conversational smoothness and speed. A refreshed design featuring a new animated blue sphere is part of the update to further enhance the experience of Advanced Voice Mode. Excluded from this launch are video and screen sharing features, although OpenAI has hinted at plans to introduce them in future updates.
[21]
Top Tech News: OpenAI Unveils New ChatGPT Feature, Advanced Voice Mode, Crypto Scammers Hijack OpenAI's Press Account on X for Malicious Gains
OpenAI via X plans to introduce Advanced Voice Mode to all its Plus subscribers, offering a more engaging and real-time conversation experience. The information leaked indicates that Advanced Voice Mode will start with ChatGPT Plus members, those who pay US$20 a month to access enhanced features like quicker replies and early access to new functionalities. After its launch, users will be able to access the feature through the ChatGPT app, where they can enable voice input and select from various voice options. This announcement comes when the competition to develop the most sophisticated chatbot AI is heating up. With major companies like Amazon, Apple, and Google incorporating AI into their voice assistants, OpenAI's Advanced Voice Mode could make ChatGPT stand out as a more adaptable tool. The company is using its AI knowledge to enhance the interaction experience for ChatGPT users.
[22]
OpenAI Set to Launch Advanced Voice Mode on ChatGPT Soon
OpenAI released GPT-4o at its latest Spring Update event earlier this year, winning hearts with its 'omni' capabilities across text, vision, and audio. OpenAI is set to launch 'Advanced Voice Mode' on ChatGPT this Tuesday, September 24, 2024, according to a screenshot posted by a user on X. "As of now, access to Advanced Voice mode is being rolled out in a limited alpha to a select group of users. While being a long-time Plus user and having been selected for SearchGPT are both indicators of your active engagement with our platform, access to the Advanced Voice mode alpha on September 24, 2024, will depend on a variety of factors including but not limited to participation invitations and the specific criteria set for the alpha testing phase," read the blog post attached in the screenshot. OpenAI released GPT-4o at its latest Spring Update event earlier this year, which won hearts with its 'omni' capabilities across text, vision, and audio. OpenAI's demos, which included a real-time translator, a coding assistant, an AI tutor, a friendly companion, a poet, and a singer, soon became the talk of the town. However, its Advanced Voice Mode wasn't released. When OpenAI recently released o1, one of them queried if they would be launching voice features soon. "How about a couple of weeks of gratitude for magic intelligence in the sky, and then you can have more toys soon?" replied Sam Altman, with a tinge of sarcasm. However, a couple of weeks later, Kyutai, a French non-profit AI research laboratory, launched Moshi, a real-time native multimodal foundational AI model capable of conversing with humans in real time, much like what OpenAI's advanced model was intended to do. Hume AI recently introduced EVI 2, a new foundational voice-to-voice AI model that promises to enhance human-like interactions. Available in beta, EVI 2 can engage in rapid, fluent conversations with users, interpreting tone and adapting its responses accordingly. The model supports a variety of personalities, accents, and speaking styles and includes multilingual capabilities. Meanwhile, Amazon Alexa is partnering with Anthropic to improve its conversational abilities, making interactions more natural and human-like. Earlier this year, Google launched Astra, an 'universal AI agent' built on the Gemini family of AI models. Astra features multimodal processing, enabling it to understand and respond to text, audio, video, and visual inputs simultaneously.
[23]
New ChatGPT Voice Upate: Everything You Need to Know
The new advanced voice mode for ChatGPT is now available to most paid users with ChatGPT Plus or Teams plans. This update significantly enhances the voice feature, making it more human-like and intuitive to use. The improved intonation, emotion, and cadence, along with lower latency, create a more engaging and natural conversation experience. The update also introduces a colorful new interface and five additional voices to choose from, providing users with more options to customize their interactions. However, it's important to note that the feature is rolling out gradually and is not yet available in certain European countries. The primary goal of this update is to make interactions with ChatGPT more intuitive, versatile, and productive. By mimicking human speech patterns more closely, the AI can now engage in more natural conversations, opening up a wide range of use cases from professional training to brainstorming sessions. The enhanced voice mode has the potential to transform the way we interact with AI, making it a more integral part of our daily lives. The video below from The AI Advantage gives us more details on the new ChatGPT Voice Update, let's find out more details. The new voice mode is currently accessible to users subscribed to the ChatGPT Plus or Teams plans. Free users do not have access to this feature, emphasizing the added value of the paid plans. The rollout is happening over a week, ensuring a smooth transition and allowing the team to address any potential issues promptly. However, it's worth noting that the feature is not yet available in some European countries due to regulatory considerations. To use the new voice mode, you need to have an active Plus or Teams subscription. This requirement highlights the exclusive nature of the feature and the benefits of upgrading to a paid plan. In the future, the voice mode may also be included in Enterprise and Education plans, further expanding its reach and potential applications. The update brings several significant enhancements to the voice mode, including: These improvements work together to create a more immersive and productive experience, allowing users to engage with ChatGPT in a way that feels more natural and intuitive. The enhanced voice mode opens up numerous possibilities across various fields and applications, such as: To support these applications presets and prompts tailored to different professions are available, making it easier for users to integrate the voice mode into their specific workflows. As the technology continues to evolve, we can expect to see even more innovative use cases emerge, further expanding the tool's capabilities and potential impact. Looking ahead, the ChatGPT team is committed to continuously improving the voice mode and introducing new features to enhance its functionality. Upcoming updates, such as screen recording and camera usage, are highly anticipated and will further expand the tool's capabilities, making it an even more powerful and versatile asset for users across various industries and applications. The team behind ChatGPT understands the importance of staying at the forefront of AI technology and is dedicated to delivering regular updates and improvements to ensure that the tool remains innovative and relevant to users' needs. By actively seeking feedback from the community and incorporating user suggestions, the team aims to create a product that truly transforms the way we interact with AI and enhances productivity across a wide range of sectors. In conclusion, the new ChatGPT voice update represents a significant step forward in the field of AI-powered communication. With its enhanced human-like qualities, improved performance, and expanded range of voices and applications, this update has the potential to transform the way we interact with AI and open up new possibilities for productivity and innovation. As the technology continues to evolve, we can expect to see even more exciting developments in the near future, cementing ChatGPT's position as a leader in the AI industry.
[24]
ChatGPT's Advanced Voice Mode might roll out to select Plus users today
Key Takeaways ChatGPT's Advanced Voice Mode feature is allegedly being rolled out to select Plus users today. A wider release is likely to happen this fall; however, OpenAI has yet to confirm a date. Google has beat OpenAI to the punch, at least this time around, by launching Gemini Live ahead of its competitor. Ever since ChatGPT launched in late 2022, it has constantly added new features that have made it somewhat of a viral sensation among users. One of these features is the Advanced Voice Mode, which the company debuted in May of this year. This capability essentially allows ChatGPT to hold more natural voice-based conversations and respond to the user with some degree of "emotion." Naturally, most users are eager to try it out. Related Copilot Pro vs ChatGPT Plus: Which subscription is best for you? If you're looking to spend $20 on a subscription tier to a generative AI platform, then which is better? Copilot Pro and ChatGPT Plus are very similar OpenAI will allegedly open up the Advanced Voice Mode feature to more users today Now, a recent leak on X suggests that a select group of ChatGPT Plus users might be getting access to this feature as early as today (September 24). The leak in question includes an email from OpenAI that mentions "access to the Advanced Voice mode alpha on September 24, 2024, will depend on a variety of factors including but not limited to participation invitations and the specific criteria set for the alpha testing phase." The email also notes that Plus users are "among the first to receive access to new features." So, it appears that while some Plus users might be added to the alpha testing group today, others will have to wait until later in the fall for access to this feature. Although OpenAI has yet to confirm a date for the full release of Advanced Voice Mode, the addition of more users to the testing phase indicates a wider rollout might happen sooner rather than later. The timeline isn't surprising, given that OpenAI's competitor Google has already launched Gemini Live, the version of the chatbot with voice capabilities. While being a ChatGPT Plus member comes with several perks, we should note that the subscription itself isn't exactly cheap at $20 per month. However, it does give you access to a variety of LLMs and even early features like Advanced Voice Mode. It remains to be seen whether OpenAI will keep Advanced Voice Mode exclusive to Plus users or follow Google's approach with Gemini Live and make it available to all users for free. If you're enjoying testing these new AI features, be sure to check out the best AI applications that you can run on your PC.
[25]
OpenAI Rolls Out Advanced Voice Mode to All Plus and Team Subscribers
OpenAI is expanding access to the Advanced Voice mode it first showed off in May. "Advanced Voice is rolling out to all Plus and Team users in the ChatGPT app over the course of the week," according to a tweet from OpenAI, which clarified later that the rollout is not happening in the EU, the UK, Switzerland, Iceland, Norway, and Liechtenstein yet, likely due to regulations there. The feature launched for select Plus subscribers in July, and promises to provide a natural-sounding back and forth with the AI. In a May demo with OpenAI employees, it easily understood what they asked it, and responded in real time with accurate, useful information, even if it was interrupted or the employee adjusted their query mid-sentence. Since then, OpenAI has added "Custom Instructions, Memory, five new voices, and improved accents." There were also improvements made to conversational speed. The new voices include two male and three female voices. To show off the improved accents, one of the female voices has an English accent and one of the male voices is Australian. The new voices increase the total number available to nine. Still missing is Sky, which drew the ire of Scarlett Johansson earlier this year and has since been disabled. In another tweet, OpenAI posted a video to show off custom instructions and memory. For custom instructions, there will be a spot in the settings where people can tell ChatGPT how they want the bot to talk to them. As an example, someone may want the bot to refer to them by a nickname or speak more slowly or more clearly. It'll also remember conversations and instructions you gave it so that it can reference those things again later. So, if you tell ChatGPT that you're from a specific town, it'll suggest stuff to do around town when you ask for recommendations later. Team and Plus users will get a notification when it's their turn to access advanced voice mode.
[26]
ChatGPT's Advanced Voice Mode is set to roll out tomorrow -- here's what we know
An exciting news leak from OpenAI via X suggests a major new ChatGPT feature is coming as early as tomorrow. From what we know, the company will unveil Advanced Voice Mode to all Plus subscribers, providing a more interactive, conversational experience in real time. As a ChatGPT Plus user myself, I am practically giddy over the move to a more human-like AI interaction. The leak suggests that Advanced Voice Mode will initially apply to ChatGPT Plus subscribers -- those who pay $20 per month to access the enhanced features such as faster response times and priority access to new capabilities. Once launched, the feature will be accessible through the ChatGPT app, where users can opt in to activate voice input and choose from a variety of voice options. If you've used ChatGPT Voice, you already know that the conversation feels fairly realistic. Honestly, they aren't bad for a first pass. But I have noticed that they glitch a bit and like ChatGPT text, the conversation needs some prompting to get the correct response. I have high hopes for the possibilities of Advance Voice Mode. OpenAI has reportedly put substantial effort into fine-tuning the AI's voice creation including varying tones and inflections, to ensure the conversation with ChatGPT is more personal and immersive. I'm happy to know that the Advanced Voice Mode will integrate directly into the ChatGPT interface for a seamless transition between text and voice. Live, real-time conversations are not new; we have seen similar advancements from Gemini Live and are anticipating more from Apple Intelligence's Siri. As AI-powered assistants continue to deliver more sophisticated conversations, the rivalry between competitors means users can anticipate the humanlike interactions will continue to be fine-tuned. This development comes at a time when the race to create the most advanced conversational AI intensifies. With giants like Amazon, Apple and Google all integrating AI into their virtual assistants, OpenAI's Advanced Voice Mode could set ChatGPT apart as a more versatile tool. The company is leveraging its AI expertise to give ChatGPT users deeper engagement. I foresee this feature beginning a shift in how we interact with digital assistants. The hands-free chat capabilities are ideal for multi-tasking and for more natural accessibility. Advanced Voice Mode will be particularly useful for tasks that require a more conversational approach, such as setting reminders, answering questions that require complex answers, or even providing step-by-step instructions for everything from home repairs to recipes. As of now, the new voice feature will only be available to ChatGPT Plus subscribers, providing them with exclusive access to cutting-edge AI capabilities. I have no doubt that if the rollout is successful and user feedback is positive, OpenAI may eventually extend this feature to the free tier or further integrate new voice models into business and enterprise solutions. For those already paying for Plus, this addition enhances the value of their subscription by adding another layer of convenience and interactivity. Just one more sleep until ChatGPT Plus users get to try it!
[27]
OpenAI Rolls Out New Voice Assistant to All Paid ChatGPT Users
OpenAI is releasing a much-anticipated new voice assistant to all paid users of its chatbot ChatGPT, four months after the artificial intelligence company first unveiled the feature at a product launch event. The San Francisco-based startup said Tuesday it has started rolling out the option, known as advanced voice mode, to ChatGPT Plus subscribers and users of its ChatGPT Team service for businesses. The company said Enterprise and Edu paid users will begin getting access to the feature next week. OpenAI first teased the voice product in May, showing how it could quickly respond to written and visual prompts from users with a spoken voice. But the next month, OpenAI delayed launching the option to work through potential safety issues. In July, OpenAI rolled out the feature to a limited number of its ChatGPT Plus customers. After the delay, OpenAI said the product would not be able to impersonate how other people speak. The company also said that it had added new filters to ensure the software can spot and refuse some requests to generate music or other forms of copyrighted audio.
[28]
Know Details of OpenAI's New Voice Assistant for ChatGPT Users
OpenAI has released a pioneering feature of voice assistant available to all paid customers. Through this new voice mode, humans will interact more naturally and effectively with AI. Here is a comprehensive look at how this feature is bound to transform the way we interact with AI. First off, it already has been noted that the is now promising a new style with multiple voices. So far, five new voices have been added, and there are nine total voices in total. All these voices can be personalized to suit the user's preference for better enjoyment of the conversation. Whether you wish for a smooth calm soothing voice or an energetic one, there is a voice for everyone. The new voices have improvements in accent which sounds more natural and realistic. That is going to be helpful for the variety of different linguistic backgrounds. It's striving to carry it over to a more natural, real-time conversation with a lot more flow, thereby making an interaction with ChatGPT similar to a chat with a human. Probably the biggest leap that's made in a new voice assistant is its ability to note and interpret how emotionally the person speaks. This means that it would know if the person is happy, sad, or frustrated. So, the AI has a sense of how to respond with something that would be apt for the user's situation. For instance, if a user seems to be upset, it may provide comforting words or solutions. Bringing a connection much more empathetic and supportive during an interaction. It represents significant strides toward meaningful and engaging conversations.
[29]
ChatGPT Advanced Voices Are Rolling out to More Accounts
OpenAI is rolling out its highly anticipated Advanced Voice feature to all ChatGPT Plus and Team users in select regions. The update will be rolled out throughout the week, so users in the regions where it's coming may need to wait for their turn. The standout feature in this update is the addition of five new voices. These aren't just regular sound English; the Advanced Voice comes with improved accents in select foreign languages, which should improve the clarity of interactions with users. This includes changes to the conversation speed and smoothness to make each conversation sound more natural. There is also a custom instructions feature that lets Plus and Team users tailor ChatGPT's behavior and responses, which sounds similar to Gems on Gemini, but it's a feature that affects all chats. It's like an area where users can tell ChatGPT to speak more slowly or quickly, how to refer to the user, and more. It's essentially a way to make it feel more personable and clear for each user. Finally, the Memory feature lets the AI recall previous conversations and maintain context over extended use. There's no need to wait by your ChatGPT and refresh, Plus and Team users will receive an in-app notification once they have access to Advanced Voice. A notable (and somewhat funny) feature is the ability for Advanced Voice to say the phrase "Sorry I'm late" in over 50 languages. This is to acknowledge the feature's delayed release and how much later it will be delayed for other users. Advanced Voice is not yet available in certain regions, including the EU, UK, Switzerland, Iceland, Norway, and Liechtenstein. This should feel like a new delay for people in those regions, but while some regions are currently excluded from this upgrade, we don't doubt that it is on the roadmap. OpenAI has not given a timeline for availability in these locations. Source: ChatGPT
[30]
OpenAI Begins Rollout of Advanced Voice to All Plus and Team Subscribers | PYMNTS.com
"While you've been patiently waiting, we've added Custom Instructions, Memory, five new voices, and improved accents," the company said in a Tuesday (Sept. 24) post on X. The feature is not yet available in the European Union, the United Kingdom, Switzerland, Iceland, Norway and Liechtenstein, OpenAI added in another post. Users can now choose from nine "lifelike output voices" for ChatGPT, with different tones and characters like "easygoing and versatile" and "animated and earnest," according to the company's Voice mode FAQ. It was reported July 30 that OpenAI was rolling out the alpha version of Advanced Voice Mode to a select group of ChatGPT Plus subscribers at that time and planned to begin a broader rollout to all premium users in the fall. To mitigate potential misuse of the feature, the company said at the time that it limited Advanced Voice Mode to preset voices created in collaboration with paid voice actors, so that it can't be used to impersonate specific individuals or public figures; implemented guardrails to block requests for violent or copyrighted content; and included filters to block requests for generating music or copyrighted audio, a move likely influenced by music industry legal actions against artificial intelligence (AI) companies. OpenAI had planned to roll the voice feature out in alpha in late June but said June 25 that it needed another month to do so. "For example, we're improving the model's ability to detect and refuse certain content," the company said at the time. "We're also working on improving the user experience and preparing our infrastructure to scale to millions while maintaining real-time responses." Many U.S. consumers are willing to pay for smart, reliable voice assistants, according to the PYMNTS Intelligence report, "How Consumers Want to Live in the Voice Economy." Twenty-two percent of Gen Z consumers are "highly willing" to pay more than $10 per month for a voice assistant that is as smart and reliable as a real person, per the report. The report also found that 54% of consumers would prefer using voice technology because it is faster than typing or using a touchscreen.
[31]
12 Amazing ChatGPT Advanced Voice Use Cases
The introduction of ChatGPT's Advanced Voice feature has opened up a world of possibilities, offering a range of applications that can transform the way we learn, communicate, and interact. This innovative technology harnesses the power of artificial intelligence to provide an immersive and personalized experience, catering to the diverse needs of users across various domains. AI Foundations explores twelve remarkable use cases that showcase the potential of ChatGPT Advanced Voice in enhancing productivity, facilitating learning, and enriching personal interactions. One of the most significant advantages of ChatGPT Advanced Voice is its ability to provide unlimited tutoring sessions. Students and lifelong learners can engage in continuous question-and-answer dialogues, delving deep into complex subjects and receiving comprehensive explanations. This feature transforms the learning experience, offering a virtual tutor that is accessible 24/7. Whether you're grappling with a challenging concept or seeking to expand your knowledge, ChatGPT Advanced Voice is there to guide you every step of the way. ChatGPT Advanced Voice takes personalization to new heights by allowing users to tailor responses to their specific needs and preferences. Whether you require concise one-word answers or detailed professional advice, the AI can adapt its responses accordingly. This customization ensures that the information you receive is highly relevant and actionable, saving you time and effort in sifting through generic responses. With the ability to fine-tune the level of detail and complexity, ChatGPT Advanced Voice becomes an indispensable tool for individuals seeking targeted insights and guidance. Here are a selection of other articles from our extensive library of content you may find of interest on the subject of ChatGPT Advanced Voice : Preparing for job interviews can be a daunting task, but ChatGPT Advanced Voice offers a innovative solution. The interview simulation feature allows you to customize questions and scenarios, mirroring the specific roles and industries you're targeting. By engaging in realistic practice sessions, you can refine your responses, anticipate potential questions, and build confidence in your ability to articulate your skills and experiences. This immersive practice environment provides invaluable preparation, increasing your chances of success in actual interviews. ChatGPT Advanced Voice recognizes that individuals have different learning styles and information retrieval needs. With the ability to modify the speed of responses, you can tailor the pace of the conversation to suit your preferences. When time is of the essence, you can opt for quick, concise answers that get straight to the point. Conversely, when a thorough understanding is required, you can request detailed explanations delivered at a slower pace. This flexibility empowers you to control the flow of information, ensuring that you absorb and retain knowledge effectively. Learning and personal interactions don't always have to be serious. ChatGPT Advanced Voice introduces an element of fun and levity through its ability to generate humorous responses. Engaging in conversations with a touch of humor can make the experience more enjoyable and memorable. Whether you're looking for a quick laugh or want to lighten the mood during a study session, ChatGPT can deliver witty jokes and light-hearted responses that keep you engaged and entertained. This feature adds a delightful dimension to your interactions, making the learning process more enjoyable and reducing stress. In today's globalized world, the ability to communicate effectively across different accents and languages is crucial. ChatGPT Advanced Voice offers a powerful tool for language learners and international users through its accent variation feature. By practicing conversations with different accents, you can improve your listening comprehension and speaking skills in various languages. This exposure to diverse accents prepares you for real-world interactions, enhancing your ability to communicate confidently in multicultural settings. Real-life conversations are rarely linear, and ChatGPT Advanced Voice mimics this dynamic nature through its interruption feature. You can interrupt the AI mid-sentence, redirecting the conversation or seeking clarification on a specific point. This ability to steer the dialogue ensures that the conversation remains relevant and engaging, allowing you to explore tangents or provide more insight deeper into topics of interest. By allowing interruptions, ChatGPT Advanced Voice creates a more natural and interactive experience, replicating the flow of real-life conversations. Breaking down language barriers is a key challenge in today's interconnected world. ChatGPT Advanced Voice offers a seamless solution through its real-time language translation feature. Whether you're communicating with non-native speakers or collaborating with international teams, the AI can translate conversations between different languages in real-time. This feature assists smoother interactions, allowing you to bridge linguistic gaps and foster effective communication. With ChatGPT Advanced Voice, language ceases to be a barrier, opening up new opportunities for global collaboration and understanding. The art of storytelling has the power to captivate, inspire, and educate. ChatGPT Advanced Voice takes this art to new heights with its storytelling capabilities. Whether you're an educator looking to engage your students or a parent seeking to entertain your children, the AI can craft compelling narratives that transport the audience to different worlds. With the ability to generate original stories or adapt existing ones, ChatGPT becomes a valuable tool for sparking imagination and fostering creativity. To truly bring stories to life, ChatGPT Advanced Voice offers voice variation in storytelling. By changing the tone, style, and emotional depth of the narration, the AI can create a more immersive and engaging experience. Whether it's a suspenseful thriller or a heartwarming tale, the voice variation feature adds a new dimension to the storytelling process. This capability enhances the impact of the narrative, making it more memorable and emotionally resonant for the listener. For sales professionals, the ability to deliver persuasive pitches and handle objections is crucial. ChatGPT Advanced Voice provides a valuable tool for sales script practice, allowing you to simulate various sales scenarios and refine your approach. By engaging in realistic dialogues, you can practice handling common objections, articulating value propositions, and closing deals effectively. This immersive practice environment helps you hone your sales skills, boosting your confidence and increasing your chances of success in real-world sales interactions. Preparing for specific interactions or scenarios can be challenging, but ChatGPT Advanced Voice offers a solution through its role prompting feature. Whether you're a student preparing for a presentation or a professional gearing up for a crucial meeting, the AI can simulate real-life situations, allowing you to practice and refine your responses. By engaging in these simulated scenarios, you can anticipate potential challenges, develop effective strategies, and build confidence in your ability to navigate complex interactions. The ChatGPT Advanced Voice feature represents a significant leap forward in AI-powered communication and learning. With its diverse range of use cases, from unlimited tutoring and custom instructions to interview practice and real-time language translation, this technology has the potential to transform the way we acquire knowledge, interact with others, and navigate professional challenges. By using the power of voice-based AI, individuals can enhance their skills, broaden their horizons, and unlock new opportunities for personal and professional growth. As we continue to explore the capabilities of ChatGPT Advanced Voice, it becomes evident that this technology is not just a tool, but a fantastic option for positive change, empowering users to reach new heights in their pursuits.
[32]
OpenAI released its advanced voice mode to more people. Here's how to get it.
OpenAI is broadening access to Advanced Voice Mode, a feature of ChatGPT that allows you to speak more naturally with the AI model. It allows you to interrupt its responses mid-sentence, and can also sense and interpret your emotions based on your tone of voice and adjust its responses accordingly. These features were teased back in May when OpenAI unveiled GPT-4o but they were not released until July -- and then only to an invite-only group. (At least initially, there seem to have been some safety issues with the model; OpenAI gave several WIRED reporters access to the voice mode back in May, but the magazine reported the company "pulled it the next morning, citing safety concerns.") Users who've been able to try it have largely described the model as an impressively fast, dynamic, and realistic voice assistant -- which has made its limited access particularly frustrating to some other OpenAI users.
Share
Share
Copy Link
Major tech companies are enhancing their AI voice assistants with celebrity voices and advanced features, as OpenAI relaunches its ChatGPT voice mode after addressing controversy.
In a bid to make AI more accessible and engaging, major tech companies are introducing celebrity voices and advanced features to their voice assistants. This move comes as OpenAI relaunches its controversial ChatGPT voice mode with significant improvements [1].
After a months-long pause due to controversy, OpenAI has finally rolled out its Advanced Voice Mode for ChatGPT Plus and Team subscribers [2]. The update includes:
However, the feature is not yet available in certain regions, including the EU and UK [3].
Meta has announced the addition of celebrity voices to its AI chatbot, including Kristen Bell, John Cena, and Dame Judi Dench [4]. This move aims to make AI interactions more engaging and personalized.
Google is not far behind, rolling out its free AI-powered Gemini Live audio tool to English-language Android users. The tool promises to learn and adapt to users' conversational styles over time [3].
The competition in the AI voice assistant space is heating up, with each company trying to outdo the others:
The path to advanced AI voice assistants hasn't been smooth. OpenAI faced legal issues when one of their voice options resembled Scarlett Johansson's voice from the movie "Her," leading to a delay in the feature's rollout [2].
Despite the advancements, AI adoption among the general public remains relatively low. A Pew Research Center study found that only about a quarter of Americans have tried ChatGPT [1]. The introduction of voice features and celebrity voices might be a strategy to increase user engagement and adoption.
As AI voice technology continues to evolve, we can expect:
The race to create the most engaging and useful AI voice assistant is just beginning, and it's clear that major tech companies are betting big on voice as the future of AI interaction.
Reference
A comparative analysis of ChatGPT's new voice mode and Google's Gemini Live, exploring their capabilities, user experiences, and potential impact on AI-human interactions.
2 Sources
As AI technology advances, it's being used in various ways - from playful chatbot interactions to addressing serious healthcare challenges. This story explores the implications of AI's growing influence on society and industry.
2 Sources
OpenAI introduces real-time video and screen sharing features to ChatGPT's Advanced Voice Mode, enabling users to interact with the AI through their camera and share their screens for immediate assistance.
11 Sources
ChatGPT's new Advanced Voice Mode brings human-like speech to AI interactions, offering multilingual support, customization, and a wide range of applications from language tutoring to creative tasks.
2 Sources
OpenAI's recent announcements and product rollouts reveal a strategic push to position ChatGPT as a comprehensive AI solution, while other tech giants make significant strides in AI development and integration.
7 Sources