OK Google, show me the money!

04th August 2021
OK Google, show me the money!

August 4 2021: The global business of (smart) voice assistants  is expected to touch $ 19.5 billion in transaction value by 2023, finds a recent study by Juniper. 
In 2021, it may end up creating $ 4.6 billion in transactions. The next 2 years should see three-fold growth and smart speakers too may see 50% increase in installed capacity, finds the study.
In a  complementary Juniper Whitepaper examines possible use cases  pathways to monetisation for voice assistant platforms and suggests  how the market will grow. We bring you highlights:
Usage of voice commands in general has increased since the COVID-19 outbreak, with 52% of voice assistant users saying they use voice tech several times a day or nearly every day, compared to 46% before the outbreak.
Popularity of voice assistants is driven by their ability to facilitate touch[1]free human–computer interactions in a natural and intuitive way, similar to the conversations between human beings.
However, the global COVID-19 pandemic has caused a significant decline in hardware manufacturing operations during 2020, which alongside suspension of some ecommerce platforms (due to localised lockdown imposed by many governments) has negatively impacted the voice assistant industry on the hardware side. Moreover, ongoing problems with software interoperability and globally-available connectivity, along with security and privacy concerns, will continue to act as partial growth restraints going forward.
Juniper Research considers a digital voice assistant to be an artificial intelligence-powered software programme designed with the intent of filling some or all of the role of a personal assistant, which is given its instructions by the user through the medium of voice interactions.
Vendor Landscape
The mass adoption of artificial intelligence in users’ everyday lives is fuelling the rapid shift towards voice applications. IoT devices and ever-expanding use cases are giving voice assistants tangible value in a connected user’s life. Microphones are everywhere and access to voice assistants has become ubiquitous, with B2C and B2B companies eager to monetise this trend.
Opportunities for voice technology as a result of the global COVID-19 pandemic has also resulted in more players in the voice space. As the pandemic creates a need for less touch and more voice applications, more organisations emerged to offer these capabilities.
The vendor landscape is increasingly fragmented, with a growing number of task-specific assistants. Generally speaking, the voice assistant landscape is divided into four categories of vendors:
• Task-specific niche assistants such as Aider that provide very narrow capabilities.
• Branded in-house assistants such as those offered by BBC and Starbucks.
• White-label solutions such as Houndify that provide lots of capabilities and configurable tool sets. • General-purpose platforms including tech giants Amazon, Google, Baidu, Samsung and Apple
Thanks to their general purpose/broad capabilities platform approach, Amazon, Google and Apple remain market leaders in the overall VA space.
Key Participants & Value Chain

Model Digital voice assistants are a classic example of multi-player products; technological platforms that enable the interaction of two or more stakeholders. The rapid spread of digital assistants is prompting many companies to increase investments in the field of digital assistants. However, while the design and development of a single service for these platforms are becoming more easily achievable from a technological point of view, monetisation models in this area present many companies with significant challenges.
Voice assistants’ ecosystems are usually complex. The technology provider of the voice assistants acts as a platform operator. Basic user groups consist of the end users of services and processes; providers of services and processes, as well as companies that provide applications to which users can access services and processes. Although the providers of processes and services are often also the ones who provide the application, there are also platforms on which software developers provide services and processes of other users. The complexity of these interlocking levels makes it difficult to achieve straightforward monetisation.
As the market for voice assistants matures, multi-faceted revenue models emerge. Monetisation models need to fit the usage scenario. Some scenarios lend themselves well to subscriptions, while others are better for one-time purchases. Others yet can be used to drive customers to another monetisation platform (such as mobile apps). As the investment in voice assistants grows, so does the desire to show more of a direct impact on the bottom line rather than relying solely on increased customer satisfaction and growing NPS scores. In this section, we will look at the most important participants in the value chain, and the different business strategies that allow for successful monetisation of voice assistant products and services.
Leading Use Cases
Voice is being used to add value to more use cases than ever before, from smart speakers and virtual assistants in office environments to machine interfaces in contact centres and automated ordering in drive throughs. Organisations are looking for more intuitive and engaging methods of interacting with customers, and consumers are looking for convenience at almost any cost. Leading applications of voice technology include subtitling and closed captions, customer experience and analytics, media and entertainment applications, compliance, digital asset management, chatbots and medical transcription.
In the following section, five of the currently most prevalent use cases are discussed with examples of existing solutions.
Customer Experience & Analytics The real change we have seen in 2020 with the world adapting to COVID-19 is the increased uptake of contactless engagement systems to avoid spreading germs. Speech has a big part to play here and amongst other trends is reflected in the large growth seen in chatbot usage
The finance and retail sectors especially are progressively adopting more chatbot solutions to improve customer experience through voice data analytics. Customer calls can be transformed into valuable insights to help with practices such as issue resolution and providing agent knowledge bases. Turning the customer voice into text enables contact centres to analyse their call content and understand the mood, tone and overall sentiment of customers. This supports continuous improvements in customer experience. Other key drivers and motivation for contact centres adopting voice technology include improving the productivity of support staff, increased customer satisfaction, reduced staffing needs, as well as improved agent related KPIs and knowledge.
Finance and Retail
Between its inception in June 2018 and May 2019, Bank of America’s voice assistant Erica, for example, completed over 50 million client requests; engaging with more than 500,000 new users per month and doubling the ways in which clients can ask financial questions through the ongoing expansion of conversational knowledge. Meanwhile, following up on their voice ordering pilot in the US in 2017, Starbucks voice ordering is available on Alibaba’s ELE.me food delivery platform. Alibaba has even created a Starbucks-branded speaker, complete with the signature ‘Bearista’ on top. Alternatively, users with a Tmall Genie speaker can place an order with a nearby Starbucks and have coffee and food delivered to their location. In future, Starbucks may provide personalised recommendations for orders based on past user experience
Hospitality
Hospitality is another sector heavily leaning on voice UIs, with travellers expecting increased safety and hygiene measures. Hospitality offers a large variety of voice assistant instances through different channels such as mobile apps, in-room control of smart devices, lights, and appliances as well as through information kiosks. With voice assistants, hotel rooms can become completely touchless - eliminating guest concerns about contact with common surfaces. Lighting, room temperature, TVs, drapes, and music can all be controlled through a custom, embedded voice assistant.
Guests at hotels with Volara’s customised Google Nest and Amazon Echo smart speakers and smart displays, for example, now offer the option for digital tips. The company was the Alexa for Hospitality launch partner and the exclusive holder of the Amazon Alexa for Business Service Delivery Designation for the hospitality industry. Volara faces plenty of competition, however. Google is now also bringing Nest Hub smart displays and Google Assistant to hotel rooms, while SoundHound is already integrating a version of its Houndify platform into JBL smart speakers at hotels worldwide.
Media and Entertainment Personalisation
The huge volume of entertainment content created today by OTT services and film studios means that media providers will need to focus on content curation to manage their assets. Consumers will require additional tools and capabilities to help them navigate the plethora of content available to them. Content Streaming Services, film studios and TV channels have the opportunity to leverage automation tools like speech to text and transcription to unlock the keywords, themes, and other elements contained within their media content to further help with consumption and accessibility.
Voice interaction can reveal deeper levels of insight across a large volume of content and unlock metadata; providing enough insight to personalise the viewing experience. This is a service element of entertainment consumption that has become a clear competitive advantage if done correctly and offers an additional opportunity for engagement through recommendations. These solutions are built on an understanding of elements that can be extracted through speech to text and other machine -learning and AI -derived tools. For brands, this provides an opportunity to optimise customer retention and revenue.
Market Forecast Summary: Total Voice Assistant Transactional Revenue
 eCommerce transaction values via voice assistants will reach $19.4 billion by 2023, rising from just $4.6 billion in 2021. The availability of voice assistant devices with screens will be imperative to increasing the monetisation of voice assistant commerce services, by increasing the efficiency of the checkout process.
Increasing the size and accessibility of the content domain libraries will be critical to increasing the number of transactions processed by voice assistant services. In turn, this will increase the value proposition of voice commerce to third-party retailers, and generate new revenue streams for voice assistant platforms.
Leaders in the voice assistants space, including Amazon, Apple and Google, are encouraged to open up their platform-based commerce services to third-party retailers, in addition to leveraging their own ecosystems to expand their monetisation capabilities. However, a key hurdle to attracting third-party retailers is the absence of a screen in many smart speakers; limiting the contextual information presented to users. Implementing omnichannel retail strategies, where users’ retail interactions are managed across multiple channels, is recommended to enable retailers to display further information on a product.
The global-installed base of smart speakers will rise by over 50% between 2021 and 2023; assisting the adoption of monetisation strategies. While smartphone-based voice assistants will be dominant in usage terms, the rising number of standalone smart speakers means that the potential for commerce is growing rapidly, but this must be targeted with the right partnerships to achieve success.
------------------------------------------------------------------------------------------------------------------------
Users will generally use voice assistants to initially explore a product, before completing
the purchase via a device with a screen. Voice assistant platforms must ensure that the
user experience is so seamless that transactions are carried out via these platforms,
rather than requiring additional devices.
-------------------------------------------------------------------------------------------------------------------------