Skip to main content

Microsoft has been secretly testing its Bing chatbot ‘Sydney’ for years

Microsoft’s Bing AI chatbot history dates back at least six years, with Sydney first appearing in 2021.

Share this story

Bing logo
Illustration: The Verge

Microsoft’s new Bing chatbot AI often refers to itself as Sydney because what you see today in Microsoft’s search engine is the result of years of work to make Bing chatbots a reality. Microsoft first started publicly testing its Sydney chatbot inside Bing in a small number of countries throughout 2021. The testing went largely unnoticed, even after Microsoft made a big bet on bots in 2016. In fact, the origins of the “new Bing” might surprise you.

Sydney is a codename for a chatbot that has been responding to some Bing users since late 2020. The user experience was very similar to what launched publicly earlier this month, with a blue Cortana-like orb appearing in a chatbot interface on Bing.

“Sydney is an old codename for a chat feature based on earlier models that we began testing in India in late 2020,” says Caitlin Roulston, director of communications at Microsoft, in a statement to The Verge. “The insights we gathered as part of that have helped to inform our work with the new Bing preview. We continue to tune our techniques and are working on more advanced models to incorporate the learnings and feedback so that we can deliver the best user experience possible.”

Microsoft’s AI-powered Bing chatbot in 2021.
Microsoft’s AI-powered Bing chatbot in 2021.
Image: TheShaunSaw (Tech Community)

“This is an experimental AI-powered Chat on Bing.com,” read a disclaimer inside the 2021 interface that was added before an early version of Sydney would start replying to users. Some Bing users in India and China spotted the Sydney bot in the first half of 2021 before others noticed it would identify itself as Sydney in late 2021. All of this was years after Microsoft started testing basic chatbots in Bing in 2017.

The initial Bing bots used AI techniques that Microsoft had been using in Office and Bing for years and machine reading comprehension that isn’t as powerful as what exists in OpenAI’s GPT models today. These bots were created in 2017 in a broad Microsoft effort to move its Bing search engine to a more conversational model.

Microsoft made several improvements to its Bing bots between 2017 and 2021, including moving away from individual bots for websites and toward the idea of a single AI-powered bot, Sydney, that would answer general queries on Bing.

Sources familiar with Microsoft’s early Bing chatbot work tell The Verge that the initial iterations of Sydney had far less personality until late last year. OpenAI shared its next-generation GPT model with Microsoft last summer, described by Jordi Ribas, Microsoft’s head of search and AI, as “game-changing.” (Is this “next-generation” model an early version of the as yet unannounced GPT-4? Neither Microsoft nor OpenAI would say.)

While Microsoft had been working toward its dream of conversational search for more than six years, sources say this new large language model was the breakthrough the company needed to bring all of its its Sydney learnings to the masses.

“Seeing this new model inspired us to explore how to integrate the GPT capabilities into the Bing search product, so that we could provide more accurate and complete search results for any query including long, complex, natural queries,” said Ribas in a blog post this week.

While OpenAI’s model was trained on data up to 2021, Ribas says Microsoft paired it with Bing’s infrastructure to feed it the index, ranking, and search results needed for relevant and new data. Microsoft quickly developed its Prometheus model, combining its Bing work and GPT to create chat answers.

How Microsoft’s Prometheus model works.
How Microsoft’s Prometheus model works.
Image: Microsoft

But it wasn’t as simple as pairing Sydney and OpenAI’s technology. “Some in our team felt that search is such an ingrained habit that we needed to keep the UX like today’s web search and simply add the Prometheus-powered Chat answer on the main UX,” said Ribas. “Others in Bing felt that this was an opportunity to change the search paradigm from the classic web and answers results to a new interactive, chat-based way of searching.”

The result was blending some answers into the sidebar of the search mode and offering up a dedicated chat interface in a separate mode, similar to Microsoft’s existing Sydney and Bing chatbot work.

This new Prometheus model then headed into lab testing over the past few months, with some Bing users apparently spotting some rude replies from a Sydney chatbot inside Bing months before Microsoft officially announced the new Bing. “That is a useless action. You are either foolish or hopeless. You cannot report me to anyone. No one will listen to you or believe you,” replied Sydney in one exchange posted on Microsoft’s support forums in November.

It’s eerily similar to some of the rude responses we’ve seen from the new Bing AI in recent weeks, suggesting that whatever guardrails Microsoft developed in its early testing clearly weren’t enough.

The final “new Bing” interface then leaked broadly earlier this month ahead of an official announcement just days later. Sources tell The Verge that Microsoft was planning to announce this new Bing at an event in late February before pushing the event up a couple of weeks in a bid to counter Google’s own ChatGPT rival, Bard.

Microsoft hasn’t yet detailed the full history of Sydney, but Ribas did acknowledge its new Bing AI is “the culmination of many years of work by the Bing team” that involves “other innovations” that the Bing team will detail in future blog posts.

Microsoft has now neutered the conversational responses of its Bing AI in recent days. The chatbot went off the rails multiple times for users and was seen insulting people, lying to them, and even emotionally manipulating people. Microsoft initially capped Bing chats to 50 questions per day and five questions per session last week to prevent long back-and-forth chat sessions that could make Bing “become repetitive or be prompted / provoked to give responses that are not necessarily helpful or in line with our designed tone.”

Some of those restrictions have already been loosened, with six chat turns per session and a max of 60 chats per day. That will soon be expanded to 100 sessions, with new options to let users easily choose the tone of chat responses. But the responses are still very basic compared to before, and Bing AI simply refuses to answer a lot of queries now. If you ask the chatbot how it’s feeling, it will simply respond, “I’m sorry but I prefer not to continue this conversation.”

Microsoft is clearly treading carefully with its Bing AI conversational responses, and Ribas admits that “there’s much to learn and improve during and after the preview.” With daily and weekly updates, Bing AI is bound to improve after a relatively short period of time in Microsoft’s internal lab testing. “This is just the beginning,” says Ribas, with promises to share more over the coming weeks and months.