RSS
 

Archive for February, 2011

Microsoft’s Bing uses Google search results—and denies it

01 Feb
By now, you may have read Danny Sullivan’s recent post: “Google: Bing is Cheating, Copying Our Search Results” and heard Microsoft’s response, “We do not copy Google's results.” However you define copying, the bottom line is, these Bing results came directly from Google.

I’d like to give you some background and details of our experiments that lead us to understand just how Bing is using Google web search results.

It all started with tarsorrhaphy. Really. As it happens, tarsorrhaphy is a rare surgical procedure on eyelids. And in the summer of 2010, we were looking at the search results for an unusual misspelled query [torsorophy]. Google returned the correct spelling—tarsorrhaphy—along with results for the corrected query. At that time, Bing had no results for the misspelling. Later in the summer, Bing started returning our first result to their users without offering the spell correction (see screenshots below). This was very strange. How could they return our first result to their users without the correct spelling? Had they known the correct spelling, they could have returned several more relevant results for the corrected query.



This example opened our eyes, and over the next few months we noticed that URLs from Google search results would later appear in Bing with increasing frequency for all kinds of queries: popular queries, rare or unusual queries and misspelled queries. Even search results that we would consider mistakes of our algorithms started showing up on Bing.

We couldn’t shake the feeling that something was going on, and our suspicions became much stronger in late October 2010 when we noticed a significant increase in how often Google’s top search result appeared at the top of Bing’s ranking for a variety of queries. This statistical pattern was too striking to ignore. To test our hypothesis, we needed an experiment to determine whether Microsoft was really using Google’s search results in Bing’s ranking.

We created about 100 “synthetic queries”—queries that you would never expect a user to type, such as [hiybbprqag]. As a one-time experiment, for each synthetic query we inserted as Google’s top result a unique (real) webpage which had nothing to do with the query. Below is an example:


To be clear, the synthetic query had no relationship with the inserted result we chose—the query didn’t appear on the webpage, and there were no links to the webpage with that query phrase. In other words, there was absolutely no reason for any search engine to return that webpage for that synthetic query. You can think of the synthetic queries with inserted results as the search engine equivalent of marked bills in a bank.

We gave 20 of our engineers laptops with a fresh install of Microsoft Windows running Internet Explorer 8 with Bing Toolbar installed. As part of the install process, we opted in to the “Suggested Sites” feature of IE8, and we accepted the default options for the Bing Toolbar.

We asked these engineers to enter the synthetic queries into the search box on the Google home page, and click on the results, i.e., the results we inserted. We were surprised that within a couple weeks of starting this experiment, our inserted results started appearing in Bing. Below is an example: a search for [hiybbprqag] on Bing returned a page about seating at a theater in Los Angeles. As far as we know, the only connection between the query and result is Google’s result page (shown above).


We saw this happen for multiple queries. For the query [delhipublicschool40 chdjob] we inserted a search result for a credit union:


The same credit union soon showed up on Bing for that query:


For the query [juegosdeben1ogrande] we inserted a page of hip hop bling jewelry:


And the same hip hop bling page showed up in Bing:


As we see it, this experiment confirms our suspicion that Bing is using some combination of:
or possibly some other means to send data to Bing on what people search for on Google and the Google search results they click. Those results from Google are then more likely to show up on Bing. Put another way, some Bing results increasingly look like an incomplete, stale version of Google results—a cheap imitation.

At Google we strongly believe in innovation and are proud of our search quality. We’ve invested thousands of person-years into developing our search algorithms because we want our users to get the right answer every time they search, and that’s not easy. We look forward to competing with genuinely new search algorithms out there—algorithms built on core innovation, and not on recycled search results from a competitor. So to all the users out there looking for the most authentic, relevant search results, we encourage you to come directly to Google. And to those who have asked what we want out of all this, the answer is simple: we'd like for this practice to stop.

Posted by Amit Singhal, Google Fellow
 
 

You Can Now Buy Up to 16 Terabytes of Storage from Google

01 Feb

Yesterday we told you how Google Docs is inching closer to being the mythical "Gdrive." Today Google announced that users can now buy up to 16 terabytes of storage for $4,096.00 per year. The storage can be used with Gmail, Picasa Web Albums and Google Docs. Those that don't need quite that much can choose from cheaper options:

  • 20 GB ($5.00 per year)
  • 80 GB ($20.00 per year)
  • 200 GB ($50.00 per year)
  • 400 GB ($100.00 per year)
  • 1 TB ($256.00 per year)
  • 2 TB ($512.00 per year)
  • 4 TB ($1,024.00 per year)
  • 8 TB ($2,048.00 per year)
  • 16 TB ($4,096.00 per year)

Sponsor

Sixteen terabytes may seem like an insane amount, but consider the benefit this will provide to media professionals who work with large video and audio files or architects and engineers working with 3D modeling software. However, there is still a limit of one gigabyte per file.

This offering, which comes as Mozy is reducing its users' storage limits, demonstrates the importance of storage to Google's long-term strategy. We've written about how storage plays into Google plans previously:

The devices will be secondary in value. Storage will be critical. And that's what these big companies recognize.

The Cloud Storage blog refers to a 451 Group report, which states that the cloud computing market is supposed to reach $16.3 billion by 2013. Storage will drive that growth.

Google's even going to bat for other companies over copyright issues to ensure the future viability of the cloud storage market.

Discuss