This post poses the question: Should Internet Explorer use Google’s Malware list as well? We already know Microsoft’s Malware list is better then Google’s. There are however some items that are on the Google list that the Microsoft filter does not pick up.
The reason I wanted to write this post is due to some one asking this question on slashdot.
”Tests show that IE’s malware filter performs well against other browsers that use the Safe Browsing blacklist from Google. But wouldn’t IE’s filter be even more effective if it used both filter lists at the same time? And are the political obstacles to that really so insurmountable?"
Read on for the rest of a plan that seems a lot more than half-baked.
Most major browsers now come with a built-in blacklist of malware-infected or phishing websites, that display a warning if the user tries to access them in the browser. Internet Explorer 8 uses Microsoft’s SmartScreen filter, while Firefox, Safari and Chrome all use Google’s Safe Browsing API.
Recent tests from NSS Labs reported that IE’s filter blocked 81% of "socially engineered malware sites" from the lab’s sample, while Firefox, in second place, blocked only 27%, and other browsers trailed even further behind.
When NSS Labs ran a test of the different browsers’ efficiency at blocking phishing sites, IE and Firefox scored about the same, both blocking about 80% of the sites in the sample. These results left a lot of unanswered questions, such as: Why Firefox, Safari and Chrome got such different scores in both tests (since they supposedly all use the Safe Browsing blacklist), and why such a huge gap between IE’s and Firefox’s performance in the malware test, but such close scores for the two browsers in the phishing test (the Google Safe Browsing API page says that the database is an attempt to list both malware and phishing sites, after all).
But I had a different question: Since Google allows anybody to use the Safe Browsing API, why doesn’t Internet Explorer use it as well, in conjunction with their own blacklist, so that a site will be blocked by IE if it’s present on either list? This would almost certainly increase the block rate for IE (unless the set of sites blocked by Safe Browsing was entirely a subset of the sites blocked by SmartScreen, which is extremely unlikely).
Google might well offer to service the queries for free, just for the prestige of being able to say that the Safe Browsing database provided protection for almost all major browsers on the market. (Microsoft’s SmartScreen team declined to comment on the record about their reasons for not using the Safe Browsing list in addition to their own database. I couldn’t get an official response from Google about what position they would have on Internet Explorer using the Safe Browsing list, although unofficially an employee said the team would probably be "delighted" if IE were to use it.)
It’s worth underlining what a strong statement Microsoft is making by not using the Safe Browsing list. They’re not just saying that their own list is better. They’re saying that the Safe Browsing list is of such low quality that adding it to their own product would actually make the product worse.
This is different from, for example, what McAfee and Symantec might say about each other’s anti-virus lists. Consider the set of all viruses that McAfee blocks and the set of all viruses that Symantec blocks. Let List X be the overlap — the huge swath of viruses that are blocked by both McAfee and Symantec. Then let List Y be the set of all viruses that are blocked by McAfee but not blocked by Symantec, and let list Z be the set of all viruses that are blocked by Symantec but not by McAfee. (So McAfee blocks viruses in the set X+Y, and Symantec blocks viruses in the set X+Z.) Now, representatives from McAfee and Symantec will each say that their list is the better one, which they may or may not believe. But even McAfee is not claiming that List Z — that portion of the list that is blocked by Symantec but not by McAfee — is so worthless that McAfee wouldn’t incorporate it into their own product if they could get it for free. If Symantec allowed any anti-virus maker to download Symantec’s anti-virus signature database, then presumably McAfee would scratch their heads a bit about why Symantec would do this, but if they cared about giving their users maximum protection, they would incorporate it into their product as well (so that McAfee would then be blocking all viruses in the set X+Y+Z, instead of just the set X+Y as they were before).
Symantec doesn’t make it available for free, so McAfee doesn’t have the option of using it and the issue doesn’t come up. Other than each company claiming their product is the better one (which is par for the course for competitors), the two companies’ positions are not contradicting each other.
But consider the analogous situation for anti-malware lists, where X is the set of all sites blocked by both IE’s SmartScreen and by the Google Safe Browsing API, Y is the set of all sites blocked by SmartScreen but not by the Safe Browsing API, and Z is the set of all sites blocked by the Safe Browsing API but not by SmartScreen. When Microsoft says that they don’t want to use the Safe Browsing list in addition to their own — that they would rather block just X+Y than block X+Y+Z — they’re saying that they’re estimating that the list Z is of such poor quality (too much risk of containing too many false positives) that it would be better not to block it at all.
In this case, Microsoft’s position really is contradicting that of Google, Firefox, Safari, and others who use the Google Safe Browsing API. To achieve the best tradeoff between user safety and convenience, should the sites on List Z — the set of sites on the Safe Browsing API blacklist but not on the SmartScreen blacklist — be blocked, or not? If the answer is Yes, then IE should use the Safe Browsing API in addition to their own SmartScreen list. If the answer is No, then Google should take the URLs in the Safe Browsing API list, run them through IE using some automated script, and then remove all the URLs that weren’t blocked by IE — in other words, remove all the URLs on List Z from the Safe Browsing blacklist. But I can think of no consistent set of assumptions that would lead one to recommend that both companies continue doing what they’re doing now — that IE should continue not to use the Safe Browsing API, and that Google should continue publishing the Safe Browsing API without trimming URLs that aren’t also blocked by IE. Microsoft is saying that the URLs on List Z should not be blocked; Google is saying that they should be.
(Note that this argument is independent of the relative weights that you assign to the benefit of blocking a genuinely malicious site, versus the cost of accidentally blocking a site which is not malicious. Different users might assign different values to these costs and benefits, and depending on what values they assign, those users would want different thresholds to be used in deciding whether to block a site or not. And Microsoft and Google have picked default thresholds that they estimate will meet the needs of the average user. But no matter what values you assign to the benefit of blocking a malicious site and the penalty for blocking a false positive, it’s still the case that blocking the sites on List Z either does increases the total cost/benefit score — in which case IE should block sites on the Safe Browsing list in addition to its own — or it doesn’t — in which case Google should remove sites from the Safe Browsing list that aren’t blocked by SmartScreen.)
I suspect, of course, that the answer is the former — that the set of sites on List Z, those which are blocked by the Safe Browsing API but not blocked by SmartScreen, are probably approximately as likely to be malware as the rest of the sites on the list, and that it would make Internet Explorer safer if Microsoft augmented SmartScreen to use the Safe Browsing API as well. So why don’t they?
The answer is probably what people have been shouting out from the back of the classroom since the first paragraph: That for political reasons, Microsoft doesn’t want to be seen incorporating anything from Google into their own flagship application. It’s not news that a company would prefer to promote its products over its rivals’. But this goes beyond, for example, Microsoft bundling Internet Explorer with Windows instead of Google’s Chrome browser. Chrome and Internet Explorer do virtually the same thing, so it would look positively odd for Microsoft to promote IE over Chrome. But IE’s SmartScreen list and Google’s Safe Browsing list can be used simultaneously, providing more protection than either one by itself.
Still, Microsoft has already calculated that it would be an unwise move politically to use Google’s Safe Browsing list. So I’m not trying to second-guess the calculation that they made, based on data that was available to them at the time. Rather, I think that if some publicity can increase the political benefit that they could get from using Google’s Safe Browsing list in conjunction with SmartScreen (and increase the political cost of not using it), that might lead them to recalculate and make a different decision. To that end, let me raise up a banner that people can gather under if they want to:
Microsoft, we will not think any less of you if you use the Google Safe Browsing API in Internet Explorer in conjunction with the SmartScreen filter! We’ll give you credit for setting aside petty rivalries and using the technology of a competitor in order to make users safer.
The IE team’s blog post about the initial success of the SmartScreen filter, from March 2009, cited statistics showing 10 million malware blocks in the previous six months, and asked readers to think about those numbers in terms of their impact on real humans and the grief it saved them: "These are BIG numbers — each malicious download blocked helps prevent compromise of that user’s computer." Since then, Microsoft has released new statistics showing that SmartScreen has delivered about 70 million blocks since IE8 was officially released. Of course, not every one of those blocks made the difference between infecting a machine with spyware and keeping it clean (many users wouldn’t have downloaded or installed the software that the website was trying to send them), but the IE team is right to be proud anyway. However that also means that if adding Safe Browsing support to IE resulted in only a small percent increase in the filter’s effectiveness, it would mean several million additional malware blocks over the same period, and cumulatively tens of millions of more in the years ahead. Isn’t that worth Microsoft forming an alliance with Google, especially if doing that would make them look good?