Wednesday, May 11, 2011

AdsBot BadBot

Recently it was brought to my attention that the user agent string of the mobile version of Google AdsBot caused problems on a .NET site running Sitecore CMS. The exception is:

     Stack trace:    at System.Number.StringToNumber(String str, NumberStyles options, NumberBuffer& number, NumberFormatInfo info, Boolean parseDecimal)
   at System.Number.ParseInt32(String s, NumberStyles style, NumberFormatInfo info)
   at System.Web.Configuration.HttpCapabilitiesBase.get_MajorVersion()
   at Sitecore.UIUtil.SupportsInlineEditing()
   at Sitecore.Sites.SiteContext.SetDisableWebEditEditing(String value)
   at Sitecore.Sites.SiteContext.ProcessQueryString()
   at Sitecore.Sites.SiteContext..ctor(SiteInfo info, Boolean processQueryString)
   at Sitecore.Sites.SiteContextFactory.GetSiteContext(String hostName, String fullPath, Int32 portNumber)
   at Sitecore.Pipelines.HttpRequest.SiteResolver.ResolveSiteContext(HttpRequestArgs args)
   at Sitecore.Pipelines.HttpRequest.SiteResolver.Process(HttpRequestArgs args)
   at (Object , Object[] )
   at Sitecore.Pipelines.CorePipeline.Run(PipelineArgs args)
   at Sitecore.Nexus.Web.HttpModule.?(Object sender, EventArgs e)
   at System.Web.HttpApplication.SyncEventExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()
   at System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean& completedSynchronously)

I made a test project and after some time managed to get the same error from a plain ASP.NET site:

 [ArgumentNullException: Value cannot be null.
Parameter name: String]
   System.Number.StringToNumber(String str, NumberStyles options, NumberBuffer& number, NumberFormatInfo info, Boolean parseDecimal) +7470778
   System.Number.ParseInt32(String s, NumberStyles style, NumberFormatInfo info) +119
   System.Web.Configuration.HttpCapabilitiesBase.get_MajorVersion() +113
   UserAgentTest._Default.Page_Load(Object sender, EventArgs e) in d:\Basho\Documents\Visual Studio 2010\Projects\UserAgentTest\UserAgentTest\Default.aspx.cs:14
   System.Web.Util.CalliHelper.EventArgFunctionCaller(IntPtr fp, Object o, Object t, EventArgs e) +14
   System.Web.Util.CalliEventHandlerDelegateProxy.Callback(Object sender, EventArgs e) +35
   System.Web.UI.Control.OnLoad(EventArgs e) +99
   System.Web.UI.Control.LoadRecursive() +50
   System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint) +627

The user agent string in question is the following:
AdsBot-Google-Mobile (+http://www.google.com/mobile/adsbot.html) Mozilla (iPhone; U; CPU iPhone OS 3 0 like Mac OS X) AppleWebKit (KHTML, like Gecko) Mobile Safari

Google thinks this is normal:
http://adwords.google.com/support/aw/bin/answer.py?hl=en&answer=38197


It turns out that Microsoft .NET Framework 3.5 disagrees with Google's choice of user agent string. The problem arises in the parsing of the majorversion/minorversion that is described in .NET's *.browser files. The browser is recognized as default/mozilla/gecko and the value of the two capabilities is finally null.

This problem is fixed in .NET Framework 4 and the browser is recognized as default/mozilla/safari with the two values equal to 0.

Now if you're stuck with .NET 3.5, you can follow the ideas described in this site (you might need the refresh hack):
http://www.jasonlinham.co.uk/2009/06/sitecore-xhtml-validation-attribute.html

The browser file you need goes something like:
<browsers>
  <browser id="AdsBot" parentID="Gecko">
    <identification>
      <userAgent match="AdsBot" />
    </identification>
    <capabilities>
      <capability name="browser" value="AdsBot" />
      <capability name="majorversion" value="0" />
      <capability name="minorversion" value="0" />
    </capabilities>
  </browser>
</browsers>
While researching the problem I didn't find any pages giving a solution or describing the problem so that's why I wrote this post. I also filed a report to Google, let's see what happens.

No comments:

Post a Comment