Tuesday, September 04, 2007

HttpWebRequest in C# for Web Traffic Simulation, Watch out for Expect 100 Continue and Nagle

Over the past few months we've been busy at work creating our scalable Intelligent Chat application using C# .NET 2.0 and AJAX. To perform load testing against our application we have written a simple Traffic Generating/Load Simulating Windows Forms application which makes Asynchronous HTTP Requests using the HttpWebRequest object, it is pretty straightforward and has been able to surface many problems in our main application. We did come across a big "Gotcha" in our simulator, after looking at the generated network traffic we noticed that in the header of each request the simulator was sending "Expect: 100 - Continue", after which the server would send the 100 - Continue response to the client. Since this was not the correct behavior, we added the following line of code:
System.Net.ServicePointManager.Expect100Continue = false;
As indicated in the following blog posts:
HttpWebRequest and the Expect: 100-continue Header Problem
HttpWebRequest and Expect 100 Continue

After this change however, the headaches started, everything was fine when we tested in our Development environment, IIS on Windows 2003, when we deployed the simulator into the Test environment (as far as I can tell, exactly the same server configuration, etc...), the simulator was so slow that requests would time out almost immediately.

After using Wireshark to view the network traffic in the test environment, we were seeing some strange behavior, it looked like the simulator was sending the header of the POST request, and then waiting for the Acknowledge (ACK) message back (which would take more than 200 milliseconds) from the server before sending the body of the post. Thanks to our brilliant network engineer we were able to determine that the client caused this behavior by using the Nagle Algorithm for it's requests.

That is when we put in one more line:

System.Net.ServicePointManager.UseNagleAlgorithm = false;
In our simulator and it seemed to fix the problem in the test environment, I have no Idea what is different in our development environment, some IIS configuration, maybe the fact that our test environment servers have .NET 3.0 installed? It is still a mystery to me. I figured I would share our experiences with the Expect 100 continue and Nagle Algorithm to perhaps save others some of the same pain we dealt with debugging this issue.

Update September 7th, 2007:
We found the difference in our development environment, the windows 2003 server in our development environment is R2 with Service Pack 1 (sp1) and the windows 2003 server in the test lab is win 2003 R2 with Service Pack 2 (sp2), after viewing further network traces this seems to stem from IIS sending a 100 continue under sp1 (even when we don't send the expect 100 continue message), then what we see is the client waits a long amount of time before sending the ACK message back to the server (I suspect because it has already sent the entire body in the initial POST headers).

This Blog post points to a hotfix that addresses this issue in sp1:
HTTP.SYS, IIS, and the 100 continue

I've put together a simple table to show all the combinations related to 100 - Continue and Nagle, and the results of the test (it was just too hard to keep it all straight in my head):




ResultsServer Sends 100 - Continue (Windows 2003 SP1 && XP IIS 5.1)Client sends Expect: 100 - ContinueUse Nagle Algorythm
Fast (extra traffic sent)111
Fast (extra traffic sent)110
Fast (didn't send POST body, after the 100 - Contine it has something to send?)101
Slow (waits before sending ACK to 100 - Continue)100
Fast Responses011
Fast (extra traffic sent)010
Slow (Nagle causes a ~200ms wait for ACK)001
Fast Responses000

(Note: A zero in column one means Windows 2003 SP2)

0 comments: