[Web10g-user] Maximum polling frequency with web10g

Sebastian Zander szander at swin.edu.au
Tue Mar 11 20:06:15 EDT 2014


Hi Chris,

Sorry for crashing into this thread, but we just stumbled over what 
looks like exactly the same problem.

We run a series of experiments where in each experiment we generate TCP 
traffic for a few minutes, each experiment has about 10 flows or so and 
in addition we have SSH control traffic before and after. We use 
openSuSE with a vanilla Linux 3.9.8 kernel with the 0.7 kernel patch and 
2.0.7 userland code. We poll web10g often, time between poll is 10-20ms. 
Our machines (with 4GB RAM and 1GB swap) crash after a number of 
experiments, in the order of 50--100 experiments.

Different error messages, but at least some point to memory issues:
"Kernel panic - not syncing: Out of memory and no killable processes..."
"Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 0"
I have screen dumps of two crashes, can send if you want them.

Quick and not very thorough look at memory consumption seemed to 
indicate we're loosing memory somewhere, albeit very slowly. With same 
kernel but without polling web10g we seem to have no issues. Since it 
takes a while to crash, as a workaround we reboot more often now...

Unfortunately, I'm a bit limited on accessing the actual testbed at the 
moment due to experiments running, but let me know if you need more 
information.

Cheers,

Sebastian

On 12/03/2014 3:25 AM, rapier wrote:
> I'd like to apologize for the delay on this. Since you aren't subscribed
> to the list this ended up being held by our mailing list software. I've
> been swamped with grant writing work and getting web10g finalized.
>
> Could you tell me a bit more about the parameters of your experiment?
> How long is the tcp flow running? At what point do you run out of
> memory? Are there any error messages that you are seeing on crash? Which
> version of the kernel patch set and userland are you using?
>
> Let me know and we'll look into this. If you have specific code that you
> are using to run this experiment it would you mind sending it to me so I
> can try to recreate the problems?
>
> Chris Rapier
>
>
> On 2/24/14, 6:10 AM, Hadrien Hours wrote:
>> Hi,
>>
>> I am not sure whether this is the correct mailing list to ask this
>> question or not but that's the best I have found.
>>
>> I am currently using the latest version of web10g on Ubuntu 13.04
>>
>> I am trying to track the evolution of TCP state by polling tcp stack
>> parameters every 100 ms. By doing so I have the machine crashing (memory
>> shortage while my conf on the machine has 4GB)
>>
>> Reducing to 1s the machine ends up crashing also but after a longer
>> running time (up to several hours so far).
>>
>> Does someone already experiment the same problem ? And had found a
>> solution ?
>>
>> Thank you very much !
>>
>> Hadrien
>> _______________________________________________
>> Web10g-user mailing list
>> Web10g-user at web10g.org
>> https://lists.psc.edu/mailman/listinfo/web10g-user
>>
>> To UNSUBSCRIBE visit https://lists.psc.edu/mailman/unsubscribe/web10g-user
>>
> _______________________________________________
> Web10g-user mailing list
> Web10g-user at web10g.org
> https://lists.psc.edu/mailman/listinfo/web10g-user
>
> To UNSUBSCRIBE visit https://lists.psc.edu/mailman/unsubscribe/web10g-user
>
>


-- 
Sebastian Zander
http://caia.swin.edu.au/cv/szander/


More information about the Web10g-user mailing list