[Web10g-user] Maximum polling frequency with web10g

Chris Rapier rapier at psc.edu
Tue Mar 11 21:27:37 EDT 2014


No need to apologize. This is the sort of information we need. I'll be 
discussing this withthe team this week and running some of our own tests. 
I'll be back in touch late tomorrow to get more information.


Sent with AquaMail for Android
http://www.aqua-mail.com


On March 11, 2014 8:06:37 PM Sebastian Zander <szander at swin.edu.au> wrote:

> Hi Chris,
>
> Sorry for crashing into this thread, but we just stumbled over what looks 
> like exactly the same problem.
>
> We run a series of experiments where in each experiment we generate TCP 
> traffic for a few minutes, each experiment has about 10 flows or so and in 
> addition we have SSH control traffic before and after. We use openSuSE with 
> a vanilla Linux 3.9.8 kernel with the 0.7 kernel patch and 2.0.7 userland 
> code. We poll web10g often, time between poll is 10-20ms. Our machines 
> (with 4GB RAM and 1GB swap) crash after a number of experiments, in the 
> order of 50--100 experiments.
>
> Different error messages, but at least some point to memory issues:
> "Kernel panic - not syncing: Out of memory and no killable processes..."
> "Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 0"
> I have screen dumps of two crashes, can send if you want them.
>
> Quick and not very thorough look at memory consumption seemed to indicate 
> we're loosing memory somewhere, albeit very slowly. With same kernel but 
> without polling web10g we seem to have no issues. Since it takes a while to 
> crash, as a workaround we reboot more often now...
>
> Unfortunately, I'm a bit limited on accessing the actual testbed at the 
> moment due to experiments running, but let me know if you need more 
> information.
>
> Cheers,
>
> Sebastian
>
> On 12/03/2014 3:25 AM, rapier wrote:
> > I'd like to apologize for the delay on this. Since you aren't subscribed
> > to the list this ended up being held by our mailing list software. I've
> > been swamped with grant writing work and getting web10g finalized.
> >
> > Could you tell me a bit more about the parameters of your experiment?
> > How long is the tcp flow running? At what point do you run out of
> > memory? Are there any error messages that you are seeing on crash? Which
> > version of the kernel patch set and userland are you using?
> >
> > Let me know and we'll look into this. If you have specific code that you
> > are using to run this experiment it would you mind sending it to me so I
> > can try to recreate the problems?
> >
> > Chris Rapier
> >
> >
> > On 2/24/14, 6:10 AM, Hadrien Hours wrote:
> >> Hi,
> >>
> >> I am not sure whether this is the correct mailing list to ask this
> >> question or not but that's the best I have found.
> >>
> >> I am currently using the latest version of web10g on Ubuntu 13.04
> >>
> >> I am trying to track the evolution of TCP state by polling tcp stack
> >> parameters every 100 ms. By doing so I have the machine crashing (memory
> >> shortage while my conf on the machine has 4GB)
> >>
> >> Reducing to 1s the machine ends up crashing also but after a longer
> >> running time (up to several hours so far).
> >>
> >> Does someone already experiment the same problem ? And had found a
> >> solution ?
> >>
> >> Thank you very much !
> >>
> >> Hadrien
> >> _______________________________________________
> >> Web10g-user mailing list
> >> Web10g-user at web10g.org
> >> https://lists.psc.edu/mailman/listinfo/web10g-user
> >>
> >> To UNSUBSCRIBE visit https://lists.psc.edu/mailman/unsubscribe/web10g-user
> >>
> > _______________________________________________
> > Web10g-user mailing list
> > Web10g-user at web10g.org
> > https://lists.psc.edu/mailman/listinfo/web10g-user
> >
> > To UNSUBSCRIBE visit https://lists.psc.edu/mailman/unsubscribe/web10g-user
> >
> >
>
>
> --
> Sebastian Zander
> http://caia.swin.edu.au/cv/szander/




More information about the Web10g-user mailing list