rsyslog performance

This is the place for developers to discuss bugs, new features and everything else about code changes.

Google Ads


Re: rsyslog performance

Postby rgerhards » Tue Sep 16, 2008 9:25 pm

dlang wrote:
rgerhards wrote:
dlang wrote:grabbing the snapshot of bcb97650683bbcb1ecdacd0c2a6052ef836d3eda doesn't work as it doesn't include the configure program. I'll unpack it on top of my 3.21.4 image to get it working


You don't need configure.

autoreconf -fvi

builds it for you.

I looked at the README, but didn't see this. I ended up unpacking it on top of 3.21.4 and doing a make clean then a configure.


It's not in README because that's nothing that's usually of interest for a user. But maybe it makes sense to add a note ;) On top of 3.21.4 should work fine.

Rainer
rgerhards
Site Admin
 
Posts: 3252
Joined: Thu Feb 13, 2003 11:57 am

Urgent Question?

  • Pulling out your Hair?
  • Wasting Time and Money?
  • Deadline Approaching?

Re: rsyslog performance

Postby rgerhards » Thu Sep 18, 2008 11:50 am

OK, I think there was good reason to be curios about these futex calls. It also looks like it was a good idea to look at the performance and the stability issue together. I found a small, but obviosly (at least performance-wise) far reaching thread synchronization this morning. I have now merged its fix into the perf[ormance improvement] git branch. And now, at least in my environment, the futex calls look much, much more than what I expected. So I think what we saw was evidence of a data race. As a side-effect, performance should now also have been improved a bit (those futex calls were very quick).

I'd appreciate if you could give the version a try. So far, it is in the "perf" branch only, I am a bit hesitant at this stage to merge it into master. Some feedback from others would be preferred before doing so ;)

Rainer
rgerhards
Site Admin
 
Posts: 3252
Joined: Thu Feb 13, 2003 11:57 am

Re: rsyslog performance

Postby dlang » Thu Sep 18, 2008 12:02 pm

I'm just getting home (about 4am local time) so I won't try to test this before you call it a night. but I should have a report for you by the time you get up again
dlang
Frequent Poster
 
Posts: 126
Joined: Mon Sep 15, 2008 7:44 am

Re: rsyslog performance

Postby rgerhards » Thu Sep 18, 2008 12:03 pm

dlang wrote:I'm just getting home (about 4am local time) so I won't try to test this before you call it a night.


;)

I'll probably post some more thoughts, be sure to review the thread. There are some ways to safe a bit more on performance. Have a good sleep ;)

Rainer
rgerhards
Site Admin
 
Posts: 3252
Joined: Thu Feb 13, 2003 11:57 am

Re: rsyslog performance

Postby rgerhards » Thu Sep 18, 2008 4:14 pm

I have done one more commit to the perf branch:

http://git.adiscon.com/?p=rsyslog.git;a ... 974f71835f

This removes the time() call on the dequeue side. However, there are some subleties (see source comments). Anyhow, I'd like to know what the overall effect of this change is.

Thanks,
Rainer
rgerhards
Site Admin
 
Posts: 3252
Joined: Thu Feb 13, 2003 11:57 am

Re: rsyslog performance

Postby dlang » Fri Sep 19, 2008 3:51 am

very interesting results
instead of one process at 100% cpu and the other at 80% cpu I get one at 100% and the second at ~15%

however, when I distrub things by doing and strace, the one that was at 15% goes to 100% (writer) and the other goes to ~80% with the total throughput dropping (about the right amount for the listener to be at 80% instead of 100%)

so you are in the right area, but it's not stable yet.

I also had a case where I did an strace of a thread just before it got the HUP signal and the thread didn't process it, so didn't release the fiel and open a new one.

one thing you could look at doing for signal handlers is to have the handler itself just set a flag variable and have the main loops of all the threads do the appropriate processing on their next pass through the loop. with multiple processes you would want to have the flag be a bitfield with the handler setting all bits and then each thread zeroing out their bit in the processing loop (this approach has the side benifit that you no longer care how many times you recieve a signal, you process it once when you can get to it, it also lets you skip any need to disable signals in the program)


with your question of how to deal with recording times that you mention in the commit., I'm not quite understanding it.

when you queue a message for output do you queue just the final string to be sent? or do you queue the Msg structure?
dlang
Frequent Poster
 
Posts: 126
Joined: Mon Sep 15, 2008 7:44 am

Re: rsyslog performance

Postby dlang » Fri Sep 19, 2008 4:25 am

I also did a test where I commented out the yield call and the writer ate a lot more cpu.

doing a quick strace doesn't show any specific problems, the reader in doing three futex calls (on different futexes) and the writer is doind one futex call
dlang
Frequent Poster
 
Posts: 126
Joined: Mon Sep 15, 2008 7:44 am

Re: rsyslog performance

Postby rgerhards » Fri Sep 19, 2008 8:21 am

dlang wrote:I also did a test where I commented out the yield call and the writer ate a lot more cpu.

doing a quick strace doesn't show any specific problems, the reader in doing three futex calls (on different futexes) and the writer is doind one futex call

I need to review the case that lead to introduction of this yield call. It may be that the real cause was what I had fixed yesterday. Anyways, I am very careful in changing this because I know it could lead to severe starvation situations under some circumstances. But... we are making progress, which is good news ;)

Rainer
rgerhards
Site Admin
 
Posts: 3252
Joined: Thu Feb 13, 2003 11:57 am

Re: rsyslog performance

Postby rgerhards » Fri Sep 19, 2008 1:57 pm

dlang wrote:one thing you could look at doing for signal handlers is to have the handler itself just set a flag variable and have the main loops of all the threads do the appropriate processing on their next pass through the loop. with multiple processes you would want to have the flag be a bitfield with the handler setting all bits and then each thread zeroing out their bit in the processing loop (this approach has the side benifit that you no longer care how many times you recieve a signal, you process it once when you can get to it, it also lets you skip any need to disable signals in the program)


Actually, this is what is done today. Except, that only the "main" thread reacts to signals, all others are blocked. The main thread is the one that is started by the OS and it coordinates all activities inside rsyslog (very few ;)).

dlang wrote:with your question of how to deal with recording times that you mention in the commit., I'm not quite understanding it.


Simply said: At that point in processing, I do not have all information I need to reconstruct a Unix timestamp. DST info is missing. Adding it requires additional changes.

dlang wrote:when you queue a message for output do you queue just the final string to be sent? or do you queue the Msg structure?

Conceptually, it depends. The main message queue must always queue the msg structure, because at this point we do not know what the various outputs expect. The action message queues can, with the current output module interface, store just the relevant string. However, they store the complete msg structure right now for two reasons:

1. reduced complexity
Until this is ultra-stable, I do not like to add additional code pathes where not strictly necessary

2. potential output interface change/improvement
there are ideas to enable output plugins to access message properties. If so, we need the structure in any way. So no point in doing code now which we may (may!) throw out tomorrow in the foreseeable future ;)

I hope this clarifies. Obviously, I am still puzzled about the HUP issue you mention plus some other bug reports you have seen on the mailing list. I am still consolidating things in the hope to come down to the root cause. I am still of the view that performance optimization and bugfixing this time closely relate to each other...

Rainer
rgerhards
Site Admin
 
Posts: 3252
Joined: Thu Feb 13, 2003 11:57 am

Re: rsyslog performance

Postby dlang » Sat Sep 20, 2008 1:55 am

rgerhards wrote:
dlang wrote:one thing you could look at doing for signal handlers is to have the handler itself just set a flag variable and have the main loops of all the threads do the appropriate processing on their next pass through the loop. with multiple processes you would want to have the flag be a bitfield with the handler setting all bits and then each thread zeroing out their bit in the processing loop (this approach has the side benifit that you no longer care how many times you recieve a signal, you process it once when you can get to it, it also lets you skip any need to disable signals in the program)


Actually, this is what is done today. Except, that only the "main" thread reacts to signals, all others are blocked. The main thread is the one that is started by the OS and it coordinates all activities inside rsyslog (very few ;)).
so how do the child threads handle the results of the signals? do they need to have something in their main loop? if so, it's not documented in the imtemplate file. If not it could be this mechanism that's getting tripped up
I hope this clarifies. Obviously, I am still puzzled about the HUP issue you mention plus some other bug reports you have seen on the mailing list. I am still consolidating things in the hope to come down to the root cause. I am still of the view that performance optimization and bugfixing this time closely relate to each other...

definantly
dlang
Frequent Poster
 
Posts: 126
Joined: Mon Sep 15, 2008 7:44 am

Re: rsyslog performance

Postby rgerhards » Mon Sep 22, 2008 1:38 pm

dlang wrote:so how do the child threads handle the results of the signals? do they need to have something in their main loop? if so, it's not documented in the imtemplate file. If not it could be this mechanism that's getting tripped up


They need not and they can not react to signals. HUP, as with sysklogd, is "just" an internal restart of the service. That means all threads are terminated, everything closed and then we go through the config and start up again those things that are now configured. It's actually a bad way of doing things, but there are so many scripts that depend on SIGHUP AND that expect that not only files are closed but the config is reloaded so that there seems to be no way around it.

An alternative may be to define another signal just for closing files, but the bad thing is that SIGUSR1 and SIGUSR2 are already taken for the debugging system...

Rainer
rgerhards
Site Admin
 
Posts: 3252
Joined: Thu Feb 13, 2003 11:57 am

Re: rsyslog performance

Postby rgerhards » Mon Sep 22, 2008 2:22 pm

Well, I have looked around a bit more. I could probably abuse SIGTTIN and SIGTTOU, so that I could spare one signal "just" for closing files. I think that's probably a smart addition in any case (avoiding all the overhead of a full restart). Now I "just" need to check if that's something that can be done rather easily (I don't want to do anything intrusive unless I've found the source of the annoying abort case).
rgerhards
Site Admin
 
Posts: 3252
Joined: Thu Feb 13, 2003 11:57 am

Re: rsyslog performance

Postby dlang » Tue Sep 23, 2008 10:27 am

rgerhards wrote:Well, I have looked around a bit more. I could probably abuse SIGTTIN and SIGTTOU, so that I could spare one signal "just" for closing files. I think that's probably a smart addition in any case (avoiding all the overhead of a full restart). Now I "just" need to check if that's something that can be done rather easily (I don't want to do anything intrusive unless I've found the source of the annoying abort case).

you do have a note that SIGUSR1 may not be used for debugging in the future

the problems I am having with rsyslog not cycling log files properly all seem to have to do with the HUP not killing all of the processes. even if you don't do a 'log rotate' signal, it may make sense to change the while(true) loops in all the threads to be while(!shutdown)
dlang
Frequent Poster
 
Posts: 126
Joined: Mon Sep 15, 2008 7:44 am

Re: rsyslog performance

Postby rgerhards » Tue Sep 23, 2008 10:31 am

dlang wrote:the problems I am having with rsyslog not cycling log files properly all seem to have to do with the HUP not killing all of the processes. even if you don't do a 'log rotate' signal, it may make sense to change the while(true) loops in all the threads to be while(!shutdown)


... that would not be a valid cure. It may be a tactical though. The reason is that the interface specification tells that input modules are terminated by being pthread_cancel()ed. So there is no valid reason the keep inside the while() loop after a HUP. If they are not shut down, something works other than designed, and we must find what causes this. From a tactical point of view, this may "solve" the current issues, but would probably move the problem to an area where we simply do not (yet) experience it. That would be a regression-bomb ;) So better fix the root cause, even if it is not easy...

Rainer
rgerhards
Site Admin
 
Posts: 3252
Joined: Thu Feb 13, 2003 11:57 am

Re: rsyslog performance

Postby rgerhards » Tue Sep 23, 2008 10:44 am

oh, and I forgot to mention that a while(!Shutdown) does not solve the case for an input to be sitting in a select() at time of shutdown. This is (one of) the root causes why the interface is as it is today.
rgerhards
Site Admin
 
Posts: 3252
Joined: Thu Feb 13, 2003 11:57 am

Google Ads


PreviousNext

Return to Developer's Corner

Who is online

Users browsing this forum: No registered users and 0 guests

cron