rsyslog performance

This is the place for developers to discuss bugs, new features and everything else about code changes.

Google Ads


rsyslog performance

Postby dlang » Mon Sep 15, 2008 7:46 am

I've got a couple projects I'm working on that rsyslog is looking like a
good answer to. I'm going to send questions about them in seperate
messages.

the first project is a traditional syslog server. I've been testing
various syslog implementations to find out what sort of performance they
can sustain.

I've gotten the (almost) standard sysklogd to sustail almost 30,000
messages/sec (udp) before it starts loosing significant numbers of logs

rsyslog is able to handle bursts of around 150,000 messages/sec, but it
doesn't seem to write them out very fast, and over time seems to be
limited to ~22,000 messages/sec

the hardware I am running this test on is insanely overpowered (four
dual-core opterons, 16G of ram, battery-backed cache on a raid card with
high-speed SAS drives). in production I will be working with slower
hardware, but I'm trying to find the limits of the software before I start
introducing lower hardware limits.

what can I look at doing to speed up the writing of data from the queues?
does rsyslog write the messages one at a time, or does it have an option
for writing them in batches (pulling a bunch off of the queue and sending
them all at once)?

some sort of batch mode would be critical for writes to a database where
you can frequently get 1000x speedups if you do the inserts as a single
transaction as opposed to individual transactions.

David Lang
dlang
Frequent Poster
 
Posts: 969
Joined: Mon Sep 15, 2008 7:44 am

Urgent Question?

  • Pulling out your Hair?
  • Wasting Time and Money?
  • Deadline Approaching?

Re: rsyslog performance

Postby dlang » Mon Sep 15, 2008 7:47 am

rsyslog.conf


$ModLoad immark
$MarkMessagePeriod 1200
$ModLoad imuxsock
$ModLoad imudp
$ModLoad imklog
$UDPServerRun 514
$AllowedSender UDP, 127.0.0.1, 192.0.0.0/8
$umask 0000
$FileGroup root
$FileOwner root
$template TraditionalFormat,"%timegenerated% %HOSTNAME% %syslogtag%%msg:::drop-last-lf%\n"
*.* /var/log/messages;TraditionalFormat # log to a file in the traditional format
dlang
Frequent Poster
 
Posts: 969
Joined: Mon Sep 15, 2008 7:44 am

Re: rsyslog performance

Postby dlang » Mon Sep 15, 2008 7:48 am

this is a reply from Rainer Gerhards, who asked me to move the discussion to here.

> what can I look at doing to speed up the writing of data from the
> queues?
> does rsyslog write the messages one at a time, or does it have an
> option
> for writing them in batches (pulling a bunch off of the queue and
> sending
> them all at once)?

The design is that each individual message is pushed to the output. The
output than writes the message. There has been discussion about a lazy
write, but nothing in this regard has yet been implemented.

Note that this is a good time to request new things. I am finished with
the last major thing (TLS) support and in front of the next one
(enhanced scripting), but scripting, I think, could be moved to a later
time in favour of something more important (like performance ;)).
>
> some sort of batch mode would be critical for writes to a database
> where
> you can frequently get 1000x speedups if you do the inserts as a
> single
> transaction as opposed to individual transactions.

I am not (any longer) so much a database guy. Can I really do a batch
insert with them? Can you point me to an API?
dlang
Frequent Poster
 
Posts: 969
Joined: Mon Sep 15, 2008 7:44 am

Re: rsyslog performance

Postby rgerhards » Mon Sep 15, 2008 7:51 am

Hi David,

thanks for your lightning-fast reply and moving the thread to over here :) One question I forgot to ask: which version of rsyslog are you currently using? And if it is not the current devel, would it be possible to move to that (because feature enhancements happen only in devel by policy - performance optimization is feature enhancement - it may break things that worked ;)).

Rainer
rgerhards
Site Admin
 
Posts: 3795
Joined: Thu Feb 13, 2003 11:57 am

Re: rsyslog performance

Postby rgerhards » Mon Sep 15, 2008 8:04 am

David,

one more question. During your tests, can you see if the bottleneck is CPU- or IO-related? I am wondering if there is something inside the (now quite complex) code path that leads to the generation of the output buffer or if it is the actual write process. I assume it is the later, but I would prefer to have as much proof as possible. I'll also start to review the file output writer. Any information you have is highly appreciated. I assume you would be willing to try out patches I create in order to speed up things (over a period of ... well ... time ;)).

Rainer
rgerhards
Site Admin
 
Posts: 3795
Joined: Thu Feb 13, 2003 11:57 am

Re: rsyslog performance

Postby rgerhards » Mon Sep 15, 2008 11:14 am

I have now looked at the (file writer) code and found the right location for a change that potentially could greatly impact performance. I would like to do a first quick and dirty test. This would result in not all messages being persisted at shutdown and messages being persisted only when a buffer is full, but it would provide good indication of the potential performance gain. Would you be able to test such a version (again, multiple tests are required until we reach our ultimate goal)?

Rainer
rgerhards
Site Admin
 
Posts: 3795
Joined: Thu Feb 13, 2003 11:57 am

Re: rsyslog performance

Postby dlang » Mon Sep 15, 2008 5:43 pm

I've been working with the stable version (3.18.3), not the development version. I'll try changing that and see what happens.

I'd be happy to test a new version

I don't know if the bottleneck is cpu or disk, in part becouse I don't fully understand how the multiple threads can operate. I have 8 high-speed cores so I am not running out of total CPU, but I could have a thread eating up one CPU entirely.
dlang
Frequent Poster
 
Posts: 969
Joined: Mon Sep 15, 2008 7:44 am

Re: rsyslog performance

Postby rgerhards » Mon Sep 15, 2008 5:45 pm

dlang wrote:I don't know if the bottleneck is cpu or disk, in part becouse I don't fully understand how the multiple threads can operate. I have 8 high-speed cores so I am not running out of total CPU, but I could have a thread eating up one CPU entirely.


Most of the logic is executed on multiple threads, only after the action itself is entered, we have a single thread. With just one action, There are at most 2 threads running concurrently, one doing the processing and one doing the action. That, of course, assumes that the processing threads waits on the action, so the action is slower. If the action is faster, multiple threads can be running at one time (but I doubt this with the configuration given).

Rainer
rgerhards
Site Admin
 
Posts: 3795
Joined: Thu Feb 13, 2003 11:57 am

Re: rsyslog performance

Postby dlang » Mon Sep 15, 2008 6:15 pm

when I am hammering it top shows rsyslog at ~144%, which probably means it's saturating one cpu and using half of another.
dlang
Frequent Poster
 
Posts: 969
Joined: Mon Sep 15, 2008 7:44 am

Re: rsyslog performance

Postby dlang » Mon Sep 15, 2008 6:31 pm

I told top to display threads and there is definantly one thread that's got the cpu pegged.

doing an strace on it, it looks like this is the thread that's recieving the logs messages from the network. I have a strace of that thread.

just after I looked at it the logs failed to roll, it looks like the old thread did not die off like it was supposed to, so the new thread is not getting the data to feed on into rsyslog
dlang
Frequent Poster
 
Posts: 969
Joined: Mon Sep 15, 2008 7:44 am

Re: rsyslog performance

Postby dlang » Mon Sep 15, 2008 6:33 pm

trying to upload the strace
Attachments
rsyslog.pegged.txt.bz2
trace of pegged process
(45.1 KiB) Downloaded 337 times
dlang
Frequent Poster
 
Posts: 969
Joined: Mon Sep 15, 2008 7:44 am

Re: rsyslog performance

Postby dlang » Mon Sep 15, 2008 6:45 pm

one problem I am seeing is that the writing thread is using sched_yield(), on linux boxes this is generally a bad idea. the implementation and effect of sched_yield can change over time, and may result in the kernel choosing to run the same thread immediatly (as if the call was a noop)
dlang
Frequent Poster
 
Posts: 969
Joined: Mon Sep 15, 2008 7:44 am

Re: rsyslog performance

Postby dlang » Mon Sep 15, 2008 7:50 pm

thoughts on performance (in part from looking at the strace)

there are three gettimeofday calls for each message recieved, one should be enough

it may be worth implementing a name lookup cache rather than calling out for every single lookup (there would need to be some way to flush the cache, say SIGHUP or SIGUSR1/2). Is there an easy way to disable name lookups to test this? (a quick calculation suggests that avoiding the system calls involved with the name lookup could come close to doubling the speed)

would it make sense to define SIGUSR1/2 as a signal to close and re-open outputs? SIGHUP tells it to re-read it's config file, which I think unloads and reloads plugins, a much more expensive task
dlang
Frequent Poster
 
Posts: 969
Joined: Mon Sep 15, 2008 7:44 am

Re: rsyslog performance

Postby dlang » Mon Sep 15, 2008 8:21 pm

dlang wrote:thoughts on performance (in part from looking at the strace)

it may be worth implementing a name lookup cache rather than calling out for every single lookup (there would need to be some way to flush the cache, say SIGHUP or SIGUSR1/2). Is there an easy way to disable name lookups to test this? (a quick calculation suggests that avoiding the system calls involved with the name lookup could come close to doubling the speed)


as a point of comparison, I just hacked sysklogd to not do any name lookups and instead just log the IP address (or existing name), and I got close to double the performance from it (~50k logs/sec instead of ~22K logs/sec)
dlang
Frequent Poster
 
Posts: 969
Joined: Mon Sep 15, 2008 7:44 am

Re: rsyslog performance

Postby rgerhards » Mon Sep 15, 2008 8:52 pm

dlang wrote:
dlang wrote:thoughts on performance (in part from looking at the strace)

it may be worth implementing a name lookup cache rather than calling out for every single lookup (there would need to be some way to flush the cache, say SIGHUP or SIGUSR1/2). Is there an easy way to disable name lookups to test this? (a quick calculation suggests that avoiding the system calls involved with the name lookup could come close to doubling the speed)


as a point of comparison, I just hacked sysklogd to not do any name lookups and instead just log the IP address (or existing name), and I got close to double the performance from it (~50k logs/sec instead of ~22K logs/sec)


This is on my long term todo list, but to do it right is more complex than it initially sounds. Most importantly, a cache must (should? ;)) make sure no expired DNS entries are used.

From what I have read in this thread, this is also not the current bottleneck. As you initially said, the problem seems to be rooted in the file writer. If I look a the CPU consumption, the 100% thread looks like the input, which is fine when running under big stress. But the remaining CPU is the rule engine and the action, and being far less than 200% (2 CPUs @ 100%), this means there is some non-CPU waiting involved. Under the circumstances described, this can only be IO bound. So I conclude that the problem is indeed the output, which emits a lot of relatively small write calls. I'll modify this code so that it does buffered writes (let's say with 64K buffers) and that should both improve performance and also get us higher CPU utilization. I'll try to do a partial implementation tomorrow. That will not be something fully correct, but should be a good test if my analysis goes into the right direction.

I'll comment on a few other things raised a bit later ;)
Rainer
rgerhards
Site Admin
 
Posts: 3795
Joined: Thu Feb 13, 2003 11:57 am

Google Ads


Next

Return to Developer's Corner

Who is online

Users browsing this forum: No registered users and 2 guests

cron