rsyslog performance

This is the place for developers to discuss bugs, new features and everything else about code changes.

Re: rsyslog performance

Postby dlang » Wed Oct 22, 2008 9:35 pm

interesting results from the testing.

when sending a stead ~78,500 logs/sec (roughly what the output can keep up with) the first second of a min is spread between the two files.

as time goes on it seems like more and more of that second makes it into the logs (more testing to follow)

but I am seeing

15616
62511

17015
61131

18342
29756

19742
57986

21305
56866

23062
55119

24350
53801

so it's getting ~99% of the logs from that second, but the time of the cutover seems to be drifting forward.

when I opened the sender up to 300,000 logs/sec it looses about 90% of the first second of logs after the restart.

are you going to modify the HUP so that it doesn't clear the queue? that would seem like the correct thing to do (you are wanting to roll the output file, but there's no need to change the queue at all)

the approach I would have taken would be to have the output module check to see if the FD exists when it goes to do a write, and if it's not initialize the FD. then a HUP just closes all the FDs. This would work with network connects as well as files (fixed and dynamic names)
dlang
Frequent Poster
 
Posts: 126
Joined: Mon Sep 15, 2008 7:44 am

Professional Services Information

  • Custom written rsyslog.conf?
  • Maintenance Contract?
  • Installation support?

Re: rsyslog performance

Postby dlang » Wed Oct 22, 2008 9:45 pm

oops, the report above was without setting the $HUPisRestart off option.

with this option enabled it seems to be working well, it's maintaining the queue across HUPs, and dosn't seem to be missing any noticable number of logs when it does the HUP.

as always, more testing continuing.
dlang
Frequent Poster
 
Posts: 126
Joined: Mon Sep 15, 2008 7:44 am

Re: rsyslog performance

Postby dlang » Wed Oct 22, 2008 11:23 pm

more testing. I set the message size to 1k and upped the queue size to 2M entries. I then flooded the box (sending at 300,000/sec), waited until the queue was full (at which point the receiving thread droppes to <10% cpu as it's waiting for the writing thread)

logs timestamp that the logs were rolled (so *1404 contains logs from roughly 14:03:00-14:03:59)

started flood
2334986 messages.20081022.1404
4523447 messages.20081022.1405
3954685 messages.20081022.1406
647508 messages.20081022.1407
1909078 messages.20081022.1408
510823 messages.20081022.1409
390953 messages.20081022.1410
stopped flood
3841329 messages.20081022.1411
4346636 messages.20081022.1412
5423597 messages.20081022.1413
3123059 messages.20081022.1414
77574 messages.20081022.1415
77368 messages.20081022.1416
76389 messages.20081022.1417
76222 messages.20081022.1418
76064 messages.20081022.1419
312420 messages.20081022.1420
2261019 messages.20081022.1421


I then tried setting the queue type to linked list instead of the default fixed array and got the following

started flood
4446808 messages.20081022.1428
2747456 messages.20081022.1429
513478 messages.20081022.1430
362899 messages.20081022.1431
halted flood
308304 messages.20081022.1432
272395 messages.20081022.1433
242726 messages.20081022.1434
218416 messages.20081022.1435
205826 messages.20081022.1436
192946 messages.20081022.1437
182845 messages.20081022.1438
176920 messages.20081022.1439
171213 messages.20081022.1440
165738 messages.20081022.1441
161121 messages.20081022.1442
157311 messages.20081022.1443
152684 messages.20081022.1444
147426 messages.20081022.1445
143677 messages.20081022.1446
139318 messages.20081022.1447
136504 messages.20081022.1448
128910 messages.20081022.1449
122839 messages.20081022.1450
120708 messages.20081022.1451
117602 messages.20081022.1452
115857 messages.20081022.1453
112884 messages.20081022.1454
110278 messages.20081022.1455
1904265 messages.20081022.1456
5918456 messages.20081022.1457
5887505 messages.20081022.1458
2450368 messages.20081022.1459

the fact that with no new logs arriving the output varies so drastically is concerning.
dlang
Frequent Poster
 
Posts: 126
Joined: Mon Sep 15, 2008 7:44 am

Re: rsyslog performance

Postby dlang » Wed Oct 22, 2008 11:49 pm

it turns out that the numbers above were with pthread_yield enabled (I thought the new option defaulted to off)

I disabled pthread yield and got the following

Started flood
1676923 messages.20081022.1534
1931218 messages.20081022.1535
3838667 messages.20081022.1536
stopped flood
6135610 messages.20081022.1537
6067789 messages.20081022.1538
5949201 messages.20081022.1539
1874775 messages.20081022.1540
dlang
Frequent Poster
 
Posts: 126
Joined: Mon Sep 15, 2008 7:44 am

Re: rsyslog performance

Postby dlang » Thu Oct 23, 2008 7:56 am

I tried expanding my testing to include relaying messages as well as just receiving them to disk and ran into problems

first performance, adding a destination to relay to via UDP dropped the performance down <4000 messages/sec

second when I tried to relay via tcp rsyslog has one thread go into a loop, eating all the CPU but not doing anything other than a select.

this is basicly the same symptom that we had early on with the klog problem

I'm giving up for tonight and heading home. when I get there I will check for updates and see if I can do any quick tests.
dlang
Frequent Poster
 
Posts: 126
Joined: Mon Sep 15, 2008 7:44 am

Re: rsyslog performance

Postby rgerhards » Thu Oct 23, 2008 8:03 am

dlang wrote:oops, the report above was without setting the $HUPisRestart off option.

with this option enabled it seems to be working well, it's maintaining the queue across HUPs, and dosn't seem to be missing any noticable number of logs when it does the HUP.

as always, more testing continuing.


Draining of the queue is a result of the restart. With the new option, there is no restart and thus the queue does not need to be drained. Note that the non-restart type of HUP is very lightweight and does not interfere much with processing. So in essence, it is hard to believe that it in itself causes any message loss which would not have happened anyhow. What it actually does is a single close and open.
User avatar
rgerhards
Site Admin
 
Posts: 2647
Joined: Thu Feb 13, 2003 11:57 am

Re: rsyslog performance

Postby rgerhards » Thu Oct 23, 2008 8:11 am

dlang wrote:I tried expanding my testing to include relaying messages as well as just receiving them to disk and ran into problems

first performance, adding a destination to relay to via UDP dropped the performance down <4000 messages/sec


That's interesting. Just for completness, can you post your config at this time?

dlang wrote:second when I tried to relay via tcp rsyslog has one thread go into a loop, eating all the CPU but not doing anything other than a select.

this is basicly the same symptom that we had early on with the klog problem

Are you talking about the input side on the TCP receiver? If I remember correctly, there was an issue with the TCP stack on that system. At least I did nothing to fix anything and then it worked for you...

In any case, I'll keep focused on the initial environment for now, because jumping into anything else while this is not finished will probably be counter-productive.
User avatar
rgerhards
Site Admin
 
Posts: 2647
Joined: Thu Feb 13, 2003 11:57 am

Re: rsyslog performance

Postby dlang » Thu Oct 23, 2008 8:12 am

#$ModLoad immark
#$MarkMessagePeriod 1200
#$ModLoad imuxsock
$ModLoad imudp
$UDPServerRun 514
$UDPServerTimeRequery 10
$MainMsgQueueSize 20000000
$MaxMessageSize 1k
#$MainMsgQueueType LinkedList
#$MainMsgQueueWorkerThreads 2
#$OptimizeForUniprocessor off
$AllowedSender UDP, 127.0.0.1, 192.0.0.0/8
$umask 0000
$FileGroup root
$FileOwner root
#$HUPisRestart off
$template TraditionalFormat,"%timegenerated% %HOSTNAME% %syslogtag%%msg:::drop-last-lf%\n"
*.* /var/log/messages;TraditionalFormat # log to a file in the traditional format
#*.* @192.168.255.3;TraditionalFormat # log to a file in the traditional format
#*.* @192.168.255.4;TraditionalFormat # log to a file in the traditional format
*.* @@192.168.255.4:514;TraditionalFormat # log to a file in the traditional format
dlang
Frequent Poster
 
Posts: 126
Joined: Mon Sep 15, 2008 7:44 am

Re: rsyslog performance

Postby dlang » Thu Oct 23, 2008 8:15 am

rgerhards wrote:
dlang wrote:second when I tried to relay via tcp rsyslog has one thread go into a loop, eating all the CPU but not doing anything other than a select.

this is basicly the same symptom that we had early on with the klog problem

Are you talking about the input side on the TCP receiver? If I remember correctly, there was an issue with the TCP stack on that system. At least I did nothing to fix anything and then it worked for you...

I think that this is the first time I have tried to use TCP on this system
In any case, I'll keep focused on the initial environment for now, because jumping into anything else while this is not finished will probably be counter-productive.
sounds good. I'll go back to an older version for one location I need to install in the next day or two. (relativly low volume, ~6000 logs/sec ;) )
dlang
Frequent Poster
 
Posts: 126
Joined: Mon Sep 15, 2008 7:44 am

Re: rsyslog performance

Postby rgerhards » Thu Oct 23, 2008 8:16 am

thanks, so it is really plain UDP syslog. Well, I'll have a least a peek at the code (but refrain from changing it until at least the lightweight HUP is done ;)). strace will probably be entertaining ;)
User avatar
rgerhards
Site Admin
 
Posts: 2647
Joined: Thu Feb 13, 2003 11:57 am

Re: rsyslog performance

Postby rgerhards » Thu Oct 23, 2008 8:19 am

mmhhh... very compact code, very few things done, the most expensive probably being the unavoidable sendto() call... so nothing obvious...
User avatar
rgerhards
Site Admin
 
Posts: 2647
Joined: Thu Feb 13, 2003 11:57 am

Re: rsyslog performance

Postby dlang » Thu Oct 23, 2008 8:54 am

dlang wrote:
rgerhards wrote:
dlang wrote:second when I tried to relay via tcp rsyslog has one thread go into a loop, eating all the CPU but not doing anything other than a select.

this is basicly the same symptom that we had early on with the klog problem

Are you talking about the input side on the TCP receiver? If I remember correctly, there was an issue with the TCP stack on that system. At least I did nothing to fix anything and then it worked for you...

I think that this is the first time I have tried to use TCP on this system.

I remembered on the drive home, this is that IPv6 problem.

I don't have any IPv6 IP addresses defined on the system, but i do have ipV6 support in the distro kernel. the versions before you started the changes did not have a problem, but about the time you added the capability to add more than one message to a queue at a run it loops in a select loop when the first message arrives.

sorry for the distraction.
dlang
Frequent Poster
 
Posts: 126
Joined: Mon Sep 15, 2008 7:44 am

Re: rsyslog performance

Postby rgerhards » Thu Oct 23, 2008 9:01 am

While I am careful with such wording, I tend to think this is a kernel issue. The select should not awake if there is nothing to process, and I do not see how I could prevent it from doing so. Mabye I am overlooking something obvious, but I actually have no clue what could be wrong. Plus, I do not see any problems in any of my environments...
User avatar
rgerhards
Site Admin
 
Posts: 2647
Joined: Thu Feb 13, 2003 11:57 am

Re: rsyslog performance

Postby dlang » Thu Oct 23, 2008 11:52 am

rgerhards wrote:While I am careful with such wording, I tend to think this is a kernel issue. The select should not awake if there is nothing to process, and I do not see how I could prevent it from doing so. Mabye I am overlooking something obvious, but I actually have no clue what could be wrong. Plus, I do not see any problems in any of my environments...

it's possible, but the system that I run into problems on had no problems with previous builds, and the new build has no problems on other systems that don't have IPv6. so I suspect that what's happening is that with imudp trying to listen to multiple sockets (one IPv4, one IPv6) it's getting confused when a packet arrives, tries to read it from the wrong socket, then does a select again and repeating the loop.

I'll try to install a couple other distros to get IPv6 support with different kernels and see what happens (I disable Ipv6 in all the kernels I compile as I don't use it yet)

David Lang
dlang
Frequent Poster
 
Posts: 126
Joined: Mon Sep 15, 2008 7:44 am

Re: rsyslog performance

Postby rgerhards » Thu Oct 23, 2008 1:59 pm

FYI: I have just uploaded the "official" patch for the new leightweight HUP. It is in nextmaster.
User avatar
rgerhards
Site Admin
 
Posts: 2647
Joined: Thu Feb 13, 2003 11:57 am

Google Ads


PreviousNext

Return to Developer's Corner

Who is online

Users browsing this forum: No registered users and 0 guests

cron