Almost same query two different results

General discussions here

Moderator: rgerhards

Google Ads


Almost same query two different results

Postby luismg » Tue Nov 29, 2016 9:16 pm

Hello I have a really poor performance on my MariaDB rsyslog server, I've been tesing several my.cnf variable options but it doesn't help.
The most impressive thing is this query

MariaDB [rsyslogdb]> SELECT id, devicereportedtime, facility, priority, fromhost, syslogtag, processid, infounitid, message FROM `SystemEvents` WHERE ( fromhost = 'IC1000F001' ) ORDER BY id DESC LIMIT 50;
50 rows in set (0.00 sec)
worst case 0.02 sec

And a simpler one takes way much longer.

SELECT id, devicereportedtime, fromhost, priority, message FROM `SystemEvents` WHERE ( fromhost = 'IC1000F001' ) ORDER BY id LIMIT 50 ;
50 rows in set (4 min 37.04 sec)

the fewer select causes I do I get the longer transaction time.
is there anything I am doing wrong?

if loganalyzer uses DESC it goes pretty well if DESC doesn't exist I am dead in time

thanks in advance
luismg
New
 
Posts: 7
Joined: Fri Nov 25, 2016 9:59 am

Urgent Question?

  • Pulling out your Hair?
  • Wasting Time and Money?
  • Deadline Approaching?

Re: Almost same query two different results

Postby dlang » Wed Nov 30, 2016 11:11 pm

I suspect that the difference is that recent log messages are still cached in RAM, so the database doesn't have to go to disk to fetch the last 50, but if you ask for the earliest 50 in the log, it has to fetch them from disk.

I'll bet that if you repeat the slow query, it will be much faster the second time.
dlang
Frequent Poster
 
Posts: 1001
Joined: Mon Sep 15, 2008 7:44 am

Re: Almost same query two different results

Postby luismg » Mon Dec 05, 2016 8:28 pm

could I set the date as an index? in that case searches with date will go extremly fast compared to how they go now, I plan to have a really big database, several TBs of data.
will that break rsyslog or loganalyzer?
I mean the variable devicereportedtime

regards
luismg
New
 
Posts: 7
Joined: Fri Nov 25, 2016 9:59 am

Re: Almost same query two different results

Postby dlang » Tue Dec 06, 2016 12:36 am

having a date on the index will let the database know what records it needs to retrieve faster, it will still need to retrieve them from disk rather than from RAM, so it won't solve all your speed problems.

rsyslog doesn't care how big your database is, it's only concern is that the database is fast enough to keep up :-) it's possible to have very large databases and have them fast, it's also possible for large databases to be slow, that's out of the scope of rsyslog [1]

I don't use loganalyzer, so I can't say for sure, but I suspect the database is what's going to matter here as well.

[1] personal opinion, I think the idea of storing everything in a database and then doing queries on it is very useful for when you have no idea what you are going t need ahead of time, but very broken for things that you know you are going to be interested in. I've seen very large and expensive Splunk installations killed because they were being used inappropriately.

I also question if MySQL is appropriate at that sort of scale, Splunk or ElasticSearch would seem to be far better options

https://www.usenix.org/publications/log ... dashboards
https://www.usenix.org/publications/login/april14/lang
dlang
Frequent Poster
 
Posts: 1001
Joined: Mon Sep 15, 2008 7:44 am

Re: Almost same query two different results

Postby luismg » Wed Dec 07, 2016 9:18 am

I've added two keys, fromhost and devicereportedtime
if I search using the advanced search button the fromhost is not exactly what I want so it takes ages
SELECT id, devicereportedtime, fromhost, priority, facility, message FROM `SystemEvents` WHERE devicereportedtime > '2016-12-06 09:00:00' AND devicereportedtime < '2016-12-06 09:10:00' AND ( fromhost LIKE '%XXX000F001%' ) AND infounitid IN (1) AND ( syslogtag LIKE '%10.161.7.240%' ) ORDER BY id DESC LIMIT 100

If i just filter the host on the main screen is way much faster almost instanct (because it is using the indexes)
SELECT id, devicereportedtime, fromhost, priority, facility, message FROM `SystemEvents` WHERE ( fromhost = 'XXX000F001' ) ORDER BY id DESC LIMIT 100

Image

Is there any change to the code that I can make so the advance search takes exactly what I imput?

kind regards
luismg
New
 
Posts: 7
Joined: Fri Nov 25, 2016 9:59 am

Re: Almost same query two different results

Postby luismg » Wed Sep 27, 2017 1:14 pm

I have to add that now my db is almost 900GB and the search is slow as hell.
Even connecting to the db
SELECT * FROM `SystemEvents` WHERE devicereportedtime BETWEEN '2017-09-26 16:00:00' AND '2017-09-26 22:00:00' AND ( fromhost = 'IXXX000F001' ) AND ( message LIKE '%ip%' ) ORDER BY id DESC

takes forever, forever I mean in 12 hours id hasn't finished. Are there any recomendatios for massive amount of storage?

regards.
luismg
New
 
Posts: 7
Joined: Fri Nov 25, 2016 9:59 am

Google Ads



Return to General

Who is online

Users browsing this forum: No registered users and 2 guests

cron