Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Visitor Log load very fast even with hundreds of thousands of visitors, or if a visitor views thousands of pages #3097

Closed
mattab opened this issue Apr 9, 2012 · 5 comments
Labels
Bug For errors / faults / flaws / inconsistencies etc. c: Performance For when we could improve the performance / speed of Matomo. Critical Indicates the severity of an issue is very critical and the issue has a very high priority.
Milestone

Comments

@mattab
Copy link
Member

mattab commented Apr 9, 2012

On a test Piwik, select year period, then visitor log, loading the page is really slow and I suppose could even crash Mysql in some cases.

When looking at year or month or day it should be as fast in all cases, since we select the last N visitors for the last day of the period.

I thought this was fixed already, but it isn't, we should run a clever SQL that is fast just like when looking at one day data.

@timo-bes
Copy link
Member

As far as I can see, the problem is as follows:

Take a look at Piwik_Live_API::loadLastVisitorDetailsFromDatabase().
It only does a LIMIT if $filter_limit is set, but there is none.

It loads all visits in the period and the data table widget does pagination.
It's not hard to see why this fails for many visits.

If you do echo $subQuery['sql'];, you get:
SELECT log_visit.* FROM piwik_log_visit AS log_visit WHERE log_visit.idsite = ? AND log_visit.visit_last_action_time >= ? AND log_visit.visit_last_action_time <= ? ORDER BY idsite, visit_last_action_time DESC

It selects everything in the range (which can be millions of rows), sorts it, ships it to PHP and builds a data table. The controller method getVisitorLog does $view->setSortedColumn('idVisit', 'ASC');, which might even sort it again in PHP before applying pagination. Obviously, that won't work for millions of rows in the data table.

What we would have to do is this:

  • Use LIMIT in the SQL query
  • Use calc_found_rows in SQL and add new methods to the data table to generate the pagination without having the actual data
  • Maybe we should avoid sorting on high traffic websites and rely on the ording of the table. Not sure whether this is possible because a GROUP BY is applied to the select query quoted above and AFAIK there is no specified ordering after a GROUP BY

@mattab
Copy link
Member Author

mattab commented Dec 17, 2012

I would propose instead to keep this SQL and call it, for 1 day at a time until there are enough visitors for the display? that would be simpler I think...

@timo-bes
Copy link
Member

One day still has an unlimited number of rows. Doing it for one day at a time is an improvement but still not a fix for very high traffic websites.

@timo-bes
Copy link
Member

The problem was not as described here. Nevertheless, it should be fixed in [3ac47e8]

@mattab
Copy link
Member Author

mattab commented Mar 26, 2013

Great work on this one!

@mattab mattab added this to the 1.12 - The Great 1.x Backlog milestone Jul 8, 2014
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug For errors / faults / flaws / inconsistencies etc. c: Performance For when we could improve the performance / speed of Matomo. Critical Indicates the severity of an issue is very critical and the issue has a very high priority.
Projects
None yet
Development

No branches or pull requests

2 participants