Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finish Live! plugin: bugs and small enhancements #1120

Closed
mattab opened this issue Jan 20, 2010 · 58 comments
Closed

Finish Live! plugin: bugs and small enhancements #1120

mattab opened this issue Jan 20, 2010 · 58 comments
Assignees
Labels
Enhancement For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc. Major Indicates the severity or impact or benefit of an issue is much higher than normal but not critical.

Comments

@mattab
Copy link
Member

mattab commented Jan 20, 2010

The Live! plugin is now bundled in Piwik core but disabled by default. Here is the list of bugs or changes that we would like to address for a public release:

Live! widget todo list

Bugs

  • reduce AJAX requests to make it faster. Fixes also problem with duping (race conditions in requests)

Visitor log

  • show Page titles + add a link to the URL
  • Put menu entry last.
  • Add a few dropdowns to filter visitor-log for specific purposes, e.g show only visitors that:
    • converted a goal
    • did a minimum of xx Actions on the page
    • bounced immediately (identify bad traffic sources)
  • Possibility to label a visitor IP with a name, and display the label in the row matching this IP.
  • Do you have more ideas for features in the Visitor Log? Let us know

Fixed bugs

  • slightly highlight visitors which converted a goal
  • ~~A proposal was made for the Live! widget design, see at: http://synaptique.ca/visitpathinfo/pathinfo.htm ~~
  • the top table "Today / Last 30 minutes" shows a cursor on hover but is not clickable: it should show the normal pointer.
  • see Live! widget does not show correct number of visits for 'today' #1375
  • the CSS is not applied in Safari, the table displays black and white (Tested in Safari5 [sic])
  • the top table "Today / Last 30 minutes" shows a cursor on hover but is not clickable: it should show the normal pointer.
  • ~~Performance issues: the visitor log just loads 1000 visits in memory. Instead it should use queries like WHERE id < (last min ID displayed) and idsite=X LIMIT 30. The search field should then be removed. ~~
  • referer URL could be clickable in the visitor log
  • the Visitor log should display icons for browsers, search engines and OS.
  • the icon for returning/known visitor takes up one row for each visitor; it could maybe be written after the browser/OS icons?
  • there could be a link "more" that when hover, a tooltip is displayed with the extra information about the visitor: the IP (rather than displaying it by default), the Time on site, etc. The code for the existing tooltip can be used (the blue tooltip used when page names are too long in the Piwik Page reports)
  • show the goal icon http://piwik.org/demo/themes/default/images/goal.png when a goal is converted. On hover, show the tooltip with the goal name, the URL where it was triggered, the goal revenue
  • ~~for downloads, use a download icon, eg. http://icons3.iconfinder.netdna-cdn.com/data/icons/fugue/icons/drive-download.png ~~
  • for outlinks, use a out click icon, eg. http://icons3.iconfinder.netdna-cdn.com/data/icons/tango/16x16/actions/system-log-out.png
  • I noticed that the code in the API wasn't refactored and added some TODOs there.
  • if I enable the plugin on piwik.org, it is very slow - did you have a chance to test the plugin with a big Piwik installation? I think there are some performance issues.
  • the search box doesn't seem to do anything, should it be removed?
  • idem for the graphs buttons and the advanced table button, they should be removed
  • deduping visits doesn't work all the time. If you try on piwik.org/demo to add the Live! widget, and wait for a while, you will see the same visits being displayed again and again.

See original ticket at #44

Keywords: visitor log, live plugin

@anonymous-matomo-user
Copy link

http://www.wpwp.org/ -> nice idea for live statistics

@mattab
Copy link
Member Author

mattab commented Jan 26, 2010

This wordpress stats plugin looks nice indeed: http://www.wpwp.org/whatis/ the spyview with the google maps is interesting.

@robocoder
Copy link
Contributor

(In [1803]) refs #1120 - suppress broken image link where no search engine icon defined

@anonymous-matomo-user
Copy link

Now added the actual Version:

open issues:

Live Widget

  • if I enable the plugin on piwik.org, it is very slow - did you have a chance to test the plugin with a big Piwik installation? I think there are some performance issues.

Visitor Log

  • in 0.5, we now have the page URL and the page title for each page view. Could it be added to the list of fields returned in the API?
  • referer URL could be clickable in the visitor log

@anonymous-matomo-user
Copy link

Attachment:
piwik.patch

@robocoder
Copy link
Contributor

(In [1868]) refs #1120 - merged from Live.zip as there were too many conflicts with piwik.patch against the svn trunk

@robocoder
Copy link
Contributor

Thomas: please "svn up" and sanity check my merge. Thanks.

@mattab
Copy link
Member Author

mattab commented Feb 24, 2010

Performance issues

  • When the query takes longer than 5s to return, the new request is still thrown to fetch more visits. This means that you keep adding new requests to the queue of requests. Instead, you should never send a new request until the previous one was received.
  • on piwik.org, the query http://piwik.org/demo/index.php?module=Live&action=ajaxTotalVisitors&idSite=1&period=day&date=yesterday takes more than 10 s to return. Also, enabling the Live! plugin simply causes the mysql serve to lock all tables and crashes (there are too many slow queries trying to JOIN tables at the same time)

To understand the problem, you can enable mysql slow query log and try to generate the Live! on a test website with 50,000 visits per day, generated using the visits generator.

To solve the issue, we can either: add new INDEXes on the log_ tables (not really something we would like to do), or we can denormalize the log_ tables so that we can fetch the data without doing JOINs.

@anonymous-matomo-user
Copy link

Attachment:
Live.patch

@anonymous-matomo-user
Copy link

Performance issues

We need an new index:

ALTER TABLE piwik_log_visit ADD INDEX index_visit_last_action_time ( idsite , visit_last_action_time )

So i added and replaced the live.zip with the actual stand of the svn.
I also added a patch named live.patch. I can't delete the piwik.patch file :-(

@mattab
Copy link
Member Author

mattab commented Mar 2, 2010

jr-ewing, adding a new index is possible but very heavy update task. Can you please document why this index is needed and why is is sufficient to make Live! plugin work fine on high traffic website? for example, can you post mysql slow logs before and after the INDEX, details of the website you are testing against (how much traffic today, yesterday, the last N days, etc.)

Thanks!

@robocoder
Copy link
Contributor

Reported on the forum: when the Live widget refreshes, it reverts back to displaying visits from idsite=1, while the rest of the dashboard is displaying a different idsite (e.g., 2).

@peterbo
Copy link
Contributor

peterbo commented Mar 24, 2010

Problem:
Live-Plugin triggers slow query, that locks up the table. This is especially critical, when it is used over the desktop client since it doesn't wait for a query to finish. The query stack then fills up until - in worst case - the database crashes.

Adding an index to the single col visit_last_action_time may solve the problem.

For testing, I used a simplified SQL-Query in a simplified test-case (query cache off / flushed):

("SELECT * FROM
piwik_log_visit WHERE
idsite= 2 ANDvisit_last_action_time` > DATE_SUB( NOW( ) , INTERVAL 30
MINUTE )")
There are ~2Mio rows in the whole table.

Testing results:

  • 7.6 s
  • 7.5 s
  • 7.8 s

Then I added an index to visit_last_action_time (ALTER TABLE piwik_log_visit ADD INDEX ( visit_last_action_time ) )(query time ~35 s):

  • 0.000x s
  • 0.000x s
  • 0.000x s

A deeper view in this explains, why this key changes the query times so much:
SQL: (EXPLAIN SELECT [...] INTERVAL 10 DAYS..) - changed interval to 10 days, because the effect is manifesting better here:
previosly (without the index on visit_last_action_time), MySQL uses the index_idsite_date_config-key for the query (since idSite is in the where-clause). MySQL then scans all Rows, where idSite is x and then sorts out the rows where the date-range definition of the query doesn't fit. This causes a big data throughput because many rows are queried that aren't needed in the end.

With an index on visit_last_action_time, MySQL changes the query-type to "range" and uses the visit_last_action_time-index for the query. So it doesn't fetch any other rows that aren't needed in the result set. A lot of data overhead could be saved here to speed the query up extremely.

Cheers

@peterbo
Copy link
Contributor

peterbo commented Mar 24, 2010

A deeper look at the problem seems to be tracked down to this (extracted Query):

("SELECT piwik_log_visit.* , piwik_goal.match_attribute
FROM piwik_log_visit
LEFT JOIN piwik_log_conversion ON piwik_log_visit.idvisit = piwik_log_conversion.idvisit
LEFT JOIN piwik_goal ON piwik_goal.idgoal = piwik_log_conversion.idgoal AND piwik_goal.deleted = 0
WHERE piwik_log_visit.idsite = '3' AND piwik_log_visit.idvisit > ''
ORDER BY idvisit DESC
LIMIT 10")

These multiple Joins are extremly expensive for performance because MySQL is using nested loops. In this case, MySQL joins over a Million rows (using a tmp table) and then limits them to the ten highest idvisit-id's. This is a huge piece of data to process. As we can't know how large the tables will grow in a high-traffic environment, we should avoid table joins here but rather use a stepwise approach: fetching the relevant idvisit-id's from the first table, query them from the second one and so on. So it will be possible to build an array of correlated data and it's mich faster.

Cheers

@anonymous-matomo-user
Copy link

Replying to peterb:

Hello Peterb,
this i write in Comment #12. So thanks for your support but we also done some performance issues.
Please can you try it with the actual Live.zip in the attachment of this ticket ?

best regards

Tom

@anonymous-matomo-user
Copy link

Attachment:
Live.zip

@anonymous-matomo-user
Copy link

Attachment:
Live2.patch

@mattab
Copy link
Member Author

mattab commented Mar 28, 2010

Note, the work on the timezone see #5476, will add an index on ( idsite , visit_last_action_time ) which should help with Live! performance issues correct?

the new index will be available in the next public release (likely 0.6)

With this new INDEX, are there still performance bottlenecks on high traffic websites?

@mattab
Copy link
Member Author

mattab commented Mar 28, 2010

Actually, the exact INDEX is likely to be

'ALTER TABLE ' . Piwik::prefixTable('log_visit') . ' ADD INDEX `index_idsite_datetime_config`  ( `idsite` , `visit_last_action_time`  , `config_md5config` ( 8 ) ) ;' => false,

but we will remove the existing INDEX piwik_log_visit.visit_server_date - will this pose some problem to Live! plugin?

@anonymous-matomo-user
Copy link

Replying to matt:

Yes the index will help enormous

With this new INDEX, are there still performance bottlenecks on high traffic websites?
No - i have 6000 Visits on a Webseite a day with response time of 2,4s

piwik_log_visit.visit_server_date is not using by the live plugin.

@anonymous-matomo-user
Copy link

but it would be useful to test it on the Piwik Demo

@mattab
Copy link
Member Author

mattab commented Mar 28, 2010

2.4s sounds reasonnable, but still quite slow; is the INDEX fully hit during the query?

Also, would it be possible to add the request throttle, so that you only send one request at a time, and not queue up requests on the server? Currently, I believe that if the request hits the DB and takes 60s to return, the Live! plugin will queue up dozens of requests, killing the webserver. What do you think?

@anonymous-matomo-user
Copy link

We need one Request to get

  1. the number of last visits for the day
  2. the number of last visits for the last 30 Minutes
  3. the pageviews
  4. the pageviews in the last 30 min
  5. to get the last 10 visits

For all these Requests we need the index because we only need the last visitors.

Point 1-4 is repeat every 20 Sec.
Point 5 is also performant, because getting the goals for every pageview for the last 10 visitors. For this feature 2 Joins are needed :-(

So i dont know how to get a good performance for so many data.

@mattab
Copy link
Member Author

mattab commented Mar 29, 2010

The new INDEX was committed to trunk: see http://dev.piwik.org/trac/changeset/2006#file14

@mattab
Copy link
Member Author

mattab commented Mar 29, 2010

jr-ewing, my comment meant that the Javascript should only send the request to the server if the previous AJAX request was received. You could add a test in the response handler that would set a flag, allowing other AJAX to trigger. If the flag is not set, it means that the AJAX request wasn't received yet and no other requests should go to the DB until the first request was received.

@anonymous-matomo-user
Copy link

Attachment:
Live-0-5-5.zip

@anonymous-matomo-user
Copy link

Attachment:
Piwik-3.patch

@robocoder
Copy link
Contributor

(In [2019]) refs #1120 - commiting update from Live-0.5.5.zip

@mattab
Copy link
Member Author

mattab commented Apr 13, 2010

I enabled Live! plugin on piwik.org/demo and it broke down the server.

Here is the mysql Slow query analysis:

Tab Visitors>Visitor Log

  • query takes 3 minutes to complete
  • it does a LIMIT 1000 - this should be limit 10 or 20? why 1000?
  • slow query log:
Count         : 2  (7.69%)
Time          : 319 s total, 159.5 s avg, 140 s to 179 s max  (9.71%)
Lock Time (s) : 0 total, 0 avg, 0 to 0 max  (0.00%)
Rows sent     : 1.00k avg, 1.00k to 1.00k max  (44.37%)
Rows examined : 7.51M avg, 7.51M to 7.51M max  (47.51%)
Database      :
Users         :
        user@localhost  : 100.00% (2) of query, 92.31% (24) of all users

Query abstract:
SELECT piwik_log_visit.* , piwik_goal.match_attribute FROM piwik_log_visit LEFT JOIN piwik_log_conversion ON piwik_log_visit.idvisit = piwik_log_conversion.idvisit LEFT JOIN pi
wik_goal ON piwik_goal.idgoal = piwik_log_conversion.idgoal AND piwik_goal.deleted = N WHERE piwik_log_visit.idsite = 'S' ORDER BY idvisit DESC LIMIT N;

Query sample:
SELECT  piwik_log_visit.* ,
                                                piwik_goal.`match_attribute`
                                FROM piwik_log_visit
                                        LEFT JOIN piwik_log_conversion
                                        ON piwik_log_visit.`idvisit` = piwik_log_conversion.`idvisit`
                                        LEFT JOIN piwik_goal
                                        ON piwik_goal.`idgoal` = piwik_log_conversion.`idgoal`
                                        AND piwik_goal.`deleted` = 0
                                 WHERE piwik_log_visit.idsite = '1'
                                ORDER BY idvisit DESC
                                LIMIT 1000;

Fix

  • add a server_date > $sevenDaysAgo to ensure that the limit is bounded and doesn't full scan of the log table.

Live! Widget

Queries now run fast, except the first one that is the same as the one above.

fixing this query should therefore fix performance issues with the Live! widget. I'm not sure whether to limit the Visitor log to the last day, or more. Users might want to access logs from older dates too. Maybe the query for the Live! widget only should restrict to the last 24 hours, when the visitor log LIMIT 1000 would limit to the last 7 days?

The proper solution would be to have the visitor log fetch each day separately automatically when clicking Next or Previous (and have the last date as a parameter).

@mattab
Copy link
Member Author

mattab commented Apr 13, 2010

(In [2097]) Refs #1120

  • Fixes performance issue with the Live! widget by changing the primary key on the log_visit table: all sql queries must hit the idsite first, then the idvisit, to ensure the INDEX is used for the ORDER BY LIMIT (without doing full table scan).
  • Lowering timeout to 8s for more real time results

@peterbo
Copy link
Contributor

peterbo commented Jul 29, 2010

Attachment:
#1120-#v2-1.patch

@peterbo
Copy link
Contributor

peterbo commented Jul 29, 2010

(In [2787]) Coding style, Refs #1120

@peterbo
Copy link
Contributor

peterbo commented Jul 29, 2010

(In [2788]) Modification for new Minify-API, Refs #1120

@mattab
Copy link
Member Author

mattab commented Jul 30, 2010

(In [2808]) Refs #1120
Adding missing js to hook + Minor updates

@mattab
Copy link
Member Author

mattab commented Jul 30, 2010

Some great progress has been made on this widget by Peter. Postponing next requests to post 1.0...

@peterbo
Copy link
Contributor

peterbo commented Jul 31, 2010

Attachment: Show goal details per action; Highlight action field in the row, where conversion happened; + Minor Updates
#1120-#20100731-v2.patch

@peterbo
Copy link
Contributor

peterbo commented Jul 31, 2010

(In [2832]) Refs #1120; Show goal details per action; Highlight action field in the row, where conversion happened; + Minor Updates; Must refresh assets;

@mattab
Copy link
Member Author

mattab commented Aug 3, 2010

From email: A very cool UI improvement for the same page could be to display the whole referrer link but only the domain-name in black, the URI in light grey beneath.

@mattab
Copy link
Member Author

mattab commented Aug 5, 2010

(In [2865]) Refs #1120
simplifying Visitor log line for conversions, removing date and some bold.
Reusing same goal icon as common goal icon.

@mattab
Copy link
Member Author

mattab commented Aug 6, 2010

Idea from forum: in visitor log, would be interesting to see a link for returning visitors that would show all previous visits.

@anonymous-matomo-user
Copy link

I'm working on adding customizable icons to the Visitor Log - basically, if a specified regex matches the URL, use a specified icon instead of the default. This can be set up to display the main types of pages on a site at-a-glance or to highlight key pages.

I'm doing this inside visitorLog.tpl. Couple questions - first, where should I be storing/loading the configuration from? Also, and more importantly for this ticket, what needs to be done to make this something that could be rolled in to the core plugin?

@anonymous-matomo-user
Copy link

Excuse me, that should be "inside liveVisits.tpl" - oops.

@mattab
Copy link
Member Author

mattab commented Nov 16, 2010

ian, I would propose something around adding a hook that would modify the table of visitor details, at the end of getCleanedVisitorsFromDetails ?

@mattab
Copy link
Member Author

mattab commented Nov 24, 2010

(In [3331]) Fixes #1794, Refs #1120

  • Fixing deduping error: now visitors converting more than one goal only appear once as expected (limitation: only one goal icon will show, rather than one icon per converted goal)
  • Removed the confusing "more..." text. Now displaying Visit duration after server datatetime, and displaying IP (for super users and admin) after the line of icons
  • Now showing the IP to super user and admin users (but not to anonymous or 'view' users)

@mattab
Copy link
Member Author

mattab commented Nov 24, 2010

(In [3332]) Refs #1120

  • replace last hardcoded strings i18n
  • refactor identical html table

@mattab
Copy link
Member Author

mattab commented Nov 24, 2010

Closing the ticket, and putting open issues in #1838 and #1839

@anonymous-matomo-user
Copy link

I thought to reopen this ticket by detailing an improvement / optimization that can be done (by moving count() functionality into SQL query):

  1. In API.php I've made the following changes...

a) From "SELECT " . Piwik_Common::prefixTable('log_visit') . ".idvisit" to "SELECT count(" . Piwik_Common::prefixTable('log_visit') . ".idvisit) AS count"
b) From "SELECT " . Piwik_Common::prefixTable('log_link_visit_action') . ".idaction_url" to "SELECT count(" . Piwik_Common::prefixTable('log_link_visit_action') . ".idaction_url) AS count"

  1. In Controller.php I've made the following changes...

a) From "return count($visitors_halfhour)" to "return $visitors_halfhour[0]['count']"
b) From "return count($visitors_today)" to "return $visitors_today[0]['count']"
c) From "return count($pis_halfhour)" to "return $pis_halfhour[0]['count']"
d) From "return count($pis_today)" to "return $pis_today[0]['count']"

That's a huge difference, unless I'm missing something and this functionality is used also somewhere else (in that case I'd suggest creating separate functions for counting visits and pageviews and fetching those data).

@robocoder
Copy link
Contributor

Can you attach an actual diff (i.e., "svn diff") to #1839?

@robocoder
Copy link
Contributor

Replying to eistrati:

Yes, where the view is only displaying the number of visitors, it would be more efficient to use a SQL query that COUNT()s. However, changing that private method in API.php changes the semantics of the Live API.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc. Major Indicates the severity or impact or benefit of an issue is much higher than normal but not critical.
Projects
None yet
Development

No branches or pull requests

4 participants