Ticket #5 (new Bug)

Opened 2 years ago

Last modified 5 weeks ago

Tables log_ are not purged after archiving is done

Reported by: matt Owned by:
Priority: major Milestone: 2 - Piwik 0.6 - DigitalVibes
Component: Core Keywords:
Cc: Sensitive: no

Description (last modified by matt) (diff)

The information contained in the tables _log_* should be purged once the archiving is successful.

We need to free the _log_* table regularly (daily?) to leave it under control. Otherwise it significantly slows down the stats logging process (mysql rebuilding indices, selecting from this million rows table is more time consuming at archiving time).

We also need to be able to process number of uniques visitors over weeks, months, year and later custom date range. We have to make sure the data is kept in another table.

Example implementation could be as follow:

  • create a table _log_unique_visitors_id
  • every day, at the end of the archiving process, copy all the (visitor_idcookie, idsite, server_date) from the _log_visit table into this _log_unique_visitors_id table
  • delete all the processed logs from _log_visit

When computing a monthly archive, to get the number of unique visitors (see #8)

  • read the number of unique visitor_idcookie for the given month in the _log_unique_visitors_id table
  • (delete these rows from the _log_unique_visitors_id table)

Change History

Changed 2 years ago by matt

  • milestone set to Future releases

Changed 2 years ago by matt

  • description modified (diff)

Changed 2 years ago by matt

  • description modified (diff)

Changed 2 years ago by matt

  • description modified (diff)

Changed 17 months ago by matt

  • milestone changed from Stable release to DigitalVibes

Changed 11 months ago by matt

  • description modified (diff)

Changed 7 months ago by spomoni

Changed 7 months ago by spomoni

Changed 6 weeks ago by weddingdress

Changed 5 weeks ago by vipsoft

  • sensitive unset

After pruning, we can also use MySQL's "COMPRESS" on the corresponding archive tables. A side-effect is that the archive table is read-only, but that's ok if the raw visit information no longer exists to regenerate those archives.

Note: See TracTickets for help on using tickets.