Opened 5 years ago

Closed 2 years ago

Last modified 16 months ago

#584 closed New feature (fixed)

New Metric: visit count until conversion

Reported by: matt Owned by:
Priority: major Milestone: 1.12.x - Piwik 1.12.x
Component: Core Keywords:
Cc: goal, tracking, goals Sensitive: no

Description

We could provide a new report: visit count until conversion

Example
You have 1000 conversions

  • 500 conversions happened on the first visit
  • 200 conversions happened on the second visit
  • 60 conversions happened on the third to fifth visit
  • etc.

Attachments (4)

issue584.diff (30.4 KB) - added by capedfuzz 3 years ago.
Patch for this issue. Made for revision 5193.
issue584.tar.gz (14.7 KB) - added by capedfuzz 3 years ago.
Patch for issues #584, #536, #2031. Made for revision 5256.
issue584v2.tar.gz (15.9 KB) - added by capedfuzz 3 years ago.
New patch for issues #584, #536, #2031. Made for revision 5270.
issue584v3.tar.gz (20.1 KB) - added by capedfuzz 3 years ago.
Patch for issues #584, #536, #2031. Made for revision 5368.

Download all attachments as: .zip

Change History (39)

comment:1 Changed 5 years ago by matt (mattab)

  • Type changed from Bug to New feature

comment:2 Changed 5 years ago by matt (mattab)

  • Cc goal tracking goals added

comment:4 Changed 3 years ago by matt (mattab)

  • Milestone changed from Feature requests to 1.x - Piwik 1.x

comment:5 Changed 3 years ago by matt (mattab)

  • Milestone changed from 1.x - Piwik 1.x to 1.3 - Piwik 1.3

comment:6 Changed 3 years ago by matt (mattab)

Note: the field is zero before 1.2-rc1 update, and 1 after by default. Maybe we can assume 1 visit for all previous 1.2 logs.

comment:7 Changed 3 years ago by capedfuzz (diosmosis)

I'd like to work on this ticket, but I've got a couple of questions:

0) The new report will look like the example given in the ticket description, correct?

1) Should there be an ‘overall’ report (that uses conversion data for every goal) in addition to a ‘per-goal’ report (that uses data for one goal)?

comment:8 Changed 3 years ago by matt (mattab)

Great ticket to pick up!

Here is breakdown used by GA:

Visits to Purchase (from start of last campaign) 
1 visits 	2,027.00 	
2 visits 	252.00 	
3 visits 	38.00 	
4 visits 	19.00 	
5 visits 	11.00 	
6 visits 	10.00 	
7 visits 	2.00 	
8 visits 	1.00 	
9-14 visits 	5.00 	
15-25 visits 	4.00 	
26-50 visits 	8.00 	
51-100 visits   1.00

I suggest we use the same breakdown as it makes sense and is consistent.

1) This report would be for Ecommerce, any goal, and I agree that overall weighted report is a good idea too (simply sum each goal worth of data and not issue a new query)

2) The report would appear as a new report below the http://piwik.org/docs/tracking-goals-web-analytics/#toc-goal-detailed-report existing "View by referer"]

3) I would also suggest you maybe consider the similar ticket #2031 and also #583 and also #536 after this one, or at the same time, as the code will be very similar and probably in the same plugin / archiving function.

comment:9 Changed 3 years ago by capedfuzz (diosmosis)

Thanks for the info, and I'll take your suggestion and work on the other tickets as well. I'll update again when I have a working prototype.

Changed 3 years ago by capedfuzz (diosmosis)

Patch for this issue. Made for revision 5193.

comment:10 Changed 3 years ago by capedfuzz (diosmosis)

Some notes on my patch:

  • Right now the visit ranges the report uses (ie, 26-50 visits, 51-100 visits, etc.) are determined by an array of constants. This could be made configurable (I can submit a new patch if you'd like me to do that before you apply this one).
  • There was an issue w/ ViewDataTable/DataTable. ViewDataTable lets you queue filters but doesn’t queue these filters with the DataTable itself. It just stores them and then runs them directly on the DataTable instance. This becomes a problem when exporting DataTables to, say, CSV since some filters are necessary to make the data readable. My patch gets around this by checking whether the module being executed is the API and queuing a filter directly on the DataTable instead of the view, but this could also be fixed by making ViewDataTable apply the DataTable's filters right after sorting. I didn’t know if this would break existing behavior, though, so I went the other way.
  • I disabled the show goals icon in the resulting view since the metrics displayed don’t make sense when applied to ‘people who converted on the Nth visit’.
  • I disabled the more metrics icon too. This info would be useful but would require either doing an extra query or adding more of log_visit’s columns to log_conversion. I figured I should ask before making either of these changes.
  • I added the following English i18n entries:

General_Visit_Singular
General_Visit_Plural
Goals_ColumnVisitsUntilConv
Goals_ColumnConversionCount
Goals_ConversionTime
Goals_VisitsUntil

I didn't add them for any other language, though if there's a way for me to do it I'd be willing to.

Let me know if anything should be changed/added.

comment:11 follow-up: Changed 3 years ago by matt (mattab)

an array of constants

that is fine, have you used the one suggested in each ticket? I put there the GA ones, or a custom after thinking about it so hopefully it should work well for all users and no need for configuration settings.

There was an issue w/ ViewDataTable/DataTable. ViewDataTable lets you queue filters but doesn’t queue these filters with the DataTable itself.

Your solution here sounds like maybe we can do it another way. Are you trying to do something new to Piwik? Maybe see the existing code for similar reports, which work fine, and your code can follow these?

  • I disabled the show goals icon in the resulting view since the metrics displayed don’t make sense when applied to ‘people who converted on the Nth visit’.
  • I disabled the more metrics icon too. This info would be useful but would require either doing an extra query or adding more of log_visit’s columns to log_conversion. I figured I should ask before making either of these changes.

Sounds good

I didn't add them for any other language, though if there's a way for me to do it I'd be willing to.

Translators do the translation later ;) http://piwik.org/translations/

Let me know if anything should be changed/added.

Please apply above feedback and then post a patch to this ticket and mention all the other tickets fixed in the same patch. Then I can look at it in detail, test it, see the changes which is all called a code review. Then you can do a few more changes or we can commit when it's ready and working. You can see the process in: http://piwik.org/participate/development-process/#toc-how-to-submit-a-patch

Looking forward to seeing your work :)

comment:12 in reply to: ↑ 11 ; follow-up: Changed 3 years ago by capedfuzz (diosmosis)

Replying to matt:

an array of constants

that is fine, have you used the one suggested in each ticket? I put there the GA ones, or a custom after thinking about it so hopefully it should work well for all users and no need for configuration settings.

I used the one you suggested. (BTW, this patch is just for the visits to conversion report. I'll fix the other issues & add them to one ticket before I attach another patch.)

There was an issue w/ ViewDataTable/DataTable. ViewDataTable lets you queue filters but doesn’t queue these filters with the DataTable itself.

Your solution here sounds like maybe we can do it another way. Are you trying to do something new to Piwik? Maybe see the existing code for similar reports, which work fine, and your code can follow these?

I might be trying to do something new. The reports I looked at had variable data in the 'label' column. When sorting by this column piwik will sort lexicographically, which makes sense for these reports. Sorting this new report in this way, however, makes it look odd (ie, with 15-25 visits right after 1 visits).

To get around this, the label column in the new report holds an index and a filter is used to make sure the label is transformed into a string AFTER the table is sorted.

Is there another report that has a special sort of ordering? If so, I could take a look at that.

[snip]
Please apply above feedback and then post a patch to this ticket and mention all the other tickets fixed in the same patch. Then I can look at it in detail, test it, see the changes which is all called a code review. Then you can do a few more changes or we can commit when it's ready and working. You can see the process in: http://piwik.org/participate/development-process/#toc-how-to-submit-a-patch

Looking forward to seeing your work :)

I'll upload another patch as soon as I am able.

comment:13 in reply to: ↑ 12 Changed 3 years ago by matt (mattab)

There was an issue w/ ViewDataTable/DataTable. ViewDataTable lets you queue filters but doesn’t queue these filters with the DataTable itself.

Your solution here sounds like maybe we can do it another way. Are you trying to do something new to Piwik? Maybe see the existing code for similar reports, which work fine, and your code can follow these?

I might be trying to do something new. The reports I looked at had variable data in the 'label' column. When sorting by this column piwik will sort lexicographically, which makes sense for these reports. Sorting this new report in this way, however, makes it look odd (ie, with 15-25 visits right after 1 visits).

You can look at examples VisitorInterest.getNumberOfVisitsPerVisitDuration
The third parameter to Sort filter is

		$dataTable->queueFilter('Sort', array('label', 'asc', true));

which stores the datatable as it contains an integer LABEL

Then the API function calls

$dataTable->queueFilter('ColumnCallbackReplace', array('label', 'Piwik_getDurationLabel'));

This filter is executed AFTER the sort and then replaces the numeric Labels with the human readable range.

So the table is sorted and then you apply the beauty filter to your datatable.

I'll upload another patch as soon as I am able.

Excellent!

comment:14 Changed 3 years ago by capedfuzz (diosmosis)

One more quick question, what's the policy on refactoring in a patch? If I find opportunities to reduce some code redundancy (in this case the beauty filter code used in VisitorInterest), would it be alright to make the changes in this patch?

comment:15 Changed 3 years ago by matt (mattab)

If I find opportunities to reduce some code redundancy (in this case the beauty filter code used in VisitorInterest), would it be alright to make the changes in this patch?

Definitely :) refactoring is very welcome and in fact highly recommended. We would point out duplicate code during the review and generally propose to refactor :) so even better to do it from the start.

If you have any question let us know

Changed 3 years ago by capedfuzz (diosmosis)

Patch for issues #584, #536, #2031. Made for revision 5256.

comment:16 Changed 3 years ago by capedfuzz (diosmosis)

I've uploaded another patch, this one addresses tickets #584, #536, #2031.

I've also uploaded a file (in the same archive as the patch) that describes the changes I made in each file. Hopefully, that will make it easier to review.

I missed the other related issue, but I'll be working on that while I incorporate the changes from the review (if any).

comment:17 follow-up: Changed 3 years ago by matt (mattab)

Great patch!

Most of it is on the right track and looking great, here is complete code review:

  • Refactor

+ check for the special 'eCommerce' goal id
+ $realGoalId = $idGoal == Piwik_Archive::LABEL_ECOMMERCE_ORDER ? 0 : $idGoal;
same as existing code in line 360: $idGoal = ($idGoal == Piwik_Archive::LABEL_ECOMMERCE_ORDER) [....]
can be refactored in a protected function

  • + if we're doing the goal overview, we have to get data tables for every goal

Here the aggregation of all goals is done at runtime, the api will fetch N times the report and sum it. In PIwik generally we pre-aggregate all data as much as possible. Here I would suggest to aggregate the sum of all goals into the 'overview' in the archive helpers for these reports. It would make code easier also and remove the IF ELSE from the function getGoalSpecificDataTable so it would be simple.

  • tag cloud should be enabled in the reports
  • public function getGoals( $idSite, $columns ); - I agree that this could be a good idea, but we cant do it for just one api function and not others (websites, users, reports etc). It would have to be a generic thing. Please revert.
  • generally the variable names containing "metric" refer to what we call "record" in the archive table, since they are often name of blobs rather than metrics
  • the function createRangeDataTable() is not necessary, since you can instead put the "label" string building in the function buildReduceByRangeSelect(). This is what is done in the similar report in VisitorInterest.php function getTablePageGap
  • buildRangeDataTableFromRow() looks like it sums rows together. Take a look at other archiving examples that sum data, there are functions in the datatable and archive classes to sum
  • > Modified this template so the goal ID would be sent with AJAX requests that require it. In this case these reports would be the 'visits to conversion' and 'days to conversion' reports.

About use_goalid - is it used only to define the idGoal that the report should be displayed for ? in this case you can use idGoal which is already used in other reports (and in fact will be set already when querying your report and looking at specific goal). Why is idGoal not always passed when the report itself is queried for an idGoal ?

  • + protected function getDataTable($name, $idSite, $period, $date, $segment, $index = Piwik_Archive::INDEX_NB_VISITS)

$index renamed to $column

  • $totalVisitors = $dataTable->getRowFromId(0)->getColumn(Piwik_Archive::INDEX_NB_UNIQ_VISITORS);

+ $dataTable->queueFilter('ColumnCallbackAddColumnPercentage', array(
+ 'visitors_percent', 'nb_uniq_visitors', $totalVisitors));
Why row 0 contains the sum of all data?

  • getNumberOfVisitorsByVisitNum renamed in getNumberOfVisitorsByVisit (idem for VisitorInterest_VisitorsByVisitNum
  • +		// sum the visitorsByVisits datatable w/o assigning a 'summed' label. The visitor_count_visits
    +		// column of log_visits will be unique per visitor and across every time period, so
    +		// the sum of visitor counts in this table is still a valid visitor count.
    
    While your solution is more "correct" in Piwik we must tradeoff performance with absolute correctness over a large date range. In all existing reports, appart from the "Unique Visitors" metric, we do sum visits across dates. In this case, we want to do the same, for consistency (we don't want to only process these 2 reports for all periods), but also for performance (takes cpu time and additional code).

So, here I vote for revert of the changes & new logic in the VisitorInterest.php archiving techniques. (remove uniq visitors and put visits instead, and just sum the reports over several days for week/month/range archives).

Let me know your thoughts and if you have any question! looking forward to next patch :)

comment:18 in reply to: ↑ 17 ; follow-up: Changed 3 years ago by capedfuzz (diosmosis)

Thanks for taking the time to review my patch (especially so promptly :)).

Replying to matt:

About use_goalid - is it used only to define the idGoal that the report should be displayed for ? in this case you can use idGoal which is already used in other reports (and in fact will be set already when querying your report and looking at specific goal). Why is idGoal not always passed when the report itself is queried for an idGoal ?

I checked again, and it is passed, but the parameter is named 'filter_only_display_idgoal'. Must've missed it :)

There is something else, though. None of the other reports seem to actively use the goal id. They all seem to select one blob which contains all the data, then only display some of it (which explains the 'filter_only_display_idgoal' parameter). My patch, on the other hand, will create a record for every goal. Is the former approach desired rather than mine?

Also, this isn't related to this issue, but I just noticed there might be some inaccuracy in the goal-specific reports. I created a goal where conversions are visits whose page title contains 'piwik' then generated some visits. The first row in the keywords report looked like this:

Keyword		Visits		second_goal conversions second_goal conv. rate
piwik bingbot	2		4			200%

From the visitor log there were only two conversions with the keyword 'piwik bingbot'.

Why row 0 contains the sum of all data?

Since I select the count of visitors, row 0 (the row w/ visitors who visited at least once) is the super set of all the other rows, ie, visitors who visited at least twice are also visitors who visited at least once.

If I change it to select visits, then I'll have to sum every row, though...

While your solution is more "correct" in Piwik we must tradeoff performance with absolute correctness over a large date range. In all existing reports, appart from the "Unique Visitors" metric, we do sum visits across dates. In this case, we want to do the same, for consistency (we don't want to only process these 2 reports for all periods), but also for performance (takes cpu time and additional code).

So, here I vote for revert of the changes & new logic in the VisitorInterest.php archiving techniques. (remove uniq visitors and put visits instead, and just sum the reports over several days for week/month/range archives).

To clarify, do you mean to avoid summing anything and instead select a count(*) & group by visitor_count_visits? Or something else? My patch currently sums over the visitor_count_visits column when archiving for one day and adds data tables together when archiving periods.

Also, would it be a good idea to keep the refactoring I did (I merged two similar functions)?

Finally, maybe I'm not looking at this correctly, but I think the 'Visits that were the visitor's nth visit' statistic is the same thing as the '# of visitors who visit at least N times'. For example, if you've got data like:

idvisitor	visitor_count_visits
A		1
A		2
A		3
B		1
B		2
C		1

then the visits that were the visitor's 1st visit (3) is the same as the # of visitors who visited at least 1 time (3). If I am looking at it correctly, is it still better to display visits as a metric or is it better to display visitors?

comment:19 in reply to: ↑ 18 ; follow-up: Changed 3 years ago by matt (mattab)

There is something else, though. None of the other reports seem to actively use the goal id. They all seem to select one blob which contains all the data, then only display some of it (which explains the 'filter_only_display_idgoal' parameter). My patch, on the other hand, will create a record for every goal. Is the former approach desired rather than mine?

In your case, one different blob per idgoal is a much better solution indeed. but I think it's better to reuse existing filter_only_display_idgoal parameter since it does the same thing. if you're keen, you can rename it globally to "filter_idgoal" for example?

Also, this isn't related to this issue, but I just noticed there might be some inaccuracy in the goal-specific reports. I created a goal where conversions are visits whose page title contains 'piwik' then generated some visits. The first row in the keywords report looked like this:

Keyword		Visits		second_goal conversions second_goal conv. rate
piwik bingbot	2		4			200%

From the visitor log there were only two conversions with the keyword 'piwik bingbot'.

Why row 0 contains the sum of all data?

I never heard of this bug, but if you are sure there is one, please create a new ticket with the screenshot of the bug or data dump from the log_visit table. Definitely something to investigate!

Since I select the count of visitors, row 0 (the row w/ visitors who visited at least once) is the super set of all the other rows, ie, visitors who visited at least twice are also visitors who visited at least once.

In http://dev.piwik.org/trac/ticket/536#comment:7 the Visitor who visited 1 time, but not "at least once" - are you talking about this report in #536 ?


So, here I vote for revert of the changes & new logic in the VisitorInterest.php archiving techniques. (remove uniq visitors and put visits instead, and just sum the reports over several days for week/month/range archives).

To clarify, do you mean to avoid summing anything and instead select a count(*) & group by visitor_count_visits?

Yes group by visitor_count_visits is definitely the way to go, and select count(distinct idvisitor)
and call this metric "nb_visits" instead of nb_uniq_visitors (since they can be summed and are in fact more visits than visitors)

Or something else? My patch currently sums over the visitor_count_visits column when archiving for one day and adds data tables together when archiving periods.

Also, would it be a good idea to keep the refactoring I did (I merged two similar functions)?

Yes the one building the SQL for ranges? definitely

Finally, maybe I'm not looking at this correctly, but I think the 'Visits that were the visitor's nth visit' statistic is the same thing as the '# of visitors who visit at least N times'. For example, if you've got data like:

idvisitor	visitor_count_visits
A		1
A		2
A		3
B		1
B		2
C		1

then the visits that were the visitor's 1st visit (3) is the same as the # of visitors who visited at least 1 time (3). If I am looking at it correctly, is it still better to display visits as a metric or is it better to display visitors?

What is the report " # of visitors who visited at least N times"? I think the report "Visits that were the visitor nth visit" is enough isnt it?

Cheers!

comment:20 in reply to: ↑ 19 ; follow-up: Changed 3 years ago by capedfuzz (diosmosis)

Replying to matt:

There is something else, though. None of the other reports seem to actively use the goal id. They all seem to select one blob which contains all the data, then only display some of it (which explains the 'filter_only_display_idgoal' parameter). My patch, on the other hand, will create a record for every goal. Is the former approach desired rather than mine?

In your case, one different blob per idgoal is a much better solution indeed. but I think it's better to reuse existing filter_only_display_idgoal parameter since it does the same thing. if you're keen, you can rename it globally to "filter_idgoal" for example?

I don't mind renaming it, though it might be a good idea to just rename it to idGoal? The fact that it is used to filter some larger data would be an implementation detail.

I never heard of this bug, but if you are sure there is one, please create a new ticket with the screenshot of the bug or data dump from the log_visit table. Definitely something to investigate!

I'll take a look at it after I finish this patch. Maybe I can submit another patch with the new issue :).

Since I select the count of visitors, row 0 (the row w/ visitors who visited at least once) is the super set of all the other rows, ie, visitors who visited at least twice are also visitors who visited at least once.

In http://dev.piwik.org/trac/ticket/536#comment:7 the Visitor who visited 1 time, but not "at least once" - are you talking about this report in #536 ?

Yes, see below for my reasoning.

What is the report " # of visitors who visited at least N times"? I think the report "Visits that were the visitor nth visit" is enough isnt it?

The '# of visitors who visited at least N times' is essentially the same report as the one specified in #536, it just seemed to me that 'visits that were the visitor's nth visit' wouldn't be as clear to a piwik user. So I was just describing the data in the report in #536 a different way.

I may have been overthinking the issue, though. I could change the report documentation to do some more explanation.

Or, if it sounds like a good idea, I could change the report to show the '# of visitors who visited exactly N times'. I think I'd just have create and modify the 'at least N times' report.

comment:21 in reply to: ↑ 20 ; follow-up: Changed 3 years ago by matt (mattab)

Replying to capedfuzz:

Replying to matt:

There is something else, though. None of the other reports seem to actively use the goal id. They all seem to select one blob which contains all the data, then only display some of it (which explains the 'filter_only_display_idgoal' parameter). My patch, on the other hand, will create a record for every goal. Is the former approach desired rather than mine?

In your case, one different blob per idgoal is a much better solution indeed. but I think it's better to reuse existing filter_only_display_idgoal parameter since it does the same thing. if you're keen, you can rename it globally to "filter_idgoal" for example?

I don't mind renaming it, though it might be a good idea to just rename it to idGoal? The fact that it is used to filter some larger data would be an implementation detail.

OK I agree. After doing the change, please check that menus still work (click on goal 1, goal 2, other menu, come back to goal 3, etc.) - I think I remember sometimes seeing several idGoal in the URL etc. but it might have been fixed.

OK to rename to "idGoal" :)

Since I select the count of visitors, row 0 (the row w/ visitors who visited at least once) is the super set of all the other rows, ie, visitors who visited at least twice are also visitors who visited at least once.

In http://dev.piwik.org/trac/ticket/536#comment:7 the Visitor who visited 1 time, but not "at least once" - are you talking about this report in #536 ?

Yes, see below for my reasoning.

What is the report " # of visitors who visited at least N times"? I think the report "Visits that were the visitor nth visit" is enough isnt it?

The '# of visitors who visited at least N times' is essentially the same report as the one specified in #536, it just seemed to me that 'visits that were the visitor's nth visit' wouldn't be as clear to a piwik user. So I was just describing the data in the report in #536 a different way.

I agree it is slightly confusing, but I think it's best to keep one report for this, as it is simple to process and use existing code.

I may have been overthinking the issue, though. I could change the report documentation to do some more explanation.

Or, if it sounds like a good idea, I could change the report to show the '# of visitors who visited exactly N times'. I think I'd just have create and modify the 'at least N times' report.

You're not overthinking in fact I think it's a useful report. In this case there is performance overhead and new core archiving logic, because it adds complexity (if you confirm) then I think it's best not to include it. This would be a great candidate for a plugin though!

Cheers!

comment:22 in reply to: ↑ 21 Changed 3 years ago by capedfuzz (diosmosis)

Replying to matt:

I may have been overthinking the issue, though. I could change the report documentation to do some more explanation.

Or, if it sounds like a good idea, I could change the report to show the '# of visitors who visited exactly N times'. I think I'd just have create and modify the 'at least N times' report.

You're not overthinking in fact I think it's a useful report. In this case there is performance overhead and new core archiving logic, because it adds complexity (if you confirm) then I think it's best not to include it. This would be a great candidate for a plugin though!

I don't think it would add complexity, but I haven't written it yet. I think what I'll do is make modifications to my patch while displaying the report in #536, then I'll try and replace that report and see how much of a change it is. It may be easier to see the difference when the code is written :).

comment:23 Changed 3 years ago by capedfuzz (diosmosis)

Got another question, I've rewritten the VisitorInterest.php archiving code (for the new report) to GROUP BY visitor_count_visits, but my code is getting complex & hard to read (since the summarizing logic has to be in PHP). If I were to modify the archiving code to sum over ranges but get the data for every report in one SELECT (similar to the way my patch gets the data for the new goal reports), would that deal with the performance issue?

In other words, if I do a sum, but get the data for the VisitorInterest_timeGap, VisitorInterest_pageGap, VisitorInterest_visitorsByVisits reports with one SELECT instead of three, will there still be a performance issue?

comment:24 Changed 3 years ago by matt (mattab)

In other words, if I do a sum, but get the data for the VisitorInterest_timeGap, VisitorInterest_pageGap, VisitorInterest_visitorsByVisits reports with one SELECT instead of three, will there still be a performance issue?

It sounds like a good idea to get all of them at once if that's possible indeed!

My main feedback was to not add much archiving code since all existing logic is already there to sum reports over ranges etc. so the code can be reused. I think that the new plugin should be quite small amount of code, appart from the new refactored methods you introduce (that will be refactored from the existing code).

Let me know if that makes sense!

Changed 3 years ago by capedfuzz (diosmosis)

New patch for issues #584, #536, #2031. Made for revision 5270.

comment:25 Changed 3 years ago by capedfuzz (diosmosis)

I uploaded a new patch with the changes specified. Some notes on this patch:

0) The 'visitors who visited at least N times' report was reverted back to the 'visits who were the visitor's Nth visit' report. If I can find some time I'll try and see how much of a difference there is when modifying it to 'visitors who visited exactly N times'.

1) All three of the VisitorInterest reports use one SELECT.

2) I've refactored a lot of my old code, so this patch should be cleaner.

Looking forward to your review!

comment:26 Changed 3 years ago by matt (mattab)

capedfuzz, great work thank you!

First of all sorry for the delay in answering we wanted to release 1.6 which is now done. We will def commmit your patch for the next release :)

Feedback

  • in en.php there are a few strings that are duplicated, eg. 'Goals_ColumnVisitsUntilConv' == 'Goals_VisitsUntil', 'Goals_ConversionTime' == 'Goals_ColumnConversions', etc.

even if they are used in different places, in general we reuse the translations. This is less work for all translators and also is the right thing to do :)

  • can you confirm you run piwik/tests/all_tests.php and tests pass?
  • in the test cases (good on you to include them in the patch!), I see that the reports are always for "At least 1 visit" or "0 days". Maybe you could modify the test cases so that at least one of each of the new reports show data for "At least 2 visits" and "1 day" for example (let me know if you need help for this)
  • I haven't tested the code yet, just looked at it.
  • Please submit patch against trunk, I will test it and then commit it.

again thank you for this very nice work and looking forward to see it in trunk :-)

comment:27 follow-ups: Changed 3 years ago by capedfuzz (diosmosis)

Almost done w/ the next patch, but I've noticed a problem w/ the php tracking client. The client will not pass the _idvc query string variable which is used to set the visitor_count_visits column. This throws off the integration test results. Is this something the php tracking client should support? If so, would it be acceptable to simply add a function to the tracking client that lets you specify the specific value to use, or is there possibly a way to store this for each visitor?

comment:28 in reply to: ↑ 27 Changed 3 years ago by capedfuzz (diosmosis)

Actually, ignore my last comment, I found a solution.

comment:29 in reply to: ↑ 27 Changed 3 years ago by matt (mattab)

Replying to capedfuzz:

Almost done w/ the next patch, but I've noticed a problem w/ the php tracking client. The client will not pass the _idvc query string variable which is used to set the visitor_count_visits column. This throws off the integration test results. Is this something the php tracking client should support?

PHP should probably not support this directly (unless we support ALL cookies of piwik.js which is off topic), but you can use ->setDebugStringAppend('&_idvc=X') as a way to force the value.

Changed 3 years ago by capedfuzz (diosmosis)

Patch for issues #584, #536, #2031. Made for revision 5368.

comment:30 Changed 3 years ago by capedfuzz (diosmosis)

I uploaded a new patch with the changes specified. There are some new integration tests, all of which pass. Some notes about this patch:

  • I noticed that I didn't add the appropriate report metadata for the new goals reports, so I did in this report.
  • I made some changes to tests/integration/Main.test.php that you may want to review.
  • There was a regression in the test that created 'test_csvExport_xp1_inner1_trans-de_ _CustomVariables.getCustomVariables_day.csv'. I did some digging, and the difference seems to be because of the extra Piwik_DataTable instances that get created for the new reports. I'm pretty sure this is the only difference, so I included it in the archive. This is a binary file, so its not part of the patch.

The patch was made w/ revision 5368.

comment:31 Changed 2 years ago by matt (mattab)

  • Resolution set to fixed
  • Status changed from new to closed

(In [5378]) Fixes #584, #536, #2031 - Kuddos to Benaka akka capedfuzz for this great patch!!! I did a few minor modifications in wording and metadata output

  • Add Report "Visits by visit number" under Visitors > Engagement
  • Add Report for all Goals (including ecommerce): "Visit until Conversion": number of visits until the conversion occured
  • Add Report for all Goals (including ecommerce): "Days until Conversion": days since the first visit

Notes

  • These new reports are also in the Metadata API so should be displayed in Piwik Mobile, and can be exported in the Scheduled reports.
  • filter_only_idgoal now renamed as idGoal for consistency
  • refactored the "Beautify labels" for ranges in generic filters
  • refactored archiving code to process multiple reports in one generic SQL query

comment:32 Changed 2 years ago by matt (mattab)

(In [5381]) Refs #584

  • Remove completely filter_only_display_idgoal from codebase, renamed to idGoal. Appears not to break anything

comment:33 Changed 2 years ago by capedfuzz (diosmosis)

(In [5428]) Refs #583, #584, #2031

Fixed bug with ranges that, for some new range reports, caused certain visits to not be counted. This fix will cause a cosmetic change in the two older reports in the VisitorInterest plugin (the 'visits per duration' & 'visits per actions' reports).

comment:34 Changed 2 years ago by capedfuzz (diosmosis)

(In [5509]) Refs #584, #2031. Fixed small bug that caused the 'days to conversion' and 'visits to conversion' reports not to show for the Goals Overview.

comment:35 Changed 16 months ago by matt (mattab)

  • Milestone changed from 1.8.x Piwik 1.8.x to 1.9.x - Piwik 1.9.x

Milestone 1.8.x Piwik 1.8.x deleted

Note: See TracTickets for help on using tickets.