Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Edge case: each page is a new visit #1916

Closed
mattab opened this issue Dec 23, 2010 · 13 comments
Closed

Edge case: each page is a new visit #1916

mattab opened this issue Dec 23, 2010 · 13 comments
Assignees
Labels
Bug For errors / faults / flaws / inconsistencies etc. Critical Indicates the severity of an issue is very critical and the issue has a very high priority.
Milestone

Comments

@mattab
Copy link
Member

mattab commented Dec 23, 2010

When the cookie is somehow read only, old timestamps will be read and new visits generated every pageview for these buggy requests. This could maybe be caused by a Adblock type extension blocking writes to the cookie, but still passing it to the request.


<?php

$host = "piwik-domain.com";

$request = "GET /piwik.php?idsite=2&rec=1&url=http%3A%2F%2Fwww.domain.de%2F&res=1280x1024&h=7&m=57&s=51&cookie=1&urlref=http%3A%2F%2Fwww.domain.de%2F&rand=0.6439636907182041&pdf=1&qt=1&realp=0&wma=1&dir=0&fla=1&java=1&gears=0&ag=1&action_name=Some%20Action HTTP/1.1
Host: $host
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13 (.NET CLR 3.5.30729)
Connection: close
Referer: http://www.refdomain.de/somepage
Cookie: piwik_visitor=[INSERT COOKIE DATA]

";

$fsock_fp = fsockopen($host, 80, $errno, $errstr, 10);
fwrite($fsock_fp, $request);

echo '<pre>';
echo $request;
while (!feof($fsock_fp))
{
    echo fgets($fsock_fp, 128);
}
echo '</pre>';

fclose($fsock_fp);

?> 
@mattab
Copy link
Member Author

mattab commented Dec 23, 2010

Maybe a solution would be to consolidate the visits at the beginning of archiving: deleting all visits from the same visitor that happen within 30min ranges.

@robocoder
Copy link
Contributor

We should be able to fix this in #409.

@robocoder
Copy link
Contributor

It's possible this is caused by bots (e.g., web scrapers). On the initial request, the bot saves cookies to its cookie jar, and on subsequent requests, sends the cookies without updating the cookie jar.

Another possibility is that the Tracker has gotten slower, and that this is a duplicate of #1108, experiencing the race condition where:

  • user browses page A, sending cookie X1
  • before server can respond with updated cookie X2, user browses page B, resending cookie X1

We can mitigate this by calling $this->end() before Piwik_Common::runScheduledTasks().

When we implement #409, we'll only be sending idcookie, so $this->end() can be called even sooner, e.g., as soon as we've confirmed it's a returning visitor. (This will also improve perceived tracker responsiveness.)

@anonymous-matomo-user
Copy link

Do you still have problems to reproduce this issue?
I am willing to give you ssh access to my server to analyse this live on an affected machine.

@mattab
Copy link
Member Author

mattab commented Jan 5, 2011

awesome, I can replicate so it's OK. stay tuned..

vipsoft, I'm going to force the tracker to check the cookie value on each request. This will be overhead compared to current algorithm, but that's the price to pay for accuracy when bad data is coming in.

Then we'll be pretty close to have 1st party cookie only, since the code will be based on the unique ID.

@mattab
Copy link
Member Author

mattab commented Jan 5, 2011

Could also be triggered in use case:

  • go to homepage,
  • before Piwik loads (and with a more than 30min old piwik cookie)...
  • ... middle click and open many other pages

Each piwik request will receive a page view with the old cookie until the new one is set in the browser cookie jar.

@mattab
Copy link
Member Author

mattab commented Jan 5, 2011

(In [3634]) Fixes #1916
Now always checking in the DB if we saw the visitor earlier. The cookie also becomes much smaller.
Renamed the setting enable_detect_unique_visitor_using_settings now called trust_visitors_cookies as it is different logic, and should only be enabled in intranet where IP is same for all users.
This will also help getting 1st party cookie implemented Refs #409

@anonymous-matomo-user
Copy link

Can you provide a patch or will you release a new update soon?

@mattab
Copy link
Member Author

mattab commented Jan 5, 2011

Please try the new beta at: http://builds.piwik.org/piwik-1.1.2b1.zip

let me know if it fixes the issue completely :)

@anonymous-matomo-user
Copy link

Thanks matt.

I installed the version, let's see what happens. I will report later the day if it worked out.

FYI: I got a JS Alert when I first opened the page :)

There is no/bad markup for form tag

Dunno if this has something to do with Piwik. However it just appeared once, now it's gone even on page reload.

@anonymous-matomo-user
Copy link

Matt: Seems to work like a charm with 1.1.2b1! Great work, thanks for your fast help.

I guess the wrong counts cannot be undone in db, right? So my daily (doesn't really matter) but also weekly and monthly data is not usable for analysis anymore!?

Or might there be a way to re-parse the data of the last day?

@anonymous-matomo-user
Copy link

Replying to vipsoft:

Another possibility is that the Tracker has gotten slower, and that this is a duplicate of #1108, experiencing the race condition where:

  • user browses page A, sending cookie X1
  • before server can respond with updated cookie X2, user browses page B, resending cookie X1

What happened to us yesterday and today seconds this hypothesis :
After upgrade to 1.1, the issue appeared (visit miscount). Maybe the tracker code got slower, because, indeed, our piwik server load increased.
After upgrade to 1.1.2b1 issue disappeared.
The issue caused the most severe spikes on sites with the most returning visitors, and sites with high number of actions / high action frequency (tracked ajax requests, for an example)

I second awesome's question, is there a way to rebuild visits, and repair yesterday stats (we can code something and contribute it if you give us some hints) ?

@mattab
Copy link
Member Author

mattab commented Jan 5, 2011

I haven't tested (WARNING) but a query like this might work:

delete from piwik_log_visit
where visit_server_date = $THE_DATE
and where visitor_idcookie IN (
SELECT visitor_idcookie from piwik_log_visit 
where visit_server_date = $THE_DATE
group by visitor_idcookie
having count(*)> 1
)

This will delete all visits from visitors beyond their first visit on $THE_DATE and therefore keep only one visit per visitor on that day

Please test on a test dataset before applying to your real one (or use on a copy of the table)

@mattab mattab added this to the Piwik 1.2 milestone Jul 8, 2014
@mattab mattab self-assigned this Jul 8, 2014
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug For errors / faults / flaws / inconsistencies etc. Critical Indicates the severity of an issue is very critical and the issue has a very high priority.
Projects
None yet
Development

No branches or pull requests

3 participants