New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The <noscript> image call doesn't currently record any visit, but it could #653
Comments
To me this is a major issue, as non-javascript users still give us valuable information. I would love to see this implemented before 1.0 Suggestion 1Couldn't the http:BL by Project Honeypot be used to filter out any bots? They offer an API to identify Search Engines, Spammers and other bots by IP address. Piwik could work like this:
This way traffic for the blacklist server would be kept low. I still think every Piwik installation would need their own API key, though. Suggestion 2Piwik should include its own tiny honeypot. The <noscript> tag should include a link that is invisible to the user and that has rel=nofollow.
Only malicious crawlers will follow this link, so Piwik can exclude their IPs from tracking. Known, well-behaving search bots can still be identified by User-Agent. This way, most bots will probably get identified. |
Replying to matt: |
Can I add my vote for this as well? We're missing out on visits from many mobile phone users and disabled people using screen readers, for example, because they don't have javascript. And there are legitimate reasons for disabling javascript in a normal browser as well. I agree we need to separate out the bots somehow for the statistics, but really that's a separate issue. I'd like the option of counting all visitors, even if that includes bots. |
The code
will be visible to blind persons using screen-reader software. It would be better to code this as
which will also hide it from the screen readers. Hope this helps, |
Charles: that's not our tracking code. Piwik's tracking code doesn't contain an anchor link (honeypot or otherwise). |
re: comment:3 - The idea behind the noscript tag is to track Javascript-disabled visitors. We'll provide a hook here so third-party plugins can implement suggestion 2. |
In order to report search engine bot activity, we could reuse some of the GPL code from http://www.crawltrack.net/ which is a php bot tracker tool. The logic could sit in a Piwik plugin. There could be a new sub tab, that would report bot activity for each bot that was seen during the selected date range. Bots would be identified by user agents and / or IPs, see eg. the list at crawltrack: http://www.crawltrack.net/crawlerlist.php Additional features could include:
|
So i think it would be interesting to track also robots f.e. for big sites. But this should track in a seperatet table with a special plugin - like Live Bots ;-) In my tool http://www.spider-trap.de/en_index.html i ban a lot of bad bots. Maybe Piwik can report the webmaster if an bot is crawling. |
The Tracking API has been released, which can help track visitors without Javascript, or even track visits Mobile apps, desktop apps and more. |
Currently the Piwik tracking code has a noscript which could be used to record visits from people without Javascript enabled.
There is some work required to
- filter out search engine bots
- filter out spam bots
- filter out all other type of bots
Of course this could also be used to log bots and show them in a specific Piwik report “Bot activity”.
The initial design decision was to not record any visitor without Javascript as it is a lot of work to ensure that the data coming from Javascript-disabled devices is accurate and not bot initiated.
To record a visit without JS, you must call
```
piwik.php?idsite=$ID_SITE
&rec=1
&action_name=$ACTION_NAME
```
See also PUSH API without Javascript #134
Keywords: bots noscript
The text was updated successfully, but these errors were encountered: