#3194 closed Bug (fixed)

Charset or parsing error in Visitors View

Reported by: banym Owned by:
Priority: normal Milestone: 1.8.3 - Piwik 1.8.3
Component: Core Keywords: charset, visitors log
Cc: Sensitive: no

Description

After updating to 1.8 there is an error representing some chars in the visitors log.

The name of my Blog ist "Banym's Blog"

I am not sure if it's a problem while updating the database or just representing of the data.

Now I am using 1.8.2 and the problem still exists. 1.7 and below was fine.

Regards,

Dominik

Attachments (1)

screenshot-visitors-log.png (40.8 KB) - added by banym 23 months ago.
Screenshot visitory log

Download all attachments as: .zip

Change History (13)

Changed 23 months ago by banym

Screenshot visitory log

comment:1 Changed 22 months ago by matt (mattab)

  • Milestone set to 1.8.2 - Piwik 1.8.2

comment:2 Changed 21 months ago by capedfuzz (diosmosis)

  • Resolution set to fixed
  • Status changed from new to closed

(In [6518]) Fixes #3194, make sure smarty escape modifier doesn't double encode escaped text.

comment:3 follow-up: Changed 21 months ago by matt (mattab)

  • Resolution fixed deleted
  • Status changed from closed to reopened

please revert as Dangerous to change the escape mechanism used in piwik - i'm pretty sure this would lead to XSS.

in this case there would be another reason for the bug, that the data was tracked incorrectly in the first place - maybe that "In DataTable/Renderer.php formatValueXml calls html_entity_decode/htmlspecialchars with ENT_COMPAT instead of ENT_QUOTES. Is this intentional?"

comment:4 in reply to: ↑ 3 Changed 21 months ago by capedfuzz (diosmosis)

Replying to matt:

please revert as Dangerous to change the escape mechanism used in piwik - i'm pretty sure this would lead to XSS.

in this case there would be another reason for the bug, that the data was tracked incorrectly in the first place - maybe that "In DataTable/Renderer.php formatValueXml calls html_entity_decode/htmlspecialchars with ENT_COMPAT instead of ENT_QUOTES. Is this intentional?"

This specific bug is caused because the action name has ' in it and the smarty escape modifier will encode it as '. The action name has ' due to getRequestVar sanitizing the action name when tracking.

I suppose, instead of modifying the escape modifier, I could decode the action name in the Live plugin, but other than decoding before escaping, I can't think of a way to solve this issue...

comment:5 Changed 21 months ago by capedfuzz (diosmosis)

(In [6523]) Refs #3194, reverted [6518] smarty escape modifier change.

comment:6 follow-up: Changed 21 months ago by matt (mattab)

I suppose, instead of modifying the escape modifier, I could decode the action name in the Live plugin, but other than decoding before escaping, I can't think of a way to solve this issue...

Did you replicate the original issue? I'm wondering if this is due to a tracker bug, or maybe just a browser bug? or were you able to have an example that fails in all browsers?

Actually banym How do the Page names display in the Actions>Pages and Page Titles report? do the report show the names with the html entities?

Thanks for further information!

comment:7 in reply to: ↑ 6 Changed 21 months ago by capedfuzz (diosmosis)

Replying to matt:

I suppose, instead of modifying the escape modifier, I could decode the action name in the Live plugin, but other than decoding before escaping, I can't think of a way to solve this issue...

Did you replicate the original issue? I'm wondering if this is due to a tracker bug, or maybe just a browser bug? or were you able to have an example that fails in all browsers?

I modified the VisitorGenerator plugin's access log, adding an "'" character to an entry's title. It showed up in the Visitor's log as "'". The HTML returned contained the text "'", so it is not a browser issue.

The action my test created had a name that looked like this: "incredible title''!". So either the bug is with the tracker when it stores action names in their sanitized state, or w/ the admin frontend.

Since decoding first is used in Piwik_Common::sanitizeInputValue, I assumed using it in the smarty escape modifier wouldn't be an issue.

comment:8 follow-up: Changed 21 months ago by matt (mattab)

Maybe the bug is in the tracker that should htmldecode before encoding? I thought it would do it already... Maybe that's a bug?

Can you confirm in your test that this page name is displayed correctly in Actions > Page Titles report?

comment:9 in reply to: ↑ 8 Changed 21 months ago by capedfuzz (diosmosis)

Replying to matt:

Maybe the bug is in the tracker that should htmldecode before encoding? I thought it would do it already... Maybe that's a bug?

Can you confirm in your test that this page name is displayed correctly in Actions > Page Titles report?

The page name is displayed correctly: "incredible title!"

What a weird bug...

comment:10 Changed 21 months ago by matt (mattab)

Could you do a special case of decoding the page title before encoding, in the Live plugins templates? what do you think?

comment:11 Changed 21 months ago by vipsoft (robocoder)

Basically, we're double encoding: first in getRequestVar, and then here:

http://dev.piwik.org/trac/changeset/6104/trunk/plugins/Live/templates/visitorLog.tpl

I don't recall the problem fixed by r6104, but could it be changed to:

{$action.pageTitle|unescape|urldecode|escape:'html'|truncate:80:"...":true} 

comment:12 Changed 21 months ago by capedfuzz (diosmosis)

  • Resolution set to fixed
  • Status changed from reopened to closed

(In [6631]) Fixes #3194, committed vipsoft's fix: use unescape before escaping action name in visitor log.

Note: See TracTickets for help on using tickets.