Opened 6 years ago

Closed 4 years ago

Last modified 4 years ago

#41 closed New feature (fixed)

Plugin to exclude / include only some Urls parameter

Reported by: matt Owned by: matt
Priority: critical Milestone: Piwik 0.6
Component: Core Keywords:
Cc: Sensitive: no

Description (last modified by matt)

This feature in phpMyVisites was very useful.
It is mostly used on websites with no readable URLs (?module=news&newsid=14&action=view)

The plugin would provide:

  • exclude given parameters from URLs
  • include only given parameters
  • exclude all parameters during statistics logging

This feature would be available:

  • "exclude parameters" would be available in a general list applying to all websites (by default it would exclude PHPSESSID, jsessionid, SESSIONID, etc.)

at the website level: each website define their own parameters to exclude (evaluated on top of the global list)

  • "Include only given parameters" available for each website
  • "Exclude all parameters" available for each website
  • by default, Piwik campaigns parameter would be excluded from URLs

The UI would call the API directly, hence ensuring that all these features are available through the UI too.

For example the url example.com/page/index.php?userid=8571498752487&module=homepage could be example.com/page/index.php?module=homepage after removing the userid parameter.

The UI for this feature should be designed to be part of a "preference page" for a website, as we need to add several new preferences for each website: #41, #42, #43, #56. Ideally, all the UI would be ajax (very quick to go from the list of websites in the admin UI, to load one website details page, to come back to the list of websites).

Outstanding question: should it be in the SiteManager plugin, or a new plugin? Should it be part of the core (to minimize overhead of loading plugins at Tracker time...)

Change History (25)

comment:1 Changed 6 years ago by matt (mattab)

  • Description modified (diff)

comment:2 Changed 6 years ago by matt (mattab)

  • Milestone set to Future features

comment:3 Changed 5 years ago by matt (mattab)

  • Description modified (diff)
  • Milestone changed from Features requests - after Piwik 1.0 to Stable release

comment:4 Changed 5 years ago by matt (mattab)

  • Priority changed from major to critical

comment:5 Changed 5 years ago by matt (mattab)

  • Component changed from Plugins to Core

comment:6 in reply to: ↑ description Changed 5 years ago by ahus1

This might be solved for the time being with a small plugin. To discuss: alexander dot schwartz at gmx dot net. This has hardcoded cleaning for sessions.

<?php
/**
 * Piwik - Open source web analytics
 *
 * @link http://piwik.org
 * @license http://www.gnu.org/licenses/gpl-3.0.html Gpl v3 or later
 * @version $Id$
 *
 * @package Piwik_TidyUrl
 */

require_once "Tracker/Action.php";

class Piwik_TidyUrl_Tracker_Action extends Piwik_Tracker_Action {
        public function getActionName() {
                $actionName = parent::getActionName();
                $actionType = parent::getActionType();
                if ($actionType == 1) {
                        // remove tomcat jsession id
                        $actionName = preg_replace("/;jsessionid=[A-Za-z0-9\.]*/","",$actionName);
                        // remove seam conversation id (assuming always at end of url)
                        $actionName = preg_replace("/&amp;conversationId=[0-9]*/","",$actionName);
                        $actionName = preg_replace("/\\?conversationId=[0-9]*/","",$actionName);
                }
                return $actionName;
        }
}

/**
 *
 * @package Piwik_TidyUrl
 */
class Piwik_TidyUrl extends Piwik_Plugin
{
    public function getInformation()
    {
        $info = array(
            'name' => 'TidyUrl',
            'description' => 'TidyUrl',
            'author' => 'ahus1',
            'homepage' => 'http://www.ahus1.de/',
            'version' => '0.1',
            'TrackerPlugin' => true, // this plugin must be loaded during the stats logging
        );

        return $info;
    }

    function getListHooksRegistered()
    {
        $hooks = array(
            'Tracker.newAction' => 'logTidyUrl',
        );
        return $hooks;
    }


    /**
     * URL Tidy
     */
    public function logTidyUrl($notification)
    {
            $action =& $notification->getNotificationObject();
            $action = new Piwik_TidyUrl_Tracker_Action();
    }

}

comment:7 Changed 5 years ago by vipsoft (robocoder)

I still like the idea of possibly doing the tidying in piwik.js (see #519).

comment:8 Changed 5 years ago by vipsoft (robocoder)

  • Milestone changed from 4- Stable release to 2- DigitalVibes
  • Sensitive unset

comment:9 Changed 5 years ago by domtop

comment:10 Changed 5 years ago by koteiko

comment:11 Changed 5 years ago by vipsoft (robocoder)

  • Owner set to vipsoft

comment:12 Changed 5 years ago by matt (mattab)

  • Description modified (diff)

comment:13 Changed 5 years ago by matt (mattab)

  • Description modified (diff)

comment:14 Changed 5 years ago by vipsoft (robocoder)

  • Owner vipsoft deleted

comment:15 Changed 4 years ago by vipsoft (robocoder)

In #1023, the user appears to propagate/persist the campaign parameters in the URL.

In conjunction with #79, we could have an option to filter all campaign parameters.

comment:16 Changed 4 years ago by matt (mattab)

  • Description modified (diff)

comment:17 Changed 4 years ago by ralf.stoltze

While exluding params as a whole is required for session params, I could also see a drill down behaviour for certain parameters (like it's already done with sites and folders in Actions/Pages).

First, I get a total of all hits, and on click these hits are splitted based on parameter value.

Example:

/index.php -> 6 pageviews

After drill down:

/index.php?message=logout -> 5 pageviews

/index.php?message=invalid-credentials -> 1 pageview

This is perfectly possible for only one parameter, but get's tricky (at least UI-wise) with more params.

comment:18 Changed 4 years ago by matt (mattab)

Also, the page providing this feature could have an option "Record all page names as lowercase to avoid duplicated page names with or without capital letters"

(if you have a better wording please suggest)

comment:19 Changed 4 years ago by vipsoft (robocoder)

From #1180, add ability to filter out the anchor/fragment after the hashmark.

comment:20 Changed 4 years ago by matt (mattab)

  • Milestone changed from 2 - Piwik 0.7 - DigitalVibes to 1 - Piwik 0.6
  • Owner set to matt

comment:21 Changed 4 years ago by matt (mattab)

  • Resolution set to fixed
  • Status changed from new to closed

(In [2023]) Fixes #41 Adding URL Query parameters exclude setting, per website, and global. We also by default exclude sessionid, phpsessid, etc.
The query parameters are excluded case insensitive.

comment:22 Changed 4 years ago by matt (mattab)

(In [2024]) Refs #41 The url shouldn't be htmlspecialchared

Also fixing notice when triggering a goal manually (piwik.trackGoal(goalId)) where the location_ip used to get the country, wasn't set for a known visitor

comment:23 Changed 4 years ago by vipsoft (robocoder)

  • Resolution fixed deleted
  • Status changed from closed to reopened

There's a typo in the 0.6 update script. excluded_parameters should be added to the table after excluded_ips is added.

comment:24 Changed 4 years ago by vipsoft (robocoder)

  • Resolution set to fixed
  • Status changed from reopened to closed

(In [2037]) fixes #41 - re-order schema change (dependencies)

comment:25 Changed 4 years ago by vipsoft (robocoder)

(In [2193])
refs #41, refs #1347 - regenerate cache files

Note: See TracTickets for help on using tickets.