Ticket #558 (new New feature)

Opened 3 years ago

Last modified 2 years ago

ActiveProvider plugin: Enhanced Provider Report uses Public Suffix List for second and third level domains

Reported by: feyp Owned by:
Priority: normal Milestone: Third Party Piwik Plugins
Component: New Plugin Keywords:
Cc: Sensitive: no

Description (last modified by vipsoft) (diff)

ActiveProvider

The ActiveProvider plugin enhances the Provider report by using the cross-vendor supported Public Suffix list -- hosted by the Mozilla Foundation at  http://publicsuffix.org/ -- to more accurately detect second and third level domains for a given top level domain.

The core Provider plugin does not recognize many public second and third level domains. For example, let's say, you get a visit from $customer.$provider.co.uk. This is correctly recognized as $provider. But if you get a visit from $customer.$provider.edu.hk, it is recognized as edu.hk, not as $provider.

Performance

  • added memory footprint approx. 960K
  • execution overhead (including load time) approx. 50ms on an (ancient) Athlon 1.4 GHz processor (test box), without APC

Requirements

  • Piwik 0.5.5 (or above)

To install

  • Extract the attached .zip file in the plugins folder.
  • Activate the plugin from the Settings | Plugins tab.

For support

  • Send me a tweet  @vipsoft
  • DO NOT CREATE A NEW TICKET HERE

Attachments

ActiveProvider.zip Download (18.4 KB) - added by vipsoft 2 years ago.
2010.01.08 update of Public Suffix List
ActiveProvider-1.2.1.zip Download (19.4 KB) - added by vipsoft 11 months ago.
2011-03-01 update of Public Suffix List

Change History

Changed 3 years ago by vipsoft

  • type changed from Bug to New feature
  • milestone set to Features requests - after Piwik 1.0

There's quite a bit more overhead to using the  Public Suffix List, both in terms of administration (e.g., keeping up to date with changes to the list) and performance (compared to the current method in plugins/Provider/Provider.php).

        private function getCleanHostname($hostname)
        {
                $extToExclude = array(
                        'com', 'net', 'org', 'co'
                );

Attached patch uses Toby Inkster's GPL'd code, found at  http://tobyinkster.co.uk/blog/2007/07/19/php-domain-class/

Changed 3 years ago by matt

overhead of using this class and the .dat list is quite big, and we really want to keep execution of piwik.php optimal. It would be disabled by default. However i think the best in this case might be to provide a plugin to do this advanced check. vipsoft, maybe you could add a hook in the getCleanHostname that would load the hostname from a plugin if hook is listened, or default to the normal simple algorithm. Creating the plugin from your existing patch will be easy. Does it make sense?

Changed 3 years ago by vipsoft

I should have done that the first time. ;)

Changed 3 years ago by vipsoft

  • owner set to vipsoft
  • status changed from new to assigned
  • summary changed from Provider plugin doesn't recognize public second level domains (sld) in all cases to Plugin: Enhanced Provider to recognize public suffix list (second level domains)

Changed 2 years ago by vipsoft

  • owner vipsoft deleted
  • sensitive unset
  • status changed from assigned to new

Changed 2 years ago by vipsoft

(In [1750]) refs #558 - add Provider.getCleanHostname hook

Changed 2 years ago by vipsoft

  • milestone changed from Features requests - after Piwik 1.0 to 1 - Piwik 0.5.5

Changed 2 years ago by vipsoft

  • status changed from new to closed
  • resolution set to fixed

(In [1753]) fixes #558 - plugin to use Public Suffix List to enhance Provider report; the register-domain-libs contains a PHP data structure to represent the contents of effective_tld_names.dat -- this loads and executes much faster (and can be opcode cached) than the implementation using Domain.class.php

Changed 2 years ago by vipsoft

(In [1776]) refs #558, revert [1753], remove PublicSuffix plugin from core per matt's review

Changed 2 years ago by vipsoft

(In [1777]) refs #558, revert [1753], move these lines back to top per matt's review

Changed 2 years ago by vipsoft

  • status changed from closed to reopened
  • summary changed from Plugin: Enhanced Provider to recognize public suffix list (second level domains) to Plugin: Enhanced Provider Report using Public Suffix List for second and third level domains
  • resolution fixed deleted
  • description modified (diff)
  • milestone changed from 1 - Piwik 0.5.5 to Third Party Piwik Plugins

Changed 2 years ago by vipsoft

  • status changed from reopened to new
  • description modified (diff)
  • summary changed from Plugin: Enhanced Provider Report using Public Suffix List for second and third level domains to ActiveProvider plugin: Enhanced Provider Report uses Public Suffix List for second and third level domains

Changed 2 years ago by vipsoft

  • description modified (diff)

Changed 2 years ago by vipsoft

2010.01.08 update of Public Suffix List

Changed 11 months ago by vipsoft

2011-03-01 update of Public Suffix List

Note: See TracTickets for help on using tickets.