Opened 23 months ago

Closed 14 months ago

Last modified 14 months ago

#3151 closed New feature (fixed)

PiwikTracker.java sync with PHP tracker and release 1.0

Reported by: matt Owned by:
Priority: normal Milestone: 1.x - Piwik 1.x
Component: Core Keywords:
Cc: klaus, JulienM, miyazaki, mfochler, user08, exactt Sensitive: no

Description (last modified by halfdan)

Follow up from #2172

The goal of this ticket is to improve the JAVA tracker and sync it with the PHP tracker in terms of features and public APIs.

Thank you to all contributors!

Code is available in: Java Tracking Client

Attachments (1)

SimplePiwikTracker.zip (10.9 KB) - added by thivnor 23 months ago.
V1.1 of SimplePiwikTracker.java

Download all attachments as: .zip

Change History (20)

comment:1 Changed 23 months ago by thivnor

OK I finally got some time to work on this. My changes are in good shape I think. I should be able to complete testing this week, so I can attach my updated code to this bug report..

In the meanwhile a few comments, with specific questions at the end.

I’ve fixed the visitor ID logic so it uses cookies like "_pk_id.6.1fff" not "piwik_visitor.” This is the only change which is really NEEDED.

I’ve updated the logic which examines the HttpServletRequest so that it is proxy-server aware. If the user is being a proxy server, which sets “X-Forwarded-For,” we use that information instead of the IP address from the HttpServletRequest. This way we get the IP address of the user not the IP address of the proxy server.

On the version numbering, this version should probably be v1.1 or perhaps 2.0. Versions 1.0.0 through 1.0.5 have been posted under Ticket #2172.

Other changes:

When checking a visitor ID, I’ve checking that we have a 16 character hexadecimal string. This is more strict than the PHP code which just checks the String’s length.

The comments indicate that Java 1.5 is required but the code really requires Java 1.6. (It uses java.net.HttpCookie and String.isEmpty)

I’ve removed the IPiwikTracker interface. I can’t think of any scenario where you would want it.

I’ve cleaned up the API, removing misnamed methods: getDownloadTackURL and getLinkTackURL.

Methods which accept a Goal now take an int not a String. The PHP code says revenue is also an int, but in the database it is a float. So I think we should have:

public final URL getGoalTrackURL(final int goal, final float revenue) { .. }

Implemented methods like doTrackGoal(final int goal, final float revenue)

In the sendRequest method, it would silently do nothing if the supplied URL was null. This does not jive with the PHP implementation and doesn’t make much sense. Now, if the supplied URL is null, you will get a NullPointerException.

I haven’t made this change yet, but I feel like the PiwikException class is a mistake. It takes all sorts of different problems and lumps them into a single PiwikException class. For example a caller can no longer easily differentiate between IOException (dead network connection) from MalformedURLException (probably bad programming). You can presumably use getCause() but that is awkward. I would much prefer to have a method like setApiurl throw MalformedURLException if the supplied URL is invalid or a raw NullPointerException if it was null. Why obfusticate the error?

I would personally return the original exception, for example a MalformedURLException, ParseException or IOException. In some cases there is no original exception but IllegalArgumentException is appropriate in almost all cases. The only case where IllegalArgumentException is not appropriate is when the server does not return a success code. In that case I would add HttpResponseException as an inner class.

Questions:

1) What version should we call this? I would recommend v1.1 or v2.0.

2) Does it make sense to remove PiwikException and return the original exceptions?

3) Why does the PHP code document revenue as an int? A float seems more appropriate.

comment:2 Changed 23 months ago by matt (mattab)

thivnor, you can commit directly your changes rather than attaching patches.

1) 1.1 sounds good, or maybe even 1.8 to sync with piwik ;)

2) Yes
Sounds good to return original exception if it simplifies the code.

3) will fix

comment:3 Changed 23 months ago by matt (mattab)

(In [6411]) Refs #3151 Floats

Changed 23 months ago by thivnor

V1.1 of SimplePiwikTracker.java

comment:4 Changed 23 months ago by thivnor

I don't have an account and I'm not up to speed on SVN. So I've just attached my updated file. I've used .zip because my .java file was rejected ... too many URLs made it look like spam.

I've put change notes in the file itself.

But at a high level I've fixed the bug, supported operation behind a proxy server, made a variety of API changes, and put everything into a single file.

So, matt, if you check this in you can remove all the other .java files.

The "single file" aspect makes it easy to import, it is just one file so if you want to import it into your code base it is easy to do. To do this, I converted two classes to inner classes, this is easy, and I think appropriate. The "ResponseData" class is "what you get after calling something like doTrackGoal." The "EBrowserPlugins" is also pretty 'internal' to the SimplePiwikTracker class. So I don't feel I'm forcing this ... it feels like a single file is a good fit for what the package is going. This also lines up with the PHP implementation which is a single file.

In terms of API changes, the big thing is how exceptions are handled. Rather than a single monolithic "PiwikException" you get specific exceptions. This makes it easy to handle transient problems like IOException differently from programming errors like MalformedURLException.

In many cases the class has methods I don't see the value of. The point of the tracker is to track what the real user is doing. So having methods like setResolution, which replace the user's real screen resolution, makes little sense. I've left them in place but don't see the value. I've included JavaDoc comments to be careful with setTokenAuth / setVisitorId, I'm not sure how they would interact.

Because of the API changes, the unit tests are totally broken. I have not yet been converted to the 'unit testing religion' as it were. I review the existing unit tests, somewhat quickly, and saw a lot of tests which didn't have much value. If there were a handful of tests which tested the overall class I could see taking the time to get the tests in line with the new class. But so many of the tests are just busy work, I don't see the point.

My guess is that piwik.org doesn't use Java-based unit testing in an automated way so this should not be a problem in terms of your build system. If this does create a problem however we could rename SimplePiwikTracker.java to PiwikTracker.java and leave the other code in place. Or, if someone makes a compelling case for unit-testing this class, I could probably the time to get up to speed on JUnit.

In terms of my own testing coverage, I am focused on goal-tracking, so that is the specific thing I've tested. I don't expect problems with the other methods however as I haven't messed with their logic. I have not tested the cookie handling: for example if you use a track to track several things, it should get the cookie from the server, save it, and use it in subsequent requests. I've removed logic related to XDEBUG cookies. I don't know what they are but the logic seemed really fishy and likely to break normal cookie handling. If anyone can explain what is up with XDEBUG cookies I can revisit this.

I have tested behavior behind a proxy server. The code now correctly reports the IP and request URL, even when behind a proxy server. I've specifically tested against Apache httpd. One odd limitation is that the proxy server doesn't send the original request URL, and I don't know how to determine the original port and protocol used for the request. So if you have a secure proxy server my code will misreport it. My code just assumes the proxy server is using http on port 80. The code is well commented and should be easy to change if it is a problem for anyone.

I've used v1.1 not v1.7 as I don't want to imply this is tied to the Piwik version. This code should be fine as long as the REST API stays stable and the cookie format doesn't change.

comment:5 Changed 23 months ago by matt (mattab)

  • Cc klaus JulienM miyazaki mfochler user08 exactt added

Adding everyone from previous tickets.

If you use the Piwik JAVA class, please give your opinion on the new code submitted by thivnor in this ticket. Your feedback is appreciated.

@thivnor all your changes make sense to me.

So having methods like setResolution, which replace the user's real screen resolution, makes little sense.

It is important to leave all API functions, because some devices have access to the resolution (for example in a mobile app, you know the resolution and actually set it if you track mobile apps analytics).

if someone makes a compelling case for unit-testing this class, I could probably the time to get up to speed on JUnit.

Tests are not critical for this class, especially if there are several developers looking out and reviewing changes. But if you build test eventually we could it on our continuous integration server.

If anyone can explain what is up with XDEBUG cookies I can revisit this.

XDEBUG are only related to PHP debugging and must be removed from other clients

Thanks for your changes! Will wait for someone else to confirm and commit

comment:6 Changed 23 months ago by matt (mattab)

  • Description modified (diff)

comment:7 Changed 23 months ago by klaus

Thanks thivnor for the great update! I will test it as soon as I have time.

comment:8 Changed 21 months ago by dnebing

A couple of important things to note:

  1. doTrackDownload() method is misspelled as doTrackDowlonad().
  2. The isTrackingCookie() method makes assumptions that the tracking cookie has the site id embedded in it. The default install of piwik, if third party cookies are enabled, uses only _pk_uid for the cookie. This should probably be noted somewhere.
  3. The addParameter(String rootQuery, String name, boolean selection) method is just wrong. Piwik expects booleans to be integers (0=false, 1=true) but this method is outputting "false"/"true", which piwik will not handle correctly.
  4. The attached code is already out of sync again as it does not include the ecommerce or goal campaign elements.

And just in general, I think that just because the PiwikTracker.php code all fits in a single page does not mean that the java implementation should fit within a single class file. Call me a java purist or whatever, but there is no separation of concerns (separating visitor info from page info, mixing the html connection stuff into the query building and other logic). And using strings in java to concatenate the query, seriously?

I found the code to only be useful as a reference for how it could be done, but certainly not how it should be done.

That said, I thank you all for your contributions anyway...

comment:9 Changed 21 months ago by thivnor

dnebing, thanks for the code review.

A little context may be in order. I don't think this code is widely used. When I started using v1.0.2, I found it very broken. It would record visits, but would not assign them to the correct visitor ID. It was using the wrong cookie to extract the visitor ID, among other problems. The original authors don't seem to be around anymore. So I volunteered to make the code at least basically functional. Or to be more exact: so it would meet my very simple needs.

In the process I did some code tidying. For example the exception handling is much more direct now. But this code is not a priority for me. I don't know much about Piwik, and am not particularly proficient in PHP. So, for example, I am unlikely to work on compatibility with the 'ecommerce or goal campaign elements.' Similarly, on 'separation of concerns,' I'm not going to spend a lot of time making the code perfectly tidy. As they say in the Perl community, 'There's more than one way to do it.'

Context aside, the typo is easily fixed, as is the issue relating to how Piwik expects booleans to be represented ('0' not 'false').

But what most concerns me here is the issue around cookie names. You note that "The default install of piwik, if third party cookies are enabled, uses only _pk_uid for the cookie." As mentioned, I'm not Piwik guru. I have used PiwikTracker.php as the definitive reference, and it does not appear to respect _pk_uid as a cookie name. The relevant function is getVisitorId, which has:

    $idCookieName = 'id.'.$this->idSite.'.';

Can anyone clarify what cookies Piwik will use?

comment:10 Changed 21 months ago by matt (mattab)

@thivnor Thanks again for taking over this class and maintaining it for the Piwik Java dev community, it's very appreciated!!

Regarding the cookie name, we don't currently use the third party in the PHP tracker so the Java is not expected to do it. Maybe we could add this in the PHP tracker, in which case @dnebing please create a ticket for this separate request.

@dnebing please submit a patch & the new zip for ecommerce tracking :) http://piwik.org/participate/development-process/#toc-how-to-submit-a-patch

comment:11 Changed 19 months ago by matt (mattab)

I just realized this new code is not in Piwik trunk in: http://dev.piwik.org/trac/browser/trunk/libs/PiwikTracker/java/src/main/java/org/piwik

should we update SVN with your latest code? can you confirm it's working well? Has anyone else tested it ? thanks!

comment:12 Changed 19 months ago by DoubleMalt

I checked out the trunk version today and just replaced it with 1.1.

I will use it in production soon, and have a high interest that it works, so I'm happy to give it some love.

I will do a raw check of the functionality and then report back.

On the long haul I would also propose to refactor this class. But until then I'm happy to provide updates and bugfixes.

comment:13 Changed 19 months ago by thivnor

In my option the v1.1 I posted is a significantly better than the prior version, simply because it uses the correct tracking cookie. There are clearly numerous potential improvements and fixes. But I do think this is the version to check in. Note that my .zip has a single file, which replaces all the files of the original. So you probably want to delete the other files; they are no longer needed.

I have not integrated the two recommendations from dnebing:

1) Replace "doTrackDowlonad" with "doTrackDownload"

2) The addParameter method which accepts a boolean, on line 916, is apparently not formatting booleans correctly for Piwik. I have not observed this behavior nor tested the fix below. But assuming dnebing is correct about what Piwik accepts ("0" not "false"), this change should do the trick:

CURRENT CODE

private String addParameter(final String rootQuery, final String name, final boolean selection) {
  return this.addParameter(rootQuery, name, String.valueOf(selection), true);
}

RECOMMENDED CHANGE

private String addParameter(final String rootQuery, final String name, final boolean selection) {
  return this.addParameter(rootQuery, name, selection ? "1" : "0", true);
}

comment:14 Changed 18 months ago by trevisan

I'm sorry, I am trying to use this java API but I cannot find the doTrackDowlonad method in any of the files attached. Am I missing something? How do you send the tracking request?

comment:15 Changed 18 months ago by thivnor

Line 268:

public ResponseData doTrackDowlonad(final String downloadurl) throws IOException, HttpResponseException, BadCookieException {

(Sorry "download" is misspelled)

comment:16 Changed 14 months ago by halfdan

I extracted the code + history into its own repository on Github: http://github.com/piwik/piwik-java-tracking

In case you'd like to make any changes please submit a pull request - they are very welcome!

comment:17 Changed 14 months ago by halfdan

  • Description modified (diff)

comment:18 Changed 14 months ago by matt (mattab)

  • Resolution set to fixed
  • Status changed from new to closed

Awesome work on the migration @halfdan!

Let's close this ticket now, and please report any bug you experience, with the Java tracking API at github https://github.com/piwik/piwik-java-tracking

comment:19 Changed 14 months ago by matt (mattab)

and of course if anyone wants write access to the repo on github please make one pull request and ask for it, it would be nice to find official maintainers who are using the API for real projects.

Note: See TracTickets for help on using tickets.