Opened 23 months ago

Closed 20 months ago

Last modified 19 months ago

#3177 closed Task (fixed)

QA: Add benchmarking system to test speed of different parts of Piwik in different situations

Reported by: capedfuzz Owned by:
Priority: normal Milestone: 1.x - Piwik 1.x
Component: Core Keywords:
Cc: Sensitive: no

Description

When optimizing or doing major re-factoring it's necessary to run performance tests to make sure new changes are as fast or faster than the unchanged code. There should be a system in place to allow this sort of testing.

Some ideas on how benchmarking should be done:

  • Should reuse as much of testing infrastructure as possible.
  • Since most benchmarks will require a large amount of data and thus will take a long time to run, the benchmarks should not be run with the tests.
  • If possible, the benchmarks should communicate with the browser when the test data has finished being added. (I think this would have to be done via forking & IPC.)
  • Allow the amount of sites/visits/goals/etc. to be configurable (via the browser) for each benchmark.
  • If possible, find some way to quickly load test data.
  • Integrate some or all benchmarks w/ Jenkins. They shouldn't be run w/ the other tests, but it should be possible to run them through Jenkins and get some performance analytics.

Attachments (7)

3177.diff.tar.gz (4.2 KB) - added by capedfuzz 23 months ago.
Patch for this issue.
screenshot.pdf (66.3 KB) - added by capedfuzz 23 months ago.
Screenshot of patch.
screenshot.png (90.5 KB) - added by capedfuzz 23 months ago.
Real screenshot. Ignore the other one.
3177.diff.tar.2.gz (7.3 KB) - added by capedfuzz 22 months ago.
New patch. Improved speed for adding visits.
3290.diff.tar.gz (4.4 KB) - added by capedfuzz 20 months ago.
benchmarking w/ phpunit
3290.diff.tar.2.gz (5.0 KB) - added by capedfuzz 20 months ago.
Another patch for phpunit benchmarking.
generate.coffee (15.1 KB) - added by capedfuzz 20 months ago.
Ugly CoffeeScript script used to generate visits.

Download all attachments as: .zip

Change History (38)

Changed 23 months ago by capedfuzz (diosmosis)

Patch for this issue.

Changed 23 months ago by capedfuzz (diosmosis)

Screenshot of patch.

Changed 23 months ago by capedfuzz (diosmosis)

Real screenshot. Ignore the other one.

comment:1 Changed 23 months ago by capedfuzz (diosmosis)

Added a patch for this issue. It contains a system for benchmarking and the following benchmarks:

  • Testing archiving for one site w/ 10,000 page views in one day.
  • Testing archiving for one site w/ 10,000 page views over an entire year.
  • Testing archiving for 1000 sites w/ 12 page views each in one day.
  • Testing archiving for 1000 sites w/ 12 pave views over an entire year.
  • Testing tracking of 1000 page views.

I didn't test all of them since they can take up to ~40 mins to set up, but here are the results of the ones I did test:

  • Tracking 1000 page views: 242s
  • One website w/ 10,000 page views in one day: ~12.5s
  • 1000 websites w/ 12 page views, all in one day: ~178.6s

Some other notes:

comment:2 Changed 23 months ago by matt (mattab)

  • Milestone set to 1.x - Piwik 1.x

comment:3 follow-up: Changed 22 months ago by matt (mattab)

Really interesting concept -- it has some value as it is as it allows to manually compare Before and After a particular change. It would be VERY nice to run it on jenkins as part of the broader plan in: #2000

For example we could create a new build on http://qa.piwik.org:8080/jenkins/ called "Performance testing", that would run the performance tests and report the time spent in each test. Ideally we would reuse some tool that would let us draw graphs to see how the performance is evolving across releases maybe?

Changed 22 months ago by capedfuzz (diosmosis)

New patch. Improved speed for adding visits.

comment:4 in reply to: ↑ 3 Changed 22 months ago by capedfuzz (diosmosis)

I added a new patch for this issue w/ improved speeds for adding visits. It essentially bypasses CURL for adding visits. I've added a check for the useLocalTracking query param in the integration tests so the improved speed can be used when running tests. For me, running all_tests.php?useLocalTracking=true sped things up from ~640s to ~470s.

Let me know what you think.

Replying to matt:

Really interesting concept -- it has some value as it is as it allows to manually compare Before and After a particular change. It would be VERY nice to run it on jenkins as part of the broader plan in: #2000

For example we could create a new build on http://qa.piwik.org:8080/jenkins/ called "Performance testing", that would run the performance tests and report the time spent in each test. Ideally we would reuse some tool that would let us draw graphs to see how the performance is evolving across releases maybe?

Do you think it would be a good idea to store the results in the piwik_log_profiling table? Then the graphs could be made from the data there. Or maybe we could create a Jenkins plugin...

comment:5 Changed 21 months ago by matt (mattab)

Code review

  • Good refactoring & first commmit of Performance Regression Testing framework: Well Done!! :)
  • setTestEnv -> setTestingEnvironment()

Do you think it would be a good idea to store the results in the piwik_log_profiling table? Then the graphs could be made from the data there.

For sure, if the table is good enough then please reuse it, that woudl be great! we could then write the performance report as piwik plugin (and even export it as HTML via the report publisher #3118 :)

You can commit as it is but definitely we need some kind of Reporting in Jenkins or on a webpage to keep an eye on performance :) Not sure what the best or easiest way would be?

comment:6 Changed 21 months ago by capedfuzz (diosmosis)

(In [6533]) Refs #3177, added benchmarking system and set of benchmarks to test Piwik performance.

Notes:

  • Added LocalTracker tracker that can be used w/ tests & benchmarks to speed things up.
  • Modified HTTP header sending code to use intermediate utility function.

comment:7 Changed 21 months ago by capedfuzz (diosmosis)

I committed the benchmarking system, however, there's still more to do. Specifically, the benchmarks are pretty still slow (though manageable), and of course, there needs to be some way to integrate w/ Jenkins. Here are my ideas:

Speeding Things Up

I think instead of running setup each time the benchmark is run, we can run it once in a database named after the test case, then keep the database (ie, don't drop it) so the setup method won't have to be run again.

Integrating With Jenkins

I've looked at some plugins for Jenkins and found two that might be useful for tracking performance:

Both of these process XML to get performance data. The benchmarks could be setup to output this XML when certain query parameters are used, and, if possible, Jenkins could get this data and feed it to the plugin. However, I have no experience setting up/configuring Jenkins, so I don't know what this will entail. Or if it's possible.

What do you think?

comment:8 Changed 21 months ago by capedfuzz (diosmosis)

(In [6534]) Refs #3177, fixing build.

comment:9 Changed 21 months ago by matt (mattab)

Code review

  • Why is the change in Url.test.php required to add the port? I'd like if the test was not adding the port if it is 80 by default as maybe it would create confusing URLs with port for users?

I get errors ) Warning: require_once(piwik\svn\trunk/trunk/tests/benchmarks/OneSiteTwelveThousandPageViews_Year.benchmark.php) [function.require-once]: failed to open stream: No such file or directory in piwik\svn\trunk\tests\benchmark_runner.php on line 18
Does it work for you?

  • Are the tests enabled by default? Since there is no reporting at the moment, please check they are disabled and not slow down the build. Once we have some reports on the time spent in each test, the size of the DB, and graphs to see evolution across builds, we can leave them disabled - ok?

I think instead of running setup each time the benchmark is run, we can run it once in a database named after the test case, then keep the database (ie, don't drop it) so the setup method won't have to be run again.

I think setup/teardown will be pretty small time compared to the test running? I think it's OK as it is (deleting/recreating the tables)

Both of these process XML to get performance data. The benchmarks could be setup to output this XML when certain query parameters are used, and, if possible, Jenkins could get this data and feed it to the plugin. However, I have no experience setting up/configuring Jenkins, so I don't know what this will entail. Or if it's possible.


Looks nice - Would this Jenkins plugin store the value for each build automatically?

Could we plot simple graphs like "measure time to run benchmark" for each benchmark + total benchmark time, using this plugin?

comment:10 Changed 21 months ago by capedfuzz (diosmosis)

(In [6535]) Refs #3177, fix require bug when loading single benchmark group.

comment:11 Changed 21 months ago by capedfuzz (diosmosis)

Replying to matt:

Code review

  • Why is the change in Url.test.php required to add the port? I'd like if the test was not adding the port if it is 80 by default as maybe it would create confusing URLs with port for users?

The Url.test.php changes are used because adding a query parameter (like useLocalTracking=true) to all_tests.php will cause some checks to fail. The port change was for Jenkins, since it doesn't use port 80.

I get errors ) Warning: require_once(piwik\svn\trunk/trunk/tests/benchmarks/OneSiteTwelveThousandPageViews_Year.benchmark.php) [function.require-once]: failed to open stream: No such file or directory in piwik\svn\trunk\tests\benchmark_runner.php on line 18
Does it work for you?

Fixed the bug. BTW, if you load benchmark_runner.php directly, it'll show every benchmark.

  • Are the tests enabled by default? Since there is no reporting at the moment, please check they are disabled and not slow down the build. Once we have some reports on the time spent in each test, the size of the DB, and graphs to see evolution across builds, we can leave them disabled - ok?

Benchmarks will not get loaded when running tests since the files don't end w/ .test.php. If they did, Jenkins would never complete a build :).

I think instead of running setup each time the benchmark is run, we can run it once in a database named after the test case, then keep the database (ie, don't drop it) so the setup method won't have to be run again.

I think setup/teardown will be pretty small time compared to the test running? I think it's OK as it is (deleting/recreating the tables)

The tests for the ones I ran take ~1-4mins. The setup takes between 7-11mins when using useLocalTracking=true. W/o local tracking, it can take > 40 mins.

Both of these process XML to get performance data. The benchmarks could be setup to output this XML when certain query parameters are used, and, if possible, Jenkins could get this data and feed it to the plugin. However, I have no experience setting up/configuring Jenkins, so I don't know what this will entail. Or if it's possible.


Looks nice - Would this Jenkins plugin store the value for each build automatically?

Could we plot simple graphs like "measure time to run benchmark" for each benchmark + total benchmark time, using this plugin?

Not sure. I was hoping someone w/ more experience w/ Jenkins would be able to more easily figure that out.

comment:12 follow-up: Changed 21 months ago by vipsoft (robocoder)

From what I understand is being said, I don't believe the port change is necessary. Jenkins runs in a Jetty container under port 8080. On the CI server, we run php-cgi through Apache at port 80.

comment:13 in reply to: ↑ 12 ; follow-up: Changed 21 months ago by capedfuzz (diosmosis)

Replying to vipsoft:

From what I understand is being said, I don't believe the port change is necessary. Jenkins runs in a Jetty container under port 8080. On the CI server, we run php-cgi through Apache at port 80.

Sorry, I wasn't clear enough, by 'port change' I meant 'change to Url.test.php that deals w/ the port in the current url'. No ports were actually changed :)

comment:14 in reply to: ↑ 13 ; follow-up: Changed 21 months ago by matt (mattab)

Replying to capedfuzz:

Replying to vipsoft:

From what I understand is being said, I don't believe the port change is necessary. Jenkins runs in a Jetty container under port 8080. On the CI server, we run php-cgi through Apache at port 80.

Sorry, I wasn't clear enough, by 'port change' I meant 'change to Url.test.php that deals w/ the port in the current url'. No ports were actually changed :)

Maybe the question is: Why is this change now necessary?

@vipsoft, do you have any experience with Jenkins plugins and performance reporting?

comment:15 in reply to: ↑ 14 Changed 21 months ago by capedfuzz (diosmosis)

Replying to matt:

Replying to capedfuzz:

Replying to vipsoft:

From what I understand is being said, I don't believe the port change is necessary. Jenkins runs in a Jetty container under port 8080. On the CI server, we run php-cgi through Apache at port 80.

Sorry, I wasn't clear enough, by 'port change' I meant 'change to Url.test.php that deals w/ the port in the current url'. No ports were actually changed :)

Maybe the question is: Why is this change now necessary?

The change to Url.test.php is in [6533]. Originally it tested getCurrentUrl() == getCurrentUrlWithoutQueryString(). If you use a query parameter like useLocalTracking=true, however, the test will fail. I modified it to use parse_url manually, so we can use useLocalTracking=true on all_tests.php (to speed things up).

What I didn't do was make sure the port in the current url was added to the test's expected url. Which caused the test to fail in jenkins, since it uses port 8080. So I committed a fix in [6534].

comment:16 follow-up: Changed 21 months ago by matt (mattab)

Thanks @capedfuzz for the clarification.

Do you think this ticket should be closed?
We could open a new one for the dev of the Jenkins Cross Build plugin.

comment:17 in reply to: ↑ 16 Changed 21 months ago by capedfuzz (diosmosis)

Replying to matt:

Thanks @capedfuzz for the clarification.

Do you think this ticket should be closed?
We could open a new one for the dev of the Jenkins Cross Build plugin.

There's still the issue that its a bit too slow to run a lot of benchmarks. I can make using persisted DB data an option (not dropping a test data). Also, still have to try w/ the database dump.

But other than that, the issue looks done.

comment:18 Changed 21 months ago by matt (mattab)

See Follow up ticket: #3280 to write some kind of reports and graphs to get Performance across several builds, and see if and when the software becomes faster/slower.

comment:19 Changed 21 months ago by capedfuzz (diosmosis)

(In [6580]) Refs #3177, fix small bug in benchmark runner.

comment:20 follow-up: Changed 21 months ago by SteveG (sgiehl)

The benchmark tests are using the simpletest library, right? I'm currently working on migrating all tests to PHPUnit. Most core and plugin tests are already migrated, some integration tests are still missing. The tests using simpletest will be removed when I'm done. Is there a way to migrate the performance tests to phpunit or something else?

comment:21 in reply to: ↑ 20 Changed 21 months ago by capedfuzz (diosmosis)

Replying to SteveG:

The benchmark tests are using the simpletest library, right? I'm currently working on migrating all tests to PHPUnit. Most core and plugin tests are already migrated, some integration tests are still missing. The tests using simpletest will be removed when I'm done. Is there a way to migrate the performance tests to phpunit or something else?

For the most part they depend on the Integration base test case class. Simpletest itself isn't really used. The only other requirements are that:

  • loading a benchmark file in the browser should not run the tests, but load benchmark_runner.php
  • benchmark_runner.php should work (should only need to change the requires at the top of the file)

I haven't been keeping up w/ the PHPUnit changes, though. Do you think this would still be possible?

Changed 20 months ago by capedfuzz (diosmosis)

benchmarking w/ phpunit

comment:22 Changed 20 months ago by capedfuzz (diosmosis)

I've uploaded a patch that implements benchmarking w/ the PHPUnit test infrastructure. I've modified visualphpunit to show the total execution time of the executed tests and made it possible to set GLOBAL variables before running a test through the browser. To run a benchmark, you select a benchmark, and set the PIWIK_BENCHMARK_FIXTURE global variable. Then you run the test and it'll tell you how long it took.

Some notes on this implementation:

  • I've separated the type of visits from the test type. So to run the ArchiveProcessing benchmark for one site w/ 12000 visits, you'd set PIWIK_BENCHMARK_FIXTURE to 'OneSiteTwelveThousandPageViewsOneDay'. If you want to run it for 12000 sites, over a year, you'd name a different type.
  • I've made it possible to save the test data into a separate database so the setup doesn't have to be run more than once.

It's a bit preliminary, and I don't think it's quite working (running the archiving process w/ 12000 visits takes .02s...). Matt, what do you think of my patch? And Stefan, do you see a problem w/ the way I've implemented it (in terms of using PHPUnit)?

Changed 20 months ago by capedfuzz (diosmosis)

Another patch for phpunit benchmarking.

comment:23 Changed 20 months ago by capedfuzz (diosmosis)

I uploaded a new patch. It's been tested and provides several more fixture types and a benchmark for the Tracker. The SqlDump fixture uses a hardcoded URL, but there's currently no publicly hosted SQL dump to use, so it will have to be changed.

Let me know what you think of it.

comment:24 Changed 20 months ago by capedfuzz (diosmosis)

Another note about my patch:

To run a benchmark, go into visualphpunit and select a benchmark to run.

Then in the 'GLOBALS' section on the left panel, enter two key-value pairs:

'PIWIK_BENCHMARK_FIXTURE' => the name of the fixture to use, such as 'OneSiteTwelveThousandVisitsOneDay'

(optional) 'PIWIK_BENCHMARK_DATABASE' => the name of the database to permanently store data in. if not set, the fixture setup will be run every time, which will slow things down if you run benchmarks more than once.

Then run the test. Be sure to only run one benchmark at a time.

I think after I commit, I'll have to edit the README.

comment:25 Changed 20 months ago by matt (mattab)

Looks good to me, nice work!

I would just suggest to remove some fixtures, and keep the 3 most useful ones?

Also as per email, please prepare a comprehensive DB dump for the test case for #3330

Changed 20 months ago by capedfuzz (diosmosis)

Ugly CoffeeScript script used to generate visits.

comment:26 Changed 20 months ago by capedfuzz (diosmosis)

  • Resolution set to fixed
  • Status changed from new to closed

(In [6954]) Fixes #3177, added benchmarking system that uses phpunit and visualphpunit.

comment:27 Changed 20 months ago by capedfuzz (diosmosis)

(In [6955]) Refs #3177, set svn property.

comment:28 Changed 19 months ago by capedfuzz (diosmosis)

(In [7027]) Refs #3290, #3177 added XHProf support to VisualPHPUnit. Can now profile tests & benchmarks through browser.

comment:29 Changed 19 months ago by capedfuzz (diosmosis)

(In [7028]) Refs #3177, #3290 add XHProf to tests/lib and tweak visualphpunit display.

comment:30 Changed 19 months ago by capedfuzz (diosmosis)

(In [7029]) Refs #3177, add README section for benchmarking system.

comment:31 Changed 19 months ago by capedfuzz (diosmosis)

(In [7031]) Refs #3177, remove some unnecessary XHProf files and add a section to the tests README describing how to use XHProf.

Note: See TracTickets for help on using tickets.