Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trunk: Data access abstraction #1368

Closed
robocoder opened this issue May 20, 2010 · 26 comments
Closed

trunk: Data access abstraction #1368

robocoder opened this issue May 20, 2010 · 26 comments
Assignees
Labels
Enhancement For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc.

Comments

@robocoder
Copy link
Contributor

Users have requested Piwik support for non-MySQL databases, e.g., PostgreSQL (#500), MSSQL (#1335), and NoSQL (see mailing list).

Obstacles:

  • MySQL schema defined in core/Piwik.php
  • MySQL'ish SQL is embedded throughout Piwik (e.g., ON DUPLICATE, crc(), and MySQL-specific error codes)
  • inconstent use of wrappers (e.g., Piwik_Query()) vs direct calls, e.g., Zend_Registry::get('db')->query(), $db->insert(), or $db->update()
  • the "only MySQL is supported" policy which stems from (1) tracker performance and (2) ongoing testing/support
  • the NoSQL proposals for Zend_Framework are not Zend_Db-based

Considerations:

  • third-party plugins must be able to use the data access abstraction without us necessarily adding these third-party queries to core
  • while most of the non-MySQL databases have a one-to-one relationship between the adapter and schema, in future, we want to be able to support multiple storage engines for MySQL (e.g., MYISAM, Innodb, spider, Infinidb)
  • MariaDB is supposed to be a drop-in replacement, so this should be transparent to the MySQL adapters
  • drizzle is a stripped down MySQL fork; since it currently uses the MySQL port, perhaps it could be treated as a storage engine? (see Drizzle support #1296)
  • watch out for '_' used in tables prefix, as it matches any character in a LIKE expression

TODO:

  • refactor Piwik.php into core/Db/Schema/*
    • tableInsertBatch() in core/Piwik.php
  • installer support for multiple schemas
  • helper methods for plugins to ALTER existing tables, or CREATE/DROP tables
  • during install, add a warning explaining that only Mysql is supported, others are "experimental", when the user doesn't select mysql.
  • when enabling new plugins, check that they have the drivers for the current DB (mysql, mssql, postgresql)
    • if they don't, user can still enable it for his db, but with high risk that, if it is not implemented for MySQL, it might not work on your db.
  • refactor SQL into query classes/methods
    • in Support for MSSQL #1335, it is suggested that we implement the Repository pattern. Should Criteria correspond to an entity (table) or aggregate (tables involved in a JOIN)? Or should Criteria correspond to a query? (BTW we're not keen on the name "Criteria".)
    • abstract isErrNo
    • tracker queries should also be in core/Db/Query instead of core/Tracker/Db/Query; similarly, for core Piwik plugins
  • discourage direct use of Zend_Registry::get('db')
  • write many, many unit tests and phpdocs; see Setup continuous integration server for Piwik builds #818

Propose changing instances of:

Piwik_Query($sql, $optional_parameter_array);
// or
Zend_Registry::get('db')->query($sql, $optional_parameter_array);

to:

Piwik_Repository::execute('Db_QueryName_XYZ', $optional_parameter_array);

The $sql is moved into Query classes, defined in:

  • core/Db/Query/SchemaABC.php, or
  • core/Db/Query/SchemaABC/QueryName.php
  • core/Db/Query/SchemaABC/QueryName/XYZ.php

(Any of the above is acceptable and handled by the autoloader.)

Piwik_Repository::execute($queryName, $parameters) could be something like:

$schemaClassName = 'Piwik_Db_Query_' . Zend_Registry::get('schema');
$query = $schemaClassName::factory($queryName, $parameters);

$db = Zend_Registry::get('db');
return $db->query($query, $parameters);

The factory method returns an object or string (e.g., schema-specific SQL). Non-MySQL schemas can either:

  • subclass the MySQL MYISAM class and return its own SQL
  • leave it undefined; in which case, a mock object is used (default to MySQL implementation?)

The prefix on the query name tells the factory method how to construct the class name. A query name prefixed by Db_ is treated as a core query. A query name prefixed by the plugin's name, e.g., GeoIP_ refers to a query in the GeoIP plugin's Query folder.

Additional ideas:

@halfdan
Copy link
Member

halfdan commented May 20, 2010

Considering that the 1.0 version jump will increase popularity of Piwik and more developers might start developing plugins, shouldn't we have a stable data access layer by then? (meaning: Shouldn't the target version be 1.0 instead of after 1.0?)

If you guys need help making these changes, I'm more than happy to help.

@robocoder
Copy link
Contributor Author

Post-1.0 means it isn't on the 1.0 roadmap, and that 1.0's release doesn't depend on this feature. Any/all help is welcome since many have asked for this feature, and it currently isn't the highest priority on my plate.

I've glossed over some implementation details, so feel free to ask questions, request feedback on patches, or prod me on some task.

@robocoder
Copy link
Contributor Author

We should also reconsider using Zend for the Tracker.

Pros:

  • eliminates some redundancy, eg core/Tracker/Config and core/Tracker/Db
  • increases cohesiveness of core/Db and lowers the learning curve for new devs
  • eliminate more require_once in favor of the auto loader everywhere
    Cons:
  • performance overhead; should benchmark the throughput; I suspect the overhead is negligible relative to the overall request time or time to execute SQL queries

@mattab
Copy link
Member

mattab commented May 24, 2010

I moved some items from 'additional ideas' to TODO.

Side note: this Data access abstraction would make it easier to build sharding into Piwik, as we could have a Sharding schema, subclassing Mysql, that would modify queries to access the right shard when applicable.

@robocoder
Copy link
Contributor Author

(In [refs #1368 - hard-code the list of supported adapters since it seems unlikely we'll have adapters as plugins; no longer depends on globr() or hack from 1632)

@robocoder
Copy link
Contributor Author

(In [2265]) refs #1368 - Piwik::prefixTable() is deprecated

@robocoder
Copy link
Contributor Author

(In [2266]) refs #1368 - Piwik::prefixTable() is deprecated

@robocoder
Copy link
Contributor Author

(In [2267]) refs #1368 - remove deprecated functions: prefixTable (since 0.2.27), isPhpCliMode (since 0.4.4), and getMysqlVersion (since 0.4.4)

@robocoder
Copy link
Contributor Author

(In [2268]) refs #1368 - initial refactoring of MySQL schema methods from Piwik.php

@robocoder
Copy link
Contributor Author

(In [2273]) refs #1368 - rename Piwik_Db_iAdapter to Piwik_Db_Adapter_Interface for consistency

@robocoder
Copy link
Contributor Author

(In [2274]) refs #1368 - further refactoring of Piwik.php; initial abstraction of MySQL schema

@robocoder
Copy link
Contributor Author

(In [2276]) refs #1368 - move low-level $dbInfos mangling to adapter

@robocoder
Copy link
Contributor Author

(In [2277]) refs #1368

@robocoder
Copy link
Contributor Author

(In [2278]) refs #1335, refs #1368 - check in Jason's Piwik_Db_Adapter_Pdo_Mssql class (excluding the schema and query methods being refactored elsewhere; edited to conform to coding style guidelines; note: a lot of the logic in getConnection() looks like it should be moved upstream to ZF (2.0 CTP1 related?) especially if CTP2 introduces breaking changes to the DSN

@robocoder
Copy link
Contributor Author

(In [2279]) refs #1368 - preprocessing the SQL to be made obsolete by query classes/methods

@robocoder
Copy link
Contributor Author

(In [2281]) refs #1368 - more refactoring and phpdocs

  • The abstraction of the DDL is more or less complete. I still have to implement isAvailable() in core/Db/Schema/Myisam.php.
  • The default schema will be MyISAM. The update script will add: schema=Myisam to the ![database] section
  • The schema object is lazy loaded.

@robocoder
Copy link
Contributor Author

(In [2282]) refs #1368 - implemented isAvailable() and private method hasStorageEngine(); added Schema.loadSchema hook for Sharding plugin; more phpdocs

@robocoder
Copy link
Contributor Author

(In [2739]) refs #1368 - use schema instead of adapter; in strict mode, update() can't be abstract

@robocoder
Copy link
Contributor Author

@mattab
Copy link
Member

mattab commented Feb 17, 2011

Moving as it is more long term project

@mattab
Copy link
Member

mattab commented Apr 28, 2011

This will definitely be worked on to some extent (most likely: tracking code, archiving code for performance optimization) in 1.x, but it will be fully implemented (eg. sites/users/goals SQL will stay as is).

@anonymous-matomo-user
Copy link

Are there any updates?

Is it possible to get piwik at least tracking with mssql?

@robocoder
Copy link
Contributor Author

dk-at-cabag: if there were updates, it would appear here in the trac log. This is a low priority enhancement. In terms of timeframe, I plan to work on this after the plugins repository is online.

re: mssql. You can try to port Jason's code (see #1335) to the latest Piwik release.

@robocoder
Copy link
Contributor Author

(In [4690]) refs #1368 - refactoring the batch insert code

@robocoder
Copy link
Contributor Author

(In [4693]) refs #1368

@robocoder
Copy link
Contributor Author

See ticket #2593

  • NoSQL is out-of-scope for Piwik 1.x, but may be re-opened at a later time (NoSQL support #2592); anyone got a patch? ;)
  • the centralization of the Repository design pattern doesn't fit Piwik's plugin/modular architecture, so it's no longer being considered

@robocoder robocoder added this to the Future releases milestone Jul 8, 2014
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc.
Projects
None yet
Development

No branches or pull requests

4 participants