Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Google Advanced Search support #1370

Closed
halfdan opened this issue May 22, 2010 · 19 comments
Closed

Google Advanced Search support #1370

halfdan opened this issue May 22, 2010 · 19 comments
Labels
Enhancement For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc.
Milestone

Comments

@halfdan
Copy link
Member

halfdan commented May 22, 2010

Saw the following URL in my referer list:

google.co.uk

Problem: The search parameter isn't &q=keywork but &as_q - Trying to figure out where this comes from brought me to this page. The "advanced search" from google doesn't use &q=. It is even possible to combine these parameters like:

&as_q=test1&as_epq=test2&as_oq=foo+bar+baz

My idea to include this would be to allow KeywordParameter to be an array in SearchEngines.php.


I found that "google.kg" (Kyrgyzstan) is not included in the list of search engines.

@anonymous-matomo-user
Copy link

that is the official list of google domains
http://www.google.com/supported_domains maybe some more are missing

@halfdan
Copy link
Member Author

halfdan commented May 22, 2010

Oh well.. ignore my suggestion. KeywordParameter already allows arrays. We should therefore adjust the setting for all Google entries to:

array(
  'q',
  'as_q',
  'as_oq',
  'as_epq'
);

hebbet: Thank for the link. Just wrote a little script to check if any other TLDs are missing. (see Attachment)

While writing the script I noticed the entry for google.fr:

'www.google.fr.' => ..

That looks like a typo to me (www.google.fr is in the list).

matt/vipsoft: Let me know if we should include the other parameters for Google in the list and I'll prepare a patch.

@halfdan
Copy link
Member Author

halfdan commented May 22, 2010

Attachment:
checkGoogleDomains.php

@halfdan
Copy link
Member Author

halfdan commented May 22, 2010

Attachment:
missing_list.txt

@robocoder
Copy link
Contributor

The problem is that these advanced query parameters (as_q=ALL_THESE_WORDS, as_epq=EXACT_WORDING, as_oq=OR1+OR2+OR3, and as_eq=UNWANTED) can all appear in the referrer URL.

halfdan: sure, a patch would be great. Don't forget to change Piwik_Common::extractSearchEngineInformationFromUrl() and please add some unit tests.

@halfdan
Copy link
Member Author

halfdan commented May 23, 2010

vipsoft: How do you think this should be handled? Should I just build a string from all keywords provided? This would lead to false reports e.g. in case of as_eq. Any suggestions?

@mattab
Copy link
Member

mattab commented May 24, 2010

no problem to add missing google URLs, please provide patch

@mattab
Copy link
Member

mattab commented May 24, 2010

Replying to vipsoft:

The problem is that these advanced query parameters (as_q=ALL_THESE_WORDS, as_epq=EXACT_WORDING, as_oq=OR1+OR2+OR3, and as_eq=UNWANTED) can all appear in the referrer URL.

I think that's fine, as long as if 'q' is found, it has priority over other variables.

@mattab
Copy link
Member

mattab commented May 24, 2010

actually it would be an issue... to solve this issue properly, we would need to be able to construct the query string from the list of possible parameters (q, as_q, as_qe, etc.) which is I believe undocumented by google, and probably undesirable considering the low traffic. I vote for won't fix..

@robocoder
Copy link
Contributor

Replying to matt:

I vote for won't fix..

But it would be nice-to-have. I'll reserve judgement until I've seen a patch.

@halfdan
Copy link
Member Author

halfdan commented May 24, 2010

matt: SearchEngine patch is attached.

vipsoft: I'll need to think this through. I'm not sure yet on how to visualize the combined data (e.g. as_eq=UNWANTED should not appear as "keyword" because keywords usually suggest that the page was found using that keyword and not by ignoring it).

I'm mabye postposing the patch until after 0.6.2 as I'll focus on other work first. Feel free to move the ticket to 0.8 when necessary.

@halfdan
Copy link
Member Author

halfdan commented May 24, 2010

Attachment:
SearchEngines.php.patch

@mattab
Copy link
Member

mattab commented May 24, 2010

as_eq=UNWANTED is equivalent, I believe, to using minus: "keyword -notThisKeyword"

@mattab
Copy link
Member

mattab commented May 24, 2010

(In [2212]) Refs #1370 adding google URLs thanks halfdan!

@robocoder
Copy link
Contributor

I'll keep this ticket open. In the meantime, I created #1381 to record the addition of the missing Google URLs/domains, for the upcoming 0.6.2 changelog.

@mattab
Copy link
Member

mattab commented May 27, 2010

moving it to later milestone.

@robocoder
Copy link
Contributor

At the same time, I propose we add some code to parse out the keywords when the referer is webcache.googleusercontent.com.

@robocoder
Copy link
Contributor

Opening a separate ticket for webcache.googleusercontent.com. #1692

@robocoder
Copy link
Contributor

(In [3118]) fixes #1370 - constructs the equivalent q= query from the advanced search parameters

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc.
Projects
None yet
Development

No branches or pull requests

4 participants