One reason I’ve been hesitant to run wild with any of the popular SEO toolkits on the market is that, having worked previously for a company that created one, I know that it evolved to violate Google’s Terms of Service after Google discontinued their SOAP API for search results. As someone who is only willing to take on customers who are interested in white hat SEO tactics and as someone whose background is in web development, I’m skeptical that many of the available toolkits can operate without either, 1) explicitly violating various TOSes, usually by scraping (or otherwise illegally acquiring) data, or 2) licensing large amounts of data. I know that there are some services that create their own tracking code to collect data rather than scraping, but I also suspect that several services scrape search engine results pages (SERPs).
To my mind, if I’m focused on generating relevance and avoiding penalties for my customers, section 5.3 of Google’s TOS has relevance:
5.3 You agree not to access (or attempt to access) any of the Services by any means other than through the interface that is provided by Google, unless you have been specifically allowed to do so in a separate agreement with Google. You specifically agree not to access (or attempt to access) any of the Services through any automated means (including use of scripts or web crawlers) and shall ensure that you comply with the instructions set out in any robots.txt file present on the Services.
And a statement in their webmaster guidelines confirms my right to be suspicious, especially on behalf of my paying customers:
Don’t use unauthorized computer programs to submit pages, check rankings, etc. Such programs consume computing resources and violate our Terms of Service. Google does not recommend the use of products such as WebPosition Gold™ that send automatic or programmatic queries to Google.
So here’s a question: Is it possible to create a disclaimer or other indicator of certified white hat toolkits? Would it even matter without a more open review of toolkit source code? Should there be an independent SEO certification authority of some kind, and would SEOs even trust it? Or would Google (and other search engines) consent to creating a standard badge that a data (or other relevant) license of some kind exists between the two parties?
To other people in this space: Does it matter at all to you whether services, many of which are probably charging you money, are operating legally? Do you have some mechanism I’m missing for identifying toolkits that are thoroughly white hat?
Consequentially, I wonder what the ramifications are for the companies that are building black hat toolkits. With the rise of cloud computing (e.g., Amazon EC2) and IPv6 (and, thus, essentially unlimited/untraceable IPs), can Google and the other major search engines even identify SERP scrapers and the like?
I’ve heard tell (can’t find a quotation at the moment) that Matt Cutts is dismissive of scrapers as any sort of threat to Google (presumably in the context of bandwidth as the scrapers hit their servers), but I don’t know whether that means there is any kind of internal effort to track SERP scrapers. I’m personally more interested in whether they could wind up creating a cascading penalty if it were determined that SEOs were using a toolkit that was in flagrant violation of TOS, such that sites being tracked by an SEO wound up suffering.
A bigger social question: Does the ease with which TOSes and copyrights can be violated online make us all likelier to be criminals? Or is it justified as online civil disobedience?