Follow BlogTips via RSS Get BlogTips updates via Email Follow @SM4NP - Social Media for NonProfit

I don’t like Google anymore

Posted on Jan 13th, 2012 by

Google horror

I have blogged before how I don’t like Google’s growing monopoly on the web, and how they increasingly block non-Chrome browsers when you use their own web applications.

It looks like Google is now moving into more unsound web practices, using their web crawling power and abilities to unfair, unethical and illegal purposes.

Google pays bloggers for Chrome links

Earlier this month, news broke that Google paid bloggers to write about Chrome, their own browser. Embedding paid advertisements, and, in some cases, hiding it as genuine blog content, has always been something I found highly unethical.
OK, in some cases during Google’s campaign, the paid blogs were clearly marked as such, but the post embedded links to Chrome’s download site. “Paid” links is something Google themselves have always battled against, saying it cheats search engines. Now they are doing it themselves, to market their own products.

Verdict: unethical and unfair competition.

Google crawls for data for their own profit

Today, I read this interesting and in-depth story where Google crawled data from Mocality, a small Kenyan company publishing an online Kenya business directory.

In their article, the Mocality folks detail how Google crawls their business directory, with the sole purpose to contact the listed companies, convincing them to convert to a Google business directory. The article features recording of telephone calls from Google employees to the listed companies, which also reveals other unclear and illegal business practices.
Even worse, after the story broke out, Google willingly switched the crawler’s IP address, and continued their malpractice in crawling Mocality’s data. This shows a lack of ignorance, and clear mal-intent.

Verdict: unethical, unfair competition, commercial spying.

My own thing with Google

Web crawlers, scanning your web content, are to obey rules web admins set in “robots.txt”. This file defines what crawlers can scan, what they should not, and at what rate they can scan.
I always had a problem that for the Google crawler, some settings, like the Google’s crawler rate, can only be set in the Google Web Master tools, ignoring the settings in “robots.txt”. Even worse, every three months, Google resets their crawler rates. If you want to spare your server from excessive Google crawling, you have to manually reset the crawler rates for each of your site, again. Every three months.

Today, I discovered that Google’s crawlers also ignore other, more basic rules:

On “Humanitarian News“, one of my sites, the “robots.txt” explicitly disallows crawlers to access the search facility:

Disallow: /search/
Disallow: /opensearch/

The reason is that with the rate Google crawls, and the amount of search combinations possible, my server performance goes down each time Google spams my site with excessive crawls. Even worse, this “opensearch” string calls for SOLR search, which invokes a JAVA script offering advanced RSS features for “live” users. Needless to say this sucks up server resources, certainly if you shoot off opensearch requests at a rate Google’s crawler does. This is the reason why I block crawlers from accessing the search, in the first place.

This morning, while verifying my log, I found a string of searches which caught my attention:

search 13 Jan 2012 – 12:02 somalia (Search). Anonymous results
search 13 Jan 2012 – 12:02 somalia (Search). Anonymous results
search 13 Jan 2012 – 12:02 somalia (Search). Anonymous results
search 13 Jan 2012 – 12:02 somalia (Search). Anonymous results
search 13 Jan 2012 – 12:02 somalia (Search). Anonymous results
etc etc etc..

Looking into it further, the search strings were of the format:

http://humanitariannews.org/search/apachesolr_search/somalia?page=1442

And who generated those searches? They all come from IP 66.249.72.105
Which is (source)…:

OrgName: Google Inc.
OrgId: GOGL
Address: 1600 Amphitheatre Parkway
City: Mountain View
StateProv: CA
PostalCode: 94043
Country: US

So…

  1. My robots.txt clearly blocks /search access.
  2. Google ignores that rule, and as such ignores a basic rule to protect certain web content from crawling.
  3. Going over my log, I find repetitive Google crawlers of the search
  4. On top of that, there is not much relevant data to be found in the search which can not be found in a normal crawl of my site. And certainly not on (as in the example) on “page 1,442″ of a search (see above example).

The problem is that other than blocking their crawler’s IP address, you can’t do much about it. And if you block their crawler’s IP address, as a web manager, you can just as well throw yourself under a train: The Google search results for your site, would disappear. – do I smell a monopoly here?-

Verdict: monopoly, unethical business practice, bad example for anyone else on the web.

Is it time for an #OccupyGoogle ? I think it is.
PS: If anyone has bright ideas how to tweak the robots.txt to disallow crawler access, I’d like to hear about it.
Angry face picture courtesy SearchEnginePeople




10 Comments to “I don’t like Google anymore”

  1. Ferb says:

    Finally, Check this out http://www.google.com/support/forum/p/Webmasters/thread?tid=0da8e0b9fd682420&hl=en

    Don’t you have to so serious about the problems. Beside, immediately switch your browser to IE9 for WIN and Firefox for Mac would be great advised.

    • Peter says:

      Hi Ferb,

      All what is recommended in that post, I did, and more… And that is exactly my point: Google does NOT follow the robots.txt rules :-(

      Peter

    • Dee Jamacek says:

      I don’t care for Google either, especially when they print something you did not say. I looked my name up and they printed a false statement about the Zimmerman/Trayvon case which I never said. I guess they figure
      they can print whatever the heck they want. I am not thrilled about them owning youtube either, they are
      trying to take over the whole world.

  2. John says:

    I am not a huge Google fan myself. I have tried Chrome but actually find it child like in its design. Much of Google looks like a child sat down and drew out plans with crayons. I guess my problems with Chrome started when Google decided to implement Flash into its browser. I prefer myself to decide what version I run. Flash has had a history of bad versions. Between IE and Firefox I would always say Firefox is by far the better browser. Even IE10 is not going to impress many. I think if you like Chrome and especially if you use Google’s other products. Then I guess Chrome is your browser. I prefer to avoid ecosystems especially one’s like Google that profit mainly from ad revenue.

  3. Julian says:

    I dont like google either. I agree with you about their web monopoly, and one thing in particular that I hate is that they bought YouTube. They completely ruined it and made it so much more complicated than it has to be. They made it so you must have a google account to have a YouTube account. It is also very difficult to change the fast that it shows your name when you comment.

    • Dee Jamacek says:

      They are trying to take over everything. Its also annoying when you want to download a product and you
      get Install Google Chrome/Toolbar. I skip over it. I have complained about Google a lot. I see they have
      posted email addresses on search, one has my last name. Well, it’s not my email address but since I
      comment online about different things, this poor person is probably getting email intended for me.
      Google does not have a privacy policy, that’s a laugh.

  4. danny Perez says:

    Not only that, google only show what give them money, try to search anythig in google and then in bing.com
    you’ll see the difference, in google alway whos paid is first even is if not relevant . I think they got corrupted with money , they become griddy.
    so i did what everybody should do if they dont like google, I DONT USE IT ANYMORE!!. SWITCH TO BING OR YAHOO or whatever.

  5. Dee Jamacek says:

    I have complained about GOOGLE a lot. GOOGLE has posted email addresses on search, one with my last name. Hey there, GOOGLE, it’s NOT my email address! It is very annoying when you want to download
    Flash Player and they have Install GOOGLE chrome/toolbar. I skip that, I skip over it every time there
    is a download with Install GOOGLE chrome/toolbar. Like I said, they are trying to take over everything.
    It’s bad enough they took over youtube.

  6. Andy says:

    I was wondering if I was the only one. Glad to see that I’m not alone.

  7. Ronda says:

    Google not a nice company… Smart but not user friendly. Do not lose pass word cannot get help if disabled, senior or regular Joe. You get into stupid mode of dumb comments and questions. What year did you open and date? Other email address you send to and what folders names did you set up? Send reset to wrong back up info and refuse to correct. Give another email acct to send info to you and they refuse to send. Reset code email insist on phone that not in service. Cannot get a live person. If you call by phone try to charge you and ask to close the account a rude rep says good buy thanks for calling click??? Being mean to seniors and disabled says allot right there. They have become the scary internet provider. Scan your messages, download chrome invades your system and spy on all set up info and mesg that you store. Letters you type on their sites cannot be deleted or saved to your docs they choose to keep them.

    This is the same company buying war robots the only exciting thing is self driving cars. Who knows may trap you inside? Even the head of Company odd mystery of some tattoo woman may have attacked him. Hope not. Either or its creepy.

Leave a Comment

*