<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Blog Tips &#187; Advanced Stuff</title>
	<atom:link href="http://www.blogtips.org/category/advanced/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.blogtips.org</link>
	<description>Blogging and Social Media for Nonprofit</description>
	<lastBuildDate>Tue, 31 Jan 2012 15:23:38 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>I don&#8217;t like Google anymore</title>
		<link>http://www.blogtips.org/i-dont-like-google/</link>
		<comments>http://www.blogtips.org/i-dont-like-google/#comments</comments>
		<pubDate>Fri, 13 Jan 2012 12:31:26 +0000</pubDate>
		<dc:creator>Peter</dc:creator>
				<category><![CDATA[Advanced Stuff]]></category>
		<category><![CDATA[Easy Stuff]]></category>
		<category><![CDATA[FYI Stuff]]></category>
		<category><![CDATA[Google Webmaster]]></category>
		<category><![CDATA[search engines]]></category>
		<category><![CDATA[SEO]]></category>
		<category><![CDATA[Web statistics]]></category>

		<guid isPermaLink="false">http://www.blogtips.org/?p=2222</guid>
		<description><![CDATA[I have blogged before how I don&#8217;t like Google&#8217;s growing monopoly on the web, and how they increasingly block non-Chrome browsers when you use their own web applications. It looks like Google is now moving into more unsound web practices, using their web crawling power and abilities to unfair, unethical and illegal purposes. Google pays [...]]]></description>
			<content:encoded><![CDATA[<p></p><p><img class="aligncenter" title="Google horror" src="http://theroadtothehorizon.net/photo/angry%20google.jpg" alt="Google horror" width="373" height="430" /></p>
<p>I have blogged before how I don&#8217;t like <a title="Google web monopoly" href="/google-web-monopoly/">Google&#8217;s growing monopoly on the web</a>, and how they <a href="/chrome-soon-only-browser-for-google-tools/">increasingly block non-Chrome browsers</a> when you use their own web applications.</p>
<p>It looks like Google is now moving into more unsound web practices, using their web crawling power and abilities to unfair, unethical and illegal purposes.</p>
<p><span id="more-2222"></span></p>
<h3>Google pays bloggers for Chrome links</h3>
<p>Earlier this month, news broke that <a href="http://searchengineland.com/googles-jaw-dropping-sponsored-post-campaign-for-chrome-106348" target="_blank">Google paid bloggers to write about Chrome</a>, their own browser. Embedding paid advertisements, and, in some cases, hiding it as genuine blog content, has always been something I found highly unethical.<br />
OK, in some cases during Google&#8217;s campaign, the paid blogs were clearly marked as such, but the post embedded links to Chrome&#8217;s download site. &#8220;Paid&#8221; links is something Google themselves have always battled against, saying it cheats search engines. Now they are doing it themselves, to market their own products.</p>
<p><strong>Verdict:</strong> unethical and unfair competition.</p>
<h3>Google crawls for data for their own profit</h3>
<p>Today, I read <a href="http://blog.mocality.co.ke/2012/01/13/google-what-were-you-thinking/" target="_blank">this interesting and in-depth story</a> where Google crawled data from <a href="http://www.mocality.co.ke/" target="_blank">Mocality</a>, a small Kenyan company publishing an online Kenya business directory.</p>
<p>In <a href="http://blog.mocality.co.ke/2012/01/13/google-what-were-you-thinking/" target="_blank">their article</a>, the Mocality folks detail how Google crawls their business directory, with the sole purpose to contact the listed companies, convincing them to convert to a Google business directory. The article features recording of telephone calls from Google employees to the listed companies, which also reveals other unclear and illegal business practices.<br />
Even worse, after the story broke out, Google willingly switched the crawler&#8217;s IP address, and continued their malpractice in crawling Mocality&#8217;s data. This shows a lack of ignorance, and clear mal-intent.</p>
<p><strong>Verdict:</strong> unethical, unfair competition, commercial spying.</p>
<h3>My own thing with Google</h3>
<p>Web crawlers, scanning your web content, are to obey rules web admins set in &#8220;robots.txt&#8221;. This file defines what crawlers can scan, what they should not, and at what rate they can scan.<br />
I always had a problem that for the Google crawler, some settings, like the Google&#8217;s crawler rate, can only be set in the Google Web Master tools, ignoring the settings in &#8220;robots.txt&#8221;. Even worse, every three months, Google resets their crawler rates. If you want to spare your server from excessive Google crawling, you have to manually reset the crawler rates for each of your site, again. Every three months.</p>
<p>Today, I discovered that Google&#8217;s crawlers also ignore other, more basic rules:</p>
<p>On &#8220;<a href="http://humanitariannews.org" target="_blank">Humanitarian News</a>&#8220;, one of my sites, the &#8220;<a href="http://humanitariannews.org/robots.txt" target="_blank">robots.txt</a>&#8221; explicitly disallows crawlers to access the search facility:</p>
<blockquote><p>Disallow: /search/<br />
Disallow: /opensearch/</p></blockquote>
<p>The reason is that with the rate Google crawls, and the amount of search combinations possible, my server performance goes down each time Google spams my site with excessive crawls. Even worse, this &#8220;opensearch&#8221; string calls for SOLR search, which invokes a JAVA script offering advanced RSS features for &#8220;live&#8221; users. Needless to say this sucks up server resources, certainly if you shoot off opensearch requests at a rate Google&#8217;s crawler does. This is the reason why I block crawlers from accessing the search, in the first place.</p>
<p>This morning, while verifying my log, I found a string of searches which caught my attention:</p>
<blockquote><p>search 13 Jan 2012 &#8211; 12:02 somalia (Search). Anonymous results<br />
search 13 Jan 2012 &#8211; 12:02 somalia (Search). Anonymous results<br />
search 13 Jan 2012 &#8211; 12:02 somalia (Search). Anonymous results<br />
search 13 Jan 2012 &#8211; 12:02 somalia (Search). Anonymous results<br />
search 13 Jan 2012 &#8211; 12:02 somalia (Search). Anonymous results<br />
etc etc etc..</p></blockquote>
<p>Looking into it further, the search strings were of the format:</p>
<blockquote><p>http://humanitariannews.org/search/apachesolr_search/somalia?page=1442</p></blockquote>
<p>And who generated those searches? They all come from IP 66.249.72.105<br />
Which is (<a href="http://whois.domaintools.com/66.249.72.105" target="_blank">source</a>)&#8230;:</p>
<blockquote><p>OrgName: Google Inc.<br />
OrgId: GOGL<br />
Address: 1600 Amphitheatre Parkway<br />
City: Mountain View<br />
StateProv: CA<br />
PostalCode: 94043<br />
Country: US</p></blockquote>
<p>So&#8230;</p>
<ol>
<li>My robots.txt clearly blocks /search access.</li>
<li>Google ignores that rule, and as such ignores a basic rule to protect certain web content from crawling.</li>
<li>Going over my log, I find repetitive Google crawlers of the search</li>
<li>On top of that, there is not much relevant data to be found in the search which can not be found in a normal crawl of my site. And certainly not on (as in the example) on &#8220;page 1,442&#8243; of a search (see above example).</li>
</ol>
<p>The problem is that other than blocking their crawler&#8217;s IP address, you can&#8217;t do much about it. And if you block their crawler&#8217;s IP address, as a web manager, you can just as well throw yourself under a train: The Google search results for your site, would disappear. &#8211; do I smell a monopoly here?-</p>
<p><strong>Verdict:</strong> monopoly, unethical business practice, bad example for anyone else on the web.</p>
<p>Is it time for an #OccupyGoogle ? I think it is.<br />
PS: If anyone has bright ideas how to tweak the robots.txt to disallow crawler access, I&#8217;d like to hear about it.<br />
Angry face picture courtesy <a href="http://www.searchenginepeople.com" target="_blank">SearchEnginePeople</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.blogtips.org/i-dont-like-google/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>How to secure WordPress timthumb.php</title>
		<link>http://www.blogtips.org/how-to-secure-wordpress-timthumb-php/</link>
		<comments>http://www.blogtips.org/how-to-secure-wordpress-timthumb-php/#comments</comments>
		<pubDate>Fri, 16 Sep 2011 09:12:48 +0000</pubDate>
		<dc:creator>Peter</dc:creator>
				<category><![CDATA[Advanced Stuff]]></category>
		<category><![CDATA[Geeky Stuff]]></category>
		<category><![CDATA[How to... Stuff]]></category>
		<category><![CDATA[hackers]]></category>
		<category><![CDATA[security]]></category>
		<category><![CDATA[WordPress]]></category>

		<guid isPermaLink="false">http://www.blogtips.org/?p=2129</guid>
		<description><![CDATA[If you have a selfhosted WordPress blog (WordPress.org), take urgent measures to secure your site from a recently discovered vulnerability. Many WordPress themes and plug-ins use a script called &#8220;timthumb&#8221; (timthumb.php). This is the most common code used to create thumbnails from pictures. End July, a vulnerability surfaced showing external users could dump malicious code [...]]]></description>
			<content:encoded><![CDATA[<p></p><p><img class="aligncenter" title="matrix" src="http://theroadtothehorizon.net/photo/matrix.jpg" alt="matrix" width="400" height="304" /></p>
<p>If you have a selfhosted WordPress blog (WordPress.org), take urgent measures to secure your site from a recently discovered vulnerability.</p>
<p>Many WordPress themes and plug-ins use a script called &#8220;timthumb&#8221; (timthumb.php). This is the most common code used to create thumbnails from pictures.</p>
<p>End July, a vulnerability surfaced showing external users could dump malicious code onto your site. Typically, a short piece of .php code is uploaded via a timthumb backdoor. This hacking code then creates a wider backdoor to gain pretty much full access to your site.</p>
<p><span id="more-2129"></span>It looks like the hackers were on holiday too, and are only gearing up their activity right now. Many sites were hacked in the last couple of days. As many sites use timthumb.php, we can foresee a major hacking spree in the next weeks.</p>
<p>So it is high time to secure your selfhosted WordPress site now.</p>
<h3>How to check if you have been hacked via timthumb.php?</h3>
<p>There is not one specific signature to this hack, contrary to <a title="GoDaddy hacked" href="/godaddy-sites-hacked-again/">the shared hosting hack</a> last year, but here are some common things that seem to happen:</p>
<ul>
<li>First&#8230; Check if you actually use &#8220;timthumb.php&#8221; on your site. It does not come with the default WordPress installation, so check if any of your uploaded themes contain the file &#8220;timthumb.php&#8221;.<br />
Do a site wide search (with SSH or SFTP). Some popular plug-ins that use timthumb.php are &#8220;WordPress Popular Posts&#8221; and &#8220;WP Mobile Detector&#8221;.<br />
Many themes use timthumb.php, or a variation of it. E.g. the widely used &#8220;Thesis&#8221; theme uses it as &#8220;thumb.php&#8221;.</li>
<li>If you find the timthumb.php in your plugin or themes directory, you&#8217;d better give your site a thorough check, so check further</li>
<li>The hackers often upload .php files in the timthumb upload directory &#8220;/cache&#8221; (a subdirectory from the one where the timthumb script is stored). You should check that directory, and delete any non-picture files (.html .php,&#8230;)</li>
<li>Often hackers upload .php files to several other subdirectories within your WordPress installation. I have seen them in the &#8220;/upload&#8221; &#8220;/supercache&#8221; directories (and their subdirectories) as well as in the directories for plugins and themes. Delete them.</li>
<li>Recently, the hackers got bolder and entire subdirectories were uploaded. First a .zip file would be uploaded, it would be unzipped and an entire sub-site was installed in one of the WordPress directories. I have seen zipfiles called halifaxsecurity.zip, hal.zip, studentloanupdate.zip, student.zip. Malicious subdirectories I detected on other sites, were called /halifaxsecurity, /hal and /studentloanupdate. Delete those, if you find them.</li>
<li>People also report direct hacks in .php files and style sheets, adding malicious code (similar to the last year&#8217;s hacks).</li>
<li>Check your .htaccess files</li>
<li>..</li>
</ul>
<p>Check also <a href="http://blog.sucuri.net/" target="_blank">Sucuri&#8217;s blog</a> for more hack signatures and scripts, and <a href="http://markmaunder.com/2011/08/01/zero-day-vulnerability-in-many-wordpress-themes/" target="_blank">Mark Maunder&#8217;s blogpost</a> for a full description of the timthumb vulnerability.<br />
List of themes and plugins (non-exhaustive, though) using timthumb.php, you can find on <a href="http://www.big-webmaster.com/themes-scanned-timthumb-vulerability/" target="_blank">Big Webmaster</a> and <a href="http://blog.sucuri.net/2011/08/attacks-against-timthumb-php-in-the-wild-list-of-themes-and-plugins-being-scanned.html" target="_blank">Sucuri&#8217;s blog</a>.</p>
<p>Deleting those malicious files is not sufficient, as it still leaves the backdoor open for future hacks, <strong>so you need to secure your timthumb.php code NOW </strong>! Read on:</p>
<h3>How to secure timthumb.php against hacks?</h3>
<ol>
<li>Locate all instances of timthumb.php (or any renames of it) on your site.</li>
<li><a href="http://timthumb.googlecode.com/svn/trunk/timthumb.php" target="_blank">Download the newest timthumb.php code</a> (Check also <a href="http://code.google.com/p/timthumb/" target="_blank">the plug-in&#8217;s home page</a>)</li>
<li>Replace the old timthumb.php with your downloaded code.</li>
<li>While the new code is already secure, I strongly suggest to limit the access from external sites.<br />
Replace the line:<br />
<code>define ('ALLOW_EXTERNAL', TRUE);</code><br />
with:<br />
<code>define ('ALLOW_EXTERNAL', FALSE);</code></li>
</ol>
<p>Good luck!</p>
<p>Picture courtesy <a href="http://www.tgdaily.com" target="_blank">TGDaily</a><br />
With thanks to <a href="http://ictkm.cgiar.org" target="_blank">Michael Marus</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.blogtips.org/how-to-secure-wordpress-timthumb-php/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Yahoo Pipes upgrades to V2</title>
		<link>http://www.blogtips.org/yahoo-pipes-upgrades-to-v2/</link>
		<comments>http://www.blogtips.org/yahoo-pipes-upgrades-to-v2/#comments</comments>
		<pubDate>Sun, 17 Jul 2011 12:00:21 +0000</pubDate>
		<dc:creator>Peter</dc:creator>
				<category><![CDATA[Advanced Stuff]]></category>
		<category><![CDATA[RSS]]></category>
		<category><![CDATA[Yahoo Pipes]]></category>

		<guid isPermaLink="false">http://www.blogtips.org/?p=2075</guid>
		<description><![CDATA[Yahoo Pipes is a powerful and free tool, which allows you to combine and manipulate RSS feeds. I use it extensively for many of my news aggregation blogs. In the past, Yahoo Pipes has been suffering from regular downtime, and just over one year ago, Yahoo decided it was time to overhaul the Pipes&#8217; core [...]]]></description>
			<content:encoded><![CDATA[<p></p><p><img class="aligncenter" title="leaky pipe" src="http://theroadtothehorizon.net/photo/leaky%20pipe.jpg" alt="leaky pipe" width="430" height="326" /></p>
<p><a href="http://pipes.yahoo.com/pipes/" target="_blank">Yahoo Pipes</a> is a powerful and free tool, <a title="how to use Yahoo Pipes - Simple tutorial for Yahoo Pipes" href="/how-to-combine-rss-feeds-with-yahoo-pipes/" target="_blank">which allows you to combine and manipulate RSS feeds</a>. I use it extensively <a href="http://www.theroadtothehorizon.org/2009/09/my-blogs.html">for many of my news aggregation blogs</a>.</p>
<p>In the past, Yahoo Pipes has been suffering <a title="Yahoo Pipes problems" href="/yahoo-pipes-more-down-than-up/">from regular downtime</a>, and just over one year ago, <a href="http://blog.pipes.yahoo.net/2010/06/09/yahoo-pipes-v2-engine/" target="_blank">Yahoo decided it was time to overhaul the Pipes&#8217; core engine</a> entirely: <em>&#8220;Pipes Version 2&#8243;</em>, or V2 was born.</p>
<p>Well, it was a difficult birth it seems.</p>
<p><span id="more-2075"></span>Since June last year, Pipes V2 was available in beta for its end-users. The initial testing revealed so many bugs that hardly anyone bothered to report problems anymore on the <a href="http://tech.groups.yahoo.com/group/pipes-engine2users/" target="_blank">V2-discussion forum</a>.</p>
<p>A couple of months ago, it seems Yahoo put new steam in the V2-migration. Several software engineers started to actively engage on the support forum. But still, bugs were plentiful. Many answers were the same: <em>&#8220;Known issue. Working on it&#8221;</em>. I tried my share, but did not succeed to migrate any of my pipes successfully to V2. So hardly anyone migrated. And if they did, most switched back to the V1-engine immediately.</p>
<p>That was until end June when <a href="http://blog.pipes.yahoo.net/2011/06/10/pipes-v2-engine-timeline/" target="_blank">Yahoo announced</a> the &#8220;switch back&#8221; would no longer be available. And that all pipes would be converted to V2 as of early August.</p>
<p>Panic!</p>
<p>Everyone is now scrambling to convert their pipes to V2. Luckily, Yahoo beefed up their presence on the discussion forum, and a number of active and user-oriented software engineers were braving the wave of complaints.</p>
<p>It was not until last week when a real stable V2-version was released (that was one day before a 20 hour long Pipes melt-down, but that is just a detail).</p>
<p>A few nights ago, I tested and migrated all my 30-odd pipes one by one, and found that with one or two exceptions, <strong>all pipes migrated just fine</strong>. The few errors I found, have been fixed by the Pipes team in the mean time, or I was able to work around them.</p>
<p>So, my advice: <strong>if you are using Yahoo Pipes, better test and convert those Pipes NOW</strong>, before you are left with no choice. Deadline: August 1, remember?</p>
<p>PS: If you are interested in this powerful RSS manipulation tool, check out my <a title="Yahoo Pipes Tutorial" href="/how-to-combine-rss-feeds-with-yahoo-pipes/" target="_blank">simple Yahoo Pipes Tutorial</a>.</p>
<p>&nbsp;</p>
<p>Picture courtesy <a href="http://www.faqs.org/photo-dict/" target="_blank">Photo Dictionary</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.blogtips.org/yahoo-pipes-upgrades-to-v2/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Shared hosting: Comparing GoDaddy with DreamHost</title>
		<link>http://www.blogtips.org/shared-hosting-comparing-godaddy-dreamhost/</link>
		<comments>http://www.blogtips.org/shared-hosting-comparing-godaddy-dreamhost/#comments</comments>
		<pubDate>Wed, 06 Apr 2011 10:37:08 +0000</pubDate>
		<dc:creator>Peter</dc:creator>
				<category><![CDATA[Advanced Stuff]]></category>
		<category><![CDATA[FYI Stuff]]></category>
		<category><![CDATA[hosting]]></category>
		<category><![CDATA[tips]]></category>

		<guid isPermaLink="false">http://www.blogtips.org/?p=1918</guid>
		<description><![CDATA[After my debacle with GoDaddy, I moved most of my high volume blogs to a HostGator Virtual Private Server. I still had some test blogs on GoDaddy&#8217;s shared servers, but as the speed and uptime of GoDaddy went from bad to worse, I recently moved several of these blogs to a DreamHost shared server. Here [...]]]></description>
			<content:encoded><![CDATA[<p></p><div class="wp-caption aligncenter" style="width: 400px">
	<img title="Dreamhost Godaddy" src="http://theroadtothehorizon.net/photo/godaddy-dreamhost.jpg" alt="Dreamhost Godaddy" width="400" height="296" />
	<p class="wp-caption-text">GoDaddy versus DreamHost: and the winner is...</p>
</div>
<p>After <a href="/shared-hosting-pay-peanuts-get-monkeys/">my debacle with GoDaddy</a>, I <a href="/the-difference-between-shared-hosting-and-dedicated-hosting/">moved most of my high volume blogs</a> to a HostGator Virtual Private Server. I still had some test blogs on GoDaddy&#8217;s shared servers, but as the speed and uptime of GoDaddy went from bad to worse, I recently moved several of these blogs to <a href="http://www.dreamhost.com/hosting.html" target="_blank">a DreamHost shared server</a>.</p>
<p>Here are my impressions of DreamHost versus GoDaddy shared servers:</p>
<p><span id="more-1918"></span></p>
<h3>DreamHost versus GoDaddy shared hosting speed</h3>
<p>After two years of fighting GoDaddy&#8217;s slow speed (I mean REAL slow), it was a relief to see how fast the DreamHost shared hosting really was.</p>
<p>Take <a href="http://lab.petercasier.be/ciatcapacity/">this test site</a>, which I recently made for a client. While this WordPress blog is not cached (it is used for an interactive re-theming of an existing site), it is about as fast as one can get. It outperforms any heavily cached site I have on GoDaddy.</p>
<p>Kudos, DreamHost! &#8211; I am impressed&#8230;</p>
<h3>DreamHost versus GoDaddy shared hosting prices</h3>
<p>As I have several blogs and domains running on the same shared hosting account, I had a GoDaddy Deluxe hosting, which is comparable to the standard DreamHost Web Hosting features (see also the next chapter).<br />
GoDaddy advertises Deluxe Hosting at $7.99 per month, DreamHost goes at $8.95 per month. What GoDaddy does not tell you, though, is that when you renew hosting, the price goes up drastically. GoDaddy Deluxe Hosting renewal goes at $150.96 per year. DreamHost Webhosting goes at US$119.40 per year&#8230;</p>
<p>By the way, GoDaddy Economy Hosting, which can only host a SINGLE domain and offers FAR fewer features/storage/databases than their Deluxe Hosting, costs $101.80/year for a renewal.</p>
<p>I do also have an issue with GoDaddy&#8217;s mal-advertising: there is nowhere on their website (I checked with their sales department) where you can find the cost for renewals. Nor do they mention that the cost for hosting renewal is different from the initial fee. Their sales person told me bluntly: &#8220;Yeah, but you need to understand, this is how we attract new customers.&#8221; To which I answered: &#8220;&#8230;And how you lose existing customers.&#8221;<br />
Nothing short of cheating, if you ask me.</p>
<p>So DreamHost is far cheaper than GoDaddy</p>
<h3>GoDaddy versus DreamHost shared hosting capacity features</h3>
<p>GoDaddy Deluxe Hosting offers:</p>
<ul>
<li> 150 GB storage</li>
<li>unlimited websites (domain hosting)</li>
<li>unlimited bandwidth</li>
<li>500 Email Accounts</li>
<li>25 MySQL Databases (of max 1 GB each)</li>
<li>1 GB max database size to be included in backups</li>
</ul>
<p>DreamHost Web Hosting offers:</p>
<ul>
<li>unlimited storage</li>
<li>unlimited websites (domain hosting)</li>
<li>unlimited bandwidth</li>
<li>unlimited Email Accounts</li>
<li>unlimited MySQL Databases (with unlimited storage each)</li>
<li>4 GB max database size to be included in backups</li>
</ul>
<p>GoDaddy&#8217;s claim for &#8220;unlimited&#8221; websites hosting is bullocks, as they only support up to 25 MySQL databases. This means you can only host up to 25 blogs (assuming all blogs use MySQL databases) on one Deluxe Hosting. That to me, smell like false advertising, once more!</p>
<p>So once again, DreamHost beats GoDaddy.</p>
<h3>GoDaddy versus DreamHost support</h3>
<p>GoDaddy only offers Email and telephone support. I wrote before about <a title="GoDaddy troubles" href="/shared-hosting-pay-peanuts-get-monkeys/">my issues with GoDaddy support</a>: Email support is almost a no-go. Most of the time I get a pre-cooked boiler plate Email answer, which looks like they did not really read my question or issue. Each exchange you have with them, you get another support person answering, who clearly does not go through the history of that particular support ticket.</p>
<p>GoDaddy&#8217;s telephone support is slightly better than their Email support, but it really depends on the person you get on the line. I would say 90% of the time, the person &#8220;had to check&#8221; with a 2nd line of support. And most of the time, the answer was &#8220;Yep, apparently there is a problem on the shared server your blog runs on, but they are working on it&#8221;. They have no possibility of warning you when the issue is resolved. Some issues were never solved.</p>
<p>DreamHost has Email and telephone support, but also feature an online chat support. This is great if you have a quick question, and saved me a lot of time at many occasions. I can&#8217;t believe GoDaddy still does not offer this service.</p>
<p>For all my interactions with DreamHost, I&#8217;d rate their level of support <strong>much</strong> higher than GoDaddy.</p>
<h3>GoDaddy versus DreamHost shared hosting features</h3>
<p>I was very surprised to find the DreamHost Web Hosting features to be more similar to a VPS (Virtual Private Server) hosting, than a shared hosting:</p>
<ul>
<li>On DreamHost, you can define an unlimited number of user accounts, each with their own log-in, disk space, domain, etc&#8230;: This allows you to subdivide your account for each of the blogs you host. It is also an additional security feature.<br />
None of that on GoDaddy: one account, one login on Deluxe Hosting</li>
<li>On DreamHost you have a full featured SSH (terminal emulation) access: Most Linux commands can be used on DreamHost. Even the &#8220;TOP&#8221; command will show you the processes that run on your account, once again resembling the features of VPS hosting.<br />
On GoDaddy, many Linux commands are blocked.</li>
<li>DreamHost has a more integrated control panel, giving you access to all account, domain, email and DNS functions.<br />
In GoDaddy, many of these functions are split off in different submenus and sub-websites are often slow, and difficult to use.</li>
<li>DreamHost allows 50 Gb of backup storage, which is configured as a separate user. This can be used to store an off-site backup of your laptop, or backups from another server.<br />
GoDaddy does not allow their Deluxe Hosting server to be used as backup server.</li>
<li>Both DreamHost and GoDaddy provide &#8220;One-Click&#8221; installs for the most popular public domain CMS (Content Management Systems) and blogs. I should say that I did not like the WordPress One-Click install of DreamHost: it was not a standard installation. It also includes 30-odd themes, which is just an overhead if you want to keep your site lean and mean, and easy to backup</li>
<li>Talking about backups: GoDaddy has a MySQL backup function (even though it is hidden in some obscure sub-sub-sub menu&#8217;s icon). DreamHost has a one-click backup function, for your &#8220;whole account in one&#8221;. You can only take one backup per month. I&#8217;d like to see a more robust and easier backup feature, though: an automated daily/weekly/monthly backup would be nice&#8230;</li>
<li>There are several additional features I like on DreamHost Webhosting: They support Email discussion and Email broadcast lists, one-click site imports from another host&#8217;s CPanel, and software development version control, to name a few&#8230;</li>
</ul>
<h3>GoDaddy versus DreamHost shared hosting, which is best?</h3>
<p>DreamHost, without any doubt. Better price, far better performance, better support and more features.</p>
<p>And no&#8230; I am not sponsored by DreamHost!</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.blogtips.org/shared-hosting-comparing-godaddy-dreamhost/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>How to make search engines discover more posts on your blog</title>
		<link>http://www.blogtips.org/how-to-make-google-discover-more-posts-on-your-blog/</link>
		<comments>http://www.blogtips.org/how-to-make-google-discover-more-posts-on-your-blog/#comments</comments>
		<pubDate>Mon, 28 Mar 2011 16:44:09 +0000</pubDate>
		<dc:creator>Peter</dc:creator>
				<category><![CDATA[Advanced Stuff]]></category>
		<category><![CDATA[How to... Stuff]]></category>
		<category><![CDATA[Google Webmaster]]></category>
		<category><![CDATA[search engines]]></category>
		<category><![CDATA[SEO]]></category>

		<guid isPermaLink="false">http://www.blogtips.org/?p=1896</guid>
		<description><![CDATA[As I explained in my tutorial &#8220;How to analyse your visitors&#8217; traffic figures&#8221;, the search engine traffic is important to have people discover your blog. So we have to do everything we can to make search engines discover our blog&#8217;s content, through SEO or Search Engine Optimization. While there is much written about SEO, any [...]]]></description>
			<content:encoded><![CDATA[<p></p><div class="wp-caption aligncenter" style="width: 400px">
	<img title="How to increase the Google crawl rate" src="http://theroadtothehorizon.net/photo/increase%20pages%20crawled%20per%20day.jpg" alt="How to increase the Google crawl rate" width="400" height="170" />
	<p class="wp-caption-text">Changing the Google crawl rate will have an immediate effect</p>
</div>
<p>As I explained in my tutorial <a title="How to analyse your blog traffic statistics" href="http://www.blogtips.org/blogging-tips-tutorial-analyse-visitors-statistics/">&#8220;How to analyse your visitors&#8217; traffic figures&#8221;</a>, the search engine traffic is important to have people discover your blog. So we have to do everything we can to make search engines discover our blog&#8217;s content, through <a title="SEO - an important tool for any serious blogger" href="http://www.blogtips.org/tag/search-engines/">SEO or Search Engine Optimization</a>.</p>
<p>While there is much written about SEO, any honest expert will confess that there are many mysteries surrounding this subject. That&#8217;s why in BlogTips, I only share those techniques I experienced to work: In previous posts, I have proven how changing from one blog platform to another, and generating a site map <a title="Different blog platforms and their influence on search engines" href="/tumblr-and-wordpress-a-crawlers-difference/">can dramatically increase the amount of posts a crawler &#8220;discovers&#8221;</a>. And as crawlers like fast sites, I showed <a title="A crawler's difference between different blog platforms" href="/the-difference-between-shared-hosting-and-dedicated-hosting/">the difference of crawlers&#8217; access speed when switching from shared to dedicated servers</a>.</p>
<p>But&#8230; these techniques still let crawlers decide how deep they want to dig into your site. If you have a website, or a blog, with a <strong>lot of content</strong>, much of it might remain undiscovered. And if crawlers don&#8217;t check for your content, people won&#8217;t be able to find it through the search engines.</p>
<p><span id="more-1896"></span>By default, Google determines automatically their <strong>crawl rate</strong> for your site. The &#8220;crawl rate&#8221; defines many how pages from your site, or posts from your blog, they&#8217;ll check on &#8220;each visit&#8221;. If the crawl rate is too low, a lot of content might go undiscovered and un-indexed.</p>
<p>So how can we force Google crawler (or gently ask it) to crawl more frequently?</p>
<h3>1. How many pages does Google crawl on your blog?</h3>
<p>As explained in <a title="things to do after creating a blog" href="/5-things-to-do-after-creating-a-new-blog/">Five things to do after creating a new blog</a>, it is good practice to submit your site to <a href="http://www.google.com/webmasters/" target="_blank">Google Webmasters</a>.<br />
In the Webmasters &#8220;Diagnostics Menu&#8221;, check the &#8220;Crawl stats&#8221;, to find out how many pages Google crawls per day. If, according to the amount of blogposts you have, this is too low, you will need to increase the crawl rate.</p>
<h3>2. Using Google Webmasters tools to increase the crawl rate</h3>
<p>For any site, even if you don&#8217;t <a title="selfhost your blog or not?" href="/selfhosting-your-blog-or-not/">selfhost your blog</a>, you can adjust the Google crawl rate using the same Google Webmasters tool: Go to &#8220;Site Configuration&#8221; &#8211; &#8220;Settings&#8221; and click on &#8220;Set custom crawl rate&#8221;:</p>
<p><img class="aligncenter" title="Adjusting the Google crawl rate with Google Webmasters tools" src="http://theroadtothehorizon.net/photo/adjusting%20the%20google%20crawl%20rate.jpg" alt="Adjusting the Google crawl rate with Google Webmasters tools" width="400" height="204" /></p>
<p>Adjust the slider to increase the crawl rate, save the setting and you are done.<br />
This is not a permanent setting. It is only valid for 90 days. Also if you have a robot.txt on your selfhosted blog&#8217;s root (see point 3), than the most restrictive setting will apply.</p>
<p>Beware, this ONLY changes the Google crawlrate, and not the crawlrate for any other search engine. Another thing to keep in mind: adjusting the Google crawlrate will only remain valid for three months. You will have to re-adjust it after that period.</p>
<h3>3. Using robots.txt to increase the crawl rate</h3>
<p>This solution is only valid if you <a href="/selfhosting-your-blog-or-not/">selfhost your blog</a>, running it on your own server.<br />
&#8220;robots.txt&#8221; is a simple text file with &#8220;instructions&#8221; for crawlers. It can restrict the access (for all crawlers or only for specific crawlers) to parts of your website, to protect non-public parts, scripts, etc.. But it can also define the rate in which your site is crawled. SearchEngineLand has <a href="http://searchengineland.com/a-deeper-look-at-robotstxt-17573">a simple tutorial</a> if you want to learn more about robots.txt</p>
<p>In short, to make a robots.txt, to change the crawl rate (say to 60 seconds), for all crawlers, something like this is sufficient: (&#8220;#&#8221; lines are comments)</p>
<blockquote><p><code>#<br />
# Change the crawlrate for any crawler<br />
#<br />
User-agent: *<br />
Crawl-delay: 60</code></p></blockquote>
<p>Save it in a text file (&#8220;robots.txt&#8221; &#8211; all lower case) and put it on the root directory of your blog, and you&#8217;re done!</p>
<p>Attention! All crawlers accept the robot.txt &#8220;crawl-delay&#8221; setting, <strong>except</strong> Google. The Google crawl-delay can <strong>only</strong> be adjusted using Google&#8217;s webmaster tools (as described in the previous paragraph).</p>
<h3>And does it work?</h3>
<p>You bet! <a href="http://aidnews.org">AidNews</a>, one of me humanitarian news aggregators, has over 100,000 posts. While I have a lot of traffic on this site, the amount of visitors coming in via Google was way too low. I changed the crawler delay with a robots.txt from 300 to 60, and see how, the same day, the crawler responded on the graph at the top of this post.<br />
Now, the only thing I can do, is wait until the search engine traffic picks up.</p>
<p>Happy crawling!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.blogtips.org/how-to-make-google-discover-more-posts-on-your-blog/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>BlogTips Tutorial:How to select your blog platform</title>
		<link>http://www.blogtips.org/blogging-tips-tutorial-how-to-select-your-blog-platform/</link>
		<comments>http://www.blogtips.org/blogging-tips-tutorial-how-to-select-your-blog-platform/#comments</comments>
		<pubDate>Fri, 25 Mar 2011 14:33:34 +0000</pubDate>
		<dc:creator>Peter</dc:creator>
				<category><![CDATA[Advanced Stuff]]></category>
		<category><![CDATA[Easy Stuff]]></category>
		<category><![CDATA[How to... Stuff]]></category>
		<category><![CDATA[Selecting a blog platform]]></category>
		<category><![CDATA[The BlogTips Tutorials]]></category>
		<category><![CDATA[Blogger]]></category>
		<category><![CDATA[hosting]]></category>
		<category><![CDATA[start a blog]]></category>
		<category><![CDATA[tips]]></category>
		<category><![CDATA[tools]]></category>
		<category><![CDATA[Tumblr]]></category>
		<category><![CDATA[Typepad]]></category>
		<category><![CDATA[WordPress]]></category>

		<guid isPermaLink="false">http://www.blogtips.org/?p=1883</guid>
		<description><![CDATA[Alright! You recognized your organisation can benefit from a blog and you know what you will blog about. But&#8230; which &#8220;software&#8221; should you use to blog? From the many different blogplatforms on the market, which suits your needs? In this tutorial, I explain the questions you need to ask yourself, the basic choices you will [...]]]></description>
			<content:encoded><![CDATA[<p></p><p><img class="aligncenter" title="picking potato seedlings" src="http://theroadtothehorizon.net/photo/potato%20seedlings.jpg" alt="picking potato seedlings" width="400" height="266" /></p>
<p>Alright! You recognized <a href="/does-your-non-profit-organisation-need-a-blog/">your organisation can benefit from a blog</a> and you know <a href="/start-blogging-but-what-will-you-blog-about/">what you will blog about</a>. But&#8230; which &#8220;software&#8221; should you use to blog? From the many different blogplatforms on the market, which suits your needs?</p>
<p>In this tutorial, I explain the questions you need to ask yourself, the basic choices you will have to make and will give you practical tips to choose the blogging tool which is the best suited for your needs:</p>
<p><span id="more-1883"></span></p>
<ol>
<li><a title="Why is it critical to choose the right blog platform?" href="/selecting-a-blog-platform-a-critical-choice/">Why is it important to choose the right blog platform?</a> What are the different blog platforms on the market? If you make the wrong choice, can you still switch?</li>
<li><a title="Will you selfhost your blog or not?" href="/selecting-a-blog-platform-selfhost-your-blog-or-not/">Should you selfhost your blog or not?</a> Do you want to use the free blog blogplatforms that host the blog for you, or do you want to run the blog on your own server? Is this linked to having your own domain? What are the advantages and pitfalls of this choice?</li>
<li><a title="What functionality do the different blogplatforms offer?" href="/selecting-a-blog-platform-functionality/">What functionality do the different blog platforms offer?</a> Looking deeper at the most popular blog software and blogging platforms, are there special niches they are made for? What functionality do you want?</li>
<li><a title="Which blog platform is the easiest to use" href="/selecting-a-blog-platform-ease-of-use-and-support/">Which blog platform is the easiest to use and offers the best support?</a> Or: what is the difference between heaven and hell?</li>
<li><a title="Which is the most flexible blogging software?" href="/selecting-a-blog-platform-layout-design-and-navigation/">Which blog platforms are the most flexible to configure your layout, design and navigation?</a> Appeal, graphic layout and usability are sometimes hampered by limitations imposed by the blogging software. Which blogging software is the most &#8220;feature rich&#8221;?</li>
<li><a title="Which blogging software is the most flexible?" href="/selecting-a-blog-platform-customizability/">Which blog platform is the easiest to customize?</a> If a blog software does not offer you what you want and how you want it, what can you customize? Which is the most flexible blog software?</li>
<li><a title="Which is the best blogging software for you?" href="/selecting-a-blog-platform-the-bottom-line/">Summarizing</a>: So which blogging software is your final choice? A summary of the pro&#8217;s and con&#8217;s of the most popular blog packages.</li>
</ol>
<p>So&#8230; WordPress, Typepad, Blogger or Tumblr, what is your choice? <img src='http://www.blogtips.org/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>&nbsp;</p>
<blockquote><p>Writing this series, I got significant help from Dave Barnhart, who filled in the blanks on Typepad and Movable Type.<br />
Dave is a social media strategy consultant, founder of Business Blogging Pros, and a gourmet chef. He and his firm have been helping companies use social media since 2005.<br />
He blogs at <a href="http://BusinessBloggingPros.typepad.com" target="_blank">Business Blogging Pros</a> and <a href="http://www.FumblingFoodie.com" target="_blank">Fumbling Foodie</a>. Check out <a href="http://businessbloggingpros.typepad.com/business_blogging_pros/friends/" target="_blank">some of the blogs</a> he has created.</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://www.blogtips.org/blogging-tips-tutorial-how-to-select-your-blog-platform/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>BlogTips Tutorial:How to analyse your blog&#8217;s visitors statistics</title>
		<link>http://www.blogtips.org/blogging-tips-tutorial-analyse-visitors-statistics/</link>
		<comments>http://www.blogtips.org/blogging-tips-tutorial-analyse-visitors-statistics/#comments</comments>
		<pubDate>Fri, 25 Mar 2011 13:25:22 +0000</pubDate>
		<dc:creator>Peter</dc:creator>
				<category><![CDATA[Advanced Stuff]]></category>
		<category><![CDATA[How to... Stuff]]></category>
		<category><![CDATA[The BlogTips Tutorials]]></category>
		<category><![CDATA[Understanding the Traffic on Your Blog]]></category>
		<category><![CDATA[Google Analytics]]></category>
		<category><![CDATA[tips]]></category>
		<category><![CDATA[tools]]></category>
		<category><![CDATA[visitor traffic]]></category>
		<category><![CDATA[Web statistics]]></category>

		<guid isPermaLink="false">http://www.blogtips.org/?p=1875</guid>
		<description><![CDATA[A serious blog is geared towards its audience. As a serious blogger, it is important you understand your audience, your readers. Simple and free tools like Google Analytics, will give you heaps of figures, but how do you make sense out of all that? What do these figures mean and what can I do with [...]]]></description>
			<content:encoded><![CDATA[<p></p><p><img class="aligncenter" title="Cairo traffic jam" src="http://theroadtothehorizon.net/photo/cairo%20traffic%20jam.jpg" alt="Cairo traffic jam" width="400" height="300" /></p>
<p>A serious blog is geared towards its audience. As a serious blogger, it is important you understand your audience, your readers.<br />
Simple and free tools like <a href="http://www.google.com/analytics/" target="_blank">Google Analytics</a>, will give you heaps of figures, but how do you make sense out of all that? What do these figures mean and what can I do with this?</p>
<p><span id="more-1875"></span>In this tutorial I use a practical case study to help you analyse your traffic and answer some basic questions:</p>
<ul>
<li><a title="Analysing the figures in blog statistics" href="/understanding-the-traffic-on-your-blog-part-1/">Analysing the traffic quantity</a>: Where does my visitors&#8217; traffic come from? How much traffic do I get from search engines and social bookmarking sites? Should I do efforts to increase that traffic? Why?</li>
<li><a title="How to find quality visitors for your blog" href="/understanding-the-traffic-on-your-blog-part-2/">Analysing the traffic quality:</a> The difference between new visitors and returning visitors? Why should I turn an occasional visitor into a returning visitor? How much time do visitors spend on my blog, and why is that important?</li>
<li><a href="/understanding-the-traffic-on-your-blog-conclusions/">Concluding &#8211; How to get more quality traffic to my blog</a>: What is the most effective way to find new and returning visitors for my blog? Should I spend more time on social bookmarking, on search engine optimization or on discussion forums to get new visitors?</li>
</ul>
<p>&nbsp;</p>
<p>Picture courtesy <a href="http://www.flickr.com/photos/tronics/" target="_blank">Walid Hassanein</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.blogtips.org/blogging-tips-tutorial-analyse-visitors-statistics/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Should you upgrade to WordPress 3.1?</title>
		<link>http://www.blogtips.org/should-you-upgrade-to-wordpress-3-1/</link>
		<comments>http://www.blogtips.org/should-you-upgrade-to-wordpress-3-1/#comments</comments>
		<pubDate>Sat, 19 Mar 2011 17:21:00 +0000</pubDate>
		<dc:creator>Peter</dc:creator>
				<category><![CDATA[Advanced Stuff]]></category>
		<category><![CDATA[WordPress]]></category>

		<guid isPermaLink="false">http://www.blogtips.org/?p=1833</guid>
		<description><![CDATA[Recently WordPress 3.1 came out. While the (major) upgrade from WordPress 2.9 to 3.0 was in general seen as a success, 3.1 seems to be different. There are a number of known problems with the WordPress 3.1 upgrade. Most of them have to do with incompatibilities with certain plugins and themes. There are also issues [...]]]></description>
			<content:encoded><![CDATA[<p></p><div id="attachment_1835" class="wp-caption aligncenter" style="width: 400px">
	<img class="size-full wp-image-1835" title="wordpress 31" src="http://www.blogtips.org/wp-content/uploads/2011/03/wordpress-31.jpg" alt="" width="400" height="101" />
	<p class="wp-caption-text">&quot;Pending upgrades&quot;... have the effect of a red cloth on a bull.</p>
</div>
<p>Recently WordPress 3.1 came out. While the (major) upgrade from WordPress 2.9 to 3.0 was in general seen as a success, 3.1 seems to be different.</p>
<p>There are <a href="http://wordpress.org/support/topic/troubleshooting-wordpress-31-master-list">a number of known problems</a> with the WordPress 3.1 upgrade. Most of them have to do with incompatibilities with certain plugins and themes. There are also issues with custom post types and with plugins using jQuery functions (picture sliders for instance).</p>
<p>I upgraded two of my fifteen WordPress sites (including BlogTips) successfully, but got stuck with <a href="http://theweirdbit.org/">my third site</a>. After upgrading, my admin screens would work well, but the site itself returned a blank screen.</p>
<p>Panic.</p>
<p><span id="more-1833"></span>That particular WordPress site used about ten different plugins, which I disabled one by one, but in vain. I started to backup the upgraded site and to retrieve a backup copy of my old site, looking forward for a loooong day of downgrading (upgrading is a flip of a switch, downgrading is a pain).</p>
<p>In the process of doing so, I remembered one of the golden rules of debugging: &#8220;switch back to the default Twenty-Ten theme&#8221;! When I checked, it seemed that the upgrade had already switched me to the Twenty-Ten theme. I re-enabled my normal theme, and voila, the site was back up.</p>
<p>I re-enabled all plugins one by one, restored the cache files (which had all been deleted when I disabled Supercache), and the site was back up.</p>
<p>I am yet to analyse exactly what went wrong, but I suspect a conflict between WordPress 3.1 and one of the plugins. Which? That is the one million dollar question.</p>
<p>My advise is to wait it out for a few more weeks. Let the plug-in developers debug 3.1 incompatibilities first.</p>
<p>Two tips on upgrading (be it the WordPress core, themes or plugins):</p>
<ol>
<li>If you have an important site (which site is not important, hey?), then keep a mirror test site, which is a working copy of your production site. Test any upgrade first on your mirror test site. If the upgrade works, and continues to work after a couple of days, then consider upgrading your production site.</li>
<li>Before you upgrade, make sure you have a backup of both your files, and your SQL database. If all goes haywire, downgrading will only be possible using your backups. otherwise -as they say in French- you&#8217;ll be stuffed.</li>
</ol>
<p>Wishing you luck. I am waiting it out.</p>
<p><span style="color: #ff00ff;">Update April 19, 2011</span>:<br />
WordPress released version 3.1.1 which seems to cure most problems of 3.1&#8230; I have upgraded most of my sites to 3.1.1, and everything seems stable now. My advise now: go for it!</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.blogtips.org/should-you-upgrade-to-wordpress-3-1/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>How I moved 350,000 blogposts from Tumblr to WordPress</title>
		<link>http://www.blogtips.org/how-i-moved-350000-blogposts-from-tumblr-to-wordpress/</link>
		<comments>http://www.blogtips.org/how-i-moved-350000-blogposts-from-tumblr-to-wordpress/#comments</comments>
		<pubDate>Mon, 31 Jan 2011 20:31:41 +0000</pubDate>
		<dc:creator>Peter</dc:creator>
				<category><![CDATA[Advanced Stuff]]></category>
		<category><![CDATA[Geeky Stuff]]></category>
		<category><![CDATA[How to... Stuff]]></category>
		<category><![CDATA[hosting]]></category>
		<category><![CDATA[mobile blogging]]></category>
		<category><![CDATA[tips]]></category>
		<category><![CDATA[tools]]></category>
		<category><![CDATA[Tumblr]]></category>
		<category><![CDATA[WordPress]]></category>

		<guid isPermaLink="false">http://www.blogtips.org/?p=1763</guid>
		<description><![CDATA[I had seven blogs on Tumblr which aggregate news. Using a technique I described earlier, they take RSS feeds from over 1,000 carefully selected websites and blogs, filter them, clean them up, and feed them into the different Tumblr blogs. I used the unique feature built into Tumblr to convert RSS feeds into posts. All [...]]]></description>
			<content:encoded><![CDATA[<p></p><div class="wp-caption aligncenter" style="width: 430px">
	<img title="Kenyan farmers" src="http://theroadtothehorizon.net/photo/preparing%20the%20fields.jpg" alt="Kenyan farmers" width="430" height="285" />
	<p class="wp-caption-text">Moving blogs: labour intensive, but fun!</p>
</div>
<p>I had seven blogs on Tumblr which aggregate news. Using a technique <a href="/rss-reversed-from-feed-to-blog/">I described earlier</a>, they take RSS feeds from over 1,000 carefully selected websites and blogs, filter them, clean them up, and feed them into the different Tumblr blogs. I used the unique feature built into Tumblr to convert RSS feeds into posts. All automatically. Pretty neat. Until it stopped working&#8230;</p>
<p>Two months ago <a href="/tumblr-problems/">Tumblr&#8217;s autoimport feature started to hiccup</a>. Tumblr support said <em>&#8220;We are aware and working on it&#8221;</em>, but could not give me any estimate when it would be fixed.</p>
<p>A month later, still nothing. So what to do? I rely on this Tumblr feature, for my blogs. In two years, those blogs collected 350,000 news articles. Quite a resource library which I did not want to give up.</p>
<p>So I decided to use the Christmas holidays to migrate these blogs from Tumblr onto WordPress, on my HostGator VPS server. It was an interesting process, involving many different techniques and debugging efforts:</p>
<p><span id="more-1763"></span></p>
<h4>1. How to export a Tumblr blog into WordPress</h4>
<div class="wp-caption aligncenter" style="width: 388px">
	<img title="From Tumblr to WordPress" src="http://theroadtothehorizon.net/photo/Tumblr%20export%20routine.jpg" alt="From Tumblr to WordPress" width="388" height="400" />
	<p class="wp-caption-text">Ben Ward&#39;s Tumblr2WordPress utility</p>
</div>
<p>I pretty much described the process and the technique to export a Tumblr blog <a href="/how-to-import-a-tumblr-blog-into-wordpress/">in an earlier post</a>: <a href="http://benapps.net/" target="_blank">Using Tumblr2WordPress</a>, a neat PHP program by Ben Ward. It uses the Tumblr API to export blogposts, and to create an .XML file, which I could import into WordPress. An API that Tumblr disables every US afternoon and evening, by the way.</p>
<p>While Ben&#8217;s program works well to export smaller Tumblr blogs, I had biiiiig blogs, so I had to adapt the PHP code. As the code is public domain and available on <a href="http://github.com/benward/tumblr2wordpress">Github</a>, I downloaded it and installed it on my server. I changed the PHP parameters to allocate a massive chunk of memory, and allow the export routine to run longer than a standard PHP program. Tip: change the parameters only for that routine, not for your whole server!</p>
<p><span style="color: #ff00ff;">Update March 1, 2010:</span><br />
Ben&#8217;s source code is still available, but the executable program is no longer available on the link I provided. You can still run Tumble to WordPress routines based on the same engine from <a href="http://tumblr2wp.com/" target="_blank">Tumblr2WP</a> or <a href="http://haochen.me/tumblr/" target="_blank">Tumble2WordPress</a> &#8211; With thanks to Parneix and Aaron for the updates)</p>
<p>I also patched Ben&#8217;s original code to work around a smaller problem I discovered on &#8220;published dates&#8221; and &#8220;categories&#8221;. Pretty easy, even for a PHP novice like me, as the code is well documented.</p>
<p>As WordPress can only import .XML files smaller than 8Mb, I split up the first exported blog manually into smaller chunks. Took me two hours for the smallest of the seven blogs. I decided once again to delve into the code, and wrote a small patch that allowed me to export 5,000 blog posts at a time. Each export file now was smaller than 8 Mbyte.</p>
<p>Cool. Exported all blogs, and there I sat on 70 files, about half a gigabyte worth of .XML files.</p>
<h4>2. Importing 350,000 posts into seven new WordPress blogs</h4>
<div class="wp-caption aligncenter" style="width: 400px">
	<img title="The WordPress Import screen" src="http://theroadtothehorizon.net/photo/wordpress%20import%20routine.jpg" alt="The WordPress Import screen" width="400" height="193" />
	<p class="wp-caption-text">The WordPress Import routine rocks!</p>
</div>
<p>For the seven blogs, I created seven new accounts on my HostGator VPS server. Some of my Tumblr blogs were using custom domains, so I changed the DNS, pointing to my HostGator VPS server, and created seven new WordPress blogs. While I was at it, I registered two new domains. <a href="http://changethru.info" target="_blank">ChangeThru.Info</a> became <a href="http://aidresources.org" target="_blank">AidResources.org</a> and <a href="http://youandusand.me/" target="_blank">Youandusand.me</a> became <a href="http://newsongreen.org" target="_blank">NewsOnGreen.org</a>. Wanted to do that a long time ago, so now was the right time. I kept the old domains live on Tumblr, so I did not lose any traffic.</p>
<p>Installing a new WordPress blog was easy to do with the &#8220;Fantastico&#8221; program in the server&#8217;s Cpanel. I choose <a href="http://wordpress.org/extend/themes/magazine-basic" target="_blank">a neat and simple magazine template</a> and added the usual plugins I always use for caching, automated blog backup, etc&#8230;</p>
<p>Then, one by one, I imported the 500 Mbyte of export .XML files. Worked flawlessly, but took about two days. Not a single error, not a single problem. I should say: WordPress impressed me once more.</p>
<p>Done. Well at least with importing the old posts. How to feed in new posts using my myriad of RSS feeds?</p>
<h4>3. Implementing an &#8220;RSS to blogpost&#8221; routine in WordPress</h4>
<div class="wp-caption aligncenter" style="width: 400px">
	<img title="Feedwordpress" src="http://theroadtothehorizon.net/photo/feedwordpress.jpg" alt="Feedwordpress" width="400" height="305" />
	<p class="wp-caption-text">FeedWordPress, you rock!</p>
</div>
<p><a href="http://feedwordpress.radgeek.com/" target="_blank">FeedWordPress</a> made my day. This neat plugin imports RSS feeds and converts them into WordPress blogposts. And it does so very well. The plugin is well designed, easy to use, and has a lot of options to customize the import process.<br />
It also has add-ons that allow you limit the size of the imported post, add text in the title or in the body of the imports. Realllllly neat!</p>
<p>I configured the different feeds <a href="/how-to-combine-rss-feeds-with-yahoo-pipes/">I process via Yahoo Pipes</a>, and ran a CRON job to import the blog posts. About two hours work per blog, and the whole cake went into the oven and started cooking: FeedWordPress neatly imported the posts.</p>
<p>You would think I was done. The real work had not even started.</p>
<h4>4. The need for speed</h4>
<p>From the beginning until the end, including the customization of the template etc.. the export from Tumblr and import into WordPress took me about a week. Fine-tuning the blogs and the server took another two weeks.</p>
<p>As &#8220;Good is Fine, Perfect is Best&#8221;, I saw some formatting deficiencies I could not live with and needed modifications to the feeds and template. And even worse, much much much worse, my poor server went through its knees with the extra load.. The new blogs demanded so much CPU time from Apache and the MySQL server that everything slowed down to a snail&#8217;s pace.</p>
<p>Time to get geeky!</p>
<h4>5. Tuning the XML sitemaps</h4>
<div class="wp-caption aligncenter" style="width: 363px">
	<img title="XML Sitemap for WordPress" src="http://theroadtothehorizon.net/photo/XML%20sitemap.jpg" alt="XML Sitemap for WordPress" width="363" height="400" />
	<p class="wp-caption-text">XML Sitemap: Tune it!</p>
</div>
<p>It only takes one dumb sysadmin to make the fastest server to go slow, I realized while monitored the CPU load with the Linux &#8220;TOP&#8221; command. I saw the &#8220;load average&#8221; to peak way above &#8220;10&#8243;, meaning there were at least 10 processes queueing up for CPU time.<br />
So I looked at several WordPress plugins that might cause the problem.</p>
<p>The first problem I saw was <a href="http://wordpress.org/extend/plugins/google-sitemap-generator/" target="_blank">the XML sitemap generator</a>. There was one option, which I had overlooked: <em>&#8220;Rebuild sitemap if you change the content of your blog&#8221;</em>. Might be fine on smaller blogs, but the automatic feedimporters were putting up new blogposts at a rate of 100 per hour. So the server was pretty much doing nothing else but generating sitemaps.</p>
<p>I disabled the feature, and scheduled a CRON job to regenerate a sitemap once a day, at night-time. The server load went down significantly.</p>
<p>Oh, by the way, if you have a huge blog, limit the number of posts to include in the sitemap to 10,000 , Google&#8217;s maximum limit for sitemaps!</p>
<h4>6. Tuning WP Supercache</h4>
<div class="wp-caption aligncenter" style="width: 400px">
	<img title="WP SuperCache" src="http://theroadtothehorizon.net/photo/supercache%20screenshot.jpg" alt="WP SuperCache" width="400" height="302" />
	<p class="wp-caption-text">WP SuperCache: Tune it!</p>
</div>
<p><a href="/a-blogger-2010-wrapup/">As I mentioned before</a>: any blogger using <a href="/selecting-a-blog-platform-selfhost-your-blog-or-not/">a selfhosted blog</a> without a cache, should appear before court for &#8220;blog neglect&#8221;. So I use caching extensively, with <a href="http://wordpress.org/extend/plugins/wp-super-cache/" target="_blank">WP Supercache</a>, my preferred plugin.</p>
<p>But it only takes a stupid blog administrator to make even the best plugin not to work properly. Supercache needed tuning:</p>
<ul>
<li>Unchecked the option <em>&#8220;Clear all cache files when a post or page is published&#8221;</em> (I published 100 posts per hour, so the cache was always invalidated)</li>
<li>Pre-loaded the last 10% of blogposts, but put <em>&#8220;Refresh preloaded cache&#8221;</em> to &#8220;0&#8243; (as once a post is imported from its RSS feed, I don&#8217;t update it anymore, so it can remain in cache forever). This means I only had to pre-load a massive amount of blogposts once, and it was done.</li>
<li>For the same reason, I put the <em>&#8220;expiry time&#8221;</em> to &#8220;0&#8243;, as once cached after a preload cycle, I want the page to remain in cache. It generates a LOT of cached files, but I have plenty of disk space on my server.<br />
If you put <em>&#8220;expiry time&#8221;</em> to a value &gt; 30 minutes, garbage collection is done every 10 minutes, which generates a lot of load on your server.</li>
<li>As now I had caches with an eternal life time, I needed to ensure the homepage, feeds, archives were NOT cached, otherwise visitors never got an updated overview of the latest posts.<br />
As I discovered that AFTER I preloaded the posts, and had put the caching to &#8220;eternity&#8221;, I had to manually delete the cached files for the home page, searches and the running month&#8217;s archives.</li>
<li>To further reduce the load on the PHP server, I choose the option to use <em>&#8220;mod_rewrite to serve cache files&#8221;</em>.</li>
<li>And by the way, if you don&#8217;t cache the home page, the <em>&#8220;Cache Tester&#8221; </em>will give an error &#8211; as it tests caching on&#8230; the home page. So ignore that error, and just look at the source of any random page, to see if, at the bottom of the source, you have a date/time for the cache generation, which is in the past.</li>
</ul>
<h4>7. Trashing &#8220;Most Popular Posts&#8221;</h4>
<p>As describe in <a href="/how-one-small-plug-can-slow-blog/" target="_self">this post</a>, one plugin meant to show &#8220;the most read posts&#8221;, also logged every single access to the SQL database, and effectively slowed down my server. Had to trash it.</p>
<h4>8. Tuning FeedWordPress</h4>
<p>I spent quite a bit of time to tune FeedWordPress, to balance how often feeds were to be imported with the success rate of each import cycle. I combine 1,000+ feeds <a href="/how-to-combine-rss-feeds-with-yahoo-pipes/" target="_self">into about 20 Yahoo Pipes feeds</a>. These are large and complex feeds, which take a lot of time to fetch. Many times the import of a feed would time out.</p>
<p>At first I worked around that problem, by refreshing all feed imports every 10 minutes. But once again, that put a lot of pressure on the server. As you can deal with any problem either by working around it, or by addressing the cause of it, it was time to look for the source of the problem. In the process of doing so, distinguish well between what is &#8220;a cause&#8221; and what is &#8220;a symptom&#8221;. Often we try to solve the latter, while we should address the former. Think about that. That is deeeeep! <img src='http://www.blogtips.org/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>So the symptom I saw was the feeds timing out. As I know the Yahoo Pipes&#8217; feeds often take very long to refresh, even interactively, I <a href="http://www.piepalace.ca/blog/2010/11/feedwordpress-broke-my-heart.html" target="_blank">had to patch FeedWordPress</a> with a timeout of 60 seconds to deal with the Yahoo Pipes&#8217; lack of speed. Cool. But that caused a dreaded SQL error &#8220;My SQL server has gone away&#8221; to appear more frequently in my CRON log files. Beh.</p>
<p>To make a long story short, the solution was to also change the PHP parameter &#8220;wait_timeout&#8221; from the default of &#8220;30&#8243; seconds to &#8220;240&#8243;. Changed that in <em>/etc/my.cnf</em> and restarted SQL server. Problem solved.</p>
<p>As this solved the timeout when reading RSS feeds, I could also decrease the frequency of the FeedWordPress CRON jobs. And that once again made my server very happy. I like happy servers&#8230;!</p>
<h4>9. Server tuning</h4>
<div class="wp-caption aligncenter" style="width: 400px">
	<img title="Server crash cartoon" src="http://theroadtothehorizon.net/photo/server%20crash.jpg" alt="Server crash cartoon" width="400" height="234" />
	<p class="wp-caption-text">Tuning a server: not for the faint of heart...</p>
</div>
<p>Depending on what exactly you do on your blogs, for large and heavy traffic blogs like the seven I had just migrated, the SQL server might need tuning. This is not for the faint-of-heart, and requires patience and caution. With one wrong setting, you can cause more damage than good.</p>
<p>The first indication that my SQL parameters might need tuning, was simply the fact that SQL took up so much CPU time on my server. phpMyAdmin, a routine available to about every selfhosted server, has a neat feature called <em>&#8220;Status&#8221;</em>, which gave me an overview of the parameters which might need changing. But I was not sure. So I installed <a href="http://blog.mysqltuner.com/" target="_blank">mySQLTuner</a>: using SSH, I logged into my server&#8217;s root. With just three commands, I got a better overview of the parameters I needed in three commands:</p>
<blockquote>
<pre><code>wget mysqltuner.pl</code>
chmod 775 mysqltuner.pl<code>
perl mysqltuner.pl </code></pre>
</blockquote>
<p>I changed one parameter at the time, and waited for 12-24 hours to see its effect.</p>
<p>It seems the two most important parameters to tune are <em>&#8220;key_buffer_size&#8221;</em> and <em>&#8220;table_cache&#8221;</em>. It was advised to tune these first before touching the others. Which I did. <em>&#8220;key_buffer_size&#8221;</em> was ok on its default value of 48M, but <em>&#8220;table_cache&#8221;</em> needed 4,096 instead of the default 1,024.</p>
<p>The rest of the parameters I changed over time:</p>
<blockquote>
<pre><code>tmp_table_size: from 32M to 64M</code>
<code>max_heap_table_size: from 32M to 64M</code>
<code>sort_buffer_size: from 1M to 4M</code></pre>
</blockquote>
<p>While I am still observing the server and tuning bits and pieces, it looked by now that, ladies and gentlemen, we have a happy server, purring  like a happy cat. Sure enough I still get peak loads, with &#8220;load average&#8221; of 3-4, most of the time it stays at &#8220;1&#8243; or below. And that is good. I like happy servers.</p>
<h4>10. Template tuning</h4>
<p>For the seven blogs, I use a neat little magazine template, called <a href="http://wordpress.org/extend/themes/magazine-basic" target="_blank">Magazine-Basic</a> by <a href="http://themes.bavotasan.com/" target="_blank">Bavotasan</a>. It comes for free, and has some basic functions that I like.</p>
<p>I customized the CSS and some of the functions. I sinned heavily by patching the original template, rather than using a child theme, but that is just because I made so many changes. On top of that, for my main blogs, &#8220;speed&#8221; is important, I did not want every page refresh to read several CSS files. So patching, it was. I will live with the fact that I can not upgrade the template automatically later on, but hey, I learned that upgrading templates is always a pain, and often causes more problems than it is worth. So I avoid theme upgrades like the pest. My opinion, punto.</p>
<p>One of the main challenges I faced was that the template, by default, puts the first image if finds in the post as a thumbnail on the home and archive pages. And often that image was junk (a &#8220;Retweet&#8221; button, or a Feedburner &#8220;Email this&#8221; image).. So I had to tweak all my Yahoo Pipes feeds to delete those images. Took me a week. Some feeds were too complex to delete all images, so for some of the blogs, I just disabled the thumbnails in the template. I mean, I got to sleep too, so can&#8217;t keep on tuning all feeds to find all kinds of combinations of dummy images..</p>
<p>Maybe that will be work for next Xmas!</p>
<h4>11. Installing the mobile theme</h4>
<p>And to finish it all in beauty, I <a href="/how-to-enable-mobile-theme-on-wordpress-blog/">installed a plugin to enable a mobile theme to be displayed</a> when visitors access my blog using a mobile phone.</p>
<h4>12. And here is the end result of about four weeks of work:</h4>
<div class="wp-caption aligncenter" style="width: 400px">
	<img title="From Tumblr to WordPress" src="http://theroadtothehorizon.net/photo/from%20tumblr%20to%20wordpress.jpg" alt="From Tumblr to WordPress" width="400" height="263" />
	<p class="wp-caption-text">From Tumblr to WordPress. With Love</p>
</div>
<p>Check them out on your mobile. Browse through the posts to see if you like the download speed (remember the homepage and archives are not cached, and thus a bit slower!):  <a title="a super fast collector of the latest news from countries in the aid spotlight" href="http://aidnews.org" target="_blank">AidNews</a>, <a title="aggregates the latest articles from dozens of reference sites" href="http://aidresources.org" target="_blank">AidResources</a>, <a title="summarizes the latest environmental news" href="http://newsongreen.org" target="_blank">News On Green</a>, <a title="the one and only aidworker blog collector" href="http://aidblogs.org" target="_blank">AidBlogs</a>, <a title="aggregates the posts from over 800 nonprofit blogs" href="http://nonprofitblogs.info" target="_blank">The NonProfit Blogs</a>, <a title="because life is too serious" href="http://theweirdbit.org" target="_blank">The Weird Bit</a> and <a title="updates about blogging and social media" href="http://bloggingtoday.org" target="_blank">Blogging Today</a>.<br />
You like?</p>
<p>Cartoon courtesy <a href="http://www.marklowe.com/art/graphicsv1.html" target="_blank">Mark Lowe</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.blogtips.org/how-i-moved-350000-blogposts-from-tumblr-to-wordpress/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>How to combine RSS feeds with Yahoo Pipes</title>
		<link>http://www.blogtips.org/how-to-combine-rss-feeds-with-yahoo-pipes/</link>
		<comments>http://www.blogtips.org/how-to-combine-rss-feeds-with-yahoo-pipes/#comments</comments>
		<pubDate>Sun, 16 Jan 2011 18:11:13 +0000</pubDate>
		<dc:creator>Peter</dc:creator>
				<category><![CDATA[Advanced Stuff]]></category>
		<category><![CDATA[Geeky Stuff]]></category>
		<category><![CDATA[How to... Stuff]]></category>
		<category><![CDATA[RSS]]></category>
		<category><![CDATA[tools]]></category>
		<category><![CDATA[Yahoo Pipes]]></category>

		<guid isPermaLink="false">http://www.blogtips.org/?p=1724</guid>
		<description><![CDATA[Many bloggers use RSS feeds to check on the latest posts from different websites and blogs. Bloggers often use widgets to display RSS feeds from related blogs, or their Twitter and Delicious updates. Or they might simply pull RSS feeds into an RSS reader like Google Reader. Comes a time where it might be useful [...]]]></description>
			<content:encoded><![CDATA[<p></p><p><img class="aligncenter" title="Pipes" src="http://theroadtothehorizon.net/photo/pipes.jpg" alt="Pipes" width="400" height="286" /></p>
<p>Many bloggers use <a href="/what-is-rss-and-what-can-you-do-with-it/" target="_self">RSS feeds</a> to check on the latest posts from different websites and blogs. Bloggers often use widgets to display RSS feeds from related blogs, or their Twitter and Delicious updates. Or they might simply pull RSS feeds into an RSS reader like <a href="http://www.google.com/reader" target="_blank">Google Reader</a>.</p>
<p>Comes a time where it might be useful to combine different RSS feeds into one feed. Imagine you have a blog about fundraising, and you have bookmarked 10 different fundraising sites of interest to you. You would like to display in an RSS widget on your blog with the latest updates on those blogs. You would not want to have 10 widgets, each for one RSS feed, would you?</p>
<p>The easiest is to combine different feeds into one. And <a href="http://pipes.yahoo.com/pipes/" target="_blank">Yahoo Pipes</a> is still the best tool to do so.</p>
<p><span id="more-1724"></span>Yahoo Pipes is a versatile tool you can use to combine different content, mash it up to create new content, but here is a step by step tutorial for the most simple thing you can do with Yahoo Pipes: combine different feeds.</p>
<h4>Step 1: Getting started with Yahoo Pipes</h4>
<ol>
<li><a href="https://edit.yahoo.com/registration" target="_blank">Register</a> for a Yahoo account</li>
<li>Go to <a href="http://pipes.yahoo.com/pipes/" target="_blank">Yahoo Pipes</a> and log in with your Yahoo account</li>
</ol>
<h4>Step 2: Create a Pipe</h4>
<p>A &#8220;Pipe&#8221; is a workflow you create to take RSS (or other) input, mix it up, and create a new output.</p>
<p>On the home page of Yahoo Pipes, click on &#8220;Create a Pipe&#8221;. This will give you a worksheet to construct a pipe, using a nice graphical user interface.<br />
In the left column, you will see a collapsible menu with the main working modules, and the main part of the screen is your worksheet:</p>
<p><img class="aligncenter" title="Yahoo Pipes main screen" src="http://theroadtothehorizon.net/photo/yahoo%20pipes%20main%20working%20screen.jpg" alt="Yahoo Pipes main screen" width="400" height="195" /></p>
<h4>Step 3. Create a workflow for your pipe</h4>
<ol>
<li>The first thing we have to do, is to tell Yahoo Pipes where to get the input RSS feeds from, the feeds we want to combine. From the left module list, drag the module &#8220;Fetch Feed&#8221; (from the &#8220;Sources&#8221; list) into your work sheet on the right:<br />
<img class="aligncenter" title="Dragging a Yahoo Pipes Fetch Feed" src="http://theroadtothehorizon.net/photo/yahoo%20pipes%20fetch%20feed.jpg" alt="Dragging a Yahoo Pipes Fetch Feed" width="200" height="111" /></li>
<li>Note that once you dragged the &#8220;Fetch Feed&#8221; module, Yahoo Pipes  automatically also put a second module &#8220;Pipe Output&#8221; in your worksheet.  Leave it there for the time being.</li>
<li>The &#8220;Fetch Feed&#8221; module will open up, and you can fill in the different feeds you want to combine.<br />
As an example, let&#8217;s put two feeds in here, one from <a href="http://feeds2.feedburner.com/blogtips/rss" target="_blank">BlogTips</a> and one from <a href="http://feeds.feedburner.com/BloggingTodayrss" target="_blank">Blogging Today</a>. In this case, both are Feedburner feeds, but it does not matter. You can take Atom feeds, RSS 1.0 or RSS 2.0. Yahoo Pipes will process them properly.</li>
<li>After filling in the first feed URL, click the &#8220;+&#8221; icon and fill in the second feed.<br />
<img class="aligncenter" title="Fetching feeds with Yahoo Pipes" src="http://theroadtothehorizon.net/photo/fetching%20feeds%20with%20Yahoo%20Pipes.jpg" alt="Fetching feeds with Yahoo Pipes" width="200" height="89" /></li>
<li>At any time during the process of creating a pipe, you can click on a module, and as it gets highlighted in orange, the debugging window at the bottom of the sheet will change when you hit &#8220;refresh&#8221;.<br />
<img class="aligncenter" title="Yahoo Pipes debugging screen" src="http://theroadtothehorizon.net/photo/pipe%20debugging%20screen.jpg" alt="Yahoo Pipes debugging screen" width="400" height="252" />The debugging module basically shows the output of that module. You can click the triangle icons to see the different internal components.<br />
This is very useful if you create complex pipes, but for now, we will leave it aside.</li>
<li>The next step is to connect the &#8220;Fetch Feed&#8221; module to the &#8220;Pipe Output&#8221; module: Click on the blue dot (the connection dot) at the bottom of the &#8220;Fetch Feed&#8221; module, and drag it to the blue dot at the top of the &#8220;Pipe Output&#8221; module:<br />
<img class="aligncenter" title="connecting two modules in Yahoo Pipes" src="http://theroadtothehorizon.net/photo/connecting%20two%20modules%20with%20Yahoo%20Pipes.jpg" alt="connecting two modules in Yahoo Pipes" width="200" height="115" /></li>
<li>And in principle you are done now.</li>
</ol>
<h4>Step 4. How to combine many many many RSS feeds</h4>
<p>While each &#8220;Fetch Feed&#8221; module will let you enter about 20 feeds, you might need to combine more feeds than that. Easy!</p>
<ol>
<li>Drag another &#8220;Fetch Feed&#8221; module from the left &#8220;Sources&#8221; list into your worksheet</li>
<li>Fill in your feeds.</li>
<li>If you still have the first &#8220;Fetch Feed&#8221; module already connected to your &#8220;Pipe Output&#8221; module, disconnect it: Just hover over the connecting blue line, and scissors will appear. Click on the scissors, and the connection line will disappear.<br />
<img class="aligncenter" title="disconnecting modules in Yahoo Pipes" src="http://theroadtothehorizon.net/photo/disconnect%20two%20modules%20with%20Yahoo%20Pipes.jpg" alt="disconnecting modules in Yahoo Pipes" width="200" height="63" /></li>
<li>Now you need to combine both &#8220;Fetch Feed&#8221; modules into one. Drag the &#8220;Union&#8221; module from the &#8220;Operators&#8221; module list on the left, into your worksheet.<br />
<img class="aligncenter" title="union module in Yahoo Pipes" src="http://theroadtothehorizon.net/photo/union%20module%20in%20Yahoo%20Pipes.jpg" alt="union module in Yahoo Pipes" width="200" height="85" /></li>
<li>Now connect the two &#8220;Fetch Feed&#8221; modules to the input connectors of the &#8220;Union&#8221; module (at the top) and connect the output connector of the &#8220;Union&#8221; module to the &#8220;Pipe Output&#8221; module.<br />
<img class="aligncenter" title="connecting the union module in Yahoo Pipes" src="http://theroadtothehorizon.net/photo/connecting%20the%20union%20module.jpg" alt="connecting the union module in Yahoo Pipes" width="200" height="101" />Remember, you only need to do this, if you have a lot of input feeds. Each &#8220;Fetch Feed&#8221; module will easily cater for 20 input feeds.</li>
<li>If need be, you can add more &#8220;Fetch Feed&#8221; modules and connect them all to the &#8220;Union&#8221; module. I suggest to limit to 60-80 input feeds per pipe, otherwise the pipe becomes too slow.</li>
</ol>
<h4>Step 5. Save your pipe</h4>
<p>Congratulations, you have just constructed a pipe which combines your RSS feeds. Now save the Pipe.</p>
<ol>
<li>Click on the tab &#8220;Untitled&#8221; at the top left of your screen, and give the pipe a name, and click &#8220;OK&#8221;<br />
<img class="aligncenter" title="give a name to your Yahoo Pipe" src="http://theroadtothehorizon.net/photo/give%20a%20name%20to%20your%20pipe.jpg" alt="give a name to your Yahoo Pipe" width="200" height="43" /></li>
<li>Now click on the &#8220;Save&#8221; button in the right corner</li>
<li>If you have many input feeds in your pipe, saving might take a while.<br />
There <a href="/yahoo-pipes-more-down-than-up/" target="_self">have been times where Yahoo Pipes had performance problems</a> and saving just hung, or gave obscure error messages, but it seems those times are now past. (touch wood)</li>
</ol>
<h4>Step 6. Create an RSS output from your pipe</h4>
<p>While you have now successfully created a Pipes workflow, you still don&#8217;t have the combined RSS output of your pipe yet, do you? Two simple steps to get there:</p>
<ol>
<li>After you saved your pipe, click on &#8220;Run Pipe&#8221;</li>
<li>A new window will open, and a screen will appear showing &#8220;Running Pipe&#8221;<br />
<img class="aligncenter" title="running yahoo pipes" src="http://theroadtothehorizon.net/photo/running%20pipe.jpg" alt="running yahoo pipes" width="200" height="103" /></li>
<li>Dependent on how many feeds you combine, this might take 2 or take 60 seconds, before the items in your feed will appear.<br />
Do check the bottom of your feed to see if there were any errors. You might have entered a wrong feed, or entered a home page URL instead of a feed, a common mistake!</li>
<li>Your pipe, combining your different RSS feeds, can be used with different tools (add it to My Yahoo!, to your Google Home page, to your Google Reader, get a badge to put the output on your blog, etc&#8230;). If you just want the link to the RSS output, click on &#8220;Get as RSS&#8221;<br />
<img class="aligncenter" title="Get output from Yahoo Pipes" src="http://theroadtothehorizon.net/photo/pipes%20output%20options.jpg" alt="Get output from Yahoo Pipes" width="400" height="100" /></li>
<li>And voila, there is the RSS output. Copy the URL of the RSS output and you can reuse it anywhere you want.<br />
<img class="aligncenter" title="RSS output from Yahoo Pipes" src="http://theroadtothehorizon.net/photo/rss%20output%20from%20yahoo%20pipes.jpg" alt="RSS output from Yahoo Pipes" width="400" height="346" /></li>
</ol>
<h4>Step 7. Refining your Yahoo Pipes output</h4>
<p>Now we&#8217;re done. Well&#8230; except for one thing. Well, except for two things. Well, except for three things:</p>
<ol>
<li> As it stands right now, the RSS output does not have the feed items sorted by date, which is no good. We will need to sort the RSS feed, and publish it with the most recent RSS items first.</li>
<li>If you combine many different feeds, the RSS output might contain many items, and that could clog up wherever you want to reuse the feed. We will need to limit the amount of items in the Yahoo Pipes&#8217; RSS feed.</li>
<li>And again, if you use many different RSS feeds as input, one input blog might have re-used input from the other, so we will want to take out duplicate entries.</li>
</ol>
<p>So roll up your sleeves once more for the &#8220;final act&#8221;:</p>
<ol>
<li>Disconnect the &#8220;Union&#8221; and the &#8220;Pipe Output&#8221; module</li>
<li>From the &#8220;Operators&#8221; menu on the left, drag the modules &#8220;Sort&#8221;, &#8220;Unique&#8221;, &#8220;Truncate&#8221; onto your worksheet.</li>
<li>Connect them in this order, and fill in the different fields:<br />
<img class="aligncenter" title="refining a yahoo pipe" src="http://theroadtothehorizon.net/photo/refining%20yahoo%20pipe.jpg" alt="refining a yahoo pipe" width="400" height="339" /></li>
<li>In the &#8220;Sort&#8221; module, select &#8220;item.pubDate&#8221; and &#8220;descending&#8221; order</li>
<li>In the &#8220;Unique&#8221; module, select &#8220;item.title&#8221;</li>
<li>In the &#8220;Truncate&#8221; module, fill in the maximum number of items in your output field. We used 40 in our example.</li>
<li>Connect the different modules (From &#8220;Union&#8221; or &#8220;Fetch Feed&#8221; if you don&#8217;t use the &#8220;Union&#8221; module, to the &#8220;Sort&#8221;, the &#8220;Unique&#8221;, the &#8220;Truncate&#8221; and the &#8220;Output Pipe&#8221;.</li>
<li>Save the pipe, and you are done. Yahoo Pipes will automatically change your output RSS according to the modifications you saved from your worksheet.</li>
</ol>
<p>So now you are really done. Unless if you want to&#8230;</p>
<h4>Step 8. Do more sophisticated stuff with Yahoo Pipes. (optional)</h4>
<p>Yahoo Pipes allows you to do many many many different things. You can take input and mash it up to create virtually new content. You can check out the pipes others have created clicking &#8220;Browse&#8221; on Pipes&#8217; home page. People come up with the weirdest and most sophisticated ideas, using translators, automatically integrating FLICKR pictures in a pipe, etc&#8230;</p>
<p>However, the most commonly needed option, is to change or filter out some of the content from the input pipes. For that, the &#8220;Regex&#8221; module from the &#8220;Operators&#8221; menu is the most used workhorse.</p>
<p>There are few &#8220;programming syntaxes&#8221; which are as geeky as &#8220;Regex&#8221;, so beware, this is not for the &#8220;faint of heart&#8221;. But just one example: Let&#8217;s say you want to delete all images from the input feeds, and generate a &#8220;text only&#8221; output feed. And as we started this post saying you have different blogs and sites about &#8220;fundraising&#8221;, let&#8217;s change the output marking any occurrence of the word &#8220;fundraising&#8221; in bold:</p>
<ol>
<li>Disconnect the &#8220;Truncate&#8221; module from the &#8220;Pipe Output&#8221; module.</li>
<li>Drag the &#8220;Regex&#8221; module from the &#8220;Operators&#8221; menu on the left, into your worksheet</li>
<li>In the &#8220;In&#8221; field, select which part of the RSS feed you want to work with. In most feeds, the actual content of every RSS item is stored in the &#8220;item.description&#8221; field, so select that field<br />
Some RSS feeds store their content in other fields, so you might want to use the debugger view of Pipes to find out where the content is stored.</li>
<li>To replace all images, we need to take out the &#8220;img&#8221; tag from the content. So we want to replace anything from the beginning of the img tag until the end, with &#8220;nothing&#8221;. The Regex formula for that is:<code>(&lt;img.*?&gt;)</code><br />
&#8230;which means: <em>take any string, beginning with &#8220;&lt;img,&#8221; following by anything (the &#8220;.*?&#8221; part) and ending with &#8220;&gt;&#8221;</em>. We need to replace that with &#8220;nothing&#8221; so leave the &#8220;with&#8221; field empty.<br />
And we want to replace all the occurrences of this in the input, so we click the &#8220;g&#8221; option in the regex module.</li>
<li>We will now look for all occurences of the word &#8220;fundraising&#8221; (including those starting with a capital &#8220;F&#8221;), and put the &#8220;bold&#8221; tags around it. Pfft, easy!</li>
<li>Our regex module will then look like this:<br />
<img class="aligncenter" title="Yahoo Pipes Regex module" src="http://theroadtothehorizon.net/photo/pipes%20regex%20module.jpg" alt="Yahoo Pipes Regex module" width="400" height="185" /></li>
<li>Don&#8217;t forget to connect the &#8220;Truncate&#8221;, &#8220;Regex&#8221; and &#8220;Pipes Output&#8221; module..</li>
<li>Save your pipe, and once again, you are done.</li>
</ol>
<p>As said, you can go as complex as you want. If you get into trouble, call on your peers for support in the <a href="http://discuss.pipes.yahoo.com/" target="_blank">Yahoo Pipes Discussion Forum</a>. People will be ready to help you.</p>
<h4>9. What I use Yahoo Pipes for?</h4>
<p>I use Yahoo Pipes for many different things.</p>
<p>I have different pipes where I take feeds from dozens of news sites and combine them into a single RSS output, which I read using Google Reader, both on my laptop, iPad and iPhone.</p>
<p>I also use Pipes to channel different feeds onto several Twitter and Facebook accounts, <a href="/the-diagram-of-a-blog-network/">using a series different techniques</a>.</p>
<p>But I mostly use Yahoo Pipes to combine, process and mashup feeds from over 1,000 sites and blogs about aid, development, humanitarian issues and the environment. I republish summaries, as I describe <a href="/rss-reversed-from-feed-to-blog/">in an earlier post</a>.<br />
Check out the end result in <a title="my mega news collector" href="http://www.humanitariannews.org" target="_blank">Humanitarian News</a>, <a title="a super fast collector of the latest news from countries in the aid spotlight" href="http://aidnews.org" target="_blank">AidNews</a>, <a title="aggregates the latest articles from dozens of reference sites" href="http://aidresources.org" target="_blank">AidResources</a>, <a title="summarizes the latest environmental news" href="http://newsongreen.org" target="_blank">News On Green</a>, <a title="the one and only aidworker blog collector" href="http://aidblogs.org" target="_blank">AidBlogs</a> and <a title="aggregates the posts from over 800 nonprofit blogs" href="http://nonprofitblogs.info" target="_blank">The NonProfit Blogs</a>.<br />
In <a title="aggregates the posts from blogs about blogging" href="http://bloggingtoday.org" target="_blank">Blogging Today</a> I also aggregate the latest updates from &#8220;Blogs about Blogging&#8221;, and <a title="aggregates weird news" href="http://theweirdbit.org" target="_blank">The Weird News</a> keeps a smile on my face&#8230;</p>
<p>Oh, and as a closing tip: if you want to go over &#8220;How to create a Yahoo Pipe&#8221; in a video, check this out:</p>
<p><object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="400" height="251" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="AllowScriptAccess" value="always" /><param name="bgcolor" value="#000000" /><param name="flashVars" value="id=13878389&amp;vid=5260536&amp;lang=en-us&amp;intl=us&amp;thumbUrl=http%3A//l.yimg.com/a/p/i/bcst/videosearch/9326/87078068.jpeg&amp;embed=1" /><param name="src" value="http://d.yimg.com/static.video.yahoo.com/yep/YV_YEP.swf?ver=2.2.46" /><param name="flashvars" value="id=13878389&amp;vid=5260536&amp;lang=en-us&amp;intl=us&amp;thumbUrl=http%3A//l.yimg.com/a/p/i/bcst/videosearch/9326/87078068.jpeg&amp;embed=1" /><param name="allowfullscreen" value="true" /><embed type="application/x-shockwave-flash" width="400" height="251" src="http://d.yimg.com/static.video.yahoo.com/yep/YV_YEP.swf?ver=2.2.46" flashvars="id=13878389&amp;vid=5260536&amp;lang=en-us&amp;intl=us&amp;thumbUrl=http%3A//l.yimg.com/a/p/i/bcst/videosearch/9326/87078068.jpeg&amp;embed=1" bgcolor="#000000" allowscriptaccess="always" allowfullscreen="true"></embed></object><br />
Have fun with Pipes!</p>
<p>Picture courtesy <a href="http://www.rajtilak.net" target="_blank">Rajtilak</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.blogtips.org/how-to-combine-rss-feeds-with-yahoo-pipes/feed/</wfw:commentRss>
		<slash:comments>23</slash:comments>
		</item>
	</channel>
</rss>

