<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Blog Tips &#187; Geeky Stuff</title>
	<atom:link href="http://www.blogtips.org/category/geeky/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.blogtips.org</link>
	<description>Blogging and Social Media for Nonprofit</description>
	<lastBuildDate>Tue, 31 Jan 2012 15:23:38 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>How to secure WordPress timthumb.php</title>
		<link>http://www.blogtips.org/how-to-secure-wordpress-timthumb-php/</link>
		<comments>http://www.blogtips.org/how-to-secure-wordpress-timthumb-php/#comments</comments>
		<pubDate>Fri, 16 Sep 2011 09:12:48 +0000</pubDate>
		<dc:creator>Peter</dc:creator>
				<category><![CDATA[Advanced Stuff]]></category>
		<category><![CDATA[Geeky Stuff]]></category>
		<category><![CDATA[How to... Stuff]]></category>
		<category><![CDATA[hackers]]></category>
		<category><![CDATA[security]]></category>
		<category><![CDATA[WordPress]]></category>

		<guid isPermaLink="false">http://www.blogtips.org/?p=2129</guid>
		<description><![CDATA[If you have a selfhosted WordPress blog (WordPress.org), take urgent measures to secure your site from a recently discovered vulnerability. Many WordPress themes and plug-ins use a script called &#8220;timthumb&#8221; (timthumb.php). This is the most common code used to create thumbnails from pictures. End July, a vulnerability surfaced showing external users could dump malicious code [...]]]></description>
			<content:encoded><![CDATA[<p></p><p><img class="aligncenter" title="matrix" src="http://theroadtothehorizon.net/photo/matrix.jpg" alt="matrix" width="400" height="304" /></p>
<p>If you have a selfhosted WordPress blog (WordPress.org), take urgent measures to secure your site from a recently discovered vulnerability.</p>
<p>Many WordPress themes and plug-ins use a script called &#8220;timthumb&#8221; (timthumb.php). This is the most common code used to create thumbnails from pictures.</p>
<p>End July, a vulnerability surfaced showing external users could dump malicious code onto your site. Typically, a short piece of .php code is uploaded via a timthumb backdoor. This hacking code then creates a wider backdoor to gain pretty much full access to your site.</p>
<p><span id="more-2129"></span>It looks like the hackers were on holiday too, and are only gearing up their activity right now. Many sites were hacked in the last couple of days. As many sites use timthumb.php, we can foresee a major hacking spree in the next weeks.</p>
<p>So it is high time to secure your selfhosted WordPress site now.</p>
<h3>How to check if you have been hacked via timthumb.php?</h3>
<p>There is not one specific signature to this hack, contrary to <a title="GoDaddy hacked" href="/godaddy-sites-hacked-again/">the shared hosting hack</a> last year, but here are some common things that seem to happen:</p>
<ul>
<li>First&#8230; Check if you actually use &#8220;timthumb.php&#8221; on your site. It does not come with the default WordPress installation, so check if any of your uploaded themes contain the file &#8220;timthumb.php&#8221;.<br />
Do a site wide search (with SSH or SFTP). Some popular plug-ins that use timthumb.php are &#8220;WordPress Popular Posts&#8221; and &#8220;WP Mobile Detector&#8221;.<br />
Many themes use timthumb.php, or a variation of it. E.g. the widely used &#8220;Thesis&#8221; theme uses it as &#8220;thumb.php&#8221;.</li>
<li>If you find the timthumb.php in your plugin or themes directory, you&#8217;d better give your site a thorough check, so check further</li>
<li>The hackers often upload .php files in the timthumb upload directory &#8220;/cache&#8221; (a subdirectory from the one where the timthumb script is stored). You should check that directory, and delete any non-picture files (.html .php,&#8230;)</li>
<li>Often hackers upload .php files to several other subdirectories within your WordPress installation. I have seen them in the &#8220;/upload&#8221; &#8220;/supercache&#8221; directories (and their subdirectories) as well as in the directories for plugins and themes. Delete them.</li>
<li>Recently, the hackers got bolder and entire subdirectories were uploaded. First a .zip file would be uploaded, it would be unzipped and an entire sub-site was installed in one of the WordPress directories. I have seen zipfiles called halifaxsecurity.zip, hal.zip, studentloanupdate.zip, student.zip. Malicious subdirectories I detected on other sites, were called /halifaxsecurity, /hal and /studentloanupdate. Delete those, if you find them.</li>
<li>People also report direct hacks in .php files and style sheets, adding malicious code (similar to the last year&#8217;s hacks).</li>
<li>Check your .htaccess files</li>
<li>..</li>
</ul>
<p>Check also <a href="http://blog.sucuri.net/" target="_blank">Sucuri&#8217;s blog</a> for more hack signatures and scripts, and <a href="http://markmaunder.com/2011/08/01/zero-day-vulnerability-in-many-wordpress-themes/" target="_blank">Mark Maunder&#8217;s blogpost</a> for a full description of the timthumb vulnerability.<br />
List of themes and plugins (non-exhaustive, though) using timthumb.php, you can find on <a href="http://www.big-webmaster.com/themes-scanned-timthumb-vulerability/" target="_blank">Big Webmaster</a> and <a href="http://blog.sucuri.net/2011/08/attacks-against-timthumb-php-in-the-wild-list-of-themes-and-plugins-being-scanned.html" target="_blank">Sucuri&#8217;s blog</a>.</p>
<p>Deleting those malicious files is not sufficient, as it still leaves the backdoor open for future hacks, <strong>so you need to secure your timthumb.php code NOW </strong>! Read on:</p>
<h3>How to secure timthumb.php against hacks?</h3>
<ol>
<li>Locate all instances of timthumb.php (or any renames of it) on your site.</li>
<li><a href="http://timthumb.googlecode.com/svn/trunk/timthumb.php" target="_blank">Download the newest timthumb.php code</a> (Check also <a href="http://code.google.com/p/timthumb/" target="_blank">the plug-in&#8217;s home page</a>)</li>
<li>Replace the old timthumb.php with your downloaded code.</li>
<li>While the new code is already secure, I strongly suggest to limit the access from external sites.<br />
Replace the line:<br />
<code>define ('ALLOW_EXTERNAL', TRUE);</code><br />
with:<br />
<code>define ('ALLOW_EXTERNAL', FALSE);</code></li>
</ol>
<p>Good luck!</p>
<p>Picture courtesy <a href="http://www.tgdaily.com" target="_blank">TGDaily</a><br />
With thanks to <a href="http://ictkm.cgiar.org" target="_blank">Michael Marus</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.blogtips.org/how-to-secure-wordpress-timthumb-php/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>When things break &#8211; Yet another look inside the workshop for a self-hosted blog</title>
		<link>http://www.blogtips.org/when-things-break-yet-another-look-inside-the-workshop-for-a-self-hosted-blog/</link>
		<comments>http://www.blogtips.org/when-things-break-yet-another-look-inside-the-workshop-for-a-self-hosted-blog/#comments</comments>
		<pubDate>Thu, 30 Jun 2011 15:39:44 +0000</pubDate>
		<dc:creator>Peter</dc:creator>
				<category><![CDATA[Geeky Stuff]]></category>
		<category><![CDATA[hosting]]></category>
		<category><![CDATA[speed]]></category>
		<category><![CDATA[WordPress]]></category>

		<guid isPermaLink="false">http://www.blogtips.org/?p=2061</guid>
		<description><![CDATA[You can classify blogs into categories using many different criteria. From a blog-administrator&#8217;s point of view, the main classification whether a blog is self-hosted on your own server (such as WordPress.org) or hosted by the blogging service itself (such as Blogger, Tumblr, Posterous, WordPress.com,&#8230;) Blogs hosted by the blogging service make you dependent on their [...]]]></description>
			<content:encoded><![CDATA[<p></p><p><img class="aligncenter" title="Do-it-yourself tools" src="http://theroadtothehorizon.net/photo/do-it-yourself%20tools.jpg" alt="Do-it-yourself tools" width="300" height="300" /></p>
<p>You can classify blogs into categories using many different criteria. From a blog-administrator&#8217;s point of view, the main classification whether <a href="/selecting-a-blog-platform-selfhost-your-blog-or-not/">a blog is self-hosted on your own server</a> (such as <a href="http://wordpress.org">WordPress.org</a>) or hosted by the blogging service itself (such as <a href="http://blogger.com">Blogger</a>, <a href="http://tumblr.com">Tumblr</a>, <a href="http://posterous.com">Posterous</a>, <a href="http://wordpress.com">WordPress.com</a>,&#8230;)</p>
<p><a href="/selecting-a-blog-platform-selfhost-your-blog-or-not/">Blogs hosted by the blogging service</a> make you dependent on their uptime and functionality. Self-hosting blogs <a href="/the-difference-between-shared-hosting-and-dedicated-hosting/">allow you to take the tiller in your own hands</a>, giving you much more freedom to sculpture the blog as you see best fit. But it also dumps <a href="/selfhosting-or-not-hackers/">all the work</a> to backup, upgrade and maintain your blog into your lap. &#8220;With freedom also comes responsibility&#8221;, did your parents not tell you that when you were allowed to go to a party for the first time? <img src='http://www.blogtips.org/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>I have posted many times before about the work involved <a href="/how-one-small-plug-can-slow-blog/">to keep self-hosted blogs up-to-date</a> and <a href="/my-blogging-life-in-one-picture/">up-and-running</a>.</p>
<p>I just wrapped up yet another adventure with one of my selfhosted blogs. Let me share it with you, as this is yet <a href="/selfhosting-your-blog-or-not/">another reminder</a> to all those of you thinkering about self-hosting your blog.</p>
<p><span id="more-2061"></span></p>
<h4>When things go wrong&#8230;</h4>
<p><a href="http://newsongreen.org" target="_blank">News On Green</a> is my main aggregator for environmental news. It is hosted on my Hostgator VPS server, like most of my other blogs, and has about 130,000 blogposts. I monitor this server&#8217;s performance continuously and three days ago, I saw the resource hogging going haywire. The amount of CPU time consumed went through the roof, free memory got exhausted and everything slowed down up to the point where I could not even log into the server anymore.</p>
<p>I had seen <a href="/my-server-performance-went-down/">instances like that before</a>, when the caching broke on some of the blogs. As my sites get a lot of traffic (currently around 70,000 visits per month, excluding all the RSS feeds polls, search engine crawlers,..), proper caching is critical. If caching breaks on even one of the sites, the server has to go to its MySQL database for every single page visit, and the server goes belly up under the CPU load.</p>
<p>This is apparently what happened on <a href="http://newsongreen.org" target="_blank">News On Green</a>, where I use the <a href="http://wordpress.org/extend/plugins/wp-super-cache/">WP Supercache</a> plugin for caching. For each individual page or post on your blog, Supercache creates a subdirectory in the /cache/supercache directory. That&#8217;s where it stores a &#8220;pre-cooked&#8221; .html file, rather than executing a MySQL query for each page. HTML files are &#8220;served&#8221; much faster than MySQL queries.</p>
<h4>Debugging, once more&#8230;</h4>
<p>To help debugging, Supercache also puts a comment line at the bottom of the HTML page for each post. It contains some statistics or error messages. This time, I saw the error &#8220;can not create /cache/supercache/subdirectory/xxx.tmp&#8221;.</p>
<p>The error.log came up with all sorts of errors, including:</p>
<blockquote><p>PHP Notice: Undefined index: HTTP_ACCEPT in xxx/wp-content/plugins/wp-super-cache/wp-cache-phase1.php on line 415</p></blockquote>
<p>So Supercache was no longer caching the pages, so far was clear. But the question was why.</p>
<p>I checked the /supercache directory for the blog, and found that I had 31,998 cached pages, thus 31,998 directories. One subdirectory for each cached page or post. When I see numbers anywhere close to a magic &#8220;^2&#8243; figure, I get suspicious: 32, 64, 16,000, 32,0000&#8230; And as &#8220;31,9998&#8243; pages was close to one of the magic &#8220;^2&#8243; figures (32,000), I suspected a system or account limitation on my server.</p>
<p>I called up my hosting company, <a href="http://hostgator.com">Hostgator</a> via an online chat channel. The technician was really patient. Whereas on other hosting services, they would have answered &#8220;This is a WordPress problem&#8221;, their support took the trouble of checking several system parameters. But we could not find anything limiting the number of files or sub directories per directory&#8230;</p>
<p>I was still convinced I hit some hard threshold. As an experiment, I deleted some 100 cached pages, and lo and behold, the cache started working again. Eureka, workaround found! But what was the cause of the problem?</p>
<p>While online, I searched the internet for &#8220;maximum Linux subdirectories&#8221;, and found <a href="http://www.linuxquestions.org/questions/linux-general-1/maximum-number-of-files-per-folder-877599/">this post</a>. There seems to be Linux system parameter (for filesystems EXT4, EXT3, EXT2 &#8211; which are used on most Linux hosts). The maximum number of Linux subdirectories in one single directory is&#8230; 31,998.</p>
<p>Bingo! That was exact the amount of subdirectories I had in my cache directory.</p>
<h4>The solution&#8230;</h4>
<p>So what was the solution? I deleted the oldest 5,000 cached pages. As if by miracle, the crippled server resurrected like a phoenix.</p>
<p>The permanent solution will be, of course, to automate this check: a CRON job should automatically trim the oldest cached pages, once I come close to the limit of 32,000. Or 31,998 to be exact. Work to do in the next days.</p>
<p>Now how is that for a nerdy story, hey?</p>
<p>&nbsp;<br/>Picture courtesy <a href="http://www.kaboodle.com/" target="_blank">Kaboodle</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.blogtips.org/when-things-break-yet-another-look-inside-the-workshop-for-a-self-hosted-blog/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Help! My server performance went down!</title>
		<link>http://www.blogtips.org/my-server-performance-went-down/</link>
		<comments>http://www.blogtips.org/my-server-performance-went-down/#comments</comments>
		<pubDate>Fri, 01 Apr 2011 11:54:32 +0000</pubDate>
		<dc:creator>Peter</dc:creator>
				<category><![CDATA[Geeky Stuff]]></category>
		<category><![CDATA[Google Webmaster]]></category>
		<category><![CDATA[hosting]]></category>
		<category><![CDATA[search engines]]></category>
		<category><![CDATA[SEO]]></category>
		<category><![CDATA[speed]]></category>

		<guid isPermaLink="false">http://www.blogtips.org/?p=1908</guid>
		<description><![CDATA[A few months ago, I migrated 7 blogs with 350,000 blogposts from Tumblr to WordPress on my VPS server. Together with some other blogs and websites, that server happily processed about 50,000 visitors per month, with a steady increase of about 10% per month. All sites and the SQLserver were properly tuned, and the server [...]]]></description>
			<content:encoded><![CDATA[<p></p><p><img class="aligncenter" title="hour glass" src="http://theroadtothehorizon.net/photo/hourglass.jpg" alt="hour glass" width="240" height="240" /></p>
<p>A few months ago, <a title="How to move from Tumblr to WordPress" href="/how-i-moved-350000-blogposts-from-tumblr-to-wordpress/">I migrated 7 blogs with 350,000 blogposts from Tumblr to WordPress</a> on my VPS server.</p>
<p>Together with some other blogs and websites, that server happily processed about 50,000 visitors per month, with a steady increase of about 10% per month. All sites and the SQLserver <a title="How a plugin can slow down your blog" href="/how-one-small-plug-can-slow-blog/">were properly tuned</a>, and the server purred happily like a cat next to a stove.</p>
<p>Until this week. Help!</p>
<p><span id="more-1908"></span>All of sudden, I could see an excessive amount of PHP requests coming in, and the server&#8217;s load average went from 1-2 to 10-20. That means that at any moment, 20 processes were waiting for CPU time&#8230; Everything slowed down. What now?</p>
<p>It turned out I had two problems:</p>
<h3>The cache broke on one of the blogs</h3>
<p><a title="My Aid News aggregator" href="http://aidnews.org">AidNews</a>, one of my main aggregator blogs, showed to give an excessive PHP load. That wasn&#8217;t normal, as most content was cached, so no need to call PHP&#8230;?!</p>
<p>When I loaded some blogposts, and looked at the source file, I could see a debugging error, stating SuperCache could not cache the page. So, one of my most busiest sites no longer cached its pages, making each page load go to the SQL database via a PHP call.</p>
<p>I changed several cache settings, but in vain. What worked on the other blogs no longer worked on this blog. The cache went RIP. In the end, I had to delete the plug-in, erase all cache directories and reinstalled the plug-in.</p>
<p>Problem cured. At least one of the problems. The high server load continued, even without the excessive PHP load from that single blog.</p>
<h3>Excessive crawler traffic</h3>
<p>After installing the new blogs, I did a good effort to tune the SEO (Search Engine Optimization). Nightly sitemaps are submitted to several search engines, and the crawlers were happy crawling my pages.</p>
<p>But all too happily it seemed. When I checked the crawler stats for each of my sites in <a href="http://www.google.com/webmasters/">Google Webmaster</a>, I found the Google crawler daily visited <strong>5,000 to 10,000 posts for each of the sites</strong>. That makes close to <strong>100,000 page visits per day</strong>, only for Google.<br />
I mean I like a good crawler rate, but that was excessive. Many of the posts I generate with most aggregation blogs I have on the server, never change. So there is no reason for the crawlers to revisit many posts. Using the same method <a href="/how-to-make-google-discover-more-posts-on-your-blog/">I described earlier</a> to increase the crawler rate, I forced the rate down.</p>
<p>Voila: a &#8220;robots.txt&#8221; with two lines code went onto each blog&#8217;s root directory, and within 12 hours, the server started purring again&#8230; Like a cat next to a stove.</p>
<p><span style="color: #ff00ff;">Update Oct 7 2011:</span><br />
Over the past 14 days, I have been monitoring my server performance more closely, as there were sudden mysterious spikes in activity. Last night was one of those occasions. I took a snapshot of the server, and with the folks over at <a href="http://hostgator.com" target="_blank">Hostgator</a> discovered crawlers are having a feast on my websites. At one single moment crawlers were accessing over 7,000 pages simultaneously on one site. 7,000! Few servers will sustain that&#8230;<br />
Again, I love crawlers, but I like a fast website more. So now I tweaked the robot.txt to only allow the few popular crawlers, and not more. 95% of the search traffic comes from Google anyway, so why would I care for dozens of unknown crawlers?</p>
<h3>Lessons learned?</h3>
<p>Well, <a title="Should you selfhost a blog or not?" href="/selecting-a-blog-platform-selfhost-your-blog-or-not/">if you self-host a blog</a> then it&#8217;s not &#8220;just a matter of configuring the server and the blogs&#8221;. You need to keep a close eye on things. From time to time something goes haywire, and unless you act fast, all runs out of control. One more reason to really think hard if you want to switch from a hosted blog platform, to a shared or dedicated server&#8230; You need to be ready to spend more time on technical stuff, than on blogging.</p>
<p>Happy blogging!<br />
Picture courtesy <a href="http://www.flickr.com/photos/bogenfreund/" target="_blank">Bogenfeund</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.blogtips.org/my-server-performance-went-down/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>How I moved 350,000 blogposts from Tumblr to WordPress</title>
		<link>http://www.blogtips.org/how-i-moved-350000-blogposts-from-tumblr-to-wordpress/</link>
		<comments>http://www.blogtips.org/how-i-moved-350000-blogposts-from-tumblr-to-wordpress/#comments</comments>
		<pubDate>Mon, 31 Jan 2011 20:31:41 +0000</pubDate>
		<dc:creator>Peter</dc:creator>
				<category><![CDATA[Advanced Stuff]]></category>
		<category><![CDATA[Geeky Stuff]]></category>
		<category><![CDATA[How to... Stuff]]></category>
		<category><![CDATA[hosting]]></category>
		<category><![CDATA[mobile blogging]]></category>
		<category><![CDATA[tips]]></category>
		<category><![CDATA[tools]]></category>
		<category><![CDATA[Tumblr]]></category>
		<category><![CDATA[WordPress]]></category>

		<guid isPermaLink="false">http://www.blogtips.org/?p=1763</guid>
		<description><![CDATA[I had seven blogs on Tumblr which aggregate news. Using a technique I described earlier, they take RSS feeds from over 1,000 carefully selected websites and blogs, filter them, clean them up, and feed them into the different Tumblr blogs. I used the unique feature built into Tumblr to convert RSS feeds into posts. All [...]]]></description>
			<content:encoded><![CDATA[<p></p><div class="wp-caption aligncenter" style="width: 430px">
	<img title="Kenyan farmers" src="http://theroadtothehorizon.net/photo/preparing%20the%20fields.jpg" alt="Kenyan farmers" width="430" height="285" />
	<p class="wp-caption-text">Moving blogs: labour intensive, but fun!</p>
</div>
<p>I had seven blogs on Tumblr which aggregate news. Using a technique <a href="/rss-reversed-from-feed-to-blog/">I described earlier</a>, they take RSS feeds from over 1,000 carefully selected websites and blogs, filter them, clean them up, and feed them into the different Tumblr blogs. I used the unique feature built into Tumblr to convert RSS feeds into posts. All automatically. Pretty neat. Until it stopped working&#8230;</p>
<p>Two months ago <a href="/tumblr-problems/">Tumblr&#8217;s autoimport feature started to hiccup</a>. Tumblr support said <em>&#8220;We are aware and working on it&#8221;</em>, but could not give me any estimate when it would be fixed.</p>
<p>A month later, still nothing. So what to do? I rely on this Tumblr feature, for my blogs. In two years, those blogs collected 350,000 news articles. Quite a resource library which I did not want to give up.</p>
<p>So I decided to use the Christmas holidays to migrate these blogs from Tumblr onto WordPress, on my HostGator VPS server. It was an interesting process, involving many different techniques and debugging efforts:</p>
<p><span id="more-1763"></span></p>
<h4>1. How to export a Tumblr blog into WordPress</h4>
<div class="wp-caption aligncenter" style="width: 388px">
	<img title="From Tumblr to WordPress" src="http://theroadtothehorizon.net/photo/Tumblr%20export%20routine.jpg" alt="From Tumblr to WordPress" width="388" height="400" />
	<p class="wp-caption-text">Ben Ward&#39;s Tumblr2WordPress utility</p>
</div>
<p>I pretty much described the process and the technique to export a Tumblr blog <a href="/how-to-import-a-tumblr-blog-into-wordpress/">in an earlier post</a>: <a href="http://benapps.net/" target="_blank">Using Tumblr2WordPress</a>, a neat PHP program by Ben Ward. It uses the Tumblr API to export blogposts, and to create an .XML file, which I could import into WordPress. An API that Tumblr disables every US afternoon and evening, by the way.</p>
<p>While Ben&#8217;s program works well to export smaller Tumblr blogs, I had biiiiig blogs, so I had to adapt the PHP code. As the code is public domain and available on <a href="http://github.com/benward/tumblr2wordpress">Github</a>, I downloaded it and installed it on my server. I changed the PHP parameters to allocate a massive chunk of memory, and allow the export routine to run longer than a standard PHP program. Tip: change the parameters only for that routine, not for your whole server!</p>
<p><span style="color: #ff00ff;">Update March 1, 2010:</span><br />
Ben&#8217;s source code is still available, but the executable program is no longer available on the link I provided. You can still run Tumble to WordPress routines based on the same engine from <a href="http://tumblr2wp.com/" target="_blank">Tumblr2WP</a> or <a href="http://haochen.me/tumblr/" target="_blank">Tumble2WordPress</a> &#8211; With thanks to Parneix and Aaron for the updates)</p>
<p>I also patched Ben&#8217;s original code to work around a smaller problem I discovered on &#8220;published dates&#8221; and &#8220;categories&#8221;. Pretty easy, even for a PHP novice like me, as the code is well documented.</p>
<p>As WordPress can only import .XML files smaller than 8Mb, I split up the first exported blog manually into smaller chunks. Took me two hours for the smallest of the seven blogs. I decided once again to delve into the code, and wrote a small patch that allowed me to export 5,000 blog posts at a time. Each export file now was smaller than 8 Mbyte.</p>
<p>Cool. Exported all blogs, and there I sat on 70 files, about half a gigabyte worth of .XML files.</p>
<h4>2. Importing 350,000 posts into seven new WordPress blogs</h4>
<div class="wp-caption aligncenter" style="width: 400px">
	<img title="The WordPress Import screen" src="http://theroadtothehorizon.net/photo/wordpress%20import%20routine.jpg" alt="The WordPress Import screen" width="400" height="193" />
	<p class="wp-caption-text">The WordPress Import routine rocks!</p>
</div>
<p>For the seven blogs, I created seven new accounts on my HostGator VPS server. Some of my Tumblr blogs were using custom domains, so I changed the DNS, pointing to my HostGator VPS server, and created seven new WordPress blogs. While I was at it, I registered two new domains. <a href="http://changethru.info" target="_blank">ChangeThru.Info</a> became <a href="http://aidresources.org" target="_blank">AidResources.org</a> and <a href="http://youandusand.me/" target="_blank">Youandusand.me</a> became <a href="http://newsongreen.org" target="_blank">NewsOnGreen.org</a>. Wanted to do that a long time ago, so now was the right time. I kept the old domains live on Tumblr, so I did not lose any traffic.</p>
<p>Installing a new WordPress blog was easy to do with the &#8220;Fantastico&#8221; program in the server&#8217;s Cpanel. I choose <a href="http://wordpress.org/extend/themes/magazine-basic" target="_blank">a neat and simple magazine template</a> and added the usual plugins I always use for caching, automated blog backup, etc&#8230;</p>
<p>Then, one by one, I imported the 500 Mbyte of export .XML files. Worked flawlessly, but took about two days. Not a single error, not a single problem. I should say: WordPress impressed me once more.</p>
<p>Done. Well at least with importing the old posts. How to feed in new posts using my myriad of RSS feeds?</p>
<h4>3. Implementing an &#8220;RSS to blogpost&#8221; routine in WordPress</h4>
<div class="wp-caption aligncenter" style="width: 400px">
	<img title="Feedwordpress" src="http://theroadtothehorizon.net/photo/feedwordpress.jpg" alt="Feedwordpress" width="400" height="305" />
	<p class="wp-caption-text">FeedWordPress, you rock!</p>
</div>
<p><a href="http://feedwordpress.radgeek.com/" target="_blank">FeedWordPress</a> made my day. This neat plugin imports RSS feeds and converts them into WordPress blogposts. And it does so very well. The plugin is well designed, easy to use, and has a lot of options to customize the import process.<br />
It also has add-ons that allow you limit the size of the imported post, add text in the title or in the body of the imports. Realllllly neat!</p>
<p>I configured the different feeds <a href="/how-to-combine-rss-feeds-with-yahoo-pipes/">I process via Yahoo Pipes</a>, and ran a CRON job to import the blog posts. About two hours work per blog, and the whole cake went into the oven and started cooking: FeedWordPress neatly imported the posts.</p>
<p>You would think I was done. The real work had not even started.</p>
<h4>4. The need for speed</h4>
<p>From the beginning until the end, including the customization of the template etc.. the export from Tumblr and import into WordPress took me about a week. Fine-tuning the blogs and the server took another two weeks.</p>
<p>As &#8220;Good is Fine, Perfect is Best&#8221;, I saw some formatting deficiencies I could not live with and needed modifications to the feeds and template. And even worse, much much much worse, my poor server went through its knees with the extra load.. The new blogs demanded so much CPU time from Apache and the MySQL server that everything slowed down to a snail&#8217;s pace.</p>
<p>Time to get geeky!</p>
<h4>5. Tuning the XML sitemaps</h4>
<div class="wp-caption aligncenter" style="width: 363px">
	<img title="XML Sitemap for WordPress" src="http://theroadtothehorizon.net/photo/XML%20sitemap.jpg" alt="XML Sitemap for WordPress" width="363" height="400" />
	<p class="wp-caption-text">XML Sitemap: Tune it!</p>
</div>
<p>It only takes one dumb sysadmin to make the fastest server to go slow, I realized while monitored the CPU load with the Linux &#8220;TOP&#8221; command. I saw the &#8220;load average&#8221; to peak way above &#8220;10&#8243;, meaning there were at least 10 processes queueing up for CPU time.<br />
So I looked at several WordPress plugins that might cause the problem.</p>
<p>The first problem I saw was <a href="http://wordpress.org/extend/plugins/google-sitemap-generator/" target="_blank">the XML sitemap generator</a>. There was one option, which I had overlooked: <em>&#8220;Rebuild sitemap if you change the content of your blog&#8221;</em>. Might be fine on smaller blogs, but the automatic feedimporters were putting up new blogposts at a rate of 100 per hour. So the server was pretty much doing nothing else but generating sitemaps.</p>
<p>I disabled the feature, and scheduled a CRON job to regenerate a sitemap once a day, at night-time. The server load went down significantly.</p>
<p>Oh, by the way, if you have a huge blog, limit the number of posts to include in the sitemap to 10,000 , Google&#8217;s maximum limit for sitemaps!</p>
<h4>6. Tuning WP Supercache</h4>
<div class="wp-caption aligncenter" style="width: 400px">
	<img title="WP SuperCache" src="http://theroadtothehorizon.net/photo/supercache%20screenshot.jpg" alt="WP SuperCache" width="400" height="302" />
	<p class="wp-caption-text">WP SuperCache: Tune it!</p>
</div>
<p><a href="/a-blogger-2010-wrapup/">As I mentioned before</a>: any blogger using <a href="/selecting-a-blog-platform-selfhost-your-blog-or-not/">a selfhosted blog</a> without a cache, should appear before court for &#8220;blog neglect&#8221;. So I use caching extensively, with <a href="http://wordpress.org/extend/plugins/wp-super-cache/" target="_blank">WP Supercache</a>, my preferred plugin.</p>
<p>But it only takes a stupid blog administrator to make even the best plugin not to work properly. Supercache needed tuning:</p>
<ul>
<li>Unchecked the option <em>&#8220;Clear all cache files when a post or page is published&#8221;</em> (I published 100 posts per hour, so the cache was always invalidated)</li>
<li>Pre-loaded the last 10% of blogposts, but put <em>&#8220;Refresh preloaded cache&#8221;</em> to &#8220;0&#8243; (as once a post is imported from its RSS feed, I don&#8217;t update it anymore, so it can remain in cache forever). This means I only had to pre-load a massive amount of blogposts once, and it was done.</li>
<li>For the same reason, I put the <em>&#8220;expiry time&#8221;</em> to &#8220;0&#8243;, as once cached after a preload cycle, I want the page to remain in cache. It generates a LOT of cached files, but I have plenty of disk space on my server.<br />
If you put <em>&#8220;expiry time&#8221;</em> to a value &gt; 30 minutes, garbage collection is done every 10 minutes, which generates a lot of load on your server.</li>
<li>As now I had caches with an eternal life time, I needed to ensure the homepage, feeds, archives were NOT cached, otherwise visitors never got an updated overview of the latest posts.<br />
As I discovered that AFTER I preloaded the posts, and had put the caching to &#8220;eternity&#8221;, I had to manually delete the cached files for the home page, searches and the running month&#8217;s archives.</li>
<li>To further reduce the load on the PHP server, I choose the option to use <em>&#8220;mod_rewrite to serve cache files&#8221;</em>.</li>
<li>And by the way, if you don&#8217;t cache the home page, the <em>&#8220;Cache Tester&#8221; </em>will give an error &#8211; as it tests caching on&#8230; the home page. So ignore that error, and just look at the source of any random page, to see if, at the bottom of the source, you have a date/time for the cache generation, which is in the past.</li>
</ul>
<h4>7. Trashing &#8220;Most Popular Posts&#8221;</h4>
<p>As describe in <a href="/how-one-small-plug-can-slow-blog/" target="_self">this post</a>, one plugin meant to show &#8220;the most read posts&#8221;, also logged every single access to the SQL database, and effectively slowed down my server. Had to trash it.</p>
<h4>8. Tuning FeedWordPress</h4>
<p>I spent quite a bit of time to tune FeedWordPress, to balance how often feeds were to be imported with the success rate of each import cycle. I combine 1,000+ feeds <a href="/how-to-combine-rss-feeds-with-yahoo-pipes/" target="_self">into about 20 Yahoo Pipes feeds</a>. These are large and complex feeds, which take a lot of time to fetch. Many times the import of a feed would time out.</p>
<p>At first I worked around that problem, by refreshing all feed imports every 10 minutes. But once again, that put a lot of pressure on the server. As you can deal with any problem either by working around it, or by addressing the cause of it, it was time to look for the source of the problem. In the process of doing so, distinguish well between what is &#8220;a cause&#8221; and what is &#8220;a symptom&#8221;. Often we try to solve the latter, while we should address the former. Think about that. That is deeeeep! <img src='http://www.blogtips.org/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>So the symptom I saw was the feeds timing out. As I know the Yahoo Pipes&#8217; feeds often take very long to refresh, even interactively, I <a href="http://www.piepalace.ca/blog/2010/11/feedwordpress-broke-my-heart.html" target="_blank">had to patch FeedWordPress</a> with a timeout of 60 seconds to deal with the Yahoo Pipes&#8217; lack of speed. Cool. But that caused a dreaded SQL error &#8220;My SQL server has gone away&#8221; to appear more frequently in my CRON log files. Beh.</p>
<p>To make a long story short, the solution was to also change the PHP parameter &#8220;wait_timeout&#8221; from the default of &#8220;30&#8243; seconds to &#8220;240&#8243;. Changed that in <em>/etc/my.cnf</em> and restarted SQL server. Problem solved.</p>
<p>As this solved the timeout when reading RSS feeds, I could also decrease the frequency of the FeedWordPress CRON jobs. And that once again made my server very happy. I like happy servers&#8230;!</p>
<h4>9. Server tuning</h4>
<div class="wp-caption aligncenter" style="width: 400px">
	<img title="Server crash cartoon" src="http://theroadtothehorizon.net/photo/server%20crash.jpg" alt="Server crash cartoon" width="400" height="234" />
	<p class="wp-caption-text">Tuning a server: not for the faint of heart...</p>
</div>
<p>Depending on what exactly you do on your blogs, for large and heavy traffic blogs like the seven I had just migrated, the SQL server might need tuning. This is not for the faint-of-heart, and requires patience and caution. With one wrong setting, you can cause more damage than good.</p>
<p>The first indication that my SQL parameters might need tuning, was simply the fact that SQL took up so much CPU time on my server. phpMyAdmin, a routine available to about every selfhosted server, has a neat feature called <em>&#8220;Status&#8221;</em>, which gave me an overview of the parameters which might need changing. But I was not sure. So I installed <a href="http://blog.mysqltuner.com/" target="_blank">mySQLTuner</a>: using SSH, I logged into my server&#8217;s root. With just three commands, I got a better overview of the parameters I needed in three commands:</p>
<blockquote>
<pre><code>wget mysqltuner.pl</code>
chmod 775 mysqltuner.pl<code>
perl mysqltuner.pl </code></pre>
</blockquote>
<p>I changed one parameter at the time, and waited for 12-24 hours to see its effect.</p>
<p>It seems the two most important parameters to tune are <em>&#8220;key_buffer_size&#8221;</em> and <em>&#8220;table_cache&#8221;</em>. It was advised to tune these first before touching the others. Which I did. <em>&#8220;key_buffer_size&#8221;</em> was ok on its default value of 48M, but <em>&#8220;table_cache&#8221;</em> needed 4,096 instead of the default 1,024.</p>
<p>The rest of the parameters I changed over time:</p>
<blockquote>
<pre><code>tmp_table_size: from 32M to 64M</code>
<code>max_heap_table_size: from 32M to 64M</code>
<code>sort_buffer_size: from 1M to 4M</code></pre>
</blockquote>
<p>While I am still observing the server and tuning bits and pieces, it looked by now that, ladies and gentlemen, we have a happy server, purring  like a happy cat. Sure enough I still get peak loads, with &#8220;load average&#8221; of 3-4, most of the time it stays at &#8220;1&#8243; or below. And that is good. I like happy servers.</p>
<h4>10. Template tuning</h4>
<p>For the seven blogs, I use a neat little magazine template, called <a href="http://wordpress.org/extend/themes/magazine-basic" target="_blank">Magazine-Basic</a> by <a href="http://themes.bavotasan.com/" target="_blank">Bavotasan</a>. It comes for free, and has some basic functions that I like.</p>
<p>I customized the CSS and some of the functions. I sinned heavily by patching the original template, rather than using a child theme, but that is just because I made so many changes. On top of that, for my main blogs, &#8220;speed&#8221; is important, I did not want every page refresh to read several CSS files. So patching, it was. I will live with the fact that I can not upgrade the template automatically later on, but hey, I learned that upgrading templates is always a pain, and often causes more problems than it is worth. So I avoid theme upgrades like the pest. My opinion, punto.</p>
<p>One of the main challenges I faced was that the template, by default, puts the first image if finds in the post as a thumbnail on the home and archive pages. And often that image was junk (a &#8220;Retweet&#8221; button, or a Feedburner &#8220;Email this&#8221; image).. So I had to tweak all my Yahoo Pipes feeds to delete those images. Took me a week. Some feeds were too complex to delete all images, so for some of the blogs, I just disabled the thumbnails in the template. I mean, I got to sleep too, so can&#8217;t keep on tuning all feeds to find all kinds of combinations of dummy images..</p>
<p>Maybe that will be work for next Xmas!</p>
<h4>11. Installing the mobile theme</h4>
<p>And to finish it all in beauty, I <a href="/how-to-enable-mobile-theme-on-wordpress-blog/">installed a plugin to enable a mobile theme to be displayed</a> when visitors access my blog using a mobile phone.</p>
<h4>12. And here is the end result of about four weeks of work:</h4>
<div class="wp-caption aligncenter" style="width: 400px">
	<img title="From Tumblr to WordPress" src="http://theroadtothehorizon.net/photo/from%20tumblr%20to%20wordpress.jpg" alt="From Tumblr to WordPress" width="400" height="263" />
	<p class="wp-caption-text">From Tumblr to WordPress. With Love</p>
</div>
<p>Check them out on your mobile. Browse through the posts to see if you like the download speed (remember the homepage and archives are not cached, and thus a bit slower!):  <a title="a super fast collector of the latest news from countries in the aid spotlight" href="http://aidnews.org" target="_blank">AidNews</a>, <a title="aggregates the latest articles from dozens of reference sites" href="http://aidresources.org" target="_blank">AidResources</a>, <a title="summarizes the latest environmental news" href="http://newsongreen.org" target="_blank">News On Green</a>, <a title="the one and only aidworker blog collector" href="http://aidblogs.org" target="_blank">AidBlogs</a>, <a title="aggregates the posts from over 800 nonprofit blogs" href="http://nonprofitblogs.info" target="_blank">The NonProfit Blogs</a>, <a title="because life is too serious" href="http://theweirdbit.org" target="_blank">The Weird Bit</a> and <a title="updates about blogging and social media" href="http://bloggingtoday.org" target="_blank">Blogging Today</a>.<br />
You like?</p>
<p>Cartoon courtesy <a href="http://www.marklowe.com/art/graphicsv1.html" target="_blank">Mark Lowe</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.blogtips.org/how-i-moved-350000-blogposts-from-tumblr-to-wordpress/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>How one small plug-in can slow down your site.The Drupal version</title>
		<link>http://www.blogtips.org/how-one-small-plug-in-can-slow-down-your-blog-the-drupal-version/</link>
		<comments>http://www.blogtips.org/how-one-small-plug-in-can-slow-down-your-blog-the-drupal-version/#comments</comments>
		<pubDate>Wed, 26 Jan 2011 07:54:30 +0000</pubDate>
		<dc:creator>Peter</dc:creator>
				<category><![CDATA[Geeky Stuff]]></category>
		<category><![CDATA[How to... Stuff]]></category>
		<category><![CDATA[Drupal]]></category>
		<category><![CDATA[hosting]]></category>
		<category><![CDATA[speed]]></category>
		<category><![CDATA[tips]]></category>
		<category><![CDATA[tools]]></category>

		<guid isPermaLink="false">http://www.blogtips.org/?p=1737</guid>
		<description><![CDATA[Remember my rumbling about how one plug-in can make your WordPress blog slow? Well, here is a similar story about my Drupal news aggregation site. Before I begin, let me re-state that Drupal by itself, is fast. I mean real fast. That&#8217;s why I gave Drupal two thumbs up in my 2010 blogger&#8217;s wrapup. But [...]]]></description>
			<content:encoded><![CDATA[<p></p><div class="wp-caption aligncenter" style="width: 400px">
	<img title="Drupal snail" src="http://theroadtothehorizon.net/photo/drupalsnail.jpg" alt="Drupal snail" width="400" height="230" />
	<p class="wp-caption-text">Mea Culpa! Drupal is fast, but I made it slow...</p>
</div>
<p>Remember my rumbling about <a href="/how-one-small-plug-can-slow-blog/" target="_self">how one plug-in can make your WordPress blog slow</a>? Well, here is a similar story about <a href="http://humanitariannews.org" target="_blank">my Drupal news aggregation site</a>.</p>
<p>Before I begin, let me re-state that <a href="http://drupal.org" target="_blank">Drupal</a> by itself, is fast. I mean real fast. That&#8217;s why I gave Drupal two thumbs up <a href="/a-blogger-2010-wrapup/" target="_self">in my 2010 blogger&#8217;s wrapup</a>. But it only takes half a dumb system administrator (or web admin) to make any fast system slow. And I succeeded very well in that.</p>
<p>As I was <a href="../my-blogging-life-in-one-picture/" target="_self">debugging my server&#8217;s CPU load</a> for my WordPress blogs, I saw that my Drupal site, hosted on the same  machine, was putting a lot of pressure on my SQL and Apache servers.</p>
<p>Well, the site as such was not slow. As I used extensive caching, a visitor did not see anything. But boy, the site really loaded my server!</p>
<p>I did not understand why. I mean, sure, I have a lot of traffic, but  as with most news site, 90% of all traffic goes to the same most recent  posts. And that was caught by the cache, so that traffic should not even touch PHP or SQL. Proof of the matter was  that as a visitor, the site was very fast, taking only one or two  seconds to load.</p>
<p>So why did the site put that much load on the server? I found out be pure luck.</p>
<p><span id="more-1737"></span>I was checking <a href="http://www.google.com/webmasters/" target="_blank">Google Webmasters</a> how the sitemaps for my different blogs were going, and noticed my Drupal site&#8217;s sitemap never gets submitted to Google. So I looked at the module settings. Something I should have done much earlier.</p>
<p>Apart from the module <a href="http://drupal.org/node/482550" target="_blank">having apparent problems with high volume sites</a>, I also misconfigured it: I had set the module to regenerate a sitemap of the latest 2,000 posts at every CRON (the internal scheduler) run.<br />
On my site, CRON is set to run at every 1o minutes, as I use CRON to trigger the import over 1,000 RSS feeds which need very regular refreshing.</p>
<p>So in short, stupid me, I was forcing Drupal to scan through 2,000 posts every 10 minutes and generate a new sitemap.xml file. You can imagine how many SQL queries that generated unnecessarily.</p>
<p>I disabled the module while I monitored the CPU load interactively. Almost on the spot, I could see the load on the Apache and SQL server going down. And the server has been purring happily ever since.</p>
<p>Since that moment, I am sitting in the corner of my room, with ashes on my head, crying &#8220;Mea Culpa, Mea Culpa, Mea Maxima Culpa&#8221;* (* &#8220;It is my fault, my fault, my freaking fault&#8221;), and hope the Server Gods won&#8217;t look badly upon me on my Digital Judgement day.</p>
<h4>Bottomline:</h4>
<p>For selfhosted Content Management Systems like Drupal or WordPress alike: check each of your plugins and their settings. It takes only one plugin or one wrongly configured plugin to either slow down your site, or slow down your server.</p>
<p><em>Unless if you want to be a stupid web admin like myself, of course.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.blogtips.org/how-one-small-plug-in-can-slow-down-your-blog-the-drupal-version/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>How to combine RSS feeds with Yahoo Pipes</title>
		<link>http://www.blogtips.org/how-to-combine-rss-feeds-with-yahoo-pipes/</link>
		<comments>http://www.blogtips.org/how-to-combine-rss-feeds-with-yahoo-pipes/#comments</comments>
		<pubDate>Sun, 16 Jan 2011 18:11:13 +0000</pubDate>
		<dc:creator>Peter</dc:creator>
				<category><![CDATA[Advanced Stuff]]></category>
		<category><![CDATA[Geeky Stuff]]></category>
		<category><![CDATA[How to... Stuff]]></category>
		<category><![CDATA[RSS]]></category>
		<category><![CDATA[tools]]></category>
		<category><![CDATA[Yahoo Pipes]]></category>

		<guid isPermaLink="false">http://www.blogtips.org/?p=1724</guid>
		<description><![CDATA[Many bloggers use RSS feeds to check on the latest posts from different websites and blogs. Bloggers often use widgets to display RSS feeds from related blogs, or their Twitter and Delicious updates. Or they might simply pull RSS feeds into an RSS reader like Google Reader. Comes a time where it might be useful [...]]]></description>
			<content:encoded><![CDATA[<p></p><p><img class="aligncenter" title="Pipes" src="http://theroadtothehorizon.net/photo/pipes.jpg" alt="Pipes" width="400" height="286" /></p>
<p>Many bloggers use <a href="/what-is-rss-and-what-can-you-do-with-it/" target="_self">RSS feeds</a> to check on the latest posts from different websites and blogs. Bloggers often use widgets to display RSS feeds from related blogs, or their Twitter and Delicious updates. Or they might simply pull RSS feeds into an RSS reader like <a href="http://www.google.com/reader" target="_blank">Google Reader</a>.</p>
<p>Comes a time where it might be useful to combine different RSS feeds into one feed. Imagine you have a blog about fundraising, and you have bookmarked 10 different fundraising sites of interest to you. You would like to display in an RSS widget on your blog with the latest updates on those blogs. You would not want to have 10 widgets, each for one RSS feed, would you?</p>
<p>The easiest is to combine different feeds into one. And <a href="http://pipes.yahoo.com/pipes/" target="_blank">Yahoo Pipes</a> is still the best tool to do so.</p>
<p><span id="more-1724"></span>Yahoo Pipes is a versatile tool you can use to combine different content, mash it up to create new content, but here is a step by step tutorial for the most simple thing you can do with Yahoo Pipes: combine different feeds.</p>
<h4>Step 1: Getting started with Yahoo Pipes</h4>
<ol>
<li><a href="https://edit.yahoo.com/registration" target="_blank">Register</a> for a Yahoo account</li>
<li>Go to <a href="http://pipes.yahoo.com/pipes/" target="_blank">Yahoo Pipes</a> and log in with your Yahoo account</li>
</ol>
<h4>Step 2: Create a Pipe</h4>
<p>A &#8220;Pipe&#8221; is a workflow you create to take RSS (or other) input, mix it up, and create a new output.</p>
<p>On the home page of Yahoo Pipes, click on &#8220;Create a Pipe&#8221;. This will give you a worksheet to construct a pipe, using a nice graphical user interface.<br />
In the left column, you will see a collapsible menu with the main working modules, and the main part of the screen is your worksheet:</p>
<p><img class="aligncenter" title="Yahoo Pipes main screen" src="http://theroadtothehorizon.net/photo/yahoo%20pipes%20main%20working%20screen.jpg" alt="Yahoo Pipes main screen" width="400" height="195" /></p>
<h4>Step 3. Create a workflow for your pipe</h4>
<ol>
<li>The first thing we have to do, is to tell Yahoo Pipes where to get the input RSS feeds from, the feeds we want to combine. From the left module list, drag the module &#8220;Fetch Feed&#8221; (from the &#8220;Sources&#8221; list) into your work sheet on the right:<br />
<img class="aligncenter" title="Dragging a Yahoo Pipes Fetch Feed" src="http://theroadtothehorizon.net/photo/yahoo%20pipes%20fetch%20feed.jpg" alt="Dragging a Yahoo Pipes Fetch Feed" width="200" height="111" /></li>
<li>Note that once you dragged the &#8220;Fetch Feed&#8221; module, Yahoo Pipes  automatically also put a second module &#8220;Pipe Output&#8221; in your worksheet.  Leave it there for the time being.</li>
<li>The &#8220;Fetch Feed&#8221; module will open up, and you can fill in the different feeds you want to combine.<br />
As an example, let&#8217;s put two feeds in here, one from <a href="http://feeds2.feedburner.com/blogtips/rss" target="_blank">BlogTips</a> and one from <a href="http://feeds.feedburner.com/BloggingTodayrss" target="_blank">Blogging Today</a>. In this case, both are Feedburner feeds, but it does not matter. You can take Atom feeds, RSS 1.0 or RSS 2.0. Yahoo Pipes will process them properly.</li>
<li>After filling in the first feed URL, click the &#8220;+&#8221; icon and fill in the second feed.<br />
<img class="aligncenter" title="Fetching feeds with Yahoo Pipes" src="http://theroadtothehorizon.net/photo/fetching%20feeds%20with%20Yahoo%20Pipes.jpg" alt="Fetching feeds with Yahoo Pipes" width="200" height="89" /></li>
<li>At any time during the process of creating a pipe, you can click on a module, and as it gets highlighted in orange, the debugging window at the bottom of the sheet will change when you hit &#8220;refresh&#8221;.<br />
<img class="aligncenter" title="Yahoo Pipes debugging screen" src="http://theroadtothehorizon.net/photo/pipe%20debugging%20screen.jpg" alt="Yahoo Pipes debugging screen" width="400" height="252" />The debugging module basically shows the output of that module. You can click the triangle icons to see the different internal components.<br />
This is very useful if you create complex pipes, but for now, we will leave it aside.</li>
<li>The next step is to connect the &#8220;Fetch Feed&#8221; module to the &#8220;Pipe Output&#8221; module: Click on the blue dot (the connection dot) at the bottom of the &#8220;Fetch Feed&#8221; module, and drag it to the blue dot at the top of the &#8220;Pipe Output&#8221; module:<br />
<img class="aligncenter" title="connecting two modules in Yahoo Pipes" src="http://theroadtothehorizon.net/photo/connecting%20two%20modules%20with%20Yahoo%20Pipes.jpg" alt="connecting two modules in Yahoo Pipes" width="200" height="115" /></li>
<li>And in principle you are done now.</li>
</ol>
<h4>Step 4. How to combine many many many RSS feeds</h4>
<p>While each &#8220;Fetch Feed&#8221; module will let you enter about 20 feeds, you might need to combine more feeds than that. Easy!</p>
<ol>
<li>Drag another &#8220;Fetch Feed&#8221; module from the left &#8220;Sources&#8221; list into your worksheet</li>
<li>Fill in your feeds.</li>
<li>If you still have the first &#8220;Fetch Feed&#8221; module already connected to your &#8220;Pipe Output&#8221; module, disconnect it: Just hover over the connecting blue line, and scissors will appear. Click on the scissors, and the connection line will disappear.<br />
<img class="aligncenter" title="disconnecting modules in Yahoo Pipes" src="http://theroadtothehorizon.net/photo/disconnect%20two%20modules%20with%20Yahoo%20Pipes.jpg" alt="disconnecting modules in Yahoo Pipes" width="200" height="63" /></li>
<li>Now you need to combine both &#8220;Fetch Feed&#8221; modules into one. Drag the &#8220;Union&#8221; module from the &#8220;Operators&#8221; module list on the left, into your worksheet.<br />
<img class="aligncenter" title="union module in Yahoo Pipes" src="http://theroadtothehorizon.net/photo/union%20module%20in%20Yahoo%20Pipes.jpg" alt="union module in Yahoo Pipes" width="200" height="85" /></li>
<li>Now connect the two &#8220;Fetch Feed&#8221; modules to the input connectors of the &#8220;Union&#8221; module (at the top) and connect the output connector of the &#8220;Union&#8221; module to the &#8220;Pipe Output&#8221; module.<br />
<img class="aligncenter" title="connecting the union module in Yahoo Pipes" src="http://theroadtothehorizon.net/photo/connecting%20the%20union%20module.jpg" alt="connecting the union module in Yahoo Pipes" width="200" height="101" />Remember, you only need to do this, if you have a lot of input feeds. Each &#8220;Fetch Feed&#8221; module will easily cater for 20 input feeds.</li>
<li>If need be, you can add more &#8220;Fetch Feed&#8221; modules and connect them all to the &#8220;Union&#8221; module. I suggest to limit to 60-80 input feeds per pipe, otherwise the pipe becomes too slow.</li>
</ol>
<h4>Step 5. Save your pipe</h4>
<p>Congratulations, you have just constructed a pipe which combines your RSS feeds. Now save the Pipe.</p>
<ol>
<li>Click on the tab &#8220;Untitled&#8221; at the top left of your screen, and give the pipe a name, and click &#8220;OK&#8221;<br />
<img class="aligncenter" title="give a name to your Yahoo Pipe" src="http://theroadtothehorizon.net/photo/give%20a%20name%20to%20your%20pipe.jpg" alt="give a name to your Yahoo Pipe" width="200" height="43" /></li>
<li>Now click on the &#8220;Save&#8221; button in the right corner</li>
<li>If you have many input feeds in your pipe, saving might take a while.<br />
There <a href="/yahoo-pipes-more-down-than-up/" target="_self">have been times where Yahoo Pipes had performance problems</a> and saving just hung, or gave obscure error messages, but it seems those times are now past. (touch wood)</li>
</ol>
<h4>Step 6. Create an RSS output from your pipe</h4>
<p>While you have now successfully created a Pipes workflow, you still don&#8217;t have the combined RSS output of your pipe yet, do you? Two simple steps to get there:</p>
<ol>
<li>After you saved your pipe, click on &#8220;Run Pipe&#8221;</li>
<li>A new window will open, and a screen will appear showing &#8220;Running Pipe&#8221;<br />
<img class="aligncenter" title="running yahoo pipes" src="http://theroadtothehorizon.net/photo/running%20pipe.jpg" alt="running yahoo pipes" width="200" height="103" /></li>
<li>Dependent on how many feeds you combine, this might take 2 or take 60 seconds, before the items in your feed will appear.<br />
Do check the bottom of your feed to see if there were any errors. You might have entered a wrong feed, or entered a home page URL instead of a feed, a common mistake!</li>
<li>Your pipe, combining your different RSS feeds, can be used with different tools (add it to My Yahoo!, to your Google Home page, to your Google Reader, get a badge to put the output on your blog, etc&#8230;). If you just want the link to the RSS output, click on &#8220;Get as RSS&#8221;<br />
<img class="aligncenter" title="Get output from Yahoo Pipes" src="http://theroadtothehorizon.net/photo/pipes%20output%20options.jpg" alt="Get output from Yahoo Pipes" width="400" height="100" /></li>
<li>And voila, there is the RSS output. Copy the URL of the RSS output and you can reuse it anywhere you want.<br />
<img class="aligncenter" title="RSS output from Yahoo Pipes" src="http://theroadtothehorizon.net/photo/rss%20output%20from%20yahoo%20pipes.jpg" alt="RSS output from Yahoo Pipes" width="400" height="346" /></li>
</ol>
<h4>Step 7. Refining your Yahoo Pipes output</h4>
<p>Now we&#8217;re done. Well&#8230; except for one thing. Well, except for two things. Well, except for three things:</p>
<ol>
<li> As it stands right now, the RSS output does not have the feed items sorted by date, which is no good. We will need to sort the RSS feed, and publish it with the most recent RSS items first.</li>
<li>If you combine many different feeds, the RSS output might contain many items, and that could clog up wherever you want to reuse the feed. We will need to limit the amount of items in the Yahoo Pipes&#8217; RSS feed.</li>
<li>And again, if you use many different RSS feeds as input, one input blog might have re-used input from the other, so we will want to take out duplicate entries.</li>
</ol>
<p>So roll up your sleeves once more for the &#8220;final act&#8221;:</p>
<ol>
<li>Disconnect the &#8220;Union&#8221; and the &#8220;Pipe Output&#8221; module</li>
<li>From the &#8220;Operators&#8221; menu on the left, drag the modules &#8220;Sort&#8221;, &#8220;Unique&#8221;, &#8220;Truncate&#8221; onto your worksheet.</li>
<li>Connect them in this order, and fill in the different fields:<br />
<img class="aligncenter" title="refining a yahoo pipe" src="http://theroadtothehorizon.net/photo/refining%20yahoo%20pipe.jpg" alt="refining a yahoo pipe" width="400" height="339" /></li>
<li>In the &#8220;Sort&#8221; module, select &#8220;item.pubDate&#8221; and &#8220;descending&#8221; order</li>
<li>In the &#8220;Unique&#8221; module, select &#8220;item.title&#8221;</li>
<li>In the &#8220;Truncate&#8221; module, fill in the maximum number of items in your output field. We used 40 in our example.</li>
<li>Connect the different modules (From &#8220;Union&#8221; or &#8220;Fetch Feed&#8221; if you don&#8217;t use the &#8220;Union&#8221; module, to the &#8220;Sort&#8221;, the &#8220;Unique&#8221;, the &#8220;Truncate&#8221; and the &#8220;Output Pipe&#8221;.</li>
<li>Save the pipe, and you are done. Yahoo Pipes will automatically change your output RSS according to the modifications you saved from your worksheet.</li>
</ol>
<p>So now you are really done. Unless if you want to&#8230;</p>
<h4>Step 8. Do more sophisticated stuff with Yahoo Pipes. (optional)</h4>
<p>Yahoo Pipes allows you to do many many many different things. You can take input and mash it up to create virtually new content. You can check out the pipes others have created clicking &#8220;Browse&#8221; on Pipes&#8217; home page. People come up with the weirdest and most sophisticated ideas, using translators, automatically integrating FLICKR pictures in a pipe, etc&#8230;</p>
<p>However, the most commonly needed option, is to change or filter out some of the content from the input pipes. For that, the &#8220;Regex&#8221; module from the &#8220;Operators&#8221; menu is the most used workhorse.</p>
<p>There are few &#8220;programming syntaxes&#8221; which are as geeky as &#8220;Regex&#8221;, so beware, this is not for the &#8220;faint of heart&#8221;. But just one example: Let&#8217;s say you want to delete all images from the input feeds, and generate a &#8220;text only&#8221; output feed. And as we started this post saying you have different blogs and sites about &#8220;fundraising&#8221;, let&#8217;s change the output marking any occurrence of the word &#8220;fundraising&#8221; in bold:</p>
<ol>
<li>Disconnect the &#8220;Truncate&#8221; module from the &#8220;Pipe Output&#8221; module.</li>
<li>Drag the &#8220;Regex&#8221; module from the &#8220;Operators&#8221; menu on the left, into your worksheet</li>
<li>In the &#8220;In&#8221; field, select which part of the RSS feed you want to work with. In most feeds, the actual content of every RSS item is stored in the &#8220;item.description&#8221; field, so select that field<br />
Some RSS feeds store their content in other fields, so you might want to use the debugger view of Pipes to find out where the content is stored.</li>
<li>To replace all images, we need to take out the &#8220;img&#8221; tag from the content. So we want to replace anything from the beginning of the img tag until the end, with &#8220;nothing&#8221;. The Regex formula for that is:<code>(&lt;img.*?&gt;)</code><br />
&#8230;which means: <em>take any string, beginning with &#8220;&lt;img,&#8221; following by anything (the &#8220;.*?&#8221; part) and ending with &#8220;&gt;&#8221;</em>. We need to replace that with &#8220;nothing&#8221; so leave the &#8220;with&#8221; field empty.<br />
And we want to replace all the occurrences of this in the input, so we click the &#8220;g&#8221; option in the regex module.</li>
<li>We will now look for all occurences of the word &#8220;fundraising&#8221; (including those starting with a capital &#8220;F&#8221;), and put the &#8220;bold&#8221; tags around it. Pfft, easy!</li>
<li>Our regex module will then look like this:<br />
<img class="aligncenter" title="Yahoo Pipes Regex module" src="http://theroadtothehorizon.net/photo/pipes%20regex%20module.jpg" alt="Yahoo Pipes Regex module" width="400" height="185" /></li>
<li>Don&#8217;t forget to connect the &#8220;Truncate&#8221;, &#8220;Regex&#8221; and &#8220;Pipes Output&#8221; module..</li>
<li>Save your pipe, and once again, you are done.</li>
</ol>
<p>As said, you can go as complex as you want. If you get into trouble, call on your peers for support in the <a href="http://discuss.pipes.yahoo.com/" target="_blank">Yahoo Pipes Discussion Forum</a>. People will be ready to help you.</p>
<h4>9. What I use Yahoo Pipes for?</h4>
<p>I use Yahoo Pipes for many different things.</p>
<p>I have different pipes where I take feeds from dozens of news sites and combine them into a single RSS output, which I read using Google Reader, both on my laptop, iPad and iPhone.</p>
<p>I also use Pipes to channel different feeds onto several Twitter and Facebook accounts, <a href="/the-diagram-of-a-blog-network/">using a series different techniques</a>.</p>
<p>But I mostly use Yahoo Pipes to combine, process and mashup feeds from over 1,000 sites and blogs about aid, development, humanitarian issues and the environment. I republish summaries, as I describe <a href="/rss-reversed-from-feed-to-blog/">in an earlier post</a>.<br />
Check out the end result in <a title="my mega news collector" href="http://www.humanitariannews.org" target="_blank">Humanitarian News</a>, <a title="a super fast collector of the latest news from countries in the aid spotlight" href="http://aidnews.org" target="_blank">AidNews</a>, <a title="aggregates the latest articles from dozens of reference sites" href="http://aidresources.org" target="_blank">AidResources</a>, <a title="summarizes the latest environmental news" href="http://newsongreen.org" target="_blank">News On Green</a>, <a title="the one and only aidworker blog collector" href="http://aidblogs.org" target="_blank">AidBlogs</a> and <a title="aggregates the posts from over 800 nonprofit blogs" href="http://nonprofitblogs.info" target="_blank">The NonProfit Blogs</a>.<br />
In <a title="aggregates the posts from blogs about blogging" href="http://bloggingtoday.org" target="_blank">Blogging Today</a> I also aggregate the latest updates from &#8220;Blogs about Blogging&#8221;, and <a title="aggregates weird news" href="http://theweirdbit.org" target="_blank">The Weird News</a> keeps a smile on my face&#8230;</p>
<p>Oh, and as a closing tip: if you want to go over &#8220;How to create a Yahoo Pipe&#8221; in a video, check this out:</p>
<p><object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="400" height="251" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="AllowScriptAccess" value="always" /><param name="bgcolor" value="#000000" /><param name="flashVars" value="id=13878389&amp;vid=5260536&amp;lang=en-us&amp;intl=us&amp;thumbUrl=http%3A//l.yimg.com/a/p/i/bcst/videosearch/9326/87078068.jpeg&amp;embed=1" /><param name="src" value="http://d.yimg.com/static.video.yahoo.com/yep/YV_YEP.swf?ver=2.2.46" /><param name="flashvars" value="id=13878389&amp;vid=5260536&amp;lang=en-us&amp;intl=us&amp;thumbUrl=http%3A//l.yimg.com/a/p/i/bcst/videosearch/9326/87078068.jpeg&amp;embed=1" /><param name="allowfullscreen" value="true" /><embed type="application/x-shockwave-flash" width="400" height="251" src="http://d.yimg.com/static.video.yahoo.com/yep/YV_YEP.swf?ver=2.2.46" flashvars="id=13878389&amp;vid=5260536&amp;lang=en-us&amp;intl=us&amp;thumbUrl=http%3A//l.yimg.com/a/p/i/bcst/videosearch/9326/87078068.jpeg&amp;embed=1" bgcolor="#000000" allowscriptaccess="always" allowfullscreen="true"></embed></object><br />
Have fun with Pipes!</p>
<p>Picture courtesy <a href="http://www.rajtilak.net" target="_blank">Rajtilak</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.blogtips.org/how-to-combine-rss-feeds-with-yahoo-pipes/feed/</wfw:commentRss>
		<slash:comments>23</slash:comments>
		</item>
		<item>
		<title>How one small plug-in can slow down your blog</title>
		<link>http://www.blogtips.org/how-one-small-plug-can-slow-blog/</link>
		<comments>http://www.blogtips.org/how-one-small-plug-can-slow-blog/#comments</comments>
		<pubDate>Tue, 11 Jan 2011 00:04:53 +0000</pubDate>
		<dc:creator>Peter</dc:creator>
				<category><![CDATA[Geeky Stuff]]></category>
		<category><![CDATA[hosting]]></category>
		<category><![CDATA[speed]]></category>
		<category><![CDATA[tools]]></category>
		<category><![CDATA[WordPress]]></category>

		<guid isPermaLink="false">http://www.blogtips.org/?p=1690</guid>
		<description><![CDATA[I recently migrated seven high volumne Tumblr blogs onto WordPress on my my private server. Even though I use aggressive caching, I still saw a lot of CPU load, caused by SQL-access...

How was that possible? I had every post preloaded in cache? 

The venom sat in a small plug-in....]]></description>
			<content:encoded><![CDATA[<p></p><p><img class="aligncenter" title="question mark" src="http://theroadtothehorizon.net/photo/question-mark.jpg" alt="question mark" width="190" height="300" /></p>
<p>Well I&#8217;ll be darned. I am never too old to learn.</p>
<p>Over the past two weeks, I <a href="/how-to-import-a-tumblr-blog-into-wordpress/">migrated seven blogs</a> from <a href="http://tumblr.com">Tumblr</a> to <a href="http://wordpress.org">WordPress</a>, onto my VPS (Virtual Private Server) on <a href="http://hostgator.com">Hostgator</a>. Concerned about the performance of the server, I scrupulously monitored the CPU, memory and traffic load, as I redirected the domains and as the traffic started to flow in.</p>
<p>There is quite a bit of traffic on those seven blogs, so I use aggressive caching on WordPress, using <a href="http://wordpress.org/extend/plugins/wp-super-cache/" target="_blank">WP Super Cache</a>.</p>
<p><span id="more-1690"></span>WP Super Cache has the option to pre-load a set number of posts into a cache: At pre-set intervals, it reads the posts from the SQL database, and converts them to plain HTML files. This way, any visitor gets the pre-cooked HTML file. This creates quite a bit of files, but offloads the server as it does not have to look up each page on the SQLserver. Important for me, as space and bandwidth is not a problem. CPU power is.</p>
<p>That being said, I was astonished to see the CPU load of the server to grow up to the point where everything slowed down. The &#8220;Load Average&#8221; on the server showed an average of six to ten process waiting to be served. Monitoring the load, I could still see a lot of SQL access happening.</p>
<p>How was that possible? I cached every single page, so no SQL access was needed. I checked and double checked everything. Could not see what caused it these SQL queries. I had everything cached?!?!</p>
<p>Until I found the culprit. I had a plugin <a href="http://wordpress.org/extend/plugins/wordpress-popular-posts/">&#8220;WordPress Popular Posts&#8221;</a>, which checked which posts were being accessed, so I could put a &#8220;Most Popular Posts&#8221; widget in my side column. Thought that would be a cool idea and installed it on all seven migrated blogs.</p>
<p>A cool idea it is, as I use it here on BlogTips. But that is not a cool idea apparently for a high volume site, with limited resources. My poor server was sweating like hell, as apparently all visits are being logged into the database..</p>
<p>I disabled the plugin, cleared the caches, rebuilt them, and voilà.. Almost on the spot the CPU load went down from six to ten processes in the waiting queue to less than two. As anything less than four queued processes per CPU (I have one CPU) is acceptable, my server is a happy camper, I am a happy camper, and my visitors are served with faster websites.</p>
<p>Lessons learned: blog performance is sometimes a matter of plain logical thinking. And often the solution is not in &#8220;blaming it on the server&#8221;, but looking at the blog in front of you. The &#8220;adder under the stone&#8221; might be right in front of you, with the venom in juuuust a small plug-in.</p>
<p>And I don&#8217;t mean to blame this &#8220;Popular Post&#8221; plugin. It does what it is supposed to do. But it does not work for high volume sites.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.blogtips.org/how-one-small-plug-can-slow-blog/feed/</wfw:commentRss>
		<slash:comments>24</slash:comments>
		</item>
		<item>
		<title>How to import a Tumblr blog into WordPress</title>
		<link>http://www.blogtips.org/how-to-import-a-tumblr-blog-into-wordpress/</link>
		<comments>http://www.blogtips.org/how-to-import-a-tumblr-blog-into-wordpress/#comments</comments>
		<pubDate>Wed, 29 Dec 2010 17:00:17 +0000</pubDate>
		<dc:creator>Peter</dc:creator>
				<category><![CDATA[Advanced Stuff]]></category>
		<category><![CDATA[Geeky Stuff]]></category>
		<category><![CDATA[How to... Stuff]]></category>
		<category><![CDATA[tips]]></category>
		<category><![CDATA[tools]]></category>
		<category><![CDATA[Tumblr]]></category>
		<category><![CDATA[WordPress]]></category>

		<guid isPermaLink="false">http://www.blogtips.org/?p=1565</guid>
		<description><![CDATA[I feel like standing in front of a rail crossing, when the red lights just won&#8217;t go off. Is it worth driving to the next rail crossing, just a minute further down the road? Are the lights defective, or is the crossing closed for a true reason? The longer I wait, the less it will [...]]]></description>
			<content:encoded><![CDATA[<p></p><p><img class="aligncenter" title="Tumblr to WordPress" src="http://theroadtothehorizon.net/photo/tumblrwordpress.jpg" alt="Tumblr to WordPress" width="200" height="200" /></p>
<p>I feel like standing in front of a rail crossing, when the red lights just won&#8217;t go off. Is it worth driving to the next rail crossing, just a minute further down the road? Are the lights defective, or is the crossing closed for a true reason? The longer I wait, the less it will be worth driving off.</p>
<p>As a metaphor, that is how I feel waiting for <a href="/tumblr-problems/" target="_self">Tumblr to get its act together after month&#8217;s of outages</a>. I have a dozen blogs on <a href="http://tumblr.com" target="_blank">Tumblr</a>, many of them are aggregators, creating blogposts from imported RSS feeds. I know <a href="http://wordpress.org" target="_blank">WordPress.org</a> (the self-hosted version) has a cute aggregation plug-in that does the same job, so should I move to WordPress, or wait for Tumblr to solve its problems?</p>
<p>Each of my Tumblr blogs has thousands of posts, so migrating them won&#8217;t be a small feat. Plus I will have to install a new blog on my server, redo the theme-ing, install the plug-ins etc.. before the blog goes live. Maybe Tumblr will solve its problems tomorrow? Or the day after, or the next? Or maybe I will discover new problems with the WordPress aggregator tool that will keep me busy for days too&#8230;?</p>
<p>Well, ladies and gentlemen, after standing in front of the rail crossing for weeks, I decided to move one blog as a trial. <a href="http://theweirdbit.tumblr.com/" target="_blank">The original blog</a> (I left its remains on Tumblr) contained 20,000 blogposts ( ! ), and I did not want to lose those. So one of the main challenges to move my Tumblr blog to WordPress was to migrate all blogposts. All 20,000 of them&#8230;</p>
<p>Here is how I did it:</p>
<p><span id="more-1565"></span></p>
<h4>There are three steps in migrating your Tumblr blogposts to WordPress:</h4>
<p>Step 1: Export your Tumblr posts<br />
Step 2: Process your exported Tumblr posts (optional)<br />
Step 3: Import the export file into WordPress</p>
<h4>Step one. How to export your Tumblr blogposts?</h4>
<p>Tumblr does not feature an &#8220;export function&#8221;. <a href="http://tumblring.net/how-to-import-and-display-your-tumblr-posts-on-wordpress-self-hosted/" target="_blank">I found a list of possibilities</a>, but none really suited me, until I stumbled upon  <a href="http://benapps.net/" target="_blank">Tumblr2WordPress</a> (by <a href="http://benward.me/" target="_blank">Ben Ward</a>). And Ben saved my day. </p>
<p>Just run <a href="http://benapps.net/" target="_blank">Tumblr2WordPress</a>, enter your blog&#8217;s Tumblr subdomain (don&#8217;t use your custom domain), select if you want to export to WordPress.com (the WordPress hosted blogs) or WordPress.org (for <a href="/selecting-a-blog-platform-selfhost-your-blog-or-not/">selfhosted blogs</a>) and &#8230; click export. The exported posts will be downloaded onto your PC, as an .XML file&#8230;<br />
That should do it for most Tumblr blogs.</p>
<p>(<span style="color: #ff00ff;">Update March 1, 2010:</span> Ben&#8217;s source code is still available, but the executable program is no longer available on this link. You can still run similar code from <a href="http://tumblr2wp.com/" target="_blank">Tumblr2WP</a> or <a href="http://haochen.me/tumblr/" target="_blank">Tumble2WordPress</a> &#8211; With thanks to Aaron and Parneix for the updates)</p>
<p>If you get an error &#8220;Tumblr API Request Failed&#8221;, this means -once again- Tumblr is failing (as it does frequently in the past months), and the API request to export the posts gives an error. Try it out manually with a command like: <code>http://yourtumblrblog.tumblr.com/api/read?start=0&amp;num=50</code> &#8211; If you get an error, the only thing you can is &#8220;try again later&#8221;.<br />
At the time of writing, it seems Tumblr is blocking API-calls during the US-day time (afternoon and evenings mostly)&#8230;</p>
<p>If you are a freak, like me, and have 20,000 posts in your blog, Ben&#8217;s routine might give a time-out. I had to <a href="http://github.com/benward/tumblr2wordpress">download the PHP source code</a> and install it on one of my servers, so I could dramatically increase the system resources. For the nerds amongst you, I put the PHP code in a subdirectory, and added a php.ini file to it, with the following parameters:</p>
<blockquote><p><code>upload_max_filesize = 20M<br />
post_max_size = 30M<br />
memory_limit = 400M<br />
max_execution_time = 600</code></p></blockquote>
<p>&#8230; but again, 99% of you might not have 20,000 blogposts, so Ben&#8217;s hosted routine will do just fine.</p>
<h4>Step 2: Processing your exported Tumblr posts</h4>
<p>In normal circumstances, you can skip this step, but if you are a purist, like me, you might want to clean up the .XML file a bit to avoid some issues when importing the file in step 3.</p>
<p>You can edit the .XML file with a normal ASCII editor (WordPad does just fine for me, the simple Windows XP user). Each post is stored between <code>&lt;item&gt;... &lt;/item&gt;</code> tags.</p>
<p>For each post, you will need to clean up two things with a simple search and replace:</p>
<p><span style="text-decoration: underline;">One: Clean up the category tags</span><br />
In some cases, the WordPress importer will create a single category for each imported post. Import 100 posts, and you will get 100 junk categories. While those are easy to clean up after importing the posts, it is better to avoid the problem than curing it.<br />
The only thing you need to do, is to delete the two category tag lines, for each post:</p>
<blockquote><p><code>&lt;category&gt;&lt;![CDATA[link]]&gt;&lt;/category&gt;<br />
&lt;category domain="category" nicename="link"&gt;&lt;![CDATA[link]]&gt;&lt;/category&gt;</code></p></blockquote>
<p><span style="text-decoration: underline;">Two: clean up the date warnings.</span><br />
Under some circumstances, you will get a date warning in the .XML file:</p>
<blockquote><p><code>&lt;wp:post_date&gt;&lt;br /&gt;<br />
&lt;b&gt;Warning&lt;/b&gt;:  date() [&lt;a href='function.date'&gt;function.date&lt;/a&gt;]: It is not safe (blabla)<br />
2010-12-24 12:00:58&lt;/wp:post_date&gt;</code></p></blockquote>
<p>Just search for that string, and replace it with the date you find for each post, for example:</p>
<blockquote><p><code>&lt;wp:post_date&gt;2010-12-24 12:00:58&lt;/wp:post_date&gt;</code></p></blockquote>
<p><span style="text-decoration: underline;">Three: Split up the file in smaller chunks</span><br />
Oh, and yes, there is a third thing, before I forget: WordPress can not import file larger than 8 Mb. So if your .XML export file is larger than 8 Mb, split it into individual small files.</p>
<p>Beware: Each file should contain the header section, which starts with</p>
<blockquote><p><code>&lt;?xml version="1.0" encoding="UTF-8" ?&gt;</code></p></blockquote>
<p>and ends with:</p>
<blockquote><p><code>&lt;wp:category&gt;<br />
&lt;wp:category_nicename&gt;uncategorized&lt;/wp:category_nicename&gt;<br />
&lt;wp:category_parent&gt;&lt;/wp:category_parent&gt;<br />
&lt;wp:cat_name&gt;&lt;![CDATA[Uncategorized]]&gt;&lt;/wp:cat_name&gt;<br />
&lt;/wp:category&gt;</code></p></blockquote>
<p>and each file should end with:</p>
<blockquote><p><code>&lt;/channel&gt;<br />
&lt;/rss&gt;</code></p></blockquote>
<p>In other words: create a series of small files, which contain all posts (between <code>&lt;item&gt;... &lt;/item&gt;</code> ) and paste the file header and end tags.</p>
<p>Hey, and forget all of this, if your .XML file is smaller than 8 Mb !</p>
<h3>Step 3: Import your .XML file into WordPress</h3>
<p>Now for the fun (and easy part): Import your .XML file with the WordPress Importer utility ( Dashboard &gt; Tools &gt; Import ).</p>
<p>The importer gives you a series of input formats. Select &#8220;WordPress&#8221;. And if you don&#8217;t have that plugin, you will have to install it first, with one click (don&#8217;t you just love WordPress? In Drupal, that would cost you four hours of work.. <img src='http://www.blogtips.org/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  )&#8230;</p>
<p>Now you are ready to import your .XML file:</p>
<div class="wp-caption aligncenter" style="width: 400px">
	<img title="WordPress importer" src="http://theroadtothehorizon.net/photo/wordpress%20import.jpg" alt="WordPress importer" width="400" height="135" />
	<p class="wp-caption-text">Ready for the WordPress magic?</p>
</div>
<p>WordPress will give you the option to define the &#8220;blog user&#8221; name you want as the author for the imported posts, chew on things for a while, and in the end, list you the names for all the posts it has imported.</p>
<p>And ready you are&#8230;</p>
<p>In my case, I converted <a href="http://theweirdbit.tumblr.com/">this Tumblr blog</a> into <a href="http://theweirdbit.org/">this WordPress blog</a> in a day, including the import of 20,000 blogposts, the theme-ing and all other WordPress goodies involved in starting a new selfhosted blog&#8230;</p>
<p>Pretty neat, no?</p>
<p><span style="color: #ff00ff;">Update Jan 18, 2010:</span><br />
Important remark (as per comment from Parneix below):<br />
Unlike WordPress to WordPress import where media (video, pictures) which are in the input blog&#8217;s media library are also imported into the new blog&#8217;s media library, images in posts exported from Tumblr and imported to WordPress will be &#8220;hotlinked&#8221;. This means they won&#8217;t be copied into the new WordPress blog&#8217;s media library, and will actually continue to link to the Tumblr media library.<br />
As long as you keep your old Tumblr blog alive (don&#8217;t delete the Tumblr blog nor any of the posts), that should not be a problem, even though it is a bit of a drag&#8230;<br />
The workaround is to automatically import hotlinked images to your local media library using <a href="http://wordpress.org/extend/plugins/add-linked-images-to-gallery-v01/" target="_blank">this plugin</a> for instance.</p>
<p>Thanks Parneix for the rectification!</p>
<p><span style="color: #ff00ff;">Update March 1, 2010:</span><br />
Ben&#8217;s source code is still available, but the executable program is no longer available on the link I provided in this post. You can still run similar code from <a href="http://tumblr2wp.com/" target="_blank">Tumblr2WP</a> or <a href="http://haochen.me/tumblr/" target="_blank">Tumble2WordPress</a>.<br />
- With thanks to Parneix and Aaron for the updates</p>
]]></content:encoded>
			<wfw:commentRss>http://www.blogtips.org/how-to-import-a-tumblr-blog-into-wordpress/feed/</wfw:commentRss>
		<slash:comments>67</slash:comments>
		</item>
		<item>
		<title>PHP 5.3.2, WordPress and The Mystery of The Disappearing Permalinks Settings Page&#8230;</title>
		<link>http://www.blogtips.org/php-532-permalinks-and-wordpress/</link>
		<comments>http://www.blogtips.org/php-532-permalinks-and-wordpress/#comments</comments>
		<pubDate>Sat, 18 Dec 2010 07:38:46 +0000</pubDate>
		<dc:creator>Peter</dc:creator>
				<category><![CDATA[Advanced Stuff]]></category>
		<category><![CDATA[Geeky Stuff]]></category>
		<category><![CDATA[How to... Stuff]]></category>

		<guid isPermaLink="false">http://www.blogtips.org/?p=1534</guid>
		<description><![CDATA[Remember I once wrote about the choice of selfhosting your blog or not? No? Well I did. And if you choose to selfhost a blog, some of the issues you need to be aware of are (drum roll): the work involved to maintain a selfhosted blog, protecting your blog from hackers, and dealing with (g)hosting [...]]]></description>
			<content:encoded><![CDATA[<p></p><div class="wp-caption aligncenter" style="width: 430px">
	<img title="French Bulldog" src="http://theroadtothehorizon.net/photo/french%20bulldog%203.jpg" alt="French Bulldog" width="430" height="286" />
	<p class="wp-caption-text">Mystery, Hitchcock, PHP: Sometimes &quot;ugly&quot; can be &quot;beautiful&quot; too</p>
</div>
<p>Remember I once wrote about <a href="/selecting-a-blog-platform-selfhost-your-blog-or-not/">the choice of selfhosting your blog or not</a>? No? Well I did. And if you choose to selfhost a blog, some of the issues you need to be aware of are (drum roll):</p>
<ul>
<li><a href="/selfhosting-your-blog-or-not/">the work involved to maintain a selfhosted blog</a>,</li>
<li><a href="/selfhosting-or-not-hackers/">protecting your blog from hackers</a>, and</li>
<li><a href="/shared-hosting-pay-peanuts-get-monkeys/">dealing with (g)hosting issues</a></li>
</ul>
<p>Well, here is another example how a problem with self-hosted blogs can send you on a debugging trail as vicious, addictive and poisoning as a true mystery story, in the true art of a Hitchcock or Sherlock Holmes thriller:</p>
<p><span id="more-1534"></span>I use permalinks in all my WordPress blogs to convert the ugly and SEO-unfriendly default WordPress URLs:</p>
<blockquote><p><code>http://www.blogtips.org/?p=123</code></p></blockquote>
<p>into something nice and smiling, like:</p>
<blockquote><p><code>http://www.blogtips.org/how-to-evaluate-a-blog-introduction</code></p></blockquote>
<p>But since I moved BlogTips from <a href="http://www.blogtips.org/shared-hosting-pay-peanuts-get-monkeys/">its old GoDaddy (or &#8220;SlowDaddy&#8221;) to its new Hostgator host</a>, I noticed that the permalinks settings page returned empty.</p>
<div class="wp-caption aligncenter" style="width: 288px">
	<img title="permalinks" src="http://theroadtothehorizon.net/photo/permalinks.jpg" alt="permalinks" width="288" height="57" />
	<p class="wp-caption-text">The Mystery of the Disappearing Permalinks</p>
</div>
<p>This was the start of my latest online mystery adventure.</p>
<p>As picked up my pipe and magnifying glass, I realized: &#8220;I must to exclude all variables (as a good IT debugger should do)&#8221;. Thus, I installed a new plain vanilla WordPress test site, without any plug-ins. And sure enough, the permalinks settings page was blank.</p>
<p>To my mind did not come the thought &#8220;Man, That Sucks&#8221;, but &#8220;Good, now I know it had nothing to do with any weird plug-in..&#8221;. So what caused it then, The Damned Permalinks Page to Disappear?</p>
<p>Googling <em>&#8220;WordPress Permalinks page blank&#8221;</em>, I found <a href="http://wordpress.org/support/topic/blank-permalink-admin-page-after-changing-permalink-type?replies=52">this post on the WordPress forum</a>, which seems to point to a PHP problem:</p>
<blockquote><p>Had this same problem, took us forever to get our issue resolved. We did this: Rebuilt apache (using easyapache within cPanel) with version 5.2.x instead of 5.3.x</p></blockquote>
<p>So I narrowed my search, and googled <em>&#8220;wordpress permalinks page blank, PHP 5.3&#8243;</em> &#8230;to find <a href="http://wordpress.org/support/topic/php-warning-php-startup-unable-to-load-dynamic-library?replies=10">this post on the forum</a>:</p>
<blockquote><p>I have the same problem here (only the pdo_sqlite.so error), apparently it is a php5.3 bug.</p></blockquote>
<p>So maybe I should have a look what PHP version I am running on my server&#8230;</p>
<p>Never done that before? What? Saying you never created a script to check your PHP settings is like saying.. &#8220;I never had sex under the shower&#8221;. It is easy, and fun: Create a info.php file on the root directory of your blog containing:</p>
<blockquote><p><code>&lt;?php<br />
phpinfo();<br />
?&gt;</code></p></blockquote>
<p>.. and execute it as: <code>http://yourblog.com/info.php</code>.</p>
<p>You&#8217;ll get a (whooooopy!) long list of all PHP settings, including the version (at the top. No, not &#8220;on top&#8221;, &#8220;at the top&#8221;) Mine said:</p>
<p><img class="aligncenter" title="PHP version" src="http://theroadtothehorizon.net/photo/php%20version.jpg" alt="PHP version" width="430" height="139" /></p>
<p>Yikes, PHP 5.3.2? Well, in the previous forum post, they talked about a &#8220;pdo_sqlite.so error&#8221;, so I wondered if I could find any error on the PHP error log (significantly called &#8220;error_log&#8221;), on my server, located in the directory <code>/var/log/httpd/</code><br />
And indeed, I found the error: <code>/usr/local/lib/php/extensions/no-debug-non-zts-20090626/pdo_sqlite.so: undefined symbol: sqlite3_libversion in Unknown</code></p>
<p>Cool. I found the problem. Now how to solve it?</p>
<p>Googling: <em>&#8220;undefined symbol: sqlite3_libversion&#8221;</em>, I found <a href="http://forums.cpanel.net/f5/undefined-symbol-sqlite3_libversion-148993.html#post632357">this post on the Cpanels forum</a>. Gee, these people seem to know what they are talking about, using terms like: &#8220;EasyApache&#8221;, &#8220;build profiles&#8221; and &#8220;mismatched shared extension&#8221;. One post said:</p>
<blockquote><p>The only solution I am aware of at the present time is to either disable the PDO SQLite3 extension, by editing the system PHP configuration file &#8220;php.ini&#8221; (at &#8220;/usr/local/lib/php.ini&#8221;), or downgrade to PHP 5.2</p></blockquote>
<p>I checked with my hosting support, and they recommended downgrading to PHP 5.2&#8230; Now, Watson, downgrading software is never as easy as upgrading. I did not want to introduce new unknown problems, while solving known problems. Like digging a new hole to fill an old hole, right?</p>
<p>So I had a look at the system php.ini file. On my server, it is located at: <code>/usr/local/lib/php.ini</code></p>
<p>At the bottom, I found something like this:</p>
<blockquote><p><code>extension=pdo_sqlite.so<br />
extension=sqlite.so </code></p></blockquote>
<p>I commented it out:</p>
<blockquote><p><code>;;extension=pdo_sqlite.so<br />
;;extension=sqlite.so </code></p></blockquote>
<p>And saved the file. Reloaded the permalink settings page and voila:</p>
<p><img class="aligncenter" title="permalinks page" src="http://theroadtothehorizon.net/photo/permalinks%20page.jpg" alt="permalinks page" width="430" height="221" /></p>
<p>It worked&#8230;. The end of the Sherlock Holmes Permalinks murder story: <em><strong>If your permalinks page returns blank, and you use PHP 5.3.2, just comment out the &#8220;sqlite.so&#8221; settings in your system php.ini file.</strong></em> Mystery solved.</p>
<p>- &#8230;.?<br />
- What?<br />
- &#8230;.<br />
- &#8220;Sometime ugly can be beautiful too&#8221;?<br />
- &#8230;<br />
- Oh, right, what I meant with the caption from the title page &#8220;Sometime ugly can be beautiful too&#8221;?<br />
- &#8230;!<br />
- Right, sorry, forgot about that.</p>
<p>What I meant was: I had a server problem and solved that problem, after a long search. But by solving that problem, I also resolved another problem: For ages the thumbnails did not appear anymore on my blog&#8217;s archive page. Never been able to find that one. By solving the permalinks problem, it seemed the thumbnails re-appeared.</p>
<p>And that is the beauty of problems with self-hosting: by solving one problem, you might incidentally also solve another.</p>
<p>- &#8230;!<br />
- You are welcome!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.blogtips.org/php-532-permalinks-and-wordpress/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>GoDaddy sites hacked again</title>
		<link>http://www.blogtips.org/godaddy-sites-hacked-again/</link>
		<comments>http://www.blogtips.org/godaddy-sites-hacked-again/#comments</comments>
		<pubDate>Sat, 18 Sep 2010 00:10:58 +0000</pubDate>
		<dc:creator>Peter</dc:creator>
				<category><![CDATA[Geeky Stuff]]></category>
		<category><![CDATA[How to... Stuff]]></category>
		<category><![CDATA[hackers]]></category>
		<category><![CDATA[hosting]]></category>
		<category><![CDATA[security]]></category>
		<category><![CDATA[tips]]></category>
		<category><![CDATA[tools]]></category>

		<guid isPermaLink="false">http://www.blogtips.org/?p=1373</guid>
		<description><![CDATA[After the massive hacks injecting malware into shared hosted sites from several providers back in April and May, it seems they are back at work. Many sites hosted by GoDaddy are being hacked at the moment I am writing this post. Two of mine were affected an hour ago. Update: Hit again this morning (Sept [...]]]></description>
			<content:encoded><![CDATA[<p></p><p><img class="aligncenter" title="hacker" src="http://theroadtothehorizon.net/photo/kaos_hacker03.jpg" alt="hacker" width="400" height="300" /></p>
<p>After the massive hacks injecting malware into shared hosted sites from several providers back in April and May, it seems they are back at work.</p>
<p>Many sites hosted by GoDaddy <a href="http://blog.sucuri.net/2010/09/godaddy-sites-hacked-myblindstudioinfoonline-com-and-hilary-kneber.html" target="_blank">are being hacked at the moment I am writing this post</a>. Two of mine were affected an hour ago.</p>
<p><span style="color: #ff6600;"><strong>Update:</strong> Hit again this morning (Sept 21).<br />
Here is a record of the September virus spree, as I saw on my sites (all CET &#8211; Central European Time):</span></p>
<ul>
<li><span style="color: #ff6600;">Friday Sept 17, 2010 &#8211; 23:30 CET</span></li>
<li><span style="color: #ff6600;">Tuesday Sept 21, 2010 &#8211; 08:30 CET</span></li>
</ul>
<p><span id="more-1373"></span>The scenario is <a href="http://www.blogtips.org/how-to-cure-your-godaddy-wordpress-hacked-blog/">the same as a few months ago</a>: Malware is injected into the .php files on the hosted sites, and the visitors of a site are getting redirected to a third website which injects a virus into the visitors&#8217; computer.</p>
<p>At this moment, it seems also other hosting providers were/are attacked, so monitor your blogs. <a href="http://www.blogtips.org/how-to-check-if-your-blog-is-infected-with-malware/">Check if it is infected regularly</a> during the next days. If you get infected, run the script <a href="http://www.blogtips.org/godaddy-hacked-again-another-way-to-cure/">from this post</a>, and your site will be cured in a minute.<br />
You can also use the same script to verify if your site was infected. If you get the message</p>
<blockquote><p><code>0 Infected Files in ./</code></p></blockquote>
<p>&#8230; then your site is clean. If you get a list of infected files, click &#8220;Fix Files&#8221;, and within a few seconds, your site will be cleaned up. If you use a cache-plugin, don&#8217;t forget to clear your cache!</p>
<p>Note that if your site was infected, and you loaded the site yourself, your computer might be infected too. Many antivirus (MacAfee, Norton,..) programmes will NOT catch the infection. Download the free malware scanner from <a href="http://www.malwarebytes.org/" target="_blank">MalWareBytes</a> to verify and cure the infection.</p>
<p>Best of luck to you.</p>
<p>Picture courtesy <a href="http://www.thenewnewinternet.com" target="_blank">thenewnewinternet</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.blogtips.org/godaddy-sites-hacked-again/feed/</wfw:commentRss>
		<slash:comments>33</slash:comments>
		</item>
	</channel>
</rss>

