Thursday, December 08, 2005


I am currently involved in a project to update the useless-knowledge website. Who knows what will really happen to the site, but one thing is crystal clear - in a major update many people might say leave the old articles behind.

This is very disturbing to me - these articles represent a lot of work on the part of a lot of people. Over 10,000 articles and who knows how many people.

So, I have begun first by getting a list of all the articles in to my database - and now the slow process of grabbing that content and putting it in a database. This way, if and when a conversion happens, the data will be in a database and can be applied to the newer format.

I am currently on the 143rd article - going in to my database. This is going to take some time, because I have ascertained no completely automated way to place the articles in the database. I could grab the entire page, but this isn't particularly useful.

I am just grabbing the article content and whatever HTML is associated with the article.

We'll see how long I stick with this and how much time is involved.


No comments: