Using screen scraping to expose legacy web pages in RSS

As part of my (almost) daily drive to and from one of my clients I pass through the sub-sea Oslofjord tunnel (Oslofjordtunnelen). Now what has driving got to do with screen scraping and RSS, you say?

Hang on, I’m getting there.

Below is a map extract that shows part of my route. The topmost pin is where I start out, the bottom-most pin is the Oslofjord tunnel. The pin in the middle is where, more often than not, a sign shows up stating that the tunnel is closed for maintenance.  You can imagine my frustration when I’m forced to drive all the way back north to get around the Oslofjord!

Map picture

To avoid that pain I set out to find a feed with traffic status updates and ended up at this page published by the Norwegian Public Roads Administration (NPRA). The page have regularly updated traffic information (all in Norwegian mind you) but to my frustration all as static web pages. No feed in sight!

At last here comes the screen scraping into play. You could write up your own scraper in any modern runtime these days. But being a good/lazy developer I know there are already quite good services out there that makes it a breeze setting up feeds with data scraped off of web pages. And lo and behold, I now burn a feed with the latest traffic updates!

So… basking in the glory of my genius for a couple of days I thought it a good idea to write up this blog post for the greater good of mankind. To make the story a bit shorter, what I discovered while rummaging around the NPRA site is that they indeed have great support for RSS!

Feeling a bit stupid I will now go and redirect my FeedBurner setup…and please let me know if there is a moral to this story.

Good night.

Share

Going to Microsoft PDC 2009!

Only one year after the most successful PDC 08, I find myself (and my company) going to Los Angeles once more. PDC 09 looks promising with sessions covering the .Net Framework 4, Visual Studio and Team System 2010, Windows Azure, DirectX 11, Silverlight 3, and much more.

Share

Moving Peter’s Pattern

The time have come for the trusty Eternia server to let go of my blog. I’m giving it this new home (presumably faster and more stable), with, as you probably think too,  a more befitting name.

Now, for those of us that didn’t realize the value of services like FeedBurner when setting up a new blog, I will highlight one advantage that I wish I knew when I started out blogging. Namely that exposing my feed via FeedBurner will save my faithful subscribers from having to update their subscription when the blog will have to move.

The feed address is hereby  http://feeds.feedburner.com/PetersPattern. I can promise you that this link will not change. That is, unless Google goes out of business 🙂

Share

Team Build and drop location share permissions

I’ve recently banged my head against a simple, yet annoying problem with Team Build and the way build result files are published to the so-called drop location. In my case, this location is a share on our file server. Knowingly, the build service executes under a TFSBUILD account which I have given Co-Owner permission level at the share.

Everything builds and results are published, yet all builds are only partially successful. Digging around the somewhat unmanageable BuildLog.txt file I see that Test results from MSTest is published via a call to http://tfs:8080/Build/v1.0/PublishTestResultsBuildService2.asmx which subsequently fails with this error:

  The results directory “\\fileserver\tfsbuilds\BuildFolder.1\TestResults” could not be created for publishing.

The solution is quite simple. Since the actual publishing of test results is done by the PublishTestResultsBuildService2 service, executing under the TFSSERVICE account, we also need to give that account Co-Owner permissions to the drop location share.

Obvious? I think not Smile 

Share

ASP.Net Profile performance or lack thereof

In a customer solution I’ve been working on we use ASP.Net Membership and Profile to store information about users. We use profile data extensively in various reports and listings throughout the solution.

Putting the solution under some regular user activity though, showed some really poor performance when producing reports. Some of these are large reports mind you so I had to go digging to figure out what was going on. I always check SQL Server activity first, looking for waiting processes and locks. And yes, there it was: an exclusive lock on aspnet_Users. Why, we’re only doing reads in these reports!

Further digging into which stored procedures are touching the aspnet_Users table I discovered that the aspnet_Profile_GetProfileProperties actually do an update on the aspnet_Users.LastActivityDate column. This stored procedure does not take any parameter to control this behavior. So a quick solution to the problem was removing the updating part.

Of course, after figuring out what the problem was I suspected that others have figured this out too. Go here for a view of the stored procedure before and after surgery.

Share