These are all quotes I have heard while watching our local Kansas City news. Let's face it. Weather teams have the daunting task of predicting the future and are crucified when they get it wrong. Don't feel too bad for them though, they are typically paid pretty well for being wrong a lot. It is a very tough job and they do get it right more often than not. Weather teams in Kansas City seem to have an even tougher job given the setup of Kansas City weather.
The Plan
Measure the accuracy of the local weather stations forecasts to see who is the most accurate. I wanted the process to be hands off, completely automated. I didn't want to enter the data manually etc. I was hoping the local weather stations had some kind of parsable forecast feed on their websites I could use for the data source.
I then decided to compare the forecasts with the observed temperature at the Kansas City International airport. For some reason I decided the time to compare the temps would be 12:00PM.
The Method
I started reviewing the Hourly Forecast for each local weather website. Three of the four weather stations in town had the data in a text format I could scrape from their site. The fourth weather website had the hourly forecast in an image format. I am not skilled enough to figure out how to work with an image so I just through Fox4 out of the running.
I used python to do the heavy lifting. Here are some sample scripts for each site.
KMBC9
KCTV5
KSHB41
I then used Weather Underground to get the temp at 12:00PM at KCI. The Weather Underground temperature is collected at 11:53AM. For my purposes here that is close enough. I could not find on the various weather team websites what location their hourly forecast is predicting. Again this is not an exact calculation of accuracy here...just a general ballpark. If your 12:00PM hourly forecast is off by 8 degrees at KCI, odds are the forecast was incorrect at other spots in the metro.
Wunderground
You can tell by the scripts above that this system is pretty fragile. Any small change in website design, CSS changes, object name changes, etc. will throw this off. I am hoping the thing at least works for a few weeks so I can collect some data.
The python script executes at 11:58AM in order to ensure that the 11:53 Wunderground temperature collection is complete, and before the weather websites remove the 12:00PM hourly forecast from their web sites.
Then at 12:15PM I have a job that executes to measure the accuracy between the forecasted temperature and observed temperature. The results are stored in a table.
Results
I then front-ended the data with a little website used to display the results. Keep in mind that I only have 3 days of data so far, so this is a very limited sample size. Again, I hope the system can work for a few weeks before it completely blows up.
0 comments:
Post a Comment