Introduction
It is no secret that I am very fond of Munin and since I recently bought a car I wanted to know cheapest gasoline price. I chose to write a munin-plugin to find the optimal time on the day to buy gasoline. Luckily the Danish organization called FDM (United Danish Automobile owners) maintain a user/operator updated list of gasoline prices from all the gas stations in Copenhagen (or any other area of Denmark). These prices are the actual prices at the gas stations and are thus more accurate than the list-prices maintained by the operators themselves. The list can be seen here:
http://www.fdmbenzinpriser.dk/searchprices/2/K%C3%B8benhavn
Update
I have updated the article with a plot of the mean prices per vendor to see if there were some vendors that would be consistently more or less expensive.
Munin plugin
Munin is the perfect platform to sample these prices and show the time trends, so I wrote a simple python based plugin that scrapes the page for prices and calculates the min, max and mean prices.
#!/usr/bin/env python import sys import urllib2 from BeautifulSoup import BeautifulSoup # Get the website that will be scraped into a varialbe websiteurl = "http://www.fdmbenzinpriser.dk/searchprices/2/K%C3%B8benhavn" soup = BeautifulSoup(urllib2.urlopen(websiteurl).read()) # Make a list of all the prices prices = [] for price in str(soup.findAll('td', attrs={'class' : 'tablebodyprice'})).split('class="octanelink">')[1:]: if '*' in price.split('</a></td>')[0]: prices.append(float(price.split('</a></td>')[0].split('*')[0].strip())) else: prices.append(float(price.split('</a></td>')[0])) # Find the minimum, maximum and mean prices minval = min(prices) maxval = max(prices) mean = sum(prices, 0.0)/len(prices) # Define the details of the graph if len(sys.argv) > 1 and sys.argv[1] == "config": print "graph_title Gasoline price for Blyfri 95 in Copenhagen based on "+str(len(prices))+" prices." print "graph_args --base 1000 --rigid -l "+str(int(minval-2))+" -u "+str(int(maxval+2)) print "graph_vlabel DKK/L" print "graph_category other" print "graph_info Gasoline prices for Blyfri 95 in the Copenhagen area. Prices scraped from: "+websiteurl print "minimum.label Minimum value" print "maximum.label Maximum value" print "mean.label Mean of all "+str(len(prices))+" prices." sys.exit() # Print the values when munin calls the script. print "minimum.value "+str(minval) print "maximum.value "+str(maxval) print "mean.value "+str(mean)
New plugin – Mean gasoline price for Blyfri 95 per vendor in Copenhagen
In response to Ras request I have made a tweaked plugin that plots the mean gasoline price per vendor for the vendors in the Copenhagen area:
#!/usr/bin/env python import sys import urllib2 from BeautifulSoup import BeautifulSoup # Get the website that will be scraped into a varialbe websiteurl = "http://www.fdmbenzinpriser.dk/searchprices/2/K%C3%B8benhavn" soup = BeautifulSoup(urllib2.urlopen(websiteurl).read()) vendorprice = {} vendornumber = {} # Make a list of all the prices vendors = [] for vendor in str(soup.findAll('td', attrs={'class' : 'tablebodyprice'})).split('<a href="/')[1:]: vendorstripped = vendor.split('/')[0] if vendorstripped not in vendors: vendors.append(vendorstripped) rawsoup = soup.findAll('td', attrs={'class' : 'tablebodyprice'}) for vendor in vendors: prices = [] number = 0 for element in rawsoup: if vendor in str(element): elementraw = str(element).split('class="octanelink">')[1].split('</a></td>')[0] if '*' in elementraw: prices.append(float(elementraw.split('*')[0].strip())) else: prices.append(float(elementraw)) number = number+1 mean = sum(prices, 0.0)/len(prices) vendorprice[vendor]=mean vendornumber[vendor]=number # Find the minimum, maximum for the plotting allprices = [] for vendor in vendors: allprices.append(float(vendorprice[vendor])) minval = min(allprices) maxval = max(allprices) # Define the details of the graph if len(sys.argv) > 1 and sys.argv[1] == "config": print "graph_title Mean gasoline price for Blyfri 95 per vendor in Copenhagen" print "graph_args --base 1000 --rigid -l "+str(int(minval-2))+" -u "+str(int(maxval+2)) print "graph_vlabel DKK/L" print "graph_category other" print "graph_info Gasoline prices for Blyfri 95 in the Copenhagen area. Prices scraped from: "+websiteurl for vendor in vendors: print vendor+".label mean "+vendor+" price based on "+str(vendornumber[vendor])+" values" sys.exit() # Print the values when munin calls the script. for vendor in vendors: print vendor+".value "+str(vendorprice[vendor])
Conclusion
Looking at the data from the first week of running the munin-plugin it is clear that a daily sharp increase of the prices occurs daily at 10 am and is followed by a monotonic decrease until next morning at 10 am. The best time a day to buy gasoline in Copenhagen is hence just before 10 am in the morning. Once I have gathered data for a few months I might be able find out if there is a ‘best’ day of the week to buy gasoline, but I am unable to tell so far.
The new plugin showing the mean per vendor is not that useful in its current state. Many of the small vendors are only represented by a single number and the plot is too cluttered to actually see any trends. This could however easily be fixed if I decided only to look at a few of the vendors and type the list of vendors my self:
vendors = ['ok', 'shell', 'jet', 'q8', 'uno-x']
Absolutely brilliant. It’d be interesting to see how much the rank of the gas stations change, i.e. is it always the same ones that have high or low prices.
Ras: That’s a good point – If I get time next weekend I will make a version that does just that. 🙂 Data mining for the win!
Great work Thomas.
I’ll be using it in Aalborg.
Cool Stuff.
Scraping data from the Web always makes me feel good 🙂
Wonderful!