Monday, December 28, 2015

Twitter Sentiment with Python Part 2

Earlier I blogged about grabbing some tweets from Twitter and running them through a text sentiment analysis module using Python. I recently revisited the project and added a new feature.

Here is the full script. Below I will add some notes

Some Notes

So I setup the script to run every 15 minutes against the keyword Kansas City to track the sentiment of this fine Midwestern town in the Great USA. I turned the script lose and sorta forgot about it for  month or so.

Kansas City is not the most polarizing of search terms on Twitter. Out of the 5000+ executions of the script the sentiment went negative a total of 88 times (1.76%).  The top 10 negative sentiment scores were recorded at the times in the table below.

0.99999574912/6/15 15:38
0.99995177112/20/15 13:52
0.99989486212/16/15 15:53
0.99985036511/29/15 14:52
0.99981270612/25/15 8:52
0.99942610912/20/15 12:22
0.99942192512/17/15 22:22
0.99894590212/6/15 14:23
0.99340058711/18/15 16:52
0.99275798512/13/15 15:53

  • Odd that I did not see any negative sentiment hits on Thanksgiving.  Black Friday did not even have a hit.
  • 12/25/2015 @ 8:52AM apparently some folks did not like their Christmas gifts in Kansas City.
  • 12/20/2015 from 12:07 - 21:07. This time span had the most negative sentiment hits of the entire data set. A quick search on twitter for that day did not yield any clues. I also searched the news outlets and did not see anything obvious.
After reviewing the collected data I noticed a problem with my process. I am parsing the tweets to score the sentiment, however I cannot go back to review the tweets to see what caused the sentiment. 

Therefore I launched version 2 of the script (the script listed above) and added a new table to my database. I am now capturing the actual tweets as well as storing the sentiment score. I changed the capture time to 30 minutes.


During the initial script build I also wanted a way to visualize the current sentiment in a sort of heat map type color system based on the negative score. I remember searching a while for this code but now I cannot remember where I found it. Anyway basically it will set the background color of this web page based on the value of the negative score. The negative score is determined using the sentiment analyzer.

If the negative score is close to 0 then the page will be green. If the negative score is close to 1 then the page will be red. If the negative score is .5 the page will be blue. Everything in between will be a various shade of these 3 main colors. Here is a sample of a high negative sentiment, the numbers are off here because I hijacked the code to negative sentiment of 1 to show the red color.:

All of this is driven by the code below.

So there you have it. I have cranked up the script again to collect data every 30 minutes. I predict in about a month or so I will remember that the script is running and will blog the findings. One of the next steps I want to do is incorporate the values in the D3js calendar chart. However, I may have to pick a more polarizing topic otherwise the whole year will be green.

Thursday, December 24, 2015

Fusion Table API Insert Data Python

Me and a buddy have been hacking at this Google Fusion API for a couple days now trying to figure it out. We finally had a break through. He sent me his sample code with the authentication piece and the SELECT statement and I started trying an INSERT. After about 10 rounds of fail finally got something to work and wanted to post it. We struggled to find a simple sample on how to do this, I am sure there is a better way, but at least its working.

Python Modules Needed

You will need to install some Python modules if you have not authenticated to the Google APIs before. Here is a list of modules I installed. 

  • requests
  • urllib3
  • apiclient
  • gspread
  • PyOpenSSL
  • oauth2client
  • google-api-python-client (used easy_install)
Not all of these are needed for the script. I am working from a new laptop so I had install them fresh. All of these were installed using pip except the Google API Python client. For some reason pip did not install that so we had to use easy_install.


We used the same method detailed in this earlier post using the gspread module. You will want to create the JSON file with your authentication key in it so you can authenticate to the Fusion Table API.

Make sure you grant your client email address access to edit the Fusion table. You can do this using the Fusion Table Share feature and then add the email from the JSON file you downloaded when  you built your key


I am using the USGS Earthquake JSON feed I blogged about earlier today to import Earthquake data in to a Fusion Table. Here is the script.

Basically we just setup a loop to parse the Earthquake JSON data and on each loop we execute an INSERT statement in Fusion. Again there is probably a better way than the line by line method detailed here. But again at least we can import data into Fusion. This is not going to be practical for thousands of rows but is fine for our purpose here.

Earthquake Data JSON feed

Previously on the blog I consumed earthquake data from a CSV produced by the USGS website. Recently I revisited the USGS website and worked on consuming their JSON feed.

You can find information on their feed here:

I am as giddy as a 11 year old girl at a One Direction concert when I see a website has a JSON feed. Here is the python script I used to consume the data and create the text file to import.

From there I was able to import the data into Google Fusion Tables. I tried using the Fusion Table API to knock this out automagically, but did not have much luck. Plan to keep hacking at the Fusion Table API to see what I can come up with.  In the meantime here is a map of all the Earthquakes that occurred 12/23/2015

Here is a link to the full screen map:

Wednesday, December 23, 2015

Link Dump 12/23/2015

Trying to start a semi frequent blog post with links, hand curated links about stuff I find. More for me than you.

Monday, November 2, 2015

Candy Tracker 2016

Round 2 of the Candy Tracker. I was able to get a few more candy loggers this year as we embarked on another Halloween Candy Haul. The SweetSpots map was on hot with candy logging action. Let's see how the night unfolded.

First Stop TLC

TLC kicks the night off from 3PM-5PM.  a great way to get a candy score before dinner. It is the earliest event around our town so of course it makes sense to hit it first and hit it hard. My boy Luke threw such a fit getting his Thor costume on (refused to take a nap) that we didnt realize he wasn't wearing shoes when we got to TLC. Oh well, we are not going to let a couple sore 4 year old feet slow us down on our first score (it is a short walk don't hot line me).

You can guarnetee you are going to get some "Gimmicky Garbage" here at the TLC Trunk or treat. We were not let down.  A small box of tissues, a first aid kit (pretty cool on any other night), and some new frisbees (to add to the 2 from last year). They also had a ring toss and a bean bag game where you got to pick out some trinkets from the bucket. The Kleenex box rated right up there with the Twizzlers at a 3/10. The only other red score here was the dreaded Almond Joy. Coop actually made an attempt to eat an Almond Joy, he knew the error of his ways about .2 seconds in. 

Second Stop the Square

Second stop of the night was the Liberty Square. We had to book it back home to get Luke some shoes for this haul. This one involved a little more walking. This event is from 5PM-7PM. Again another great way to score a ton of candy in a small area. As an added double bonus there is also a great Trunk or Treat just 1 block from the square as you can see on the SweetSpots map.

The Liberty Square is what we call a pure candy score. Businesses can't afford to let the kids down here, so they hand out candy. We had 1 red pin here but it was actually a combo score of Gimmicky Garbage and candy. The Gimmicky Garbage was a stress ball which was involved later in the night in a brother on brother battle royal where said stress balls were taken away.

The Trunk or Treat off the square is a decent spot for candy. It has a great candy to distance walked ratio. However as you can see there are a couple folks who haven't read the rules of proper candy dispersion. Some poor lady was trying to give out Colgate toothpaste. I was so proud of my little Thor when he just flat refused to take the toothpaste and walked past her.

Last Stop Grandma's

We made a final stop, after hitting Chipolte for 3$ burritos, at Grandma's neighborhood. This is more of a tradition then a good strategy for maximizing our candy score. Grandma's street does have a lot of houses that participate so it's not a terrible score. Usually by this time I am pretty much Halloween-ed out. That and we were wanting to catch Game 4 of the World Series to cheer on our Royals.

I logged a few pieces and we called it a night.

Friends of thejoestory

As I mentioned I had a few more people log candy this year. My Gladstone friends were logging crazy. Look at all those pins.

A quick review of the data and I did not find any green pins. Man what gives. So I started looking more closely. 

PB cups a 7? Milky Way a 6? Twix a 6? Kit Kat a 6? Tough candy crowd out at Gladstone. They are not totally crazy because Swedish Fish and Nerds were properly rated at a 2 and 3. Team Gladstone did not rate a single piece of candy higher than a 7. Wonder what Gladstone folks think is a 8,9 or 10 piece of candy.

We had a few other pieces logged throughout the metro. 


All and all not a bad candy haul this year. We had Thor and a Jedi so we doubled up on Candy. We are sending a large bag up to school with Coop to give to the Troops. Thanks for all who logged candy. Here are the counts.

  • 126 total pieces logged. 
  • Heath bar gets an unfair advantage cause it is one of my favorites. 
  • Snickers and Kit Kat were among the most popular. 
  • Healthy at a 4 was the toothpaste...seems a little high but I was rating on utility.

81Heath Bar
7.333Hershey Bar
73Crunch Bar
6.52Laffy Taffy
6.4511Kit Kat
6.388Tootsie Roll
6.254Reeses PB Cups
62Milk Duds
63Milky Way
5.8363 Muskateers
51Sweet Tarts
3.52Almond Joy
3.52Tootsie Roll Pop
3.297Gimmicky Garbage
31Jolly Rancher
21Swedish Fish

Sunday, October 25, 2015

The Legend of Chad Bisher


an act or movement of putting one leg in front of the other in walking or running

Seems simple enough. Straight forward definition. But my buddy Chad Bisher challenged this definition and taught us all to reach for our dreams and follow our hearts this week.

It all started with a friendly Fitbit Work Week Step count challenge thrown down by none other than thejoestory. I invited a few Fitbit friends to step up to the challenge in what I am now labeling "THE Hot Stepper October Work Week Challenge of 2015, Murderer".

I knew something was a miss Early Tuesday of the competition. Chad Bisher had a big lead (like a whole day's worth of steps lead) by Tuesday Morning. I quickly rememberd what gave Chad Bisher the early week edge.

Chad Bisher owns a desk cycle. For the uninformed, the desk cycle is a small bicycle apparatus you can fit under your desk that allows you to pedal while you are sitting at your desk. Chad Bisher was racking up thousands and thousands of steps on the desk cycle.

Taken a back by this brazen display of competitive advantage I decided to email some folks for an impartial opinion. The problem, in my opinion, was that a step counted on a desk cycle is not truly a step. You are not supporting your full body weight and you are not moving from your current position.

Emails started flooding in, and quickly two teams formed.


Some crazy folks were actually in support of Chad Bisher.

"hahahahahaha" was one response which I counted as a support for Bisher.

"So have you ever tried so called deskcycle? What's preventing you from doing the same thing?"

"If seated cycling was allowed it should have been clear to all competitors"

"Would you count steps any differently if he was riding a stationary bike...if you would then you count the deskcycle steps too"

Some supporters went as far as printing up some tshirts for the FREE BISHER CAUSE.


The other side of the argument, the more sane people, had this to say.

"I am sure he is a nice guy and all but it also seems like he is a pumpkin eater (cheater cheater)"

"In my world there are many kinds of steps, but they all requires two components... moving ones lower extremity forward (even 0.01 inches), AND transferring standing body weight to that extremity. So I would say that he is not taking a step by pedaling a bike in a non weight-bearing position, as one cannot sit and walk simultaneously. Id have a hard time coming up with a formula to translate pedaling a bike to walking, too many variables. Walking and pedaling use a few of the same leg muscles, but it would be like comparing casually throwing a ball 100 times to doing 100 push-ups. Similar arm and shoulder muscles used, but too many variables to compare. If it was a competition of STEPS, I would say he knowingly racked up thousands of "steps" on his pedometer without leaving his seat, and should either subtract the approximate number of revolutions he made on he bike from his score, or be disqualified from he competition."

"While it seems that the majority of the competitors are all within "steps" reach of one another, Chad is walking away by 20k plus. I feel that this is similar to the Barry Bonds doping scandal with home runs. Rumor has it that steps are being calculated with the Fitbit attached somewhere other than the proper wrist location. I feel there needs to be an investigation launched by federal authorities into this scandal. "

"I don't have a Fitbit so I don't feel like I should get a say. Since you asked though, it's cheating for sure. Punish him!"

"I think the CBDBCC should be set to whatever value makes his total steps lower than yours.  Is that impartial enough for you?'
I agree that Chad has an advantage that does not permit a level playing field.

Adjustments Needed

So after the crowd weighed in...where do we stand? I think we have to do a step adjustment. I proposed a very straight forward calculation for adjustment.

  • Legs account for 16.88% of the average Male Body weight
  • Add a multiplier for unknown resistance setting of the bike
  • Multiply Chad Bisher's steps by .328
  • Add 10000 steps back to his total to account for non bike steps

From here you arrive at the adjusted step count for Chad Bisher. Then we see where he stands in the leader board. If he is still ahead of me then we subtract off just enough steps so the calculation still looks legit while allowing me to maintain the step total victory. Seems simple enough.

Battle to the End

So for all intents and purposes Chad Bisher was out of the competition. So it ended up being a battle between me and the Doc. Friday I ran to The Scout and back which is around 4.75 miles. Doc hit the gym after work and straight owned it.

I was down by about 5000 steps when I got home. The Royals started Game 6 of the ALCS at 7:00PM. By 8:10PM I convinced myself that I must defeat the Doc. So I embarked on another 4 mile run.

I get back from the run and realize I am still 1200 steps behind. The Doc is on the move, the Royals game is a nail biter, and we only have about 3 more hours to go on this competition.

I would sync, up 300 steps, the Doc would sync down 412 steps. We are both pacing and walking at this point in a fight to the finish. The ALCS game had a 45 minute rain delay and the Doc made the mistake of taking a breather during the rain delay. I never stopped walking from 9:00PM - 12:00AM and finished up a couple thousand steps on the Doc.

I earned the 35K step badge. It took 8.5 miles of running and 3 hours of constant walking but somehow I managed to edge past the Doc.


The step contest ended 10/23/2015 @ 11:59:59PM. Here are the final results: Adjusted and non-adjusted.


Hot StepperSteps
Chad Bisher115375
Wild Man David64380
The Professor52821
Brown Bear47335
Major H35160

I ended up multiplying Chad Bisher's steps by .582 which, to me, factors in resistance settings and other steps taken throughout the week. Remember this is me coming up with some arbitrary metric to ensure my victory.

Hot StepperSteps
Chad Bisher67148
Wild Man David64380
The Professor52821
Brown Bear47335
Major H35160

Tuesday, September 22, 2015

CentOS Sox playing MP3 from Command Line

I simply wanted to play a MP3 from the command line as project 2 of the Amazon Dash Button hack. This took me way longer than I wanted so I thought I would put the instructions here. SoX is a utility that was installed with my CentOS out of the box, however it would not play MP3s by defualt.

I tried installing LAME using standard repos but that is apparently not going to happen. I finally found some like with RPMFusion.

  1. Install RPMFusion
  2. Remove Sox
  3. Install the necesssary libraries and such for Lame
  4. Make a local directory fo SoX
  5. Get SoX using wget from sourceforge.
  6. Extract Sox
  7. Run configure
  8. You should see lame in the list from the configure output
  9. Run a make
  10. Run make install

From here run the sox from the command line and MP3 should be in the list of supported files.
To play a file simply run sox and pass it a file name


Well SoX didnt really work for me. I was able to have some luck with VLC though. In order to install VLC you need to add the Desktop Repos.

Execute this command to install the desktop repos

Then from there you can run VLC like so

Friday, September 4, 2015

Getting to know your Neighbors...Parcel Viewer

I stumbled on a cool tool on the KC Open Data page today: The parcel viewer shows the owner of various parcels of land. When you get to the spot on the map you want you can click download CSV, and what do you know Lat and Lon are right there in the feed.

The data is all public but I blacked it out anyway.

So of course from there you can plug the data into Google Fusion Data and it will create a map for you.

Bam now you know your least their name. Of course remember this data is only as good as the parcel viewer. Case in point we had some friends move recently (2 weeks ago) and they are still listed.

Monday, August 31, 2015

Dash Button Project 1: Anger Study

So I recently purchased an Amazon Dash button and I am using it as a basic Internet connected button to track and do stuff (check this article out). So what are we going to count. I thought a fun first project would be an informal anger study. Starting on 8/23 and ending today at 8/31 we started tracking the anger level of the house.

The basic idea was simple, if you are mad you have to push the angry button (Gatorade Dash Button). The button would record the date and time the button was pushed. We could then track the results to gauge when our family gets angry.

Now remember we have a four year old in the house so that can skew the numbers a bit. Skew both in increased angry moments and increased in angry button overuse (pushing the button when not really angry). But looking at the numbers it appears we have a decent data set to look at.

8/23/2015 8:41Anger Button Pushed
8/23/2015 12:34Anger Button Pushed
8/23/2015 12:36Anger Button Pushed
8/23/2015 12:52Anger Button Pushed
8/23/2015 13:52Anger Button Pushed
8/23/2015 16:26Anger Button Pushed
8/23/2015 16:59Anger Button Pushed
8/23/2015 17:53Anger Button Pushed
8/23/2015 19:56Anger Button Pushed
8/24/2015 11:50Anger Button Pushed
8/24/2015 11:52Anger Button Pushed
8/24/2015 14:46Anger Button Pushed
8/24/2015 16:26Anger Button Pushed
8/24/2015 16:26Anger Button Pushed
8/24/2015 16:28Anger Button Pushed
8/24/2015 17:34Anger Button Pushed
8/25/2015 7:23Anger Button Pushed
8/25/2015 19:37Anger Button Pushed
8/25/2015 20:16Anger Button Pushed
8/26/2015 8:18Anger Button Pushed
8/26/2015 18:15Anger Button Pushed
8/27/2015 18:08Anger Button Pushed
8/27/2015 20:41Anger Button Pushed
8/30/2015 13:10Anger Button Pushed
8/30/2015 17:58Anger Button Pushed
8/31/2015 16:45Anger Button Pushed
8/31/2015 16:48Anger Button Pushed
8/31/2015 20:48Anger Button Pushed

So above is the raw data. A few findings.

  • We get mad around meal times
    • 5 instances around lunch (11:00-13:00ish)
    • 7 instances around dinner time (17:00ish)
  • 8/27 - had an anger instance around bed time @ 20:41
  • 8/23 - someone got a little mad before church @ 8:41
  • 8/23 - recorded the most instances of anger. May have been due to the fact that the anger button was new that day, or maybe cause its Sunday and we spend more time with each other on Sunday.
  • 8/25 - got mad early that day @ 7:23AM
  • At times the person that was angry got more angry when someone said "go push the mad button"
  • At other times people forgot that they were mad when told to push the mad button. Sometimes pushing the button would diffuse the situation.
This of course does not tell you who got angry but does show the instances of anger among the 4 of us living in this house.

Not sure what we will track next but this has been a fun first Amazon Dash Button project.

I also noticed some other ARP probes while trying to capture the Amazon button push. Thought that was pretty interesting. Wonder what these devices are??? I'll see if I can track em down.

Our ARP sniffer gets no love...just a lowly netbook sitting on top of another computer.

Running your name

A while back I thought it would be cool to run my name. So I did. Blog post over...wait no. Back in August 2012 I ran my first name around my then current job at JHA. I went cursive mode and it turned out pretty good IMO.

Fast forward to Friday last week. I got a hankering to run my name again. This time I wanted to run first name and last name (technically shortened version of my first name). Since I am working downtown now there are tons of options for routes and I was quickly able to chart a course. I stepped through the instructions using Route Planner, 53 instructions in all, and decided to go for the run last Friday.

Router Planner estimated about 5.2 miles. I am the lucky owner of two, count em, two cell phones right now. My personal phone is on Sprint and my work phone is on Verizon. I usually track all my runs using RunKeeper on my Sprint phone.

First Attempt

Sprint GPS struggled a bit with the tall buildings downtown. The instructions were on point but the GPS hosed up the t in Story. I know for a fact I ran that portion of the run correctly. I almost hosed up at the e in Joe since you have to cut down an alley. I about went the wrong way but quickly corrected.

Not 100% satisfied with the way the name turned out I decided to have another go today. Today I used my work phone on Verizon. Verizon too struggled with the t but it appeared to keep better track of the run overall. I opened up the activity on RunKeeper and modified the t to more accurately portray the run. Other than that the map is un-edited.

So there we have it. I have now successfully run my name. I have left my digital mark on downtown. I may run it a few more times after vacation and snap a few pics of the route. Here are some more shots of the second run. I used GPS visualizer to build these maps.

Saturday, August 22, 2015

Amazon Dash Button Hack

I stumbled on Ted Benson's Amazon Dash Button hack article earlier this week and thought, man I have to try this out. I ordered the Amazon Dash Gatorade button to give it a go.

At first I wanted to try and get this to work on Windows. I am sure it is doable but I hit a snag in one of the Python scripts. Instead of hacking my way through the scripts I decided to punt and try the Linux route.

I had just recently installed CentOS 7 on an old netbook we had laying around.

So I set out to use it as the ARP sniffer. Ted lays out the high level instructions in his article and it was pretty easy to follow along.

I was able to get this working with Python 2.7. Installing Scapy was a pretty easy.

I also had to install the requests module using a similar set of commands. This was a new build so you may already have requests installed. From there I grabbed Ted's sample code and tweaked it.

The count = 10 will set your script to send 10 probes. If you set that to 0 it will probe until you kill it. Here is a Youtube video of the hack in action:

So now you have a low cost Do button that you can use for all sorts of different projects. I will try and keep you posted on some of the stuff I end up doing.

Wednesday, August 5, 2015

Linux Performance Overview

Quick post to throw some Linux Performance Overivew commands for safe keeping.

dstat -rpma

How much Memory
dmidecode -t 17 | grep "Size.*MB" | awk '{s+=$2} END {print s / 1024}'

cat /proc/cpuinfo

core count
cat /proc/cpuinfo | grep 'core id'


Wednesday, July 29, 2015

Mario Kart Hot Laps

The boys setup a K'NEX Mario Kart track the while I was at work earlier this week. So naturally I did what any cool father would do. "Boys which Kart can finish the track the quickest? We should run some hot laps and time each Kart, keep track of it and then I'll blog abotut it". And here we are.

The Course

The course is a modified oval. Long straight stretch to help build up speed. Turn 1 and 2 are standard turns. The back straight away has a ramp and an S-turn as you move into Turn 3. Another quick S-Turn as you enter Turn 4 and then you hit the finish.

The Karts

We have 3 working Karts, one busted Mario Kart. All Karts have fresh batteries. Using the standard eye test, the Donkey Kong Kart appeared to be the fastest Kart. Mario seemed to be the slowest and Bowser was somewhere in the middle.

Hot Lap Heats

1 - Lap

We started trying to time one lap. All Karts finish a lap pretty quick. So we quickly changed to a 5-Lap run.

5 - Lap

Each Kart finished 3 5-Lap runs.
5-LapsHeat 1Heat 2Heat 3

  • Again DK seemed to be the fastest. However the Kart would build up so much speed on the first straight away that it would wreck out in Turn 1. If it made it past Turn 1 it would get too much air on the ramp out of Turn 2 and would crash out by the tree. 
  • The Mario Kart was slow so it had a clean run around most of the track. However it would not get much air on the ramp so it kept crashing into the wall where it would have to re-accelerate costing time.
  • The Bowser Kart set the fastest lap in Heat 2. It had a clean run and was faster than the Mario Kart.

10 - Lap

Each Kart completed 3 10-Lap Runs:

10-LapsHeat 1Heat 2Heat 3
  • We made a slight adjustment to the track by lining up the ramp a little better to smooth out the landing. DK set the course record in Heat 2 by pulling off a clean run.
  • Mario was the most consistent across all three heats.
  • Bowser crashed out in one heat, and then had wheel trouble in the other two heats. If not for the wheel trouble Bowser would have probably set the course record in Heat 1.


Donkey Kong



Wrap Up

So there you have it. Bowser set the 5-lap record. DK Sets the 10-lap record. I challenge the boys to come up with some new configs so we'll see what they come up with. I'd like to calculate the scale speed but we already tore the first track down. I will try to remember to measure the next track.

Sunday, July 19, 2015

Linux on USB Stick

Found a laptop laying around at home and wanted to throw linux on it. The laptop does not have an optical drive so I needed a Linux on USB solution. Here is what I did.

  1. Download the Linux distro you want to throw down.
  2. Download Win32 disk imager here:
  3. Find a USB drive that has ample space
  4. Launch the Win32 Disk Imager
  5. Browse to the ISO you downloaded in step 1
  6. Choose the USB Device you wish to lay the image down on
  7. Click the Write Button
  8. took a long time.
  9. Then slap that USB stick in your laptop and boot that mug up.

Simple as that. I went with CentOS 7 and it worked perfectly.

Wednesday, July 15, 2015

Meeting Tracker First Pass

I dislike meetings. I make that pretty clear to anyone that wants to listen. So anytime an opportunity arises to expose meetings for what they really are (99% waste) I jump at the challenge to qualify/quantify how terrible meetings can be. In my 14 year professional career I have sat in numerous meetings. I can remember 2 meetings total that actually brought value and one of them was a 3 day working meeting to solve a very specific problem. I'd probably call that a focused collaboration session but it was billed as a meeting.

So where do we go from here? Well in the past I have created meeting waste calculators and created a meeting buzzword tracker. Me and a friend were chatting the other day and he knew I had created a calculator and he wanted the link. I sent it over. After some normal I hate meetings banter I mentioned I need to work on my meeting buzzword tracker. Like most of my projects I cobbled something together quickly without thinking about flexibility/scale-ability and such.

The meeting buzzword tracker was written for a single, very specific dataset I wanted to collect. How many times was the word project mentioned in our standard weekly team meeting? So I sent the link to my buddy and then he asks this question which pretty much sparked the remaining of this blog post: "How do I adjust the meeting tracker for the guy who fell asleep?"

Couple things. 1) Notice how he called the website by the wrong name, meeting tracker instead of meeting buzzword tracker....ding ding ding... 2) How do I track various things that happen in a this case a count of all the people that fell asleep

My response to this question was "perhaps we have a few counters you can tick".

He responded. "We need a scoring system for meetings, buzzwords, people nodding off, snores, nose picks, etc." BOOM! and now we are off and running.

After some more back and forth I decided to email some folks in the lame meeting world to get some more feedback. Several responses later and we now have the Meeting Tracker 1.0 system in development.

Meeting Info

First we collect some standard meeting info: Meeting Name, Date/Time, Duration in minutes, and the Buzzword. The buzzword is a word you want to track the frequency of during your meeting. As mentioned earlier I used to track the word project. We once had a meeting where project was mentioned 32 times in about a 25 minute span. It was glorious.


Next we have a standard set of counters. These are the various counters you want to track. Here is our current list.

  • Sleepers - Count of the people falling asleep or nodding off during the meeting
  • Pickers - Count of people picking their nose in the meeting
  • Boomers - Count of people over 50 in the meeting, typically slowing the meeting down
  • Offliners - Count of people asking questions that are off topic and extending the meeting
  • Surfers - Count of people surfing the internet, checking email on their devices instead of listening. These are the people that ask questions about stuff answered 5 minutes ago
  • Outsiders - Count of the number of auditors or consultants that do not work for the company


We added a series of conditions to track during the meeting. These are yes/no true/false type conditions about the meeting. Here is our current list.

  • Start Late - Meeting starts after scheduled time
  • Finish Late - Meeting goes longer than scheduled time
  • Just the Two of us - meeting could have been a face to face conversation instead of a 10 person meeting
  • Echo - Meeting was covered in a memo or email before the meeting
  • No Lunch - meeting was scheduled during typical lunch hour but no food is provided
  • Double booked - you or the meeting venue itself are double booked
  • Meetingception - another meeting was scheduled as a result of this meeting or the whole purpose of this meeting was to setup another meeting


Then we came up with a set of bonus points. These items remove points from our overall Meeting score. We will talk more about the scoring system below.

  • No talk - You make it the entire meeting without saying a word
  • Food - There is food at the meeting
  • No Action - You have no action items as a result of this meeting
  • Early finish - The meeting finished before scheduled duration
  • Cancelled - the meeting was cancelled. 

Scoring System

Now that we had a series of metrics I started assigning values to the various metrics. For the counter section we assign a value to the counter and then just multiply the count by the value to get the metric for that meeting. For conditions I just assigned a value to each condition based off my own personal biases for how bad the various condition effects the overall meeting score. Finally for kickers I added points back for various items. Therefore kickers are negative numbers.

After the math we are left with a Meeting Score representing how terrible the meeting is. The higher the number the worse the meeting. Since whole numbers are not really cool I decided to create a variable to multiple the final score by. I call this variable the Inherent Terribleness of Meetings in General (ITMG coefficient). Right now the ITMG coefficient is 1.73123.

Below is a snapshot of the current values along with 3 sample meetings.

So for our sample meetings, Meeting 3 is the worst at 328.93. 

We probably need to tweak things a bit, and I need to track some meetings with the web app to see how it works on mobile devices. I am thinking I also need to create a PDF sheet that you can print and take to the meeting. That way you can just track the data in the meeting and then record it in the system at a later date. You don't want to have to add yourself to the Offliners count.


So I am all ears on Suggestions and feedback. Can you think of some more counters, conditions, or kickers for the meetings? Have any suggested value tweaks for the metrics above? Have a meeting war story to share? Send them my way.

I think meeting duration should play a role some how. So the longer the meeting the worse the meeting. I may add that into our scoring system somewhere. Do you have any other ideas?

Monday, July 13, 2015

KC Traffic Data: Speed Sensors

Recently I started capturing data from the Speed Sensors of the KC Scout system here in the KC Metro detailed in this post. So I am collecting tons of data from the Speed Sensors daily (archive table has over 9 million rows already)  and I started to wonder what I should do with all that data.


Why not build a heat map for day of week and hour of day? Having no good answer to that question I threw together a quick website that allows you to select a Speed Sensor and then generate a Heat Map for that speed sensor.

Since I was in a hurry I did not get fancy with the selector:

Here you get a big ol list of Speed Sensors with cryptic names and such. One future feature I want to implement is to plot all the sensors on a Google Map and let you select a Sensor from the map to generate the HEAT MAP.

After you Select a sensor and click Generate you land here:

This will give you a map of where the sensor is located and the average speed for each day of the week and each hour of the day in a sweet looking heat map. I am using the sweet Library to build the Heat Map.

Like most of my stuff this is pretty bush league. I am actually building a Tab Separated Values file on the fly and using that to source the Heat Map. Therefore its pretty slow and I cannot send out the link to folks cause it will not be able to handle multiple requests very well.

I need to spend some time to figure out how to make the D3js heat map hit the database directly but as typical with my stuff I just crank it out and figure out the details later.

End of Post update 

So while composing this blog I decided to stop being lazy and work on the Maps idea I mentioned above. Here is a screenshot of the map:

Now you can select a sensor from the map and then link to the Heat Map. Pretty Boss. Anyway yeah I think the next step is to figure out how to dynamically build the Heat Map using D3 hitting the database directly. Hopefully I can figure that out eventually.

Tuesday, June 2, 2015

Work Trails Revisited

Long long time ago I created a static website called work trails. Basically I wanted to document the various trails I created while walking on my lunch break at work. Recently I decided to pick that project back up and make that a little more usable. I created a database and a web site front end and started charting out some courses around work.

Tracker App

I am using the RunKeeper app to track each run. GPS is a little spotty around my place of employment but it works pretty well. I have tried GPS on Verizon and plan to try it on Sprint to compare who is better. My money is on Verizon but we'll see.

One cool thing about RunKeeper is that they have a website where you can see all your routes and such. I used their website to build my maps and to track distance, time, calories burned etc.

Database and Website

I created a really simple database to store my info. One table is for the details about the trails, the other table is for recording each time I complete the trail. I created a few stored procedures to handle adding and displaying the data.

I then wrote a web site to display the data.

Here is the main page that shows the details of all the trails.

Here is overall summary starts of all the trails I have completed so far.

Here is the trail detail page where you can see information about each trail. When the last run was completed, summary for the trail, etc.

Next steps

I am in the process of mapping out about 5 or 10 more trails then I will start repeating trails and keeping track of the stats. I am also looking to add a photo gallery for each walk to show the various sites on each trail. 

Wednesday, May 27, 2015

gspread Python Module and More Push-Ups

Yesterday I posted about my new push-up strategy here at work. The new PowerShell script is working great. I am really starting to hate that purple Blink(1) beacon....but its good for me. My arms and chest are thrashed from all the pushups.

Anyway, as mentioned yesterday I needed a way to track my push-up progress (or lack there of). Having already posted about the awesomeness of the gspread Python Module, I figured now would be a good time to revisit the module and find a good use for it.

First things First I'm the Realest

First we need to authenticate. You can go the route of just storing your email account and password in your Python script...but what fun is that when you can go OAuth. Browse on over to the Google Developers Console and make sure you have the Google Drive API enabled.

After you have enabled the API go to the Credentials section under APIs and Auth. Click Create new Client ID under OAuth, Choose the Service Account option and make sure JSON key is selected. Once complete you should see a service account on the right side of your screen with a Client ID, Email address, Certificate Fingerprints.

Click the Generate new JSON key button below the Service Account and you will download a JSON file that has your private key and such. Rename this key file to something easy (I just called it json.key), and drop this file in the folder where your Python script is located.

Open up the json.key file you downloaded and take note of the client_email address. You will need that later.

Client Time

Now its time to get the oauth client. I had to go through a few rounds of pip installs to get my client working. Here are the modules I had to install for sure:

  • pip install oauth2client
  • pip install PyOpenSSL

You may need a few other libraries here. Just google your errors when you start trying to authenticate. Ima bet there is a StackOverflow article on it to help you out.

Authenticate don't Playa Hate

Reference the gspread API documentation for some help but its pretty simple once you get the pieces above in place.

That is about it. This will crack open that sweet key.json file you downloaded earlier from Google. This file contains all the API goodies you need to make the Authentication happen Cap'n.

Sharing is Caring

Grab that Client Email you noted from the earlier step and you will now want to share the spreadsheet you will be modifying with this email. The email should be a big ol hairy email with the address on the end.

Make sure this email account can edit your spreadsheet.

All Together Now

So putting it all together here. I wanted a quick script that would take in a parameter for the number of pushups and grab the current time, then write these values to a Google Spreadsheet.

I encountered a small issue with the gspread Module. It's more of a Drive API issue than a gspread issue. When you use the append_row function it appends a row to the bottom of the worksheet. By default the bottom row is row 1000 so append_row will put your data at 1001 regardless of where your data is at.  As a quick work around I just use the col_values function to get the number of rows and then use the resize function to set that as the spreadsheet size. Then you run append_row and you are good to go.

So now I am tracking push-ups in a more automatic fashion.

Example call:

python.exe c:\Python27\Projects\PushUpTracker\ 15

This call will record 15 push ups in my Google Spreadsheet with the current time.

Tuesday, May 26, 2015

Blink, PowerShell, and Push-Ups

I have been at my new gig for a little over a month now. Some fellas at the new gig like to do push-ups throughout the day in an attempt to stay active during a primarily sedentary job. Me being a data guy and a fella that likes to try and stay active this activity falls right in my wheelhouse.

The first few weeks I did pretty good remembeing to stop and do push-ups throughout the day, however soon I kept forgetting. I would get to the end of the day and only have like 20 or 40 pushups. I recently brought the Blink(1) to work to keep an eye on traffic, news and stuff so I said "Self!" why not write a quick script to remind you to do pushups every hour.

Blink(1) makes this so simple. All you have to do is write a HEX color to a text file periodically. You then setup a rule in the blink control software to monitor the file. When the modified date on the file changes the blink(1) will show whatever color or pattern you send it.

I wrote a quick PowerShell script to write to the file.

So we loop 8 times and sleep for 3600 seconds each time (1 hour) giving us an all day pushup reminder script. I setup the Blink control software to scan the file every 5 minutes.

I also setup a Google Spreadsheet so I could track my push-up progress. I will make a chart with my progress (if it shows improvement ha) over the next few weeks.

Thursday, April 30, 2015

KC Traffic Stats Version 2

Back in October I wrote a little process to grab data from our Scout system in an attempt to provide a customized view of current traffic on the route from work to home. The process worked about 2.4% of the time. I think my main problem was the way I was trying to hijack user state to keep the page alive. The first attempt was scraping the data off the site.

Fast forward a few months later, and a new job later, and I wanted a process to grab the speed of the speed sensors on my route from work to home. The new gig has different hours so I am fighting the traffic a little more.

I used Fiddler2 to investigate how the data was being populated on the KC Scout homepage. I have learned  a little more about Python since the last attempt. Using Fiddler I quickly discovered the data was being passed using JSON. Perfect, if I could figure out the user state problem then I could easily parse the JSON feed.

Some quick research led me to the requests module for Python. BOOM game changer. Again using Fiddler I was able to come up with a HTTP header to request the JSON feed and I was off to the races.

Here is my new Python script:

I am using the requests module to handle the HTTP header. Then I grab the JSON feed and parse it out. I still need to keep an eye on the JSON data being returned because I am parsing the date out to do the conversion to SQL DATETIME.

First few days of testing have been a success. A lot more successful than the previous scrape system. I am loading the data up into a SQL Server database every 5 minutes, about 500 data points per 5 minutes. My dataset is growing pretty quick so I made an archive table to move data older than 7 days to an archive table.

I put a quick web front end on the data and hooked up a Blink(1) process to give me a visual indication of traffic before I roll out for the day.

The web front will show which sensors are slower than average and color code them based on how much slower than average they are currently showing. You can also click on a link to see a map. The map is based on the lattitude and longitude of the sensor you choose. Google Maps will provide current traffic metrics for the area you selected.

The Blink notification system will go red if traffic is below a certain threshold on a couple of the speed sensors in my route home. I write out the speed to a txt file and read it using the Blink control application.

Blink(1) Notification