Archive

Posts Tagged ‘statistics’

The Big Two – They’ll Be Back

December 10th, 2010 No comments

After this year’s Ohio State-Michigan game (37-7), Ohio State coach Jim Tressel offered a statement that I never expected a Buckeye coach to say. “Michigan will be back. You don’t have to worry about that.” While some Big Ten fans (and even Buckeye fans) make comments about how recent lopsided runs have diluted the rivalry, I’m in no particular hurry to see Michigan return to the top of the conference standings.

But Tressel’s comment got me thinking about a fact I’d read once years before. A decent chunk of the wins that make Michigan’s football program the winningest in the country came in the first half of the 20th Century. If college football has changed since the days of Woody and Bo, it has definitely changed since the time of Fielding Yost.

College programs and conference strength is cyclical so I don’t doubt that Tressel’s statement was true. But I wondered how Ohio State and Michigan have fared over the life of their programs.A chart showing the winning percentages of Ohio State and Michigan

Michigan is clearly dominant early in its history, and it went through several down periods in the 1930s and 1960s before its current troubles. Meanwhile, Ohio State seems to vary much more. Its record is more spiky. When the Buckeyes are good, they’re good. And when they’re down, they’re down. But the only time they’ve fallen as low as Michigan in the past 100 years was during the 1940s.

This left me with one further question. How do the teams look when compared to the entire Big Ten conference?

I grabbed the season winning percentages for the 11 teams currently in the Big Ten since they began football. I wasn’t trying to capture the time since the school joined the conference. I wanted to look at how dominant the programs were overall – even if they didn’t line up against each other every year. Finally, I added a 10-year moving average for Ohio State and Michigan.

While other conference teams have surpassed one (or both teams) for a year or two at different times, the moving average is clearly well above the normal season for the bulk of the Big Ten. It’s normal for the single best team in the Big Ten in any given year to keep pace with the 10-year average for the better of these two teams. I was really surprised to see just how dominant the two programs are. One team or the other is always at the top – if not both teams.

A chart showing the winning percentages of all Big Ten teams.

The other big thing I learned from my two weeks of number crunching? Tressel’s comment was a bit off base. Michigan’s 10-year average still has two 10-3 seasons (2002 and 2003) and an 11-2 season (2006). But it’s only recently began to drop its moving average and is only slightly below the period of the early 1990s that would include the Earle Bruce-John Cooper transition years in the Buckeyes 10-year average.

Categories: Analysis, Tangents

Summer Heat? Time to Dial Down the Energy Usage

August 13th, 2009 No comments

During a road trip this past Friday, I grabbed a magazine from the unread pile to catch up on some reading. Turns out my Outside subscription has expired. (I’ll get a renewal/resubscribe in after getting my next check – or I’ll go back to reading it online.) And it turns out I was way behind on my reading. The issue I grabbed was September 2008, which was great – it had an article that coincided nicely with the Hot, Flat, and Crowded book that I had read.

The article is the written exchange of two of the magazine’s editors in a competition to track their energy use. I had downloaded my electric bill about a month earlier, and I decided to join in the comparison a year late (and without the technological gadget). Without the special software, I would only be able to estimate my daily use. That’s still good enough for me to begin to understand how much power my wife and I use compared to other households.

First, I checked out PPL’s website, which lets consumers access special tools to understand their electricity use. I downloaded the account history and looked at the kilowatts used. Big increases in the winter months and valleys in the summer. Our rented half-double has no insulation (we’ve bugged the landlord about it to no avail – and without much opportunity to look for other options). That means the heating unit uses plenty of electricity trying to keep the old house warm during the cold winters. We topped out at 693 kilowatts in February 2008 – before we began dialing the thermostat way down during the day. Our best full month was July 2007 when we used 336 kilowatts.

An easy spreadsheet formula gave me the number of days in each billing period. From there, it was easy to track the average kilowatts per day. The Outside article (if you didn’t follow the link) says the average American household uses 30.25 kWh per day. The most we used was 23.1 kilowatts in February 2008, and our best month was 10.5 kilowatts in July 2007. The two competing editors fell between about 8 and 18 kilowatts. I have some work to stay in their league. My median was 14.86 kWh, and the mean was 15.28 kWh. There aren’t big fluctuations in our energy use except for a few key months when it really spikes.

Average daily electricity use

Average daily electricity use

I wondered what the trend was from year to year. I took a few minutes to reconfigure the chart to map out the monthly use over the course of 2007, 2008, and 2009 – and I checked the mean for each of the 12 months. So far in 2009, we’ve been below the monthly average every month except for January.

Average Daily kWh by Month

I’ve actually followed the average pretty closely for most of this year. August heat and air conditioners have driven up our electricity in the past – that’s something we haven’t done this year with a cooler summer. Look like the windows are staying open this year, and I’m looking for ideas on how to winterize.

Categories: Analysis, Tangents

Tracking Climate Change In My Own Backyard

May 30th, 2009 No comments

I’ve been reading Hot, Flat, and Crowded by Tom Friedman. He argues that demographics and globalization risk making climate change more dramatic than earlier projected. He also expands climate change to be more than Global Warming. He terms it “Global Weirding” and writes that the impact varies from place to place. Some areas have higher temperatures while others have colder weather. Certain months are impacted more than others. Sometimes the result is more rain – other places report dryer conditions.

But global warming is how everyone thinks of climate change so Friedman writes about a series of interviews where his subjects talk about noticing warmer weather. Western ranchers talk about less snow remaining on mountain tops. Another person speaks about the number of record highs and lows set across the country each week. That left me wondering whether I could find any change in weather in my area simply by looking at record highs and lows and when they were set.

I checked the National Weather Service’s repository of record highs and lows for the Scranton/Wilkes-Barre area – my current home. I used the tables from 1955 to present because they’re pulled from a consistent place (the airport) rather than the general area. I typed the date, record high (“maximum high”) and corresponding year, and record low (“minimum low”) and corresponding year into an Excel spreadsheet. It’s unfortunate that the records only cover 54 years, but they’re taken from a consistent area, which was more important to me than whether they covered 100 years worth of temperatures.

Because I wasn’t counting leap day, I had 365 days. The time period covered 54 years. Simple math says that if there are 365 record highs and 365 record lows, I should be able to expect about 7 record highs and 7 record lows each year.

If this covered two years – 1955 and 1956 – I’d expect half of the highs to be from 1955 and the other half to be from 1956. If it covered five years – 1955 to 1959, I’d expect 20 percent of the highs (73) to come from each year. Because I have 54 years, I expected 1.85 percent of the highs to have occurred in any one year. In a 365-day year, that’s 6.75 days. There were 365 lows as well – one for each day of the year. Odds say that another 6.75 lows – rounded to 7 – would set records each year.

I realize that some years just happen to be warmer or cooler than others, and so I wanted a way to lump years together. I decided to do it by decade. There were five years in the 1950s, nine years in the 2000s (the chart doesn’t cover 2009 temperatures), and 10 years for the 1960s, 1970s, 1980s, and 1990s. So odds say that I should have 33 or 34 records from the 1950s, 60 or 61 records from the 200s and 67 or 68 records for the other decades. If my numbers were around there, we’d be setting roughly equal numbers of record highs and record lows each year – and you wouldn’t be able to track the weather getting warmer or colder.

I didn’t get those results.

Decade Projected Number of Records Number of Record Highs Number of Record Lows
1950s 33.8 20 56
1960s 67.6 55 76
1970s 67.6 51 76
1980s 67.6 63 58
1990s 67.6 96 54
2000s 60.8 80 45
Total 365 365 365

As you can see, there were a lot more record high temperatures set more recently than record lows. In the 1990s and the 2000s, there were 77.8 percent more record highs set than record lows set. We were still setting record low temperatures, but we were setting new high temperatures much more often. While the 1990s represented 18.5 percent of the years in the study, 26 percent of the high temperatures occurred in that decade. The 2000s represented 16.7 percent of the years, and 21.9 percent of the high temperatures. The 1950s are 9.3 percent of the years in the study, and 5.5 percent of the high temperatures. That same decade has 15.3 percent of the record lows for the period.

We’re setting both new highs and new lows in each decade. But there highs are coming more frequently most recently. But how drastic is the change? It’s difficult to see because the 1950s and 2000s don’t have the same number of years included as the other decades. To have a better view of the trend, I divided the 54 years into nine groups of six years each: 1955-1960, 1961-1966, 1967-1972, 1973-1978, 1979-1984, 1985-1990, 1991-1996, 1997-2002, 2003-2008.

Odds should say that you should have roughly equal number of record highs and record lows set in each time period – just more than 40.5 each. (1.85 percent of the highs in each of the six years is 11.1 percent of the records, and 11.1 percent of the 365 days in a year is 40.5.) The final numbers didn’t match the odds. Remember, the number of records for both highs and lows should be right around 40.

Number of Records Set

I’ve been really surprised to see this result. I’ll take some time to look into individual months to see if any part of the year is more affected than another. But it turns out to have been pretty easy to chart the fact that’s it’s getting warmer in Northeastern Pennsylvania. We’re setting many more record highs than record lows.

Categories: Analysis, Tangents

Measuring the Impact

May 21st, 2009 No comments

About six weeks ago, I wrote about how a monthly e-newsletter was key to driving traffic to a blog and website where I worked. A few days later I was reminded of the Pareto Principle – also known as the 80/20 rule. The monthly e-mail doesn’t drive that much traffic, but I’m a sucker for a quick analysis and measuring the ROI is always a great thing to do. That led me to try to compare a few numbers to quantify how big an impact the e-newsletter gives.

Quick disclaimer. My six months of numbers are a little dated – the last quarter of 2008 and first quarter of 2009.

The e-newsletter, web page, and a blog received the majority of views during each month so I just looked at those sources. I left the e-newsletter numbers out as well because I wanted to understand whether the e-newsletter really increased the number of web page and blog views. So I focused on those two numbers. And I looked at the six-day period from when the e-newsletter was sent. Over the course of the typical 30-day month, those six days are 20 percent.

Month Month Views 6-day Views 6-day Percentage
March web 1,476 505 34.2 %
February web 1,472 435 29.6 %
January web 2,172 654 30.1 %
December web 1,569 382 24.3 %
November web 1,737 494 28.4 %
October web 2,160 648 30 %

A copy of the e-newsletter was kept on the website and many articles were posted on the site as well. While each open and click could be listed as a page view, I only measured hits on the index page. The newsletter offered the chance to go to my organization’s “home page” a link to the index page – and a number of people did so. In the six days after an e-mail (20 percent of a month) we always had more than 20 percent of our monthly hits – as high as 34 percent in the final month that I tracked. The main web page generated 29.5 percent of its hits in the 20 percent of the month after an e-newsletter.

This trend was even more obvious in the blog hits. We launched the blog on WordPress.com in September, added a link to our web page late in that month, and began to promote the blog in the e-newsletter in October. The concept of visiting the blog was new to stakeholders throughout this period, and the monthly e-mail provided a great reminder and driver to the blog.

Month Month Views 6-day Views 6-day Percentage
March blog
2,525 1,376 54.5 %
February blog 1,785 743 41.6 %
January blog 1,684 618 36.7 %
December blog
1,891 1,082 57.2 %
November blog 2,363 1,411 59.7 %
October blog 1,271 491 38.6 %

During half of the months studied, the blog received more than half of its page views in the 6 days immediately after the e-mail. While this isn’t an 80/20 split, overall the blog received 49.7 percent of its traffic in the 20 percent of the time following an monthly e-mail.

Content was likely one of the main reasons the blog fared better than the web page in the days after the e-mail. But the takeaway is the same. When planning communications, include something regular to provide your audience with a gentle reminder that you’re there. E-mail is deleted too easily and too regularly – especially when you lean too heavily on it. But e-mail is low-cost and unobtrusive enough that it can give your readers a push to get more information about you.