April 26, 2011 § Leave a comment
The five variables considered were:
- “percent of people in each city that put their green beliefs into action”
- “percentage of people who are willing to admit to no concern or consciousness of environmental issues”
- “percent of people who make a conscious effort to recycle”
- “Average trips taken on public transport each weekday”
- “percent of homes that use solar energy for heating”
These variables were said to be “weighted equally”, and that should set off alarm bells: how do you equally weight variables with different units? You could use ranks, but I bet that isn’t what they did.
Of the top 25 cities, Vegas was 21st in consciousness, 17th= best (i.e. lowest) in unconsciousness, last in recycling, and had no public transport data. But they were first in solar power! Meanwhile, San Francisco was first in consciousness, 4th= best in unconciousness, first in recycling, 2nd in public transit per capita, and 11th in solar. So how did Vegas beat SF overall?
Well, Vegas has a lot of solar power: 0.43% of households, when second-placed Albuquerque was 0.2%, and the median for the top 25 is 0.06. Whatever standardisation they did — perhaps changing to z-scores — left Vegas a huge outlier for solar. So though Vegas sucked in all the other categories, solar alone pushed them to second overall.
- Changing to z-scores might not be sensible if your data aren’t normal.
- Ranks can be good.