Where, oh Where?

For nearly half of my adult life, I have dreamed of leaving Omaha for greener pastures. The plan has always just been to finish my time with the Army, get my degree, and peace out. Now, as we enter 2019, I am approaching the 10 year anniversary of my honorable discharge from the Army and receiving my Bachelor of Science in Computer Science and Mathematics. And I’m still here.

The reasons are varied but not uncommon, but the predominant one has been money/debt. This year I have a very real chance to get out of debt, so the question of where would I want to move if I could should probably be addressed. And of course, it’s a simple question without a simple answer.

Conventional wisdom says to go where the jobs are, and in tech, most of those are on the West Coast. However, that’s also where the rent is three to five times what I’m paying in Omaha. And then there’s the considerations of public transportation, weather, and the dating and music scene.

So rather than listen to the various voices telling me where I should and shouldn’t move, I’ve decided to take a somewhat empirical approach. Let’s establish some metrics, gather the data, and compare all my options on an equal scale.

Metrics and Data Collection

First, I needed to establish some metrics. What cities would I be comparing? I’m mostly interested in densely populated cities, not small towns. So I simply started with Wikipedia’s List of United States cities by population. This is a list of 311 cities containing at least 100,000 people according to the 2017 estimate of the 2010 Census.

Caveat: When people speak of cities, they often imply they are also talking about the surrounding suburbs, even if they are technically their own city. An example would be Miami, which by population, is actually smaller than Omaha. But when people talk of Miami, they are usually implying the inclusion of neighboring Hialeah, Miami Gardens, Miami Beach, etc. In this analysis, we are dealing with the technical definition of the cities, as Wikipedia has listed them.

Top 25 Cities by 2017 Population

Rank Location Population 2017
1 New York, New York 8622698
2 Chicago, Illinois 2716450
3 Philadelphia, Pennsylvania 1580863
4 Houston, Texas 2312717
5 Louisville, Kentucky 621349
6 Detroit, Michigan 673104
7 Baltimore, Maryland 611648
8 Jacksonville, Florida 892062
9 Phoenix, Arizona 1626078
10 Los Angeles, California 3999759
11 Buffalo, New York 258612
12 New Orleans, Louisiana 393292
13 Gainesville, Florida 132249
14 Dallas, Texas 1341075
15 Augusta, Georgia 197166
16 Syracuse, New York 143396
17 Milwaukee, Wisconsin 595351
18 Nashville, Tennessee 667560
19 Miami, Florida 463347
20 San Antonio, Texas 1511946
21 Pittsburgh, Pennsylvania 302407
22 Cleveland, Ohio 385525
23 Rochester, New York 208046
24 McAllen, Texas 142696
25 Winston-Salem, North Carolina 244605

Metrics

I then broke down the desirable traits of a city into categories and further into quantifiable data points.

  • Climate
    • Difference from 78 degrees by month
    • Precipitation
    • Air Quality
  • Population
    • Population density
    • Population growth in last 10 years
    • Median age
    • % married
    • % male
    • Crime rate
  • Livability
    • Cost of living
    • Walk score
    • Transit score
    • Bike score
    • % favorite bands have played there recently

The result is a score that I just now made up called the CPL Score. The CPL Score is a holistic look at a city given the metrics that are important to me. Someone else may have a different idea of what makes a city desirable. For instance, I am looking for a very low median age, high population density, a climate as close as possible to 78 degrees for much of the year, and a place that brings in a lot of heavy metal shows.

Collecting Data

I started off gathering climate data from http://usclimatedata.com. This ended up being a bust and a huge waste of time and effort because not all of the cities were in their database. I ended up scraping most of my data from https://www.areavibes.com. They also provide their own Livability score, but I didn’t use it because my criteria are a little different.

For mobility, I used https://www.walkscore.com.

Concerts were a little less scientific. I simply went to http://loudwire.com and looked at all the concerts I would have totally gone to in the last year had they came to Omaha. I supplemented with shows from smaller bands listed on https://www.bandsintown.com. I didn’t need that big of a sample size here because it really was kind of a random sample. I ended up with 570 show listings. Good enough for my purposes. More than likely, I wasn’t going to find a Behemoth show in Davenport, Iowa, no matter how much data I gathered.

Cleaning and Normalizing

The collected data are not perfect. There were several cities that didn’t match the AreaVibes URL and didn’t get collected, so I had to manually enter those. I also meant to collect the average temperature for each month, when I mistakenly got the average LOW temperature for each month. This would have been fine, but during manual entry, I accidentlly put in the average, so there’s a bit of a mish-mash. The manual entries are for very small cities that I most likely wouldn’t have considered anyway.

Once the data were loaded and all the NAs were gone (either filled in with actual data, or replaced with 0), I standardized and normalized the columns. This allows me to compare each metric equally to the others. That is, with the exception of Cost of Living. This one is pretty important, so I multiplied the normalized value by 5.

Finally, the objective is to get the “golf score”, or smallest score. So for features that I preferred to be larger, such as population density, I took the negative. For temperature, I took the absolute value of the temperature minus 78 degrees (the closer to 78 the better).

The final CPL score was just a summation of each row. Smallest score wins.

Top 25 Cities by CPL Score

Rank CPL Score Location Population 2017 Area mi^2 Density Walk Bike Transit Age M/F Married Families Cost of Living Crime Concerts Weather (LOW) Precip. Air
1 -22.424276 New York, New York 8622698 301.5 28317 89 68 85 35.9 0.9 0.46 0.44 166 1987 8 47.66667 4.0583333 53
2 -16.463894 Chicago, Illinois 2716450 227.3 11900 78 72 65 33.9 0.9 0.41 0.45 113 4363 14 44.58333 3.1750000 60
3 -15.557255 Philadelphia, Pennsylvania 1580863 134.2 11683 79 66 67 33.9 0.9 0.38 0.42 104 4011 9 47.16667 4.1250000 57
4 -12.870909 Houston, Texas 2312717 637.5 3613 49 48 37 32.7 1.0 0.48 0.49 91 5224 11 62.58333 5.3166667 52
5 -12.286090 Louisville, Kentucky 621349 263.5 2339 96 65 74 37.3 0.9 0.50 0.43 91 4769 6 48.41667 4.5833333 55
6 -10.781799 Detroit, Michigan 673104 138.8 4847 55 52 38 34.8 0.9 0.32 0.45 80 6597 6 42.58333 2.7916667 52
7 -10.634362 Baltimore, Maryland 611648 80.9 7598 69 52 57 34.7 0.9 0.36 0.41 100 6955 9 53.08333 3.8416667 46
8 -10.244231 Jacksonville, Florida 892062 747.4 1178 27 40 22 35.7 0.9 0.50 0.42 93 4158 5 63.41667 4.1833333 38
9 -10.166915 Phoenix, Arizona 1626078 517.6 3120 41 52 36 33.3 1.0 0.46 0.50 95 4432 8 63.50000 0.5833333 74
10 -9.990661 Los Angeles, California 3999759 468.7 8484 67 55 51 35.0 1.0 0.44 0.45 146 3297 16 56.66667 1.1416667 74
11 -9.895247 Buffalo, New York 258612 40.4 6359 68 59 49 32.9 0.9 0.36 0.49 85 4855 1 39.91667 3.6750000 41
12 -9.756202 New Orleans, Louisiana 393292 169.4 2311 58 64 44 35.5 0.9 0.38 0.42 99 5365 9 64.58333 5.6833333 45
13 -9.735823 Gainesville, Florida 132249 62.3 2112 34 66 0 25.7 0.9 0.29 0.38 95 4316 0 57.66667 4.1833333 33
14 -9.557525 Dallas, Texas 1341075 340.9 3866 46 46 40 32.5 1.0 0.46 0.49 95 3960 12 58.50000 3.5583333 54
15 -9.499844 Augusta, Georgia 197166 302.5 652 68 47 0 33.5 0.9 0.43 0.40 87 3217 0 56.50000 3.6333333 42
16 -9.350472 Syracuse, New York 143396 25.0 5735 61 46 44 30.6 0.9 0.34 0.46 87 4072 0 38.66667 3.7666667 32
17 -9.185876 Milwaukee, Wisconsin 595351 96.2 6186 62 54 49 31.0 0.9 0.36 0.52 89 5389 2 39.83333 3.0583333 45
18 -9.035180 Nashville, Tennessee 667560 475.9 1388 91 61 0 34.0 0.9 0.46 0.45 98 4956 4 49.08333 4.3916667 45
19 -9.010497 Miami, Florida 463347 36.0 12599 79 63 57 39.7 1.0 0.41 0.39 109 4735 1 69.75000 5.7750000 41
20 -8.972250 San Antonio, Texas 1511946 461.0 3238 38 42 36 33.1 1.0 0.49 0.47 86 5552 0 60.33333 2.4000000 45
21 -8.846880 Pittsburgh, Pennsylvania 302407 55.4 5481 62 51 54 32.9 1.0 0.39 0.39 91 3771 5 43.66667 3.6750000 56
22 -8.804279 Cleveland, Ohio 385525 77.7 4965 60 50 47 35.8 0.9 0.34 0.47 88 6473 12 46.66667 4.8833333 55
23 -8.742531 Rochester, New York 208046 35.8 5835 65 53 43 31.3 1.0 0.31 0.50 88 4701 7 39.41667 3.1333333 36
24 -8.601354 McAllen, Texas 142696 58.4 2435 77 61 0 32.4 1.0 0.58 0.50 77 2924 0 66.66667 1.8333333 36
25 -8.435472 Winston-Salem, North Carolina 244605 132.5 1828 83 74 25 35.0 0.9 0.48 0.48 95 4179 0 59.66667 3.8666667 41

Conclusion

It’s no surprise to see New York and Chicago around the top. This list is a good starting point but of course I’m not just going to blindly select the first one and move there. This approach simply allows me to narrow my search while also looking beyond the usual big 3 of New York/Chicago/LA. Some of these won’t even make it past the initial phase. I mean, I know for a fact I’m not moving to Columbia, Missouri, even if it is a college town.

You can find the code (Warning: Really sloppy R code) here.