DNC Transit Effect? (Part 3)

Project link: https://github.com/eric-mc2/DNCTransit

Recap

In a previous post I proposed a statistical model to estimate whether the 2024 Chicago DNC affected public transit usage across Chicago.

The learning objective was to was to try to construct a believable model from a real-life event – or to convince myself that the model is too flawed to be trustworthy.

I gathered ridership data for train, bike, and rideshares, plus the dates and locations of the DNC. Then I estimated two regression models: a fixed-effects model and a difference-in-difference model. The results indicate that the DNC caused ridership to increase near the convention centers. However, mobility across the rest of Chicago also declined during that week.

Do we belive this design? I’ll list several issues that I thought of.

Problems with this model

(Key: 😑 = dealbreaker; πŸ™ = bad but maybe fixable; 🫀 = not a dealbreaker)

Control Mis-Specification - 😑

  • the control group should be “like” the treatment group, but I use the entire city as a comparison. it’s not apples-to-apples. a better control group might be a handful of the major event/cultural institutions like Soldier Field, Wrigley Field, Art Institute. or using finding matched pairs based on covariates.

Selection Bias - 😑

  • the convention centers were explicitly chosen for their ability to accomodate visitors (via transit)
    • this is mostly a problem for external validity (over-estimating the result if repeated elsewhere), which is not the point here

Substitutability of Transit - πŸ™

  • Travelers have different options for transit. Their choice depends on location, time of day, distance cost and reliability of other options. I don’t like that I’m modeling these modes independently right now, but unifying them would be a rabbit hole.

Spillover into control - πŸ™

  • DNC visitors might stay in chicago longer than the official convention dates, which would attenuate any effect
    • we can mitigate this by using placebo dates or by an event study design
  • DNC visitors may visit other parts of the city, attenuating the effect
    • we can mitigate this by maybe searching for spikes in other areas (but introduces a multiple testing issue)

Since the effects were statistically significant, I am not worried about these attenuation biases.

Spillover into treatment - πŸ™

  • drawing buffers around the event centers weakens our identification strategy (marginally nearby transit might be spuriously related to the DNC itself)

This concern can be mitigated (for tract-level model) with a robustness check, by varying whether the buffer must contrain 100% of tract land area, 75%, 50%, etc.

Confounding Variables - πŸ™

  • the security perimeter around the event centers may actually suppress ridership

It will be hard to disentagle the security effect (-) from the DNC effect (+). One way might be to test the sensitivity of the model to varying buffer sizes (smaller than the perimeter, equal to the perimeter, larger than the perimeter).

Gravity and Catchement Models - 🫀

  • Theoretically I’m interested in true origins and destinations, not the “first/last stop” of transit, which is the data I have. For rideshares, the two are probably the same. But for trains and bikes we can assume people need to walk the last mile, which may mean crossing census tract boundaries. This is why I haven’t aggregated station-level ridership to tract level. I’d want to model some cross-tract spillover e.g. via a gaussian.

Fixing the Selection Problem

Matched Pairs

One way to mitigate selection bias is to choose control units that are “like” treatement units’ at baseline. I can model

$$ P(\text{near convention} | X) $$

and then find other units with high probabilities that were in fact not near the DNC. Unfortunately, I don’t have a strong theoretical definition of “likeness”, nor the data to measure it. Ideally I’d like to operationalize “ability to handle large crowds”. I didn’t see maximum fire code capacity in Chicago’s building footprint dataset. But I can measure attendance at crowded events.

Attendance Model

I’ll compare the DNC locations (United Center and McCormick Place) to Chicago’s other major event venues. I chose Wrigley Field, Guaranteed Rate Field, and Soldier Field because per-game sports attendance data is readily available1. I also pull in conference event data for McCormick Place2.

Now the control group is more “similar” in terms of its transit patterns. I’ll include event attendance as a variable in the regression.

One drawback of this method is that I can only now compare “event days”, drastically reducing the sample size. Worse, Soldier Field and Guaranteed Rate Field do not have games during the DNC, reducing the active control group just to transit options near Wrigley Field.

Event timeline
Fig: Event timeline.

The sample sizes are just too small.

Stadium Model

On second thought, using attendance as a regression variable makes it hard to interpret our treatment effect. Ceteris peribus, I’d be modeling the effect of the DNC in excess of the DNC attendees. That’s not at all what I want.

Why not drop the attendance term, but keep the reduced sample of transit near stadiums on all game/non-game days. Now the treatment and control are much more similar in terms of transit density:

Not Near DNC Near DNC P-Value
train stations 12 8
station-days 732 468
daily rides, mean (SD) 3098.3 (2692.9) 2093.5 (1985.5) <0.001
log(daily rides), mean (SD) 7.7 (0.9) 7.0 (1.8) <0.001
bus_distance, mean (SD) 230.0 (340.2) 165.3 (167.7) 0.580
bike_distance, mean (SD) 232.1 (164.3) 288.5 (281.6) 0.620
sqrt(area), mean (SD) 1919.2 (1141.3) 3123.8 (3564.8) 0.382
lat, mean (SD) 0.4 (1.2) -0.3 (0.3) 0.073
long, mean (SD) -0.1 (0.9) -0.8 (1.1) 0.134

bike docks 75 47
dock-days 3986 2846
daily rides, mean (SD) 84.1 (64.9) 66.7 (55.9) <0.001
log(daily rides), mean (SD) 4.0 (1.2) 3.9 (0.9) <0.001
train_distance, mean (SD) 1232.8 (1050.7) 1056.5 (1240.4) 0.420
bus_distance, mean (SD) 140.0 (145.4) 105.6 (73.5) 0.087
sqrt(area), mean (SD) 603.4 (215.9) 734.0 (711.5) 0.227
lat, mean (SD) 0.3 (1.3) -0.4 (0.4) <0.001
long, mean (SD) 0.0 (0.8) -0.2 (1.2) 0.179

uber tracts 64 40
tract-days 3877 2428
daily rides, mean (SD) 746.7 (2018.7) 927.5 (1896.5) <0.001
log(daily rides), mean (SD) 5.5 (1.5) 5.7 (1.5) <0.001
train_distance, mean (SD) 2010.3 (1149.6) 2293.7 (1027.7) 0.195
bus_distance, mean (SD) 567.4 (324.9) 503.0 (313.2) 0.317
bike_distance, mean (SD) 840.2 (357.8) 889.5 (504.9) 0.592
sqrt(area), mean (SD) 608.4 (254.8) 784.4 (320.0) 0.004
lat, mean (SD) 0.5 (1.2) -0.3 (0.4) <0.001
long, mean (SD) -0.0 (0.7) -0.5 (1.2) 0.028

I estimate the same difference in difference model as before:

$$ \log{rides_{it}} \sim \beta_0 + \beta_1 \text{DNC}_t + \beta_2 \text{near DNC}_i + \beta_3 \text{DNC}_t \text{near DNC}_i + X_{it} + u_{it} $$
DiD (Uber) DiD (Train) DiD (Bike)
Near DNC 0.6760* 0.1569 -0.0082
(0.3960) (0.4565) (0.1781)
During DNC -0.1139*** -0.1447 -0.0957
(0.0412) (0.0980) (0.0737)
Near DNC:During DNC 0.2746*** 0.8982 0.4492***
(0.0854) (0.6198) (0.1048)
log(dist to train) -0.2957 0.1952***
(0.1995) (0.0526)
log(dist to bike) 0.1554 0.5530
(0.3104) (0.3475)
log(dist to bus) 0.2676* -0.0356 -0.0679
(0.1437) (0.1728) (0.1009)
R-squared 0.5182 0.4228 0.4733
R-squared Adj. 0.5170 0.4159 0.4722
N 6718.0 1280.0 7026.0

In the control group we observe a (non-causal) -10.8%, -13.5% (NS), -9.1% (NS) percentage-point change in rideshare, train, and bike rides, which agrees directionally with the un-subsetted data. The causal effect of the DNC on rideshares, train, and bike rides near the DNC is a +31.6%, +145.5% (NS), and +56.7% percentage-point change, an increase in magnitude.

As a robustness check on this model, I plot the parallel trends and run a placebo test, shifting the simulated treatment period back across 8 x 4-day windows. 38% of the simulated models returned statistically significant main effects, but the actual DNC effects (x’s) were much larger than the simulated effects (box & whisker).

Placebo Test
Fig: Placebo Test.

Conclusion

I attempt to overcome selection effect bias by a pseudo-matched pairs method, comparing transit ridership near large event centers in Chicago. The results agree directionally and give credence to the original experimental design.

Footnotes


  1. CSV’s downloaded from the sports-reference family of websites. ↩︎

  2. Scraping events listed on tradefest.io ↩︎

Nifty tech tag lists fromΒ Wouter Beeftink