1 2 Boarding 3 Next Page 4 5

“without raising fares”

A wee nit to pick from SORTA’s recent “State of Metro'” dog and pony show:
I distinctly remember one of the speakers saying something like ‘and all this without raising fares!’, and this to my feeble memory reeked of bullshit, so I found the numbers again and ran them to see if I was remembering correctly. I was indeed.

Here are the facts, as reported by SORTA to the Federal Transit Administration. Over the period where we have data on both fare revenues and ridership (currently 2002 to 2012) SORTA has been steadily getting more money from fare revenues while moving fewer passengers. We are currently at the nadir of this trend, with

  1. More fare revenue than ever
  2. Fewer passenger trips than ever

When SORTA says by the way that they are a ‘most efficient’ agency, a title pinned on them by the laughably unscientific UC Economics Center, it is precisely this measure they have in mind. There is hardly a better example of doublespeak to be found. Here’s the trend:

a shitty decade for transit

In order to plot both agencies together, I normalized fares and passenger trips to the same range. The scale is linear.

Now you may rightly note that the standard fare for a zone 1 trip hasn’t changed lately. But that’s not the only kind of fare that can be paid. It might not even be the most common! I don’t know for certain. I haven’t personally paid standard fare in quite a while because my transit use is partly subsidized by UC. So for example, the fare revenue variable in this data almost certainly includes UC’s cash subsidy for my fare as well as the dollar I put in myself. Multiply that by the dozens of private fare subsidies each agency probably negotiates (or drops) each year and you get a more dynamic picture. Fare could also be effected, though probably isn’t, by people using transit cards more or less, while paying the same monthly price.

But anyway, I’ll be damned if f the total price paid by riders or their agents, on a per-trip basis doesn’t constitute a better definition of ‘fare’ than SORTA’s standard zone-1 single-segment price. And by that definition, fares have risen from $0.76 in 2002 to $1.78 in 2012 (+134%). For TANK, the change is from $0.72 in 2002 to $1.16 in 2012 (+60%). Adjusting for inflation, the changes are 84% and 26% respectively. So much for SORTA’s unchanging fares theory lie.


I’ll end with an ineffectual plea to the people at SORTA. Please, understand that when you speak in lies and euphemisms, no matter how nice your breakfast spread,  you turn off clever people and retain only the idiots and the cynical. People from all three of these categories vote, to be sure, but I know who I’d rather spend my time with. And I know who could build the better transit system.

Comments: 2
Posted in: Analysis | Data | Politics
Tags: | | | | |

Another Stab at the 2014 Ridership Dataset

I’m taking a self-guided course in R this semester — that is, teaching myself, but with deadlines — and since I’ve been playing with transit data for the most part, it seems appropriate to tickle y’all with some of the mildly interesting data visualizations that I’ve so far produced.

I’ll be using the 2014 SORTA spatio-temporal ridership dataset, which I’ve already sliced a couple different ways on this blog. The first was here with a set of animated maps andthe second here showing basic peaking in passenger activity through time.

This time, I’m going to take that later analysis a little further by breaking out passenger activity into lines. Go ahead and take a look at the graphic, which I’ll explain in more detail below.

SORTA 2014 ridership by time and lineOk. So first, it’s important to understand what we’re measuring here. Our dataset tells us the average number of people getting on a bus (boarding) and the average number getting off (alighting) for each scheduled stop. There are1 about 162,000 scheduled stops on a weekday. Of those, I was able to identify a precise, scheduled time for all but ~ 2,0002. Of the remaining ~160,000 the dataset tells me that 77,763 have at least 0.1 people boarding or alighting on an average weekday. I used those stops to calculate a weighted density plot over the span of the service day for each route. Added together of course, the individual routes sum to the total ridership for the system3.  I then sorted the routes by their total ridership and plotted them.

The first thing that becomes clear, to me at least, is that a minority of SORTA’s lines account for a large majority of actual riders. These lines by the way are precisely the ones featured in the Cincinnati Transit Frequency Map, and I’ve used their color from that map to distinguish them in the chart above. The remaining routes, as I knew even before I had this data, are relatively unimportant.

transit map of Uptown Cincinnati

May 2013 routing

The one grey line mixed in among the colored lines is the m+ (a latecomer to the frequency map), which does actually run all day on weekdays.

Now another interesting question, to me at least, is what this would look like without the pea under the mattress; how large are the rush-hour peaks if we exclude the peak-only lines from the chart? Let’s try it. I’ll also reverse the order, so we can see some of the larger lines with less distortion.SORTA 2014 ridership by time and line no-expressWell, the rush-hours are still pretty distinct. More distinct than I would have expected. It’s an open question whether this is the result of more service in the rush-hours, or more crowding at the same level of service.

One last way (for now) to slice the data will be to take the total ridership at any given moment, and relativize each line’s total, showing each line’s percent share of the total. To keep it easy to read, I’ll leave the peak-only lines out of this one too.SORTA all-day lines ridership flow diagramI found it slightly surprising how straight these lines are. Only toward the end of the day do we see a major wobble in any direction, and that’s essentially the result of a few lines shutting down earlier than the others.

Show 3 footnotes

  1. or were when this data was collected
  2. These ~2,000 stops seem to account for about 1,000 passengers
  3. Minus the missing values for the records which couldn’t be matched.
Comments: Leave one?
Posted in: Data
Tags: | | | |

KINDA T-shirts now available

Just in time for Xmas, KINDA t-shirts are now available for purchase at Rock, Paper, Scissors in Over-the-Rhine!

KINDA logo

(I also have the TANK T-shirts there too)

TANK Transit Authority of Northern Kentucky logo

The KINDA shirts are printed on grey-white Gildan tees in small through extra large. The TANK shirts are printed on a slightly lighter Tultext 50/50 cotton/poly tee, available in small through large. The TANK shirts are available in both brown and grey.

Price is $20. If you’re not able to make it to OTR, you can also just email me and have one delivered straight to your door(!) for an extra $5 shipping.

Wear one to your next public meeting! Or while you’re riding your bike past some poor bus stuck in traffic ;-)

Who the hell named these transit agencies anyway?

Comments: 2
Posted in: Silly Bullshit
Tags: | | | |

Tree Map!

Just dicking around with R a bit this afternoon…here is a tree map showing the number of weekday transit riders in each city neighborhood (area) compared to the size of the neighborhood (color).

1. Downtown has many transit riders (because it’s chart area is proportional to that number)
2. Downtown has many riders for it’s geographic size (because it’s dark green in the chart)

SORTA 2013 rdership tree mapData is from SORTA, 2013 and is available on the data page. Any stop outside of the City limits was classified as ‘other’.

Comments: 2
Posted in: Data
Tags: | |

Schedule Padding in Time

An excerpt from (the background section of) the first draft of my thesis proposal:

“The bus is so slow! Isn’t rail just better?”
The popular confusion created by the conflation of coaches and railcars with their typical relation to automotive traffic has wasted billions of public dollars1 (and caused me no end of frustration). Though the superficial wheel may not much matter, the general public is right to sense a distinction in the reliability and speed of transit services that operate in mixed traffic and those that are given priority over such traffic. As the public more and more aggressively demands train-based transit services, these should be read as demands for increased speed and reliability (among several other things) and planners should respond by modifying existing services to meet these demands.

Speed and reliability are a function in large part of how many potential delays a line will encounter along it’s course. Random delay results from unplanned disruptions such as higher than expected passenger loads, traffic, serial red lights, etc. Scheduled delay, also known as schedule ‘padding’ is delay that is built into scheduled transit services that allows them to be tolerant of unscheduled disruptions by acknowledging their average effects in advance. Agencies try to balance scheduled and unscheduled delay to create schedules that are neither too slow nor too often disrupted by random delay. While the public often reacts negatively to random delay events, they’re typically unaware of schedule padding, though both are dependent on basically the same environmental factors.

Since the public, and not transit schedulers, are in control it becomes important to explain delay and it’s causes and effects to a lay audience and thereby to direct them toward a fruitful response. Further, since funds for radical infrastructure interventions may be difficult to find in the current political regime, attention should be focused on marginal cases and incremental improvements to surface-running bus lines.

Simply, the question is: where can the smallest new delay-avoidance technique create the biggest improvement in speed and reliability for existing services?


As a first step toward an answer to this question, I’ve created a rough measure of the amount of schedule padding identifiable from just the schedule information itself. It’s not perfect by any means, but I’m going to run with it for a moment and see where it leads me. First I identified all unique trip segments in the transit system. A segment is defined here as the travel between two unique stops, so…

( stop A -> travel -> stop B ) = segment 1
( stop B -> travel -> stop C ) = segment 2
( stop A -> travel -> stop B ) = segment 1 again
( stop B -> travel -> stop A ) = segment 3
for a total of 4 segments, 3 unique in our example.

For SORTA’s current GTFS schedule, we observe:

Total Segments Unique Segments
Weekday 164,621 5,159
Saturday 103,757 3,914
Sunday 69,380 3,550

Each segment has some times associated with it:

  1. A departure from the first stop
  2. An arrival at the second stop
  3. Implicitly, the time scheduled to complete the segment.

Because schedulers expect that the amount of time it will take a bus to get from A to B will be different at different times of day, these otherwise identical segments will have different durations. By finding the deviation from the minimum duration for each segment, we can get a crude measure of the schedule padding built into the system.


Hours of Padding Scheduled Vehicle Hours % padding
Weekday 414.60 1,925.22 21.53%
Saturday 174.19 1,036.57 16.80%
Sunday 97.36 664.13 14.66%

This method estimates that 21.5% of the weekday schedule is actually scheduled delay, more than 400 hours of it, each weekday. That is, at least relative to the fastest any bus is scheduled to complete a segment. Just where is this scheduled delay anyway? When and where are the schedules most heavily padded? I’ll save a spatial exploration for later, but let’s take a very preliminary peek into the temporal dimension.

The first question we must ask is: when are all of the segments? By taking a central moment as the time, we can plot them, in a kernel-smoothed histogram :

SORTA segments per hourThis clearly shows the basic level of service throughout the day and week. It’s not a great measure of that as such, but it does give us a definite sense of the balanced weekday rush-hours and diminished weekend service.

Then since most of the segments are padded,we ask when are the segments without padding? On the same scale, we get:

SORTA unpadded segments per hourAs we might have expected, there is less random delay and thus less need for padding when the streets and buses are less congested: early morning and late at night. It also appears that there are relatively fewer padded segments on the weekends, though the total number of unpadded segments is roughly the same as on weekdays.

Ok, so when is the padding itself and how much of it is there? Note that we’re measuring something different here: hours of padding per hour.

SORTA padding in timeNow, this definitely has a different shape than the overall distribution of schedule segments, but it’s a little hard to compare them when they’re so far apart. Let’s combine all of these into one plot. I might have got a little carried away in Inkscape…SORTA Schedule Padding by Time-of-Day

I’ll just let that speak for itself for now. We’ll get into spatial visualizations of this data next, and eventually real-time comparisons and measures in space-time.

Show 1 footnote

  1. Citation needed…
Comments: 2
Posted in: Back to Basics | Data
Tags: | | |

Why the buses just stop

A friend of mine just emailed with a very good question: Why do the buses just stop sometimes, with no one getting on or off, no traffic jam, etc, and no clear reason why they  aren’t running?

The Answer:
The buses here operate on a fixed schedule, with a set time for each and every stop. Planners try to adjust these times to accommodate varying traffic conditions like rush hours and other factors. Obviously, they can’t do this perfectly, and sometimes buses tend to run late, ie slower than the schedule says they ought to. Just as often, they should tend to get early. This can happen if traffic is particularly light, there aren’t as many passengers as usual, or maybe the bus just hits a string of green lights. When that happens, the driver tries to stay on-schedule by driving more slowly, or sometimes stopping completely. When they do stop completely, they’ll often take the opportunity to read a couple pages of their magazine, send a text, take a bite of lunch, etc.

To someone who doesn’t know what’s going on, it probably looks like the driver is behaving quite capriciously. Many systems make announcements explaining any longer-than-usual waits, but I’ll speculate that SORTA and TANK don’t do this because they’re not seriously trying to attract or retain new passengers. They see themselves as serving a relatively captive and stable audience who has already spent a lot of time learning the system’s quirks. This also explains their complete inattention to providing easy-to-understand overviews of the system in favour of obtuse and overly detailed maps and schedules.

Comments: 2
Posted in: Back to Basics
Tags: | | |
1 2 Boarding 3 Next Page 4 5