For the last few months I’ve been scraping SORTA’s real-time GTFS data for vehicle positions. I don’t really have any plans for what to do with all of it, but it’s easy for me to collect and I figure someone else may have a use for it somewhere down the road. This could eventually be a very interesting dataset for looking at e.g. changes in on-time performance, traffic congestion, bunching, etc.
Essentially, for each vehicle in operation at a given moment, I’ve been storing its:
vehicle_id: vehicle ID given by the API
trip_id: trip_id given by the API. I believe this corresponds to the trip_id in the GTFS package for the corresponding period.
report_time: the timestamp field, per vehicle, given by the API. This is stored as Greenwich time, without a time zone, so you need to subtract a few hours to get to local time.
location: I’ve been storing everything in a PostGIS database, and the location datatype in this case is geometry(POINT,4326). Postres dumps this as a hexadecimal string.
The API updates all vehicle locations every 30 seconds and I’ve been requesting updates every 25 seconds and ignoring duplicates, so I should have all of the data on vehicle positions that have been made publicly available. I’ve tried to keep my script running steadily, but there have inevitably been a few interruptions as the postgresql server has been restarted, etc, so there may be some big gaps. Where there is any data, it should be complete; It just may skip out for a day or two. The earliest date I have is 2017-05-16 17:59:36.
Anyway, here is the script I’ve been using along with ancillary files:
In case anyone is interested, I’ve begun storing data from SORTA’s GTFS-realtime feed. I started the script today and I see no reason to stop it unless I start running out of space on my server. I’ve got GTFS trip_id’s, vehicle_id’s, locations and timestamps for every vehicle location update from here until infinity.
Let me know in the comments if you happen to be interested and I’ll find a way to share the data!
I’m excited to announce that I’ll be spending part of this summer working with Daniel Schleith and Brad Thomas on a rather exciting project… Thanks to a grant from People’s Liberty, we’re going to have the opportunity to develop what I believe will be the first application to make use of SORTA’s recently released real-time data.
The goal of the project is to get real-time arrival displays into businesses along major transit lines. These will be privately owned and operated computer displays that ingest real-time data through the interwebs and display localized arrival predictions for nearby stops. We’ll be developing a display/app1, and subsidizing the purchase of tablet computers which can then be mounted behind a bar, in a shop window, near the door of the coffee shop, etc.
I’m sure I’ll have lots more to say here on the topic in the near future, but for now, I leave you with hope only.
The data is structured according to the GTFS real-time specification. I was able to parse it pretty easily in Python by following the instructions on that page. The fields currently included in the feed (many are optional in the specification) are as follows.
The feeds update every 30 seconds, which seems a little slow, but oh well.
Right now, my understanding is that these feeds have been tentatively released as-is for developers only, and that SORTA is not ready yet to make a general public announcement that real-time data is available. Tim Harrington at SORTA, who shared the links with us, has politely asked to see the neat stuff that we’re able to develop with this data. I imagine that the sooner someone sends him a link to a decent, working app, the sooner they’ll give us the go-ahead and the sooner we’ll all be able to use this data in every-day situations.
So who’s gonna make an app? There must be a dozen open-source applications that are already designed to work with GTFS-realtime. We probably just need to plug this feed in and maybe make a few localization tweaks. If you or anyone you know has the skills and/or interest to make an app…then for the love of transit, let’s make this happen ASAP!
A wee nit to pick from SORTA’s recent “State of Metro'” dog and pony show:
I distinctly remember one of the speakers saying something like ‘and all this without raising fares!’, and this to my feeble memory reeked of bullshit, so I found the numbers again and ran them to see if I was remembering correctly. I was indeed.
Here are the facts, as reported by SORTA to the Federal Transit Administration. Over the period where we have data on both fare revenues and ridership (currently 2002 to 2012) SORTA has been steadily getting more money from fare revenues while moving fewer passengers. We are currently at the nadir of this trend, with
More fare revenue than ever
Fewer passenger trips than ever
When SORTA says by the way that they are a ‘most efficient’ agency, a title pinned on them by the laughably unscientific UC Economics Center, it is precisely this measure they have in mind. There is hardly a better example of doublespeak to be found. Here’s the trend:
In order to plot both agencies together, I normalized fares and passenger trips to the same range. The scale is linear.
Now you may rightly note that the standard fare for a zone 1 trip hasn’t changed lately. But that’s not the only kind of fare that can be paid. It might not even be the most common! I don’t know for certain. I haven’t personally paid standard fare in quite a while because my transit use is partly subsidized by UC. So for example, the fare revenue variable in this data almost certainly includes UC’s cash subsidy for my fare as well as the dollar I put in myself. Multiply that by the dozens of private fare subsidies each agency probably negotiates (or drops) each year and you get a more dynamic picture. Fare could also be effected, though probably isn’t, by people using transit cards more or less, while paying the same monthly price.
But anyway, I’ll be damned if f the total price paid by riders or their agents, on a per-trip basis doesn’t constitute a better definition of ‘fare’ than SORTA’s standard zone-1 single-segment price. And by that definition, fares have risen from $0.76 in 2002 to $1.78 in 2012 (+134%). For TANK, the change is from $0.72 in 2002 to $1.16 in 2012 (+60%). Adjusting for inflation, the changes are 84% and 26% respectively. So much for SORTA’s unchanging fares theory lie.
I’ll end with an ineffectual plea to the people at SORTA. Please, understand that when you speak in lies and euphemisms, no matter how nice your breakfast spread, you turn off clever people and retain only the idiots and the cynical. People from all three of these categories vote, to be sure, but I know who I’d rather spend my time with. And I know who could build the better transit system.