For the last few months I’ve been scraping SORTA’s real-time GTFS data for vehicle positions. I don’t really have any plans for what to do with all of it, but it’s easy for me to collect and I figure someone else may have a use for it somewhere down the road. This could eventually be a very interesting dataset for looking at e.g. changes in on-time performance, traffic congestion, bunching, etc.
Essentially, for each vehicle in operation at a given moment, I’ve been storing its:
The API updates all vehicle locations every 30 seconds and I’ve been requesting updates every 25 seconds and ignoring duplicates, so I should have all of the data on vehicle positions that have been made publicly available. I’ve tried to keep my script running steadily, but there have inevitably been a few interruptions as the postgresql server has been restarted, etc, so there may be some big gaps. Where there is any data, it should be complete; It just may skip out for a day or two. The earliest date I have is 2017-05-16 17:59:36.
Anyway, here is the script I’ve been using along with ancillary files:
and a compressed SQL dump of the PostGIS DB:
The script is still running, so if you find this post in a year, hit me up for some fresher data!
The data at the above link has now been updated as of January 27, 2018.
Hi, I created a new site for NTD data that I thought you might be interested in: http://www.nationaltransitdatabase.org