Don't know much about Baltimore, but concerning problem 2:

The effect of absent drivers and out-of-commission buses (statistically accounted for in a way including fraction of days) is to effectively be late by t + tav, where t is the schedule time between buses, and tav was the average lateness (a negative number if early) of the original process, without digitizing to "on-time" or "not-on-time." Assuming buses normally do not come within 1 + 5 = 6 minute intervals, we may assume that the statistical number of buses not on the road at any time, are not-on-time at their stops.

Therefore (N_nor) x n_delta not-on-time data points should be added,

where n_delta = n_original x ( N_nor / (N - N_nor) ). Here, N_nor are the statistical expectation for the number of buses not on the road at a given time, n_original is the number of data points available for the original study, and N is the total number of buses.