Trending: Key Zillow Group exec Greg Schwartz leaving after 12 years, amid massive changes at real estate giant
Data from ORCA cards, like transfers, allowed a group at the Data Science for Social Good program to build applications visualizing this data. Photo: DSSG.
A group at the Data Science for Social Good program built applications to visualize data stored in ORCA cards, like the number of transfers at each stop. Photo: DSSG.

For Seattleites, traffic and transportation can be serious obstacles. Seattle’s booming population has led to busier streets and buses, and for many residents that poses a challenge to working, getting to school, or taking care of their kids.

But the solutions to Seattle’s transportation troubles may be closer than we imagined — buried deep in data sets. This summer, two groups at the University of Washington’s Data Science for Social Good (DSSG) program set out to harness this data, and create tools that will help citizens and transport officials tackle transportation problems.

One group examined nine weeks’ worth of ORCA card data, information that is gathered whenever a rider taps an ORCA card when boarding a bus. The data set consisted of 21 million individual data points that include location, route number, and time of boarding.

The group also used data from bus sensors, which measure the number of footsteps at a stop to estimate the number of people boarding a bus.

“We’ve created a suite of applications in an integrated dashboard to shed some light on the data,” said DSSG fellow Victoria Sass. The applications can visualize different subsets of the data, and allow users to look at patterns in ridership, overcrowding, and even the number of people using ORCA versus other payment methods.

“These applications we’ve created offered a lot of insight into what could be done with this data,” Sass said, “but there’s a lot more to be done.”

The group proposed that transit authorities could potentially use these tools to monitor bus use in real-time, and react quickly to issues like overcrowding. While the group is still fine-tuning applications, and awaiting approval to make the data public, they hope that transit authorities and citizens will be able to use the tools as early as this fall.

Another group in the program tackled a different transportation issue: sidewalks. For citizens with limited mobility, navigating a new area can be difficult, even impossible, without the right data.

“The key here is that we forget sidewalks when we’re creating maps, and it becomes a big problem,” said DSSG fellow Thomas Disley.

Map applications, like Google Maps, don’t include data points like crosswalks, the incline of a sidewalk, and cub stops — ramps leading onto sidewalks from the street. For those with limited mobility, this data is essential to daily navigation.

Over the course of the program, the group created a pathway to add sidewalk data to Open Street Maps (OSM), a crowdsourced map database.


The OpenSidewalks project allowed data on pedestrian passages to be added to existing OpenStreetMaps. Photo: DSSG.

After aggregating existing data with data from local municipalities, the group created a platform for users to verify the data before being published, allowing this complex project to be crowdsourced.

Nick Bolten, a project lead for the group, said they may apply for national grants to continue the work.

The Data Science for Social Good program, hosted by the UW’s eScience Institute, brings together students and scientists from across the globe to harness data science for social needs. Many of the projects worked on during the program continue after the summer ends, said eScience Institute Program Manager Sarah Stone.

This year’s program included two groups working outside of transportation.

One built a machine learning program that uses Amazon reviews  to predict if a product was recalled. The program mines texts of reviews for words like “rotten,” “mold,” and “vomit,” and uses the frequency of certain words to make predictions.

This group is working to make the program more accurate, with the hope that it could be used to predict product recalls and cut down the time of the current recall process. To do this, the group may bring in additional data from sources like social media.

The final group tackled an age-old problem: the census. Using data from call detail records and open source maps, the team build a model to predict poverty levels in large cities. Currently, poverty levels in many cities are estimated with raw population counts.

Their model currently works twice as well as baseline predictions, and the team plans to continue work on it and eventually present to the UN’s Global Policy Forum.

Like what you're reading? Subscribe to GeekWire's free newsletters to catch every headline


Job Listings on GeekWork

Find more jobs on GeekWork. Employers, post a job here.