Every day, millions of New Yorkers rely on the MTA subway — a system so massive and complex that even small delays can ripple throughout the city. Trains arrive, depart and stop with a complex, ever-changing rhythm.
Capturing this rhythm in real time and translating it into something that humans can see and understand is a challenge designed specifically for analytics.
During my summer internship with SAS, this challenge inspired me to build a digital twin of the MTA subway system in New York City using SAS tools.
The goal was to transform live transit data into a dynamic, interactive dashboard that could provide insights into subway performance in real time. The project explored how SAS technologies can transform raw transportation data into a dynamic dashboard that tracks trains in motion and displays key performance insights second-by-second.
Definition of the subway system
Creating a digital twin began with mapping the physical world of the subway system and transforming it into a digital twin. Each train, station and track is defined as an asset with attributes such as location, status and route. Hierarchies link these assets to reflect how they work together on each subway line.
Static information was obtained from the NYCT-GTFS Python library, which provided basic details such as stop times, station IDs, and route structures. This information is processed in SAS® Via® Workbench and SAS® The studio develops code to process this textual information into JSON configuration files. SAS AutoMLForIoT then converts these definitions into a working SAS® Event Stream Processing (ESP) Project – Actively bringing the digital architecture of the subway to life.
Powered by live data
After selecting the system, the next step was to connect it to the MTA’s live train status updates, which are updated every 30 seconds. The Python script helped track individual trains, detect when their status changed, and determine whether they were en route, stopped, delayed, or completed their route.
This stream was processed in real time and output to three CAS tables – one each for trains, stations and tracks. These tables, accessed through SAS Studio, stored the latest status of the subway system and formed the basis for analysis and visualization.
Visualize movement in real time
To make the pulse of the subway visible, SAS® Visual analytics were used to design an interactive dashboard. The screen highlighted system-level KPIs, such as:
- Trains are currently in transit
- Average time between stations
- Platform dwell times
- Train delay rate
An interactive subway map provided a live snapshot of train locations and routes, while filters allowed users to focus on individual lines. Behind the scenes, the SAS code added calculated fields—such as delay signals and motion indicators—so users could instantly understand what was happening without sifting through raw data streams.
Lessons in real-time analytics
This project demonstrated how SAS tools can work together – from data ingestion to visualization – to manage the continuous flow of information. I have provided hands-on experience in data flow, digital coupled modeling, event stream processing, and Python integration with SAS for real-time response.
SAS Viya proved crucial in tying it all together. Its components – ESP Studio, CAS, and Visual Analytics – have created a seamless environment where raw data can evolve into a working solution in minutes, not hours.
Looking forward
This project focused on real-time monitoring, but lays the foundation for further exploration. Possible next steps include using historical data to train predictive models for train delays or expanding the digital twin to cover buses and rail. The flexibility of SAS technologies makes these extensions not only possible, but practical.
As a student, working with SAS on a project of this size was both challenging and rewarding. It showed me how powerful the platform is in solving real-world problems. SAS tools have given me the ability to design a complex public system, connect it to live data, and create insights that are updated in real time.






