Timely and efficient analysis of big data collected from various gateways installed in a smart city is an intractable problem and requires immediate priority. Given the stochastic and massive nature… Click to show full abstract
Timely and efficient analysis of big data collected from various gateways installed in a smart city is an intractable problem and requires immediate priority. Given the stochastic and massive nature of big data, the existing literature often relies on artificial intelligence techniques based on information theory. As a new approach, this paper presents a knowledge extraction method based on an analysis of Seoul Metro’s ’untraceable’ ridership big data. Without identification information, the untraceable ridership data only shows the hourly accumulation of station entry and exit information. To reconstruct the missing information in the data set, this study proposes a fluid dynamics model and adopts a heuristic genetic algorithm based on optimization theory as the problem solver. The result of our model presents the distribution of the elapsed time defined on an hourly basis taken until a passenger returns to the station they departed from. To validate our model, we acquired subway ridership data with passengers’ identification with permission from Seoul Metro. This paper presents two novel aspects of subway ridership, namely the dependency on departure time and the discrepancy between weekend and weekday traffic. Our analytical approach contributes to solving the problem of extracting hidden knowledge from big collection of data missing critical information, e.g., constantly and autonomously gathered data fragments from numerous gateways in smart cities.
               
Click one of the above tabs to view related content.