Recent trends in the Web of Things (WoT) have led to data explosion. Data lake (DL), as a flexible on-demand heterogeneous data management architecture, has become a feasible solution in… Click to show full abstract
Recent trends in the Web of Things (WoT) have led to data explosion. Data lake (DL), as a flexible on-demand heterogeneous data management architecture, has become a feasible solution in data management. Metadata modeling for DLs is the key basis for smart analysis and processing. However, the varieties in structures and semantics of industrial WoT data hinder metadata modeling and maintenance. Moreover, the lack of textual descriptions and the semantics hidden in value streams make it hard to automatically construct semantic metadata. The dynamic nature of WoT requires on-time evolution on metadata. To overcome these challenges, we propose an automated bottom-up metadata generation approach for DL of WoT applications. Applying a data-driven framework, raw data are notated as linked data and self-organizing map-based online clustering is applied to real timely extract data characteristics. To recognize entities, concepts and relations, semantics-based entity discovery approach from short texts is proposed according to the feature of WoT data. The numerical analysis is performed to find the hidden relations from raw values. Full-dimensional metadata with rich semantic knowledge are finally built. Experiments on a real-world dataset are conducted to verify the effectiveness of methods and a case study on an energy WoT system is provided to demonstrate the feasibility of the approach.
               
Click one of the above tabs to view related content.