A First Step Towards a Streaming Linked Data Life-Cycle

Tommasini R., Ragab M., Falcetta A., Valle E.D., Sakr S.

2020-11-01

Alongside with the ongoing initiative of FAIR data management, the problem of handling Streaming Linked Data (SLD) is relevant as never before. The Web is changing to tame Data Velocity and fulfill the needs of a new generation of Web applications. New protocols (e.g. WebSockets and Server-Sent Events) emerge to grant continuous and reactive data access. Under the Stream Reasoning initiative, the Semantic Web community has been actively working on query languages, engines, and vocabularies to address the scientific and technical challenges of taming Data Velocity without neglecting Data Variety. Nevertheless, a set of guidelines that showcase how to reuse existing resources to produce and consume streams on the Web is still missing. In this paper, we walk through the life-cycle of streaming linked data. We discuss the challenges of applying FAIR principles when publishing data streams. Moreover, we contextualise the usage of prominent Semantic Web resources, i.e., (i) TripleWave, R2RML/RML, VoCaLS, RSP-QL. We apply the guidelines to three representative examples of real-world Web streams: DBpedia Live changes, Wikimedia EventStreams, and the Global Database of Events, Language and Tone (GDELT). Last but not least, we open-sourced our code at https://w3id.org/webstreams.

DOI: link