The Next Level of Automated Data Quality Measurement
Sprache des Vortragstitels:
Englisch
Original Tagungtitel:
The 14th Annual MIT Chief Data Officer and Information Quality (MITCDOIQ) Symposium
Sprache des Tagungstitel:
Englisch
Original Kurzfassung:
Central and automated monitoring of the data quality in integrated enterprise information systems is still a major challenge. Enterprise data is usually distributed across several heterogeneous information systems and is maintained by different parties (departments). In most cases, local information systems are developed autonomously and therefore the schema development cannot be tracked reliably and complete global domain knowledge is not available. Such a heterogeneous scenario has a number of negative effects. In particular, it is difficult to monitor data quality (e.g., completeness, accuracy, and timeliness) centrally. To overcome these problems, we have developed DQ-MeeRKat, a DQ tool that exploits the power of knowledge graphs to provide a global, homogenized view of data schemas. In this presentation, I will introduce the concept and advantages of "reference data profiles", which are annotated to the knowledge graph and serve as quasi-gold-standard to automatically and continuously verify the quality of manipulated data (insert, update, delete). To ensure that changes in the knowledge graph are globally visible and reliably traceable, a blockchain is used to make the knowledge graph tamper-proof. With DQ-MeeRKat, a knowledge graph for the global integrated schema and the annotated reference data profiles, chief data officers can reach the next level in DQ measurement.
Sprache der Kurzfassung:
Englisch
Vortragstyp:
Hauptvortrag / Eingeladener Vortrag auf einer Tagung