Skip to main navigation Skip to search Skip to main content

SchemaDrill: Interactive semi-structured schema design

  • William Spoth
  • , Ting Xie
  • , Oliver Kennedy
  • , Ying Yang
  • , Beda Hammerschmidt
  • , Zhen Hua Liu
  • , Dieter Gawlick
  • SUNY Buffalo
  • Oracle

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Scopus citations

Abstract

Ad-hoc data models like JSON make it easy to evolve schemas and to multiplex different data-types into a single stream. This flexibility makes JSON great for generating data, but also makes it much harder to query, ingest into a database, and index. In this paper, we explore the first step of JSON data loading: schema design. Specifically, we consider the challenge of designing schemas for existing JSON datasets as an interactive problem. We present SchemaDrill, a roll-up/drill-down style interface for exploring collections of JSON records. SchemaDrill helps users to visualize the collection, identify relevant fragments, and map it down into one or more flat, relational schemas. We describe and evaluate two key components of SchemaDrill: (1) A summary schema representation that significantly reduces the complexity of JSON schemas without a meaningful reduction in information content, and (2) A collection of schema visualizations that help users to qualitatively survey variability amongst different schemas in the collection.

Original languageEnglish
Title of host publicationProceedings of the Workshop on Human-In-the-Loop Data Analytics, HILDA 2018
PublisherAssociation for Computing Machinery, Inc
ISBN (Electronic)9781450358279
DOIs
StatePublished - Jun 10 2018
Event2018 Workshop on Human-In-the-Loop Data Analytics, HILDA 2018 - Houston, United States
Duration: Jun 10 2018 → …

Publication series

NameProceedings of the Workshop on Human-In-the-Loop Data Analytics, HILDA 2018

Conference

Conference2018 Workshop on Human-In-the-Loop Data Analytics, HILDA 2018
Country/TerritoryUnited States
CityHouston
Period06/10/18 → …

Fingerprint

Dive into the research topics of 'SchemaDrill: Interactive semi-structured schema design'. Together they form a unique fingerprint.

Cite this