Skip to main navigation Skip to search Skip to main content

Towards a distributed infrastructure for data-driven discoveries & analysis

  • Mohammed Elshambakey
  • , Mohamed Khalefa
  • , William J. Tolone
  • , Sreyasee Das Bhattacharjee
  • , Huikyo Lee
  • , Luca Cinquini
  • , Shannon Schlueter
  • , Isaac Cho
  • , Wenwen Dou
  • , Daniel J. Crichton
  • University of North Carolina at Charlotte
  • City for Scientific Research and Technology Applications
  • University of Loiisville
  • Jet Propulsion Laboratory, California Institute of Technology

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

Big data analytics traditionally involves download of massive amounts of datasets to common server/cluster for processing. Analytic process gets slower with increasing size of required data and network conditions. Data scientists also need explicit access to data locations to download required data. Explicit access to required data may not always be granted due to security reasons. To simplify and accelerate the analytics process on distributed big data with security considerations, we proposed the Virtual Information Fabric Infrastructure (VIFI) for data driven discoveries. Instead of moving large amounts of data to a common place of processing, VIFI allows automatic transfer of required analytics programs to the distributed data locations for in-place processing of relevant data. VIFI allows data scientists to conduct and coordinate complex analytics processes on distributed data repositories using containerization technology and open-source workflow design tools. VIFI alleviates users from having detailed knowledge of distributed data locations, as well as required dependencies, installation and configuration of analytical libraries. In this paper, we demonstrate our current and future work to improve the VIFI architecture using previous and additional uses cases, data management layer that simplifies search of relevant data sets through addition of metadata, integration with security policies at different institutions with the proposed VIFI security layer, and the use of a user-friendly web interface to carry different VIFI activities.

Original languageEnglish
Title of host publicationProceedings - 2017 IEEE International Conference on Big Data, Big Data 2017
EditorsJian-Yun Nie, Zoran Obradovic, Toyotaro Suzumura, Rumi Ghosh, Raghunath Nambiar, Chonggang Wang, Hui Zang, Ricardo Baeza-Yates, Ricardo Baeza-Yates, Xiaohua Hu, Jeremy Kepner, Alfredo Cuzzocrea, Jian Tang, Masashi Toyoda
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages4738-4740
Number of pages3
ISBN (Electronic)9781538627143
DOIs
StatePublished - Jul 1 2017
Event5th IEEE International Conference on Big Data, Big Data 2017 - Boston, United States
Duration: Dec 11 2017Dec 14 2017

Publication series

NameProceedings - 2017 IEEE International Conference on Big Data, Big Data 2017
Volume2018-January

Conference

Conference5th IEEE International Conference on Big Data, Big Data 2017
Country/TerritoryUnited States
CityBoston
Period12/11/1712/14/17

Fingerprint

Dive into the research topics of 'Towards a distributed infrastructure for data-driven discoveries & analysis'. Together they form a unique fingerprint.

Cite this