Enthusiasm for and investment in Big Data and the Cloud is spurring innovation in a suite of new technologies that seek to transform information into knowledge at reduced costs. But the potential of Big Data and the Cloud is threatened by security, privacy, legal and regulatory constraints which prevent data integration and information sharing.
While the costs to capture, store and exploit data are declining, the costs of mishandling data are rising for every enterprise; and threaten to extend the data-poor environments in which we have long operated, forcing continued inferences and limits on data insights.
Technology leaders like Google,Facebook and Target have reshaped their industries using Big Data, but each is facing increased scrutiny over data handling. The result has created an atmosphere of concern and trepidation and has deterred many in the Fortune 1000 from embracing Big Data.
The relationship between Big Data security and Big Data innovation is not zero-sum, but rather they are mutually reinforcing concepts. Traditional data security approaches, which have proven inadequate, deal with disequilibrium by seeking counterbalance. In this case more security, more privacy, and more constraints lead to limited data access, continued fragmentation of data sets, and missed opportunities.
Instead of addressing these challenges as an afterthought or applying solutions around the edges, solutions that bake in and address security, privacy, legal and regulatory constraints from the onset enable new insights, while simultaneously building trust and transparency. Such a data-centric security model promotes adaptability and re-conceptualizes the relationship among data, users and applications and reduces administrative burdens and risks. Simultaneously it unlocks the potential for innovation and serves as a mechanism for supporting the integration of disparate data sets and for more complete information sharing.
As an example, an Electronic Health Record (EHR) has hundreds of unique facets (e.g., patient name, weight, age, height, social security number, insurance provider, date of last visit, medications prescribed, allergies, etc.). Should one of these elements demand the need for additional controls, the traditional approach would extend that constraint to the entire data set or create multiple replicated sanitized data sets. This solution negatively impacts the potential for other data facets to be leveraged by a more diverse user base and creates both analytic and administrative inefficiencies.
Today few companies successfully leverage their indigenous Big Data holdings and fewer have been able to integrate disparate data sets to reveal new key insights in the Cloud. A model that supports the integration of data, and the interaction of a diverse community to realize the potential of Big Data as a public good, may provide solutions for combating the spread of disease,crime, war, or improving government accountability and transparency in addition to commercial interests. A data-centric approach allows security to be an enabler for analytic adaptability. This approach is assured to keep pace with and even lead industry demands for innovation and data controls. Neutralizing security, privacy and regulatory concerns unlocks the full potential for Big Data as a transparent, participatory and collaborative system that can adapt to dynamic demands over time.
About sqrrl: sqrrl is a Boston Big Data startup founded by data scientists from the National Security Agency. The team members are the original developers, committers, and contributors to Apache Accumulo, which is a secure and highly scalable database. sqrrl’s product, sqrrl analytics, provides a multi-tenant Big Data solution powered by Accumulo that allows organizations to eliminate individual data silos, and integrate data that has been separated, through cell-level security. The sqrrl analytics extends the core Accumulo capability to provide a solution that allows organizations to easily build powerful applications, and find hidden value in their data through both investigative as well as operational analytics.
|This article is part of a guest series highlighting the efforts of hack/reduce (a nonprofit established to create the next generation of Big Data technologies and applications) and the companies participating in hack/reduce’s program. GoGrid is the infrastructure sponsor of hack/reduce. Learn more about GoGrid’s Big Data solutions.|