JSON – Data Format to Data Type


In the era of API’s, Cloud Tech and Big-Data, one often encounters the format for data data exchange, which is rather becoming ubiquitous today is JSON. JavaScript Object Notation, popularly known as JSON gained attention due to its ease of use and quick to interpret format.  It’s precursor is XML (eXtensible Markup Language), which paved way for a structured representation of data in a unstructured world with a “predefined schema” that can be used for communication across disparate platforms.


XML was widely embraced by many B2B and EAI applications, but JSON has trumped in its adoption across different applications and usage moving beyond being just a protocol for data exchange.

JSON uses a “Key-Value” pair based notation to describe an object, wherein the Key represents attribute name and Value represents the actual value for the attribute. It is similar to the tags in XML, but the support for more complex data types (e.g. embedded arrays), light weight versus verbose format gives JSON the advantage, which subsequently led to its adoption across diverse areas.

JSON popularity in BI & Analytics space stems in parallel with rise of Big Data viz un-structured data management and analysis. Here are some use cases where JSON has cemented its position as the most efficient data format/type and understanding the

  • ETL tools Recognized the rise of JSON as a data format and have implemented standard parsers to recognize and process the data as part of source definition. Tools provide option to load files directly or connect with web service and fetch data by using GET method
    • JSON files for directly loading from file locations
    • Web Service calls to connect via API
  • NoSQL Databases (Big Data) JSON has emerged as de-facto Internal storage mechanism for any data for the following types of NoSQL databases
    • Document Store – MongoDB
      • Data is stored as document wherein each JSON object is a document
      • Multiple documents will reside in a collection
    • GraphDB – Neo4js
      • Data is stored as JSON object for nodes and relationships
  • RDBMS JSON data is defined as a unique data type and stored within the database. Different database vendors parse and manage them in different manner.
    • SAP HANA
      • Manages JSON document similar to “Document Store” databases.
      • Collection is defined to store and retrieve data, which is managed independently of structured data that reside in schemas
    • Oracle
      • Define a constraint to check and load data into a column as JSON, where the column is a standard varchar of CLOB data type.
  • SQL Parsers The basic property of JSON is its nature to have a schema less design and representation of actual data. Query parsers on other hand need a structure in order to parse and . The Key in Key-Value pair is identified as a field and the entire document is treated as record. It also allows join operations between JSON data and table.

In conclusion, JSON may not be a complex technically, but it this new era of convergence of data types/formats we have better options to store and process data. This is critical during data modeling and architecture design of projects, wherein different requirement across organization may be conflicting and utilizing best of both worlds technologies will provide optimal benefits.