Advanced JSON Data Loading for Autonomous Database

Published in

db-one

4 min readSep 29, 2021

JSON data loading can be achieved in several ways for Autonomous Databases. I can point to Jeff Smith’s blog for other methods such as using the Oracle REST Data Service (or ORDS).

Today, I’ll just demonstrate another simple method that is very useful to quickly load JSON data into SODA collections on Autonomous Databases.

This is a follow-up to my How to access Open Data with Oracle Autonomous Database blog post.

For demo purposes, I need to load JSON data that is hosted online and is publicly accessible. This is really important although the method presented here could be improved to support authentication…

The Goal: download JSON documents and load them into a SODA collection

For this, I’ll leverage the APEX_WEB_SERVICE PL/SQL package which can download files directly.

The next step is to install the following PL/SQL Types and the Function to do the magic:

CREATE OR REPLACE TYPE t_opendata_json_doc AS OBJECT ( json_document CLOB );CREATE OR REPLACE TYPE t_opendata_json_doc_collection IS TABLE OF t_opendata_json_doc;CREATE OR REPLACE FUNCTION getJSON( p_my_url IN varchar2, 
                                    p_json_path IN varchar2 default '$' ) 
    RETURN t_opendata_json_doc_collection PIPELINED AS  TYPE OpenDataCursorType IS REF CURSOR;
  l_clob   clob;
  cur OpenDataCursorType;  invalid_path_exception EXCEPTION;
  PRAGMA EXCEPTION_INIT(invalid_path_exception, -40561);
BEGIN
  l_clob := apex_web_service.make_rest_request(p_url => p_my_url, p_http_method => 'GET' );  begin
    -- uses the provided JSON Path to access JSON data
    open cur for 'select d.json_document from json_table(:l_clob, ''$'' columns (nested PATH ''' || p_json_path || ''' columns(json_document CLOB format json path ''$''))) d' using l_clob;    loop
      fetch cur into l_clob;
      exit when cur%NOTFOUND;
      PIPE ROW(t_opendata_json_doc(l_clob));
    end loop;
    
    close cur;EXCEPTION WHEN invalid_path_exception THEN
  begin
    -- try if this is an array of JSON documents to flatten
    open cur for 'select d.json_document from json_table(:l_clob, ''$'' columns (nested path ''$[*]'' columns (json_document CLOB format json path ''$''))) d' using l_clob;
    
    loop
      fetch cur into l_clob;
      exit when cur%NOTFOUND;
      PIPE ROW(t_opendata_json_doc(l_clob));
    end loop;
    
    close cur;
  
  exception WHEN invalid_path_exception THEN
    -- return the JSON document as is
    open cur for 'select d.json_document from json_table(:l_clob, ''$'' columns json_document CLOB format json path ''$'') d' using l_clob;
    
    loop
      fetch cur into l_clob;
      exit when cur%NOTFOUND;
      PIPE ROW(t_opendata_json_doc(l_clob));
    end loop;    close cur;
  end;
END;RETURN;
END;
/

Querying this Table function

Following are several ways to invoke the function…

One JSON document in a file

select *
  from getJSON( 'https://www.ncdc.noaa.gov/cag/global/time-series/globe/land_ocean/1/8/1880-2021/data.json' );

It will download the file:

{
  "description": { 
    "title": "Global Land and Ocean Temperature Anomalies, August",
    "units": "Degrees Celsius",
    "base_period": "1901-2000",
    "missing":-999
  },
  "data": { 
  "1880": "-0.07",
  "1881": "-0.06",
...
  }
}

and return the document as-is:

One JSON document in a file, accessing a field using JSON path

select * 
  from getJSON( 'https://tools.learningcontainer.com/sample-json-file.json', '$.Pets' );

It will download the file:

{
  "Name": "Test",
  "Mobile": 12345678,
  "Boolean": true,
  "Pets": ["Dog", "cat"],
  "Address": {
    "Permanent address": "USA",
    "current Address": "AU"
  }
}

and return the associated Pets array field:

Several JSON documents inside a JSON array

select * 
  from getJSON( 'https://raw.githubusercontent.com/loiclefevre/jsonloader/main/src/main/resources/Uni_POI_upper.json' );

This will download the following file:

[
   {
      "NAME":"piscina_tor_di_quinto",
      "UNIVERSITY_ID":1,
      "TYPE":"Centro_sportivo",
      "COSTO":{
         "MIN":"2.20",
         "MAX":"7.20"
      },
      "CITY":"Roma",
      "ADDRESS":"Via Fornaci di Tor di Quinto, 64",
      "GEOMETRY":{
         "type":"Point",
         "coordinates":[
            12.4842835,
            41.9365692
         ]
      },
      "OPEN":"start from January 9th.",
      "SIZE":"16m x 33m",
      "PHOTO":{
         "TYPE1":"url",
         "PICTURES1":"https://web.uniroma1.it/sapienzasport/sites/default/files/IMG_20200622_164135.jpg"
      }
   },
   {
      "NAME":"mensa_lollis",
      "UNIVERSITY_ID":1,
      "TYPE":"Mensa",
...
   }
]

And it will return one JSON document per array item:

Loading JSON documents into a SODA collection using SQL

Now remains the step to actually insert these JSON documents into the SODA collection using plain SQL:

> soda create points_of_interest; Successfully created collection: points_of_interest> desc points_of_interest;Name          Null?    Type          
------------- -------- ------------- 
ID            NOT NULL VARCHAR2(255) 
CREATED_ON    NOT NULL TIMESTAMP(6)  
LAST_MODIFIED NOT NULL TIMESTAMP(6)  
VERSION       NOT NULL VARCHAR2(255) 
JSON_DOCUMENT          JSON> INSERT INTO points_of_interest (id, version, json_document)
SELECT SYS_GUID(), SYS_GUID(), json_document
FROM getJSON( 'https://raw.githubusercontent.com/loiclefevre/jsonloader/main/src/main/resources/Uni_POI_upper.json' );14 rows inserted.

Elapsed: 00:00:00.585

And we are done!