skip to main content
Welcome to the Progress DataDirect for JDBC for MongoDB Driver : Mapping Objects to Tables : Normalizing Native Data
  

Try DataDirect Drivers Now

Normalizing Native Data

When you first connect to a MongoDB server, the driver automatically generates a normalized view of your data. You can also use the Schema Tool to normalize your native MongoDB data. When your native data is normalized, each MongoDB collection is decompounded into a set of parent-child tables. Fields containing simple types are mapped as columns in a parent table, while complex types, such as subdocuments and arrays, are mapped as child tables. Child tables share a foreign key relationship with their parent table.
Note: Child tables are only extracted from arrays if all the sampled rows in the column have the native Array data type. See ConfigOptions for information on setting the sample size.
For example, the collection residents contains the array vehicles and subdocuments in the address document (or object). The collection's JSON structure can be rendered as follows:
{"_id": "ajx363",
"name": "Sydney Smith",
"address": {"street": "101 Main Street", "city": "Raleigh", "state": "NC"},
"county": "Wake",
"vehicles: ["car", "boat", "bicycle"]
}

{"_id": "tzn525",
"name": "Cora Welch",
"address": {"street": "191 First Street", "city": "Chapel Hill", "state": "NC"},
"county": "Orange",
"vehicles": ["scooter", "truck", "bicycle"]
}
The collection has been decompounded into separate but related tables. The normalization of the residents collection yields one parent table and two child tables. In the parent table, the _id field is mapped as a primary key column, and fields with other simple types are mapped as relational columns. The parent table adopts the name RESIDENTS and takes the following form:
_ID (PK)
NAME
COUNTY
ajx363
Sydney Smith
Wake
tzn525
Cora Welch
Orange
The information in the vehicles array maps to a RESIDENTS_VEHICLES child table. In this table, a column which refers back to the parent table is mapped as a concatenation of the parent table and the _id field: RESIDENTS_ID. (This column establishes a foreign key relationship to the parent table based on the _id field.) The vehicles array is mapped to the column VEHICLES. Each value in the VEHICLES column corresponds to an element in the vehicles array. In addition, the driver generates a third column to establish a primary key for the child table: VEHICLES_GENERATED_ID. (This automatically generated column enables the driver to normalize nested arrays to any depth. The format of the unique keys are internal to the driver.) This child table would take the following form:
RESIDENTS_ID (FK)
VEHICLES
VEHICLES_GENERATED_ID (PK)
ajx363
car
unique key
ajx363
boat
unique key
ajx363
bicycle
unique key
tzn525
bicycle
unique key
tzn525
scooter
unique key
tzn525
truck
unique key
The information in the address object maps to a RESIDENTS_ADDRESS child table. In this table, a column which refers back to the parent table is mapped as a concatenation of the parent name and the _id field: RESIDENTS_ID. (This column establishes a foreign key relationship to the parent table and functions as a primary key within the child table.) Each subdocument found in the address object is mapped as a column. This table would take the following form:
RESIDENTS_ID (PK & FK)
STREET
CITY
STATE
ajx363
101 Main Street
Raleigh
NC
tzn525
191 First Street
Chapel Hill
NC