skip to main content
Using Hybrid Data Pipeline : Creating and using REST data sources : Creating an input REST file
  

Try Now

Creating an input REST file

The input REST file is a JSON file which specifies one or more REST endpoints in the form of a JSON object. The input REST file may include only endpoints, or it can include endpoints with parameters that define the REST data. When initially connecting to a REST endpoint, Hybrid Data Pipeline uses the input REST file to build a relational model of the REST data. You can create an input REST file with a text editor. Once you create the input REST file, it can be uploaded via the Web UI or with the Drive Files API.
The basic format of the input REST file consists of a list of comma-separated endpoints. The following example shows how endpoints are mapped as tables to support a relational schema.
{
"<table_name1>":"<endpoint1>",
"<table_name2>":"<endpoint2>",
"<table_name3>":"<endpoint3>"
}
Note: The syntax requirements described here can also be applied to editing the relational model of your REST data through the Web UI. It should also be noted that the Entity Name field in the Web UI specifies the name of the relational table.
Valid formats for the input REST file are described in detail in the following sections.
*Specifying Endpoints for GET Request with Unparameterized Paths
*Specifying Endpoints for GET Request with Parameterized Paths
*Specifying Endpoints for GET Requests with Query Parameters
*Specifying Endpoints for Requests with Custom HTTP Headers
*Defining a POST Request
*Configuring Paging

Specifying Endpoints for GET Request with Unparameterized Paths

To specify endpoints for unparameterized GET requests, use the following format:
"<table_name>":"<host_name>/<endpoint_path>"
table_name
is the name of the relational table to which the driver maps the endpoint. For example, country.
host_name
(optional) is the protocol and host name components of the URL endpoint. For example, http://example.com. You can omit this value by specifying the host name using the ServerName property.
endpoint_path
is the path component of the URL endpoint. For example, countries.
For example, the following demonstrates a GET request that will map to the countries table.
"countries":"http://example.com/countries/"

Specifying Endpoints for GET Request with Parameterized Paths

To specify parameterized GET requests, use the following format:
"<table_name>":"<host_name>/<endpoint_path1>/{<param_name>:<param_value>}[/<endpoint_path2>]"
table_name
is the name of the relational table to which the driver maps the endpoint. For example, states.
host_name
(optional) is the protocol and host name components of the URL endpoint. For example, http://example.com. You can omit this value by specifying the host name using the ServerName property.
endpoint_path
is the path component of the URL endpoint. For example, states.
param_name
is the parameter identifier used for filtering the request. For example, countryCode.
param_value
is the parameter value used for filtering the request during sampling. For example, USA.
For example, the following demonstrates a GET request that will map to the states table.
"states":"http://example.com/states/get/{countryCode:USA}/all"

Specifying Endpoints for GET Requests with Query Parameters

Use the following format to specify endpoints for GET requests with argument parameters. Multiple argument parameters withing the same endpoint are separated by an ampersand (&).
"<table_name>":"<host_name>/<endpoint_path>?<parameter>=<value>[&...]"
table_name
is the name of the relational table to which the driver maps the endpoint. For example, timeseries.
host_name
(optional) is the protocol and host name components of the URL endpoint. For example, http://example.com. You can omit this value by specifying the host name using the ServerName property.
endpoint_path
is the path component of the URL endpoint. For example, times.
parameter
is the argument parameter component of the parameter=value pair used for filtering the request. For example, interval.
value
is the value argument parameter used for filtering the request. For example, 5min.
For example, the following demonstrates a GET request that will map to the timeseries table.
"timeseries":"https://www.example.com/times/query?interval=5min&symbol=USA&function=TIME_SERIES_WEEKLY"

Specifying Endpoints for Requests with Custom HTTP Headers

Some endpoints employ custom HTTP headers to filter data returned by a GET request. This type of filtering is typically used to create multiple unique reports/tables from the same endpoint. To use custom headers, you must define the request in the input REST file. The REST file entry is comprised of a path and header object. The path object contains the URL endpoint used in requests, while the header object defines the headers and provides value arguments used to filter the request.
In addition to filtering requests, the header object can be used to specify a value for the Accept header if the default, application/json, is not accepted by the endpoint. This scenario typically occurs when accessing a vendor endpoint that uses a proprietary Accept header.
An entry for a GET request using custom HTTP headers takes the following form:
"table_name":{
"#path": "<host_name>/<endpoint_path>",
"#headers":{
"<header1>":"<value1>",
"<header2>":"<value2>",
"<header3>":"<value3>"
}
}
table_name
is the name of the relational table to which the driver maps the endpoint. For example, people.
host_name
(optional) is the protocol and host name components of the URL endpoint. For example, http://example.com. You can omit this value by specifying the host name using the ServerName property.
endpoint_path
is the path component of the URL endpoint. For example, times.
header
is the HTTP header component of the header=value pair used for filtering the request. For example, X-Subway-Payment.
When overriding the Accept header, this value is Accept.
value
is the value argument for the HTTP header used for filtering the request or, if overriding the default Accept header, the value of the Accept header for the endpoint. For example, token.
For example, the following demonstrates an entry for a GET request that defines custom HTTP headers.
"people":{
"#path": "http://example.com/people",
"#headers":{
"Accept":"application/calendar+json",
"X-Subway-Payment":"token",
"X-Laundry-Service":"dryclean",
"X-Favorite-Food":"pizza"
}
}

Defining a POST Request

To use POST requests, you must define the request in the REST file in the JSON format. The definition entry is comprised of a path and body. The path contains the URL endpoint and the body used in requests, while the body defines documents and provides sample values. The driver then uses these sample values to define which data type to be used when executing a POST request. An entry for a POST request takes the following form:
"table_name": {
"#path": "<host_name>/<endpoint_path>",
"#post": {
"<field1>":"<value1>",
"<field2>":"<value2>",
}
}
table_name
is the name of the relational table to which the driver maps the endpoint. For example, countries2.
host_name
(optional) is the protocol and host name components of the URL endpoint. For example, http://example.com. You can omit this value by specifying the host name using the ServerName property.
endpoint_path
is the path component of the URL endpoint. For example, country.
document
is the document name of the document=value pair. For example, START_DATE.
value
is the sample value the driver uses to determine the data type to use when executing a POST to that document. For example, 2018-08-31.
For example, the following demonstrates an entry for a POST request that will map to the countries2 table.
"countries2": {
"#path": "http://example.com/country/",
"#post": {
"start_date":"2018-08-31",
"end_date":"2018-09-01",
"departments":"[engineering,marketing,sales]",
"tags":"[blue,green,red]"
}
}

Configuring Paging

The driver supports two types of paging: offset and page numbering paging. To configure paging, specify values for the properties in the following tables that correspond to the type of paging you want to employ. Paging properties can be set for individual GET or POST requests by specifying these options in the body object. If paging properties are not specified, the driver will attempt to retrieve the first page for data sources that require paging.
The following demonstrates configuring row offset paging for an unparametrized GET request:
"table_name": {
"#path": "<host_name>/<endpoint_path>",
"#maximumPageSize":1000,
"#firstRowNumber":1,
"#pageSizeParameter":"maxResults",
"#rowOffsetParameter":"startAt"
}
table_name
is the name of the relational table to which the driver maps the endpoint. For example, countries2.
host_name
(optional) is the protocol and host name components of the URL endpoint. For example, http://example.com. You can omit this value by specifying the host name using the ServerName property.
endpoint_path
is the path component of the URL endpoint. For example, country.
Table 126. Row Offset Paging Properties
Property
Description
#maximumPageSize
The maximum page size in rows.
#firstRowNumber
The number of the first row. The default is 0; however, some systems begin numbering rows at 1.
#pageSizeParameter
The name of the URI parameter that contains the page size.
#rowOffsetParameter
The name of the URI parameter that contains the starting row number for this set of rows.
Table 127. Page Number Paging Properties
Property
Description
#maximumPageSize
The maximum page size in rows.
#firstPageNumber
The number of the first page. The default is 0; however, some systems begin numbering pages at 1.
#pageSizeParameter
The name of the URI parameter that contains the page size.
#pageNumberParameter
When requesting a page of rows, this is the name of the URI parameter to contain the page number.
* Sample Input REST File