skip to main content
Using DataDirect Hybrid Data Pipeline : Creating a Data Source definition : Google Analytics parameters
  

Try Now

Google Analytics parameters

Google APIs use OAuth 2.0 protocol for authentication and authorization. OAuth is an open standard for authorization. It provides client applications (in this case Hybrid Data Pipeline) secure access to server resources (such as Google Analytics) on behalf of a resource owner. To connect to Google Analytics through OAuth 2.0 protocol, the administrator needs to set up a project at Google console.
Next, the administrator must create an OAuth application. This involves creating a client ID and client secret pair. The administrator must then register this pair with Hybrid Data Pipeline using the REST API.

Creating an OAuth application (for administrators only)

1. Launch the Google Developer Console and log in using your Google account credentials.
2. Make sure that the Analytics API under Library > Other populer APIs >Analytics API is enabled for this account.
3. Create a new project.
4. In the project, click the Credentials section on the left.
5. Click the OAuth consent screen tab, enter the required details, including the product name, and then click Save.
6. In the Credentials tab, click Create credentials > OAuth client ID.
7. In the Create client ID screen, select Web application in the Application type section and specify the Hybrid Data Pipeline URL--<protocol>:<Your_Hybrid Data Pipeline_Server>:<port-number>/d2c-ui/oauth--in the Authorized redirect URIs setting. Ensure that the domain name is fully qualified. Then, click Create.
Note: In case you are using Hybrid Data Pipeline on multiple nodes behind a load balancer, you will need to specify the Load Balancer URL as the redirect URL. The Load Balancer URL followed by /d2c-ui/oauth will be the redirect URL. If the port number is unspecified, the default port will be 443 for https and 8080 for http.
8. Copy the client ID and secret key to a text editor.
Note: You will need these credentials while registering the OAuth application with Hybrid Data Pipeline.
9. Click Library in the left pane and in the page that appears on the right, click the Analytics API link.
10. In the Analytics page that appears, enable the APIs that you need by using the Enable button.

Registering the OAuth application with Hybrid Data Pipeline (for administrators only)

To register the OAuth application with Hybrid Data Pipeline, you must make a REST call to the Hybrid Data Pipeline Server. You can use any REST client that you are familiar with. Use the following details:
*Method: POST
*URL: <Your_Hybrid_Data_Pipeline_Server>/api/mgmt/oauthapps
*Authorization: Hybrid Data Pipeline Admin credentials
*Sample Request:
{
"name": "<name>",
"dataStore": 54,
"description": "<description>",
"clientId": "<clientId>",
"clientSecret": "<clientSecret>"
}

Creating the Google Analytics data source definition

After the administrator has set up OAuth authentication, add a Google Analytics data store in your Hybrid Data Pipeline user interface.
The following sections describe the parameters available on the General and Mapping tabs of a Google Analytics Data Source dialog.

General tab

General tab of the Google Analytics data source setup dialogGeneral tab of the Google Analytics data source setup dialog
Table 22. General tab connection parameters for Google Analytics
Field
Description
Data Source Name
A unique name for this Data Source definition.
Note: Names can contain only alphanumeric characters and underscores.
Description
A description of this set of connection parameters.
OAuth Profile Name
In some environments, you might want to reuse connections for users who have different access privileges. You can create separate profiles for different users.
Select an OAuth profile name from the drop-down list, or create a new profile by clicking Create New Profile in the dropdown list, entering a profile name, and then clicking the Add icon. Then, click Authorize with Google.
When you select an OAuth Profile Name that has been authorized with Google, the Default View Name and Segment fields get populated.
Default View Name
A view that belongs to your Google Analytics account. Select a view from the drop-down list.
Segment
A segment that belongs to your Google Analytics account. Select a segment from the drop-down list.
Start Date
The start date for fetching Google Analytics data (inclusive). You can enter a specific date in YYYY-MM-DD format, or select a date, using the calendar icon. Alternatively, select a relative value (Today, Yesterday, or N Days Ago, where N is a positive integer). The default is 30 days prior to the current date.
End Date
The end date for fetching Google Analytics data. You can enter a specific date in YYYY-MM-DD format, or select a date, using the calendar icon. Alternatively, select a relative value from the drop-down list (Today, Yesterday, or N Days Ago, where N is a positive integer).
The end date must always be later than the start date, if a start date is specified.

OData tab

The following table describes the controls on the OData tab. For information on using the Configure Schema editor, see Enabling OData and working with Data Source groups. For information on formulating OData requests, see "Formulating queries" under Querying with OData.
Click the thumbnail to view the screen. Required fields are marked with an asterisk.
OData tabOData tab
Table 23. OData tab connection parameters for Google Analytics
Field
Description
OData Version
Enables you to choose from the supported OData versions. OData configuration made with one OData version will not work if you switch to a different OData version. If you want to maintain the data source with different OData versions, you must create different data sources for each of them.
OData Access URI
Specifies the base URI for the OData feed to access your DataDirect Cloud data source, for example, https://service.datadirectcloud.com/api/odata. You can copy the URI and paste it into your application's OData configuration.
The URI contains the case-insensitive name of the data source to connect to, and the query that you want to execute. This URI is the OData Service Root URI for the OData feed. The Service Document for the data source is returned by issuing a GET request to the data source's service root.
The OData Service Document returns the names of the entities exposed by the Data Source OData service. To get details such as the properties of the entities exposed, the data types for those properties and the relationships between entities, the Service Metadata Document can be fetched by adding /$metadata to the service root URI.
Schema Map
Enables OData support. If a schema map is not defined, the OData API cannot be used to access the data store using this Data Source definition. Use the Configure Schema editor to select the tables to expose through OData.
See Using the Configure Schema editor for more information.
Data Source Caching
Specifies whether the connection to the backend data source is cached in a session associated with the data source. Caching the back end connection improves performance when multiple OData queries are submitted to the same data source because the connection does not need to be created on every query.
Caching of the back end connection can get in the way when trying to configure a data source for OData. If a change is made to any of the DataDirect Cloud data source connection parameters, those changes will not be seen because the connection was established using the old data source definition, and was cached. The session that caches the backend connection is discarded if there is no activity to the data source for approximately 5 minutes.
When you configure a data source for OData, it is recommended that the OData session caching be disabled. Once you are satisfied with the OData configuration for the data source, enable the parameter to get the performance improvement provided by caching the connection to the backend data source.
Valid Values:
When set to 1, session caching is enabled. This provides better performance for production.
When set to 0, session caching is disabled. Use this value when you are configuring the data source.
Default: 1
Page Size
Determines the number of entities returned on each page for paging controlled on the server side. On the client side, requests can use the $top and $skip parameters to control paging. In most cases, server side paging works well for large data sets. Client side pagination works best with a smaller data sets where it is not as expensive to fetch subsequent pages.
Valid Values: 0 | n
where n is an integer from 1 to 10000.
When set to 0, the server default of 2000 is used.
Default: 0
Refresh Result
Controls what happens when you fetch the first page of a cached result when using Client Side Paging. Skip must be omitted or set to 0. You can use the cached copy of that first page, or you can re-execute the query to get a new result, discarding the previously cached result. Re-executing the query is useful when the data being fetched may change between two requests for the first page. Using the cached result is useful if you are paging back and forth through results that are not expected to change.
Valid Values:
When set to 0, the OData service caches the first page of results.
When set to 1, the OData service re-executes the query.
Default: 1
Inline Count Mode
Specifies how the connectivity service satisfies requests that include the $count parameter when it is set to true (for OData version 4) or the $inlinecount parameter when it is set to allpages (for OData version 2). These requests require the connectivity service to include the total number of entities that are defined by the OData query request. The count must be included in the first page in server-driven paging and must be included in every page when using client-driven paging.
The optimal setting depends on the data store and the size of results. The OData service can run a separate query using the count(*) aggregate to get the count, before running the query used to generate the entities. In very large results, this approach can often lead to the first page being returned faster. Alternatively, the OData service can fetch the entire result before returning the first page. This approach works well for small results and for data stores that cannot optimize the count(*) aggregate; however, it may have a longer initial response time for the first page if the result is large.
Valid Values:
When set to 1, the connectivity service runs a separate count(*) aggregate query to get the count of entities before executing the query to return results. In very large results, this approach can often lead to the first page being returned faster.
When set to 2, the connectivity service fetches all entities before returning the first page. For small results, this approach is always faster. However, the initial response time for the first page may be longer if the result is large.
Default: 1
Top Mode
Indicates how requests typically use $top and $skip for client side pagination, allowing the service to better anticipate how to process queries.
Valid Values:
Set to 0 when the application generally uses $top to limit the size of the result and rarely attempts to get additional entities by combining $top and $skip.
Set to 1 when the application uses $top as part of client-driven paging and generally combines $top and $skip to page through the result.
Default: 0
OData Read Only
Controls whether write operations can be performed on the OData service. Write operations generate a 405 Method Not Allowed response if this option is enabled.
Existing OData-enabled data sources are read only (write operations are disabled). To enable write operations for an existing OData enabled data source, clear the OData Read Only option on the OData tab. Then, on the Data Sources tab, regenerate the OData model for the data source by clicking on the OData model icon Synch completed successfully.
Valid Values:
true | false
When the check box is selected (set to true), OData access is restricted to read-only mode.
When the check box is not selected (set to false), write operations can be performed on the OData service.
Default: false

Mapping tab

The Mapping tab enables you to create relational tables in Hybrid Data Pipeline and map them to Metrics and Dimensions in your Google Analytics data source.
Mapping tab of the Google Analytics data source setup dialogMapping tab of the Google Analytics data source setup dialog
Table 24. Mapping tab connection parameters for Google Analytics
Field
Description
Map Name
Optional name of the map definition that Hybrid Data Pipeline uses to interpret the schema of the data store. The Hybrid Data Pipeline service automatically creates a name for the map. If you want to name the map yourself, enter a unique name.
Refresh Schema
The Refresh Schema checkbox specifies whether the connectivity service attempts to refresh the schema when an application first connects.
Valid Values:
When the check box is selected (set to true), the connectivity service attempts to refresh the schema.
When the check box is not selected (set to false), the connectivity service does not attempt to refresh the schema.
Default
false
Notes
*You can choose to refresh the schema by clicking the Refresh button: This refreshes the schema immediately. Note that the refresh option is available only while editing the Data Source.
*Use the checkbox to specify whether the connectivity service attempts to refresh the schema when an application first connects. Use the button if you want to refresh the schema immediately, using an already saved configuration.
*If you are making other edits to the settings, you need to click update to save your configuration. The Refresh schema button will only trigger a runtime call on the saved configuration.
Create Mapping
Determines whether the Google Analytics table mapping files are to be (re)created.
Hybrid Data Pipeline automatically maps data store objects and fields to tables and columns the first time that it connects to the data store. The map includes both standard and custom objects and includes any relationships defined between objects.
Table 24. Valid values for Create Map field
Value
Description
Not Exist
Select this option for most normal operations. If a map for a data source does not exist, this option causes one to be created. If a map exists, the service uses that existing map. If a name is not specified in the Map Name field, the name will be a combination of the User Name and Data Source ID.
Force New
Select this option to force creation of a new map. A map is created on connection whether one exists or not. The Hybrid Data Pipeline connectivity service uses a combination of the User Name and Data Source ID to name the map. Map creation is expensive, so you will likely not want to leave this option set to Force New indefinitely.
No
If a map for a data source does not exist, the connectivity service does not create one.
Add Tables
A set of tables to work with your Google Analytics account.
To create configuration tables that use different combinations of Metrics and Dimensions, click the Configure Logical Schema button.
In the Configure Logical Schema screen, click Create Table and enter a name for the table.
In the Dimensions and Metrics screen, select the metrics that you want to add to the table. You can select metrics across multiple dimensions. Each metric gets added as a column in the table.
Finally, click Save & Close.
Show Deprecated Objects
Defines whether Hybrid Data Pipeline shows deprecated objects. Google Analytics marks dimensions and metrics as deprecated as an indication that they plan to remove support for those objects. By default, the Hybrid Data Pipeline connectivity service does not expose these deprecated objects. Set the value to true while you work on rewriting your queries and table definitions to migrate from the deprecated objects. Once the queries and table definitions are fixed, change the setting for the map option back to false.
Valid Values:
true | false
If the check box is selected (true), Hybrid Data Pipeline includes deprecated objects in the relational model.
If the check box is not selected (false), Hybrid Data Pipeline does not include deprecated objects in the relational model.
Default: false
Show Internal Tables
Defines how Hybrid Data Pipeline shows internal tables.
Valid Values:
true | false
If the check box is selected (true), Hybrid Data Pipeline shows the "Data" table.
If the check box is not selected (false), Hybrid Data Pipeline does not show the "Data" table.
Default: false
Subtract Tables
Defines a comma-separated list of tables that should be hidden from the user's view. This feature is useful if you want to define your own tables instead of using some of the tables that are supplied with the data store, or to limit access to certain tables so that the user does not see them. For example, enter adSense,adWords.
subtractTables can be used both for the pseudo-tables in Google Analytics that are derived from the Data system table, and also for the regular management tables such as Goal or Account.

Advanced tab

Click the thumbnail to view the screen.
Advanced tab of the Google Analytics data source setup dialogAdvanced tab of the Google Analytics data source setup dialog
Table 26. Advanced tab connection parameters for Google Analytics
Field
Description
Default Query Options
A semi-colon delimited list of default values for the WHERE clauses within the connection. Specifying mandatory values such as startDate, endDate, and viewId in this parameter makes the queries simpler. For example, the query SELECT * FROM Overview returns only results from the specified period.
Valid Values:
(key=value[;key=value])
Where:
key
is one of the following values:
If set to startDate, specifies the starting date for the query (inclusive). The default is thirty days prior to the current date, expressed as 30daysago.
If set to endDate, the ending date for the query (inclusive). This defaults to yesterday.
The syntax for startDate and endDate values is as follows:
*a date in YYYY-MM-DD format
*the word "today" for the current date
*the word "yesterday" for the prior date
*#daysAgo, where # is some positive integer
If the key is viewId, the value is a comma-separated list of view Ids. There is no default; in order for SELECT * FROM to work for either "Data" or any of the pseudo-tables, this must be set either explicitly in a WHERE clause or via the defaultQueryOptions connection string option.
Default: If no value is specified (the default), the connectivity service uses startDate=30daysAgo;endDate=yesterday.
Extended Options
Specifies a semi-colon delimited list of connection options and their values. Use this configuration option to set the value of undocumented connection options that are provided by Progress DataDirect technical support. You can include any valid connection option in the Extended Options string, for example:
Database=Server1;UndocumentedOption1=value[;UndocumentedOption2=value;]
If the Extended Options string contains option values that are also set in the setup dialog, the values of the options specified in the Extended Options string take precedence.
Valid Values: string
Default: empty string
Initialization String
A semicolon delimited set of commands to be executed on the cloud data store after DataDirect Cloud has established and performed all initialization for the connection. If the execution of a SQL command fails, the connection attempt also fails and DataDirect Cloud returns an error indicating which SQL commands failed.
The default is an empty string.
Max Pooled Statements
The maximum number of prepared statements to cache for this connection. If the value of this property is set to 20, the connectivity service caches the last 20 prepared statements that are created by the application.
Web Service Call Limit
The maximum number of Web service calls allowed to the cloud data store for a single SQL statement or metadata query.
When set to 0, there is no limit on the number of Web service calls on a single connection that can be made when executing a SQL statement.
Web Service Fetch Size
Specifies the number of rows of data the DataDirect Cloud connectivity service attempts to fetch for each call.
Valid Values:
0 | x
If set to 0, the DataDirect Cloud connectivity service attempts to fetch up to a maximum of 10000 rows. This value typically provides the maximum throughput.
If set to x, the DataDirect Cloud connectivity service attempts to fetch up to a maximum of the specified number of rows. Setting the value lower than 10000 can reduce the response time for returning the initial data. Consider using a smaller value for interactive applications only.
Default: 0
Web Service Retry Count
The number of times to retry a timed-out Select request. The Web Service Timeout parameter specifies the period between retries. A value of 0 for the retry count prevents retries. A positive integer sets the number of retries. The default value is 3.
Web Service Timeout
The time, in seconds, to wait before retrying a timed-out Select request. Valid only if the value of Web Service Retry Count is greater than zero. A value of 0 for the timeout waits indefinitely for the response to a Web service request. There is no timeout. A positive integer is considered as a default timeout for any statement created by the connection. The default value is 120.
See the steps for:
Creating a Data Source definition
* Adding Google Analytics tables
* Creating an OAuth profile
* Renaming an OAuth profile
* Deleting an OAuth profile