skip to main content
Using Hybrid Data Pipeline : Creating data sources with the Web UI : Google Analytics parameters
  

Try Now

Google Analytics parameters

Important: Before you can proceed with creating a data source, an administrator must register Hybrid Data Pipeline as a client application with the Google Analytics API and create an OAuth application object using the Hybrid Data Pipeline OAuth applications API. For detailed information, see Integrating Hybrid Data Pipeline with a Google OAuth 2.0 authorization flow to access Google Analytics.
The following tables describe parameters available on the tabs of a Google Analytics Data Source setup dialog:
*General tab
*OData tab
*Mapping tab
*Advanced tab

General tab

General tab of the Google Analytics data source setup dialogGeneral tab of the Google Analytics data source setup dialog
Table 22. General tab connection parameters for Google Analytics
Field
Description
Data Source Name
A unique name for the data source. Data source names can contain only alphanumeric characters, underscores, and dashes.
Description
A description of this set of connection parameters.
OAuth Profile Name
An OAuth profile contains access and refresh tokens generated by Google. These tokens allow Hybrid Data Pipeline to access the Google Analytics API on your behalf.
You must either select an existing profile or create a new one.
For an existing OAuth profile, select the profile from the OAuth Profile Name drop-down list. The Default View Name and Segment fields will automatically populate.
To create a new profile, you must have administrative privileges on the Google Analytics project. To begin, click the Create OAuth Profile Name button, enter a profile name and click Create. A Google authorization pop-up window appears. In the authorization window, enter the required Google credentials and click Allow. Google Analytics supplies Hybrid Data Pipeline with access and refresh tokens. Then, you are returned to the General tab. Click Save to save these changes to the data source.
Default View Name
A view that belongs to your Google Analytics account. Select a view from the drop-down list.
Segment
A segment that belongs to your Google Analytics account. Select a segment from the drop-down list.
Start Date
The start date for fetching Google Analytics data (inclusive). You can enter a specific date in YYYY-MM-DD format, or select a date, using the calendar icon. Alternatively, select a relative value (Today, Yesterday, or N Days Ago, where N is a positive integer). The default is 30 days prior to the current date.
End Date
The end date for fetching Google Analytics data. You can enter a specific date in YYYY-MM-DD format, or select a date, using the calendar icon. Alternatively, select a relative value from the drop-down list (Today, Yesterday, or N Days Ago, where N is a positive integer).
The end date must always be later than the start date, if a start date is specified.

OData tab

The following table describes the controls on the OData tab. For information on using the Configure Schema editor, see Configuring data sources for OData connectivity and working with data source groups. For information on formulating OData requests, see "Formulating queries" under Querying with OData.
Click the thumbnail to view the screen. Required fields are marked with an asterisk.
OData tabOData tab
Table 23. OData tab connection parameters for Google Analytics
Field
Description
OData Version
Enables you to choose from the supported OData versions. OData configuration made with one OData version will not work if you switch to a different OData version. If you want to maintain the data source with different OData versions, you must create different data sources for each of them.
OData Access URI
Specifies the base URI for the OData feed to access your Hybrid Data Pipeline data source, for example, https://service.datadirectcloud.com/api/odata. You can copy the URI and paste it into your application's OData configuration.
The URI contains the case-insensitive name of the data source to connect to, and the query that you want to execute. This URI is the OData Service Root URI for the OData feed. The Service Document for the data source is returned by issuing a GET request to the data source's service root.
The OData Service Document returns the names of the entities exposed by the Data Source OData service. To get details such as the properties of the entities exposed, the data types for those properties and the relationships between entities, the Service Metadata Document can be fetched by adding /$metadata to the service root URI.
Schema Map
Enables OData support. If a schema map is not defined, the OData API cannot be used to access the data store using this data source definition. Use the Configure Schema editor to select the tables/columns to expose through OData.
Data Source Caching
Specifies whether the connection to the backend data source is cached in a session associated with the data source. Caching the back end connection improves performance when multiple OData queries are submitted to the same data source because the connection does not need to be created on every query.
Caching of the back end connection can get in the way when trying to configure a data source for OData. If a change is made to any of the Hybrid Data Pipeline data source connection parameters, those changes will not be seen because the connection was established using the old data source definition, and was cached. The session that caches the backend connection is discarded if there is no activity to the data source for approximately 5 minutes.
When you configure a data source for OData, it is recommended that the OData session caching be disabled. Once you are satisfied with the OData configuration for the data source, enable the parameter to get the performance improvement provided by caching the connection to the backend data source.
Valid Values:
When set to 1, session caching is enabled. This provides better performance for production.
When set to 0, session caching is disabled. Use this value when you are configuring the data source.
Default: 1
Page Size
Determines the number of entities returned on each page for paging controlled on the server side. On the client side, requests can use the $top and $skip parameters to control paging. In most cases, server side paging works well for large data sets. Client side pagination works best with a smaller data sets where it is not as expensive to fetch subsequent pages.
Valid Values: 0 | n
where n is an integer from 1 to 10000.
When set to 0, the server default of 2000 is used.
Default: 0
Refresh Result
Controls what happens when you fetch the first page of a cached result when using Client Side Paging. Skip must be omitted or set to 0. You can use the cached copy of that first page, or you can re-execute the query to get a new result, discarding the previously cached result. Re-executing the query is useful when the data being fetched may change between two requests for the first page. Using the cached result is useful if you are paging back and forth through results that are not expected to change.
Valid Values:
When set to 0, the OData service caches the first page of results.
When set to 1, the OData service re-executes the query.
Default: 1
Inline Count Mode
Specifies how the connectivity service satisfies requests that include the $count parameter when it is set to true (for OData version 4) or the $inlinecount parameter when it is set to allpages (for OData version 2). These requests require the connectivity service to include the total number of entities that are defined by the OData query request. The count must be included in the first page in server-driven paging and must be included in every page when using client-driven paging.
The optimal setting depends on the data store and the size of results. The OData service can run a separate query using the count(*) aggregate to get the count, before running the query used to generate the entities. In very large results, this approach can often lead to the first page being returned faster. Alternatively, the OData service can fetch the entire result before returning the first page. This approach works well for small results and for data stores that cannot optimize the count(*) aggregate; however, it may have a longer initial response time for the first page if the result is large.
Valid Values:
When set to 1, the connectivity service runs a separate count(*) aggregate query to get the count of entities before executing the query to return results. In very large results, this approach can often lead to the first page being returned faster.
When set to 2, the connectivity service fetches all entities before returning the first page. For small results, this approach is always faster. However, the initial response time for the first page may be longer if the result is large.
Default: 1
Top Mode
Indicates how requests typically use $top and $skip for client side pagination, allowing the service to better anticipate how to process queries.
Valid Values:
Set to 0 when the application generally uses $top to limit the size of the result and rarely attempts to get additional entities by combining $top and $skip.
Set to 1 when the application uses $top as part of client-driven paging and generally combines $top and $skip to page through the result.
Default: 0

Mapping tab

The Mapping tab enables you to create relational tables in Hybrid Data Pipeline and map them to Metrics and Dimensions in your Google Analytics data source.
Mapping tab of the Google Analytics data source setup dialogMapping tab of the Google Analytics data source setup dialog
Table 24. Mapping tab connection parameters for Google Analytics
Field
Description
Map Name
Optional name of the map definition that Hybrid Data Pipeline uses to interpret the schema of the data store. The Hybrid Data Pipeline service automatically creates a name for the map. If you want to name the map yourself, enter a unique name.
Refresh Schema
The Refresh Schema option specifies whether the connectivity service attempts to refresh the schema when an application first connects.
Valid Values:
When set to ON, the connectivity service attempts to refresh the schema.
When set to OFF, the connectivity service does not attempt to refresh the schema.
Default
OFF
Notes
*You can choose to refresh the schema by clicking the Refresh button: This refreshes the schema immediately. Note that the refresh option is available only while editing the Data Source.
*Use the option to specify whether the connectivity service attempts to refresh the schema when an application first connects. Use the button if you want to refresh the schema immediately, using an already saved configuration.
*If you are making other edits to the settings, you need to click update to save your configuration. The Refresh schema button will only trigger a runtime call on the saved configuration.
Create Mapping
Determines whether the Google Analytics table mapping files are to be (re)created.
Hybrid Data Pipeline automatically maps data store objects and fields to tables and columns the first time that it connects to the data store. The map includes both standard and custom objects and includes any relationships defined between objects.
Table 24. Valid values for Create Map field
Value
Description
Not Exist
Select this option for most normal operations. If a map for a data source does not exist, this option causes one to be created. If a map exists, the service uses that existing map. If a name is not specified in the Map Name field, the name will be a combination of the User Name and Data Source ID.
Force New
Select this option to force creation of a new map. A map is created on connection whether one exists or not. The Hybrid Data Pipeline connectivity service uses a combination of the User Name and Data Source ID to name the map. Map creation is expensive, so you will likely not want to leave this option set to Force New indefinitely.
No
If a map for a data source does not exist, the connectivity service does not create one.
Add Tables
A set of tables to work with your Google Analytics account.
To create configuration tables that use different combinations of Metrics and Dimensions, click the Configure Logical Schema button.
In the Configure Logical Schema screen, click Create Table and enter a name for the table.
In the Dimensions and Metrics screen, select the metrics that you want to add to the table. You can select metrics across multiple dimensions. Each metric gets added as a column in the table.
Finally, click Save & Close.
Show Deprecated Objects
Defines whether Hybrid Data Pipeline shows deprecated objects. Google Analytics marks dimensions and metrics as deprecated as an indication that they plan to remove support for those objects. By default, the Hybrid Data Pipeline connectivity service does not expose these deprecated objects. Set the value to ON while you work on rewriting your queries and table definitions to migrate from the deprecated objects. Once the queries and table definitions are fixed, change the setting for the map option back to OFF.
Valid Values:
ON | OFF
If set to ON, Hybrid Data Pipeline includes deprecated objects in the relational model.
If set to OFF, Hybrid Data Pipeline does not include deprecated objects in the relational model.
Default: OFF
Show Internal Tables
Defines how Hybrid Data Pipeline shows internal tables.
Valid Values:
ON | OFF
If set to ON, Hybrid Data Pipeline shows the "Data" table.
If set to OFF, Hybrid Data Pipeline does not show the "Data" table.
Default: OFF
Subtract Tables
Defines a comma-separated list of tables that should be hidden from the user's view. This feature is useful if you want to define your own tables instead of using some of the tables that are supplied with the data store, or to limit access to certain tables so that the user does not see them. For example, enter adSense,adWords.
subtractTables can be used both for the pseudo-tables in Google Analytics that are derived from the Data system table, and also for the regular management tables such as Goal or Account.

Advanced tab

Click the thumbnail to view the screen.
Advanced tab of the Google Analytics data source setup dialogAdvanced tab of the Google Analytics data source setup dialog
Table 26. Advanced tab connection parameters for Google Analytics
Field
Description
Default Query Options
A semi-colon delimited list of default values for the WHERE clauses within the connection. Specifying mandatory values such as startDate, endDate, and viewId in this parameter makes the queries simpler. For example, the query SELECT * FROM Overview returns only results from the specified period.
Valid Values:
(key=value[;key=value])
Where:
key
is one of the following values:
If set to startDate, specifies the starting date for the query (inclusive). The default is thirty days prior to the current date, expressed as 30daysago.
If set to endDate, the ending date for the query (inclusive). This defaults to yesterday.
The syntax for startDate and endDate values is as follows:
*a date in YYYY-MM-DD format
*the word "today" for the current date
*the word "yesterday" for the prior date
*#daysAgo, where # is some positive integer
If the key is viewId, the value is a comma-separated list of view Ids. There is no default; in order for SELECT * FROM to work for either "Data" or any of the pseudo-tables, this must be set either explicitly in a WHERE clause or via the defaultQueryOptions connection string option.
Default: If no value is specified (the default), the connectivity service uses startDate=30daysAgo;endDate=yesterday.
Extended Options
Specifies a semi-colon delimited list of connection options and their values. Use this configuration option to set the value of undocumented connection options that are provided by Progress DataDirect technical support. You can include any valid connection option in the Extended Options string, for example:
Database=Server1;UndocumentedOption1=value[;UndocumentedOption2=value;]
If the Extended Options string contains option values that are also set in the setup dialog, the values of the options specified in the Extended Options string take precedence.
Valid Values: string
Default: empty string
Initialization String
A semicolon delimited set of commands to be executed on the cloud data store after Hybrid Data Pipeline has established and performed all initialization for the connection. If the execution of a SQL command fails, the connection attempt also fails and Hybrid Data Pipeline returns an error indicating which SQL commands failed.
The default is an empty string.
Max Pooled Statements
The maximum number of prepared statements to cache for this connection. If the value of this property is set to 20, the connectivity service caches the last 20 prepared statements that are created by the application.
Web Service Call Limit
The maximum number of Web service calls allowed to the cloud data store for a single SQL statement or metadata query.
When set to 0, there is no limit on the number of Web service calls on a single connection that can be made when executing a SQL statement.
Web Service Fetch Size
Specifies the number of rows of data the Hybrid Data Pipeline connectivity service attempts to fetch for each call.
Valid Values:
0 | x
If set to 0, the Hybrid Data Pipeline connectivity service attempts to fetch up to a maximum of 10000 rows. This value typically provides the maximum throughput.
If set to x, the Hybrid Data Pipeline connectivity service attempts to fetch up to a maximum of the specified number of rows. Setting the value lower than 10000 can reduce the response time for returning the initial data. Consider using a smaller value for interactive applications only.
Default: 0
Web Service Retry Count
The number of times to retry a timed-out Select request. The Web Service Timeout parameter specifies the period between retries. A value of 0 for the retry count prevents retries. A positive integer sets the number of retries. The default value is 3.
Web Service Timeout
The time, in seconds, to wait before retrying a timed-out Select request. Valid only if the value of Web Service Retry Count is greater than zero. A value of 0 for the timeout waits indefinitely for the response to a Web service request. There is no timeout. A positive integer is considered as a default timeout for any statement created by the connection. The default value is 120.
Metadata Exposed Schemas
Restricts the metadata exposed by Hybrid Data Pipeline to a single schema. The metadata exposed in the SQL Editor, the Configure Schema Editor, and third party applications will be limited to the specified schema. JDBC, OData, and ODBC metadata calls will also be restricted. In addition, calls made with the Schema API will be limited to the specified schema.
Warning: This functionality should not be regarded as a security measure. While the Metadata Exposed Schemas option restricts the metadata exposed by Hybrid Data Pipeline to a single schema, it does not prevent queries against other schemas on the backend data store. As a matter of best practice, permissions should be set on the backend data store to control the ability of users to query data.
Valid Values
<schema>
Where:
<schema>
is the name of a valid schema on the backend data store.
Default: No schema is specified. Therefore, all schemas are exposed.
See the steps for:
How to create a data source in the Web UI
* Creating an OAuth profile
* Renaming an OAuth profile
* Deleting an OAuth profile
* Refreshing stale access and refresh tokens
* Adding Google Analytics tables
* Using Google Analytics