progress® datadirect® hybrid data pipeline user's guide

1146
Progress ® DataDirect ® Hybrid Data Pipeline User's Guide Release 4.3

Upload: duonghanh

Post on 09-Jan-2017

392 views

Category:

Documents


18 download

TRANSCRIPT

  • Progress DataDirectHybrid Data PipelineUser's Guide

    Release 4.3

  • Copyright

    2018 Progress Software Corporation and/or one of its subsidiaries or affiliates. All rights reserved.

    These materials and all Progress software products are copyrighted and all rights are reserved by ProgressSoftware Corporation. The information in these materials is subject to change without notice, and ProgressSoftware Corporation assumes no responsibility for any errors that may appear therein. The references in thesematerials to specific platforms supported are subject to change.

    Corticon, DataDirect (and design), DataDirect Cloud, DataDirect Connect, DataDirect Connect64, DataDirectXML Converters, DataDirect XQuery, DataRPM, Deliver More Than Expected, Icenium, Kendo UI, NativeScript,OpenEdge, Powered by Progress, Progress, Progress Software Developers Network, Rollbase, SequeLink,Sitefinity (and Design), SpeedScript, Stylus Studio, TeamPulse, Telerik, Telerik (and Design), Test Studio, andWebSpeed are registered trademarks of Progress Software Corporation or one of its affiliates or subsidiariesin the U.S. and/or other countries. Analytics360, AppServer, BusinessEdge, DataDirect Spy, SupportLink,DevCraft, Fiddler, JustAssembly, JustDecompile, JustMock, Kinvey, NativeChat, NativeScript Sidekick,OpenAccess, ProDataSet, Progress Results, Progress Software, ProVision, PSE Pro, Sitefinity, SmartBrowser,SmartComponent, SmartDataBrowser, SmartDataObjects, SmartDataView, SmartDialog, SmartFolder,SmartFrame, SmartObjects, SmartPanel, SmartQuery, SmartViewer, SmartWindow, and WebClient aretrademarks or service marks of Progress Software Corporation and/or its subsidiaries or affiliates in the U.S.and other countries. Java is a registered trademark of Oracle and/or its affiliates. Any other marks containedherein may be trademarks of their respective owners.

    Please refer to the Release Notes applicable to the particular Progress product release for any third-partyacknowledgements required to be provided in the documentation associated with the Progress product.

    Updated: 2018/03/14

    3Progress DataDirect Hybrid Data Pipeline: User's Guide: Version 4.3

  • Progress DataDirect Hybrid Data Pipeline: User's Guide: Version 4.34

    Copyright

  • Table of Contents

    Chapter 1: Welcome to DataDirect Hybrid Data Pipeline.........................17Product requirements............................................................................................................................18Deployment guidelines..........................................................................................................................21Deployment scenarios..........................................................................................................................22

    Single node deployment.............................................................................................................22Cluster deployment....................................................................................................................23Cloud deployment......................................................................................................................25On-premises deployment...........................................................................................................26

    Getting started for administrators .........................................................................................................27Login credentials........................................................................................................................29Load balancer configuration.......................................................................................................29External system database..........................................................................................................33Apache Kafka Cluster................................................................................................................33Shared files location...................................................................................................................34SSL certificate for the load balancer..........................................................................................36Access ports...............................................................................................................................37Application and driver configuration...........................................................................................38Browser configuration for the Web UI........................................................................................38

    Chapter 2: Using DataDirect Hybrid Data Pipeline...................................39Logging in to the Web UI.......................................................................................................................39Creating a Data Source definition.........................................................................................................40

    Parameters for all supported Data Source types ......................................................................41Amazon Redshift Parameters....................................................................................................42Apache Hadoop Hive Parameters..............................................................................................53DB2 parameters.........................................................................................................................69Google Analytics parameters.....................................................................................................86Defining OAuth2 authentication ..............................................................................................103Using Google Analytics ...........................................................................................................104Greenplum parameters............................................................................................................108Informix parameters.................................................................................................................122Microsoft Dynamics CRM parameters.....................................................................................134Microsoft SQL Server parameters ...........................................................................................148MySQL Enterprise parameters.................................................................................................172MySQL Community Edition parameters ..................................................................................186Oracle parameters ..................................................................................................................192Oracle Marketing Cloud (Eloqua) parameters..........................................................................220Oracle Sales Cloud parameters...............................................................................................231

    5Progress DataDirect Hybrid Data Pipeline: User's Guide: Version 4.3

    Contents

  • Oracle Service Cloud parameters............................................................................................242PostgreSQL parameters..........................................................................................................257Progress OpenEdge parameters.............................................................................................273Progress Rollbase parameters.................................................................................................286Salesforce (and Related Data Store) connection parameters.................................................300SugarCRM parameters............................................................................................................370Sybase parameters..................................................................................................................382

    Editing, testing, and deleting Data Source definitions.........................................................................396Editing connection parameters.................................................................................................397Testing a Data Source..............................................................................................................398Deleting a Data Source definition.............................................................................................398

    Exploring the Hybrid Data Pipeline workspace ..................................................................................399Manage Users view (administrators only)................................................................................399Product information..................................................................................................................401Data Sources View ..................................................................................................................402SQL Testing View ....................................................................................................................403User Profile..............................................................................................................................404

    Changing your password....................................................................................................................404Password Policy..................................................................................................................................405Firewall and port redirection using iptables.........................................................................................406

    Disabling firewalld....................................................................................................................406Installing iptables......................................................................................................................407Creating the iptables configuration file.....................................................................................407Starting the iptables service.....................................................................................................407

    Enabling OData and working with Data Source groups......................................................................408Using the Configure Schema editor.........................................................................................408Creating a Data Source Group ................................................................................................414Editing a Data Source Group...................................................................................................415Deleting a Data Source Group.................................................................................................416Setting OData options .............................................................................................................416

    Administrating Hybrid Data Pipeline...................................................................................................417Initial login................................................................................................................................418Managing users........................................................................................................................418Authentication..........................................................................................................................426Permissions and user provisioning..........................................................................................437Configuring change password behavior...................................................................................443Implementing an account lockout policy..................................................................................443Configuring row limit throttling..................................................................................................445Configuring throttling for OData queries ..................................................................................447Configuring CORS behavior.....................................................................................................448FIPS (Federal Information Processing Standard)....................................................................450Logging....................................................................................................................................452Troubleshooting and FAQ for Hybrid Data Pipeline ................................................................459

    Progress DataDirect Hybrid Data Pipeline: User's Guide: Version 4.36

    Contents

  • Chapter 3: Configuring Hybrid Data Pipeline Driver for ODBC.............461Getting started with the ODBC Driver.................................................................................................462Supported features ............................................................................................................................462

    Data encryption........................................................................................................................462Unicode....................................................................................................................................462Safe thread handling................................................................................................................463Number of connections and statements supported..................................................................463Parameter metadata................................................................................................................463SQL support.............................................................................................................................463

    Configuring an ODBC data source on UNIX and Linux systems........................................................463Setting environment variables manually..................................................................................465Configuring a data source in the system information file.........................................................467DSN-less connections..............................................................................................................469Example application for UNIX and Linux..................................................................................470

    Configuring and testing an ODBC data source on Windows..............................................................470Example application for Windows............................................................................................471

    Connecting applications to the connectivity service ...........................................................................471Connection strings ..................................................................................................................471File data sources .....................................................................................................................472Connection Properties .............................................................................................................473Connecting Through a Proxy Server .......................................................................................487

    Connection properties reference ........................................................................................................487ODBC Connection Properties..................................................................................................488Application Using Threads ......................................................................................................489Client Time Zone .....................................................................................................................490Data Source Name ..................................................................................................................491Data Source Password ...........................................................................................................491Data Source User.....................................................................................................................492Default Buffer Size for Long/LOB Columns (in Kb) .................................................................493Description ..............................................................................................................................493Enable SSL..............................................................................................................................494Enable WChar Support ...........................................................................................................494Host Name In Certificate .........................................................................................................495Hybrid Data Pipeline Source ...................................................................................................496IANAAppCodePage ................................................................................................................496Login Timeout ..........................................................................................................................497Logon Domain .........................................................................................................................498Max Varchar Size ....................................................................................................................498Min Long Varchar Size ............................................................................................................499Password ................................................................................................................................500Port Number.............................................................................................................................500Proxy Host ...............................................................................................................................501Proxy Password ......................................................................................................................501

    7Progress DataDirect Hybrid Data Pipeline: User's Guide: Version 4.3

    Contents

  • Proxy Port ...............................................................................................................................502Proxy User ..............................................................................................................................502Query Timeout .........................................................................................................................503Report Codepage Conversion Errors ......................................................................................504Service ....................................................................................................................................504Transaction Mode ....................................................................................................................505Trust Store ...............................................................................................................................505Trust Store Password ..............................................................................................................506User Name ..............................................................................................................................507Validate Server Certificate .......................................................................................................507Varchar Threshold ...................................................................................................................508WSRetryCount ........................................................................................................................509WSTimeout ..............................................................................................................................510

    Application considerations .................................................................................................................510Verifying the driver version number .........................................................................................510Retrieving data type information .............................................................................................513Supported ODBC API functions ..............................................................................................514Scalar functions .......................................................................................................................517

    Troubleshooting .................................................................................................................................518Determining where an issue originates ...................................................................................518Log files created during installation or upgrade.......................................................................520Error message syntax .............................................................................................................520Test loading tools for UNIX and Linux .....................................................................................522ODBC Trace ............................................................................................................................523Creating a trace log .................................................................................................................525Other tools ...............................................................................................................................525ODBC Test ..............................................................................................................................526Using the Driver with Microsoft Access ...................................................................................526

    Internationalization, localization, and Unicode ...................................................................................526Internationalization and Localization .......................................................................................526Unicode Character Encoding...................................................................................................528Unicode ODBC Drivers ...........................................................................................................529Driver Manager and Unicode Encoding on UNIX/Linux ..........................................................532References ..............................................................................................................................533

    Code page values ..............................................................................................................................533WorkAround options ...........................................................................................................................537

    WorkArounds and WorkArounds2 options ..............................................................................537Using the WorkAround options ...............................................................................................539

    Chapter 4: Configuring Hybrid Data Pipeline for JDBC.........................541Getting started with the JDBC driver ..................................................................................................541

    Testing the JDBC connection to a Hybrid Data Pipeline Data Source.....................................542Connecting from an Application to Hybrid Data Pipeline.........................................................544Connecting Through a Proxy Server .......................................................................................547

    Progress DataDirect Hybrid Data Pipeline: User's Guide: Version 4.38

    Contents

  • Driver and Data Source Classes..............................................................................................547Supported Features ...........................................................................................................................548

    Data encryption .......................................................................................................................548Unicode ...................................................................................................................................548Scrollable cursors ....................................................................................................................548Large objects (LOBs) ..............................................................................................................549Rowsets ..................................................................................................................................549Auto-generated keys ...............................................................................................................549Using IP addresses .................................................................................................................550SQL support.............................................................................................................................551

    Using connection pooling ...................................................................................................................551How connection pooling works ................................................................................................551Using a DataDirect connection pool ........................................................................................553Connecting to a JDBC Data Source using a connection pool .................................................555

    Testing your application .....................................................................................................................557Configuring DataDirect Test ....................................................................................................557

    Troubleshooting .................................................................................................................................558Installation issues ....................................................................................................................558Troubleshooting an application by logging ..............................................................................559Logging Levels ........................................................................................................................560Troubleshooting Connection Pooling ......................................................................................562

    Connection properties reference ........................................................................................................565Connection Properties .............................................................................................................566ConvertNull .............................................................................................................................567DataSourcePassword .............................................................................................................568DataSourceUserID ..................................................................................................................568EnableCancelTimeout .............................................................................................................569EncryptionMethod ...................................................................................................................570HostNameInCertificate.............................................................................................................570HybridDataPipelineDataSource ..............................................................................................571InsensitiveResultSetBufferSize ...............................................................................................572JavaDoubleToString ................................................................................................................573LogConfigFile ..........................................................................................................................573LoginTimeout ...........................................................................................................................574Password ................................................................................................................................574ProxyHost ................................................................................................................................575ProxyPassword .......................................................................................................................575ProxyPort ................................................................................................................................576ProxyUser ...............................................................................................................................577QueryTimeout ..........................................................................................................................577TransactionMode .....................................................................................................................578TrustStore ................................................................................................................................578TrustStorePassword ................................................................................................................579User .........................................................................................................................................580ValidateServerCertificate .........................................................................................................580

    9Progress DataDirect Hybrid Data Pipeline: User's Guide: Version 4.3

    Contents

  • WSRetryCount ........................................................................................................................581WSRetryDelay .........................................................................................................................582

    JDBC support......................................................................................................................................582Array ........................................................................................................................................582Blob .........................................................................................................................................583CallableStatement ...................................................................................................................584Clob..........................................................................................................................................593Connection ..............................................................................................................................594ConnectionEventListener ........................................................................................................597ConnectionPoolDataSource ....................................................................................................597DatabaseMetaData .................................................................................................................598DataSource .............................................................................................................................606Driver .......................................................................................................................................606ParameterMetaData ................................................................................................................607PooledConnection ...................................................................................................................607PreparedStatement .................................................................................................................608Ref ...........................................................................................................................................611ResultSet .................................................................................................................................612ResultSetMetaData .................................................................................................................621RowSet ....................................................................................................................................622SavePoint ................................................................................................................................622Statement ................................................................................................................................622StatementEventListener ..........................................................................................................625Struct .......................................................................................................................................625

    DataDirect connection pooling ...........................................................................................................625DataDirect Connection Pool Manager interfaces ....................................................................625Methods for configuring the connection pool ..........................................................................630Enabling Pool Manager tracing ...............................................................................................631

    JDBC extensions ...............................................................................................................................632JDBC Wrapper methods to access JDBC extensions.............................................................632ExtConnection interface...........................................................................................................632

    SQL escape sequences .....................................................................................................................632Date, Time, and Timestamp escape sequences .....................................................................633Scalar functions .......................................................................................................................633Outer join escape sequences ..................................................................................................633LIKE escape character sequence for wildcards ......................................................................634Procedure call escape sequences...........................................................................................634

    Chapter 5: Querying with OData version 4..............................................637Getting started with OData version 4..................................................................................................637

    Configuring an OData schema map ........................................................................................638Setting OData options..............................................................................................................643Testing Data Source configurations for Hybrid Data Pipeline..................................................644Requesting service metadata and the service document........................................................646

    Progress DataDirect Hybrid Data Pipeline: User's Guide: Version 4.310

    Contents

  • Supported functionality for OData Version 4.......................................................................................648Supported OData operations and data types ..........................................................................648OData support for stored functions..........................................................................................651Data stores supported for OData use with Hybrid Data Pipeline ............................................657OData model warnings.............................................................................................................658

    Understanding and configuring a schema map for OData version 4..................................................659Tour the Configure Schema editor...........................................................................................660Opening the Configure Schema editor ....................................................................................662Configure Schema editor quick reference ...............................................................................663JSON schema map syntax ......................................................................................................664Schema map examples ...........................................................................................................666

    Structure requests for OData version 4..............................................................................................668Headers ...................................................................................................................................669Service URI and resource path in Hybrid Data Pipeline..........................................................671Response formatting for OData version 4................................................................................672

    Formulating queries with OData version 4..........................................................................................672Query options and optimizing response times ........................................................................672Searching text-based columns ................................................................................................675Fetching records and collections .............................................................................................676Creating, editing, and deleting records ...................................................................................677Batch Requests........................................................................................................................680Navigating relationships ..........................................................................................................681

    Method Reference for OData version 4..............................................................................................682HTTP GET...............................................................................................................................683HTTP DELETE or POST and DELETE ...................................................................................684HTTP PATCH or POST and PATCH (update) .........................................................................685HTTP POST (create) ...............................................................................................................686

    Chapter 6: Querying with OData version 2..............................................689Getting started with OData version 2..................................................................................................689

    Configuring an OData schema map.........................................................................................690Setting OData options..............................................................................................................695Testing Data Source configurations for Hybrid Data Pipeline..................................................696Requesting service metadata and the service document........................................................699

    Supported functionality for OData version 2.......................................................................................699Supported OData operations and data types ..........................................................................699Data stores supported for OData use with Hybrid Data Pipeline ............................................703

    Understanding and configuring a schema map for OData version 2..................................................703Tour the Configure Schema editor...........................................................................................704Opening the Configure Schema editor.....................................................................................706Configure Schema editor quick reference................................................................................707JSON schema map syntax.......................................................................................................708Schema map examples............................................................................................................710

    Structure of requests for OData version 2..........................................................................................712

    11Progress DataDirect Hybrid Data Pipeline: User's Guide: Version 4.3

    Contents

  • Headers....................................................................................................................................713Service URI and resource path in Hybrid Data Pipeline..........................................................715Response formatting................................................................................................................716

    Formulating queries with OData version 2..........................................................................................716Query options and optimizing response times.........................................................................716Searching text-based columns with OData version 2 .............................................................719Fetching records and collections with OData version 2 ..........................................................720Creating, editing, and deleting records with OData version 2 .................................................721Navigating relationships with OData version 2 ........................................................................725

    Method Reference for OData version 2 .............................................................................................727HTTP GET...............................................................................................................................728HTTP DELETE or POST and DELETE....................................................................................729HTTP POST and MERGE (update).........................................................................................730HTTP POST (create)................................................................................................................731

    Chapter 7: Querying data stores with SQL..............................................733Querying data stores with SQL...........................................................................................................733Supported data types..........................................................................................................................734

    Entity Data Model (EDM) types for OData Version 4...............................................................735Entity Data Model (EDM) types for OData version 2 ...............................................................736Amazon Redshift data types....................................................................................................737Apache Hive data types...........................................................................................................737DB2 data types.........................................................................................................................738Greenplum data types..............................................................................................................740Informix data types...................................................................................................................741Microsoft Dynamics CRM Online data types............................................................................742Microsoft SQL Server data types.............................................................................................744MySQL data types....................................................................................................................746Oracle data types.....................................................................................................................747Oracle Marketing Cloud (Eloqua) data types...........................................................................749Oracle Sales Cloud data types.................................................................................................749Oracle Service Cloud data types..............................................................................................750PostgreSQL data types............................................................................................................751Progress OpenEdge data types...............................................................................................752Progress Rollbase data types..................................................................................................753Salesforce-type data types.......................................................................................................755SugarCRM data types..............................................................................................................757Sybase data types....................................................................................................................760

    Supported scalar functions.................................................................................................................762Scalar Function Support for Amazon Redshift.........................................................................762Scalar Function Support for Apache Hive................................................................................763Scalar Function Support for DB2.............................................................................................765Scalar Function Support for Google Analytics.........................................................................766Scalar Function Support for Greenplum...................................................................................768

    Progress DataDirect Hybrid Data Pipeline: User's Guide: Version 4.312

    Contents

  • Scalar Function Support for Informix........................................................................................769Scalar Function Support for Microsoft Dynamics.....................................................................770Scalar Function Support for Microsoft SQL Server..................................................................771Scalar Function Support for MySQL........................................................................................772Scalar Function Support for Oracle..........................................................................................774Scalar function support for Oracle Marketing Cloud (Eloqua)..................................................775Scalar Function Support for Oracle Sales Cloud.....................................................................776Scalar Function Support for Oracle Service Cloud..................................................................777Scalar Function Support for PostgeSQL..................................................................................779Scalar Function Support for Progress OpenEdge ...................................................................780Scalar Function Support for Progress Rollbase.......................................................................781Scalar Function Support for Salesforce-based data stores.....................................................782Scalar Function Support for SugarCRM...................................................................................784Scalar Function Support for Sybase........................................................................................785

    Using Salesforce reports.....................................................................................................................786Salesforce data store reports...................................................................................................786

    Supported SQL and Extensions..........................................................................................................787Alter Session (EXT)..................................................................................................................790Alter Table for Salesforce.........................................................................................................791Create Table for Salesforce......................................................................................................794Delete.......................................................................................................................................798Drop Table................................................................................................................................798Explain Plan.............................................................................................................................799Insert........................................................................................................................................799Select.......................................................................................................................................801Update......................................................................................................................................812SQL Expressions.....................................................................................................................813Subqueries...............................................................................................................................820

    Catalog tables.....................................................................................................................................822SYSTEM_SESSIONS catalog table.........................................................................................823SYSTEM_REMOTE_SESSIONS catalog table.......................................................................824

    Error messages...................................................................................................................................826Hybrid Data Pipeline Management API error messages.........................................................826

    Performance tuning.............................................................................................................................845Oracle Marketing Cloud bulk operations..................................................................................845

    Chapter 8: Configuring an On-Premises Connector...............................849Configuring the On-Premises Connector............................................................................................849

    Restarting the On-Premises Connector...................................................................................850Determining the Connector information...................................................................................850Defining the proxy server ........................................................................................................852Configuring On-Premises Connector memory resources .......................................................853Determining the version ..........................................................................................................855Checking the configuration status............................................................................................855

    13Progress DataDirect Hybrid Data Pipeline: User's Guide: Version 4.3

    Contents

  • Configuring failover and balancing requests across multiple On-Premises Connectors.........855Configuring the Microsoft Dynamics CRM On-Premises data source for Kerberos................856Troubleshooting the On-Premises Connector..........................................................................857

    Chapter 9: Hybrid Data Pipeline Management API.................................859Using the Progress DataDirect Hybrid Data Pipeline Management APIs...........................................859Hybrid Data Pipeline DataSource API................................................................................................861

    Getting started with the Hybrid Data Pipeline DataSource API...............................................861Get data stores.........................................................................................................................863Get options for a data store......................................................................................................871Create a data source................................................................................................................876Get data sources......................................................................................................................881Get data source details............................................................................................................885Update a data source...............................................................................................................890Delete a data source................................................................................................................894Get data source permissions...................................................................................................895Update permissions on a data source......................................................................................897Test a connection to a data source..........................................................................................899Refresh a data source map......................................................................................................901Create or refresh a data source OData model.........................................................................903Check status of the OData model refresh................................................................................905Get members of a data source group......................................................................................909Add member data sources to a group data source group........................................................911Update members of a group data source.................................................................................912Delete a member data source from a group data source........................................................915Hybrid Data Pipeline Schema API...........................................................................................916

    Hybrid Data Pipeline OAuth API.........................................................................................................927Getting started with the Hybrid Data Pipeline OAuth API........................................................928OAuthApplications Operations ................................................................................................929OAuthProfiles Operations........................................................................................................938

    Hybrid Data Pipeline Connector API...................................................................................................948Getting Started with the Hybrid Data Pipeline Connector API.................................................949Get Connectors........................................................................................................................952Get Connector Information.......................................................................................................955Update Connector Information.................................................................................................958Get Authorized Users...............................................................................................................966Add Authorized Users..............................................................................................................967Update Authorized Users.........................................................................................................969Delete Authorized Users..........................................................................................................970Create a Connector Group.......................................................................................................972Add On-Premises Connectors to an On-Premises Connector Group......................................980Get the List of On-Premises Connectors in an On-Premises Connector Group......................983Configure Round-Robin Request Balancing for an On-Premises Connector Group...............987Replace the List of On-Premises Connectors in an On-Premises Connector Group .............995

    Progress DataDirect Hybrid Data Pipeline: User's Guide: Version 4.314

    Contents

  • Remove an On-Premises Connector.......................................................................................998Delete a Group.......................................................................................................................1000

    Hybrid Data Pipeline Management Permissions API........................................................................1001Getting started with the Management Permissions API.........................................................1001

    Hybrid Data Pipeline Administrator's API..........................................................................................1003Getting started with the Limits API.........................................................................................1004Getting started with the Users API.........................................................................................1034Getting started with the Administrator Permissions API.........................................................1069Getting started with the Roles API.........................................................................................1075Getting started with the User Details API...............................................................................1085Getting started with Logging API............................................................................................1087Getting started with Whitelist API...........................................................................................1093Getting started with the Authentication API............................................................................1099Getting started with the System Configurations API..............................................................1110Getting started with the Password Policy API........................................................................1117

    Get Version Information....................................................................................................................1122Hybrid Data Pipeline Management API Error Messages..................................................................1123

    General errors........................................................................................................................1124Connector API error messages..............................................................................................1125OAuth API error messages....................................................................................................1130DataSource API error messages...........................................................................................1134Administrator API error messages.........................................................................................1142

    15Progress DataDirect Hybrid Data Pipeline: User's Guide: Version 4.3

    Contents

  • Progress DataDirect Hybrid Data Pipeline: User's Guide: Version 4.316

    Contents

  • 1Welcome to DataDirect Hybrid Data Pipeline

    Progress DataDirect Hybrid Data Pipeline is a light-weight software service that provides simple, secureaccess to your cloud and on-premises data sources for your business intelligence tools and applications. Clientapplications can use ODBC, JDBC, or OData to access data from over twenty supported relational andnon-relational database management systems, such as Apache Hive, DB2, SQL Server, Oracle, and Salesforce(collectively referred to as data stores). Requests from client applications are translated into the format supportedby the underlying data store SQL, NoSQL, Big Data, cloud and returned in the format accepted by theclient. Communications in HTTP and HTTPS are supported.

    Key Features Cluster deployment Run the Hybrid Data Pipeline service on multiple nodes behind a load balancer to

    support scalability

    Single node deployment Run the Hybrid Data Pipeline service on a single server with one or more backupservers to support failover

    Cloud deployment Host Hybrid Data Pipeline from the cloud for access to cloud and on-premises datasources without requiring a VPN or other gateway

    On-premises deployment Host Hybrid Data Pipeline on-premises for access to cloud and on-premisesdata sources

    API support Use ODBC, JDBC, or OData to access supported cloud and on-premises data sources

    Main Components Hybrid Data Pipeline server provides access to multiple data sources through a unified interface On-Premises Connector enables Hybrid Data Pipeline to establish a secure connection from the cloud to

    an on-premises data source without requiring a VPN or other gateway

    17Progress DataDirect Hybrid Data Pipeline: User's Guide: Version 4.3

  • ODBC driver enables ODBC applications to communicate to a data source through the Hybrid DataPipeline server (HTTP and HTTPS connections supported)

    JDBC driver enables JDBC applications to communicate to a data source through the Hybrid Data Pipelineserver (HTTP and HTTPS connections supported)

    For details, see the following topics:

    Product requirements

    Deployment guidelines

    Deployment scenarios

    Getting started for administrators

    Product requirementsHybrid Data Pipeline is comprised of four main components: the Hybrid Data Pipeline server, the On-PremisesConnector, the JDBC driver, and the ODBC driver.

    Hybrid Data Pipeline server must be installed on a 64-bit Linux machine. On-Premises Connector must be installed on a separate Windows machine.Depending on your application environment, you may need to install one or more additional components.

    For ODBC applications, install and configure the ODBC driver. For JDBC applications, install and configure the JDBC driver. For REST-based data access for mobile apps and desktop applications, no local software is needed.Ensure that your environment meets component requirements before proceeding with installation. See thefollowing tables for details on component requirements.

    Hybrid Data Pipeline server Browser for Hybrid Data Pipeline Web UI On-Premises Connector JDBC driver ODBC driver

    Progress DataDirect Hybrid Data Pipeline: User's Guide: Version 4.318

    Chapter 1: Welcome to DataDirect Hybrid Data Pipeline

  • Table 1: Hybrid Data Pipeline server

    64-bitPlatform

    Red Hat Enterprise Linux x64, version 4.0 andhigher

    CentOS Linux x64, version 4.0 and higher Ubuntu Linux x64, version 16 and higher SUSE Linux Enterprise Server, Linux x64, version

    10.x, 11, 12, and 13

    Oracle Linux x64, version 4.0 and higher

    Linux

    Table 2: Browser for Hybrid Data Pipeline Web UI

    VersionBrowser

    Chrome 53.0 and higherChrome

    Firefox 48 and higherFirefox

    Internet Explorer 11.0 and higherInternet Explorer

    Safari 9.1 and higherSafari

    Table 3: On-Premises Connector

    64-bit32-bitPlatform

    Windows 10 Windows 8, 8.1 Windows 7 Windows Server 2012 Service

    Pack 2

    Windows Server 2008

    Windows 10 Windows 8, 8.1 Windows 7 Windows Server 2008

    Windows

    Table 4: JDBC driver

    64-bit32-bitPlatform

    Java SE 6 or higherJava SE 6 or higherLinux

    Java SE 6 or higherJava SE 6 or higherWindows

    19Progress DataDirect Hybrid Data Pipeline: User's Guide: Version 4.3

    Product requirements

  • Table 5: ODBC driver

    64-bit32-bitPlatform

    AIX 5L, version 5.3 fixpack 5 6.1 7.1

    AIX 5L, version 5.3 fixpack 5 6.1 7.1

    AIX

    11i (B.11.23 and B.11.31) 11i (B.11.23 and B.11.31) 11i (B.11.11) 11

    HP-UX PA-RISC

    11i (B.11.23 and B.11.31) 11i (B.11.23 and B.11.31)HP-UX IPF

    Red Hat Enterprise Linux 4.x,5.x, 6.x

    SUSE Linux Enterprise Server10.x, 11, 12

    Oracle Linux 6.x, 7.x CentOS Linux 6.8, 7

    Red Hat Enterprise Linux 4.x,5.x, 6.x

    SUSE Linux Enterprise Server10.x, 11, 12

    Oracle Linux 6.x, 7.x CentOS Linux 6.8, 7

    Linux

    Oracle Solaris 8, 9, 10 Oracle Solaris 8, 9, 10Oracle Solaris on Oracle SPARC

    na Oracle Solaris 10,11Oracle Solaris x86: Intel

    Oracle Solaris 10 Oracle Solaris 11 Express

    Oracle Solaris 10, 11Oracle Solaris x64: Intel and AMD

    Windows 10 Windows 8, 8.1 Windows 7 Windows Vista Windows XPService Pack 2 and

    higher

    Windows Server 2012 ServicePack 2

    Windows Server 2008

    Windows 10 Windows 8, 8.1 Windows 7 Windows Vista Windows XPService Pack 2 and

    higher

    Windows Server 2008

    Windows

    Progress DataDirect Hybrid Data Pipeline: User's Guide: Version 4.320

    Chapter 1: Welcome to DataDirect Hybrid Data Pipeline

  • Deployment guidelinesHybrid Data Pipeline is a highly adaptable software service that can be securely integrated into a variety ofnetwork environments. Follow these guidelines to get your Hybrid Data Pipeline environment up and running.

    Determine how to deploy Hybrid Data Pipeline in your network environment. While not all deploymentscenarios are covered, Deployment scenarios on page 22 describes four basic deployment scenarios whichshould help with an assessment. For example, you may want to deploy the Hybrid Data Pipeline service asa cluster on multiple nodes.

    Determine which components you need to install and configure in addition to the Hybrid Data Pipeline server.For example, you will need to install the ODBC driver to support ODBC applications, and the JDBC driverfor JDBC applications. Additionally, you may choose to install the On-Premises Connector for direct, secureaccess to on-premises data sources.

    Ensure that Product requirements on page 18 are met for each component you are installing. For example,at this time, the Hybrid Data Pipeline server must be installed on a Linux 64-bit machine.

    Collect the information needed for component installation. The information you need depends on yourdeployment scenario. For example, you must supply a hostname and port for a load balancer when deployinga Hybrid Data Pipeline cluster. Refer to the Progress DataDirect Hybrid Data Pipeline Installation Guide fordetails.

    During installation of the Hybrid Data Pipeline server, youmust specify passwords for the default administratorand user. The default administrator and user are d2cadmin and d2cuser, respectively. When initially loggingin to the Web UI or using the API, you must authenticate as one of these users. See Login credentials onpage 29 for details.

    During installation of the Hybrid Data Pipeline server, you must specify the location of shared files used inthe installation and operation of the server. The shared location stores properties files, encryption keys, andsystem information. In addition, the shared files location should be secured on a system separate from thesystem that stores encrypted data, or encrypts or decrypts data. See Shared files location on page 34 fordetails.

    Install the components needed for your Hybrid Data Pipeline deployment, beginning with the Hybrid DataPipeline server. Refer to the Progress DataDirect Hybrid Data Pipeline Installation Guide for details.

    Note: The following four configuration files are generated with the installation of the Hybrid Data Pipelineserver:config.properties,OnPremise.properties,ddcloud.pem, andddcloudTrustStore.jks.These files must be used in the installation of the ODBC driver, the JDBC driver, and the On-PremisesConnector.

    For a Hybrid Data Pipeline cluster deployment, configure Hybrid Data Pipeline nodes, the load balancer,client applications, and the drivers. See Cluster deployment on page 23 for more information.

    To query on-premises data sources without requiring a VPN or other gateway, configure the On-PremisesConnector. See Configuring the On-Premises Connector on page 849 for details.

    Use the Web UI or Management API to create new users and administrators. See Managing users on page418 and Using the Progress DataDirect Hybrid Data Pipeline Management APIs on page 859.

    Use the Web UI or Management API to create data source definitions to support queries to the data storesyou use (such as Apache Hive, DB2, SQL Server, Oracle, and Salesforce). See Creating a Data Sourcedefinition on page 40 and Using the Progress DataDirect Hybrid Data Pipeline Management APIs on page859.

    21Progress DataDirect Hybrid Data Pipeline: User's Guide: Version 4.3

    Deployment guidelines

    https://documentation.progress.com/output/DataDirect/hybridpipeinstall/https://documentation.progress.com/output/DataDirect/hybridpipeinstall/

  • Configure your OData applications to query the data sources you have created. See Enabling OData andworking with Data Source groups on page 408 and Getting started with OData version 2 on page 689.

    Configure your Google Analytics OAuth applications to query the data sources you have created. SeeGoogle Analytics parameters on page 86.

    Configure the ODBC and JDBC drivers, as well as your ODBC and JDBC applications, to query data sources.For example, you may need to configure the drivers to use timeouts. See Getting started with the ODBCDriver on page 462 and Getting started with the JDBC driver on page 541.

    Deployment scenariosHybrid Data Pipeline is a highly adaptable software service that can be securely integrated into a variety ofnetwork environments. The following topics describe how Hybrid Data Pipeline can be deployed as a clusterbehind a load balancer or as a single node in a DNS or behind a gateway. Deployments for hosting HybridData Pipeline from the cloud or on-premises are also described.

    Whether deploying Hybrid Data Pipeline on a single node or multiple nodes, in the cloud or on-premises, youwill provide key pieces of information such as hostnames and port numbers during the installation process. Toensure a successful deployment, carefully follow the instructions described in the Progress DataDirect HybridData Pipeline Installation Guide for any deployment scenario.

    Single node deploymentA single node deployment refers to a single, active Hybrid Data Pipeline service running on a node in a DNSor behind a reverse proxy or load balancer. HTTP and HTTPS end-points are exposed on this node. OData,ODBC, and JDBC requests can be made directly to this node, and a firewall can be used to limit the HTTP andHTTPS end-points that are exposed. Additionally, connections with client applications can be secured byappropriately configuring client applications, supplying an SSL certificate for the active node, and, if usingODBC or JDBC applications, appropriately configuring theODBC and JDBC drivers. TheOn-Premises Connectormay also be used to ensure secure connections from cloud applications to on-premises data sources.

    A single node deployment with failover supports the same functionality. However, a failover deployment wouldinclude the installation of one or more Hybrid Data Pipeline backup nodes. Any backup nodes must share thesame external system database, an encryption key, and SSL certificate with the active node. This informationmust be provided during the installation of the backup nodes.

    Your single node deployment of the Hybrid Data Pipeline service may include the use of the On-PremisesConnector, the ODBC driver, and the JDBC driver. During the installation of a Hybrid Data Pipeline server, thefollowing four configuration files are generated.

    config.properties

    ddcloud.pem

    ddcloudTrustStore.jks

    OnPremise.propertiesThese files must be used in the installation of any supporting components to ensure a fully operating HybridData Pipeline integration. See the Progress DataDirect Hybrid Data Pipeline Installation Guide for details.

    Progress DataDirect Hybrid Data Pipeline: User's Guide: Version 4.322

    Chapter 1: Welcome to DataDirect Hybrid Data Pipeline

    https://documentation.progress.com/output/DataDirect/hybridpipeinstall/https://documentation.progress.com/output/DataDirect/hybridpipeinstall/https://documentation.progress.com/output/DataDirect/hybridpipeinstall/

  • Cluster deploymentThe Hybrid Data Pipeline service can be deployed onmultiple nodes behind a load balancer to support scalability.The Hybrid Data Pipeline cluster supports the following functionality.

    Distribution of requests OData, ODBC, JDBC across cluster nodes SSL communication with a load balancer that supports SSL termination Session affinity to bind a client query to a single node for improved performance (must be enabled in the

    load balancer to support the Web UI and ODBC and JDBC clients)

    HTTP health check to verify that nodes are active and workingThe successful deployment of a Hybrid Data Pipeline cluster involves the following key elements.

    Hybrid Data Pipeline server configuration Load balancer configuration Application and driver configuration Browser configuration for the Web UI

    Hybrid Data Pipeline server configurationThe configuration of a Hybrid Data Pipeline server takes place during the installation process. The Hybrid DataPipeline server must be installed consistently across the nodes in a cluster. When you install the first server inthe Hybrid Data Pipeline cluster, you are creating a template to be used for subsequent installation of the serveron additional nodes. While you will need to refer to the Hybrid Data Pipeline installation guide to ensure asuccessful deployment, the following list outlines configuration information that must be supplied during theinstallation process.

    The name of the machine hosting the Hybrid Data Pipeline service The public domain name of the load balancing appliance or of the machine hosting the load balancing

    service

    The location of a file with the CA server certificate used for SSL configuration in the load balancer. Base64encoded X.509 and DER encoded binary X.509 formats are supported for the CA server certificate. Thiscertificate is used to create the trust store used by ODBC and JDBC drivers.

    The location of shared files used in the installation and operation of the server. This shared location storesproperties files, encryption keys, and system information. The shared files location must be accessible toall the nodes in a cluster. In addition, the shared files location should be secured on a system separate fromthe system that stores encrypted data, or encrypts or decrypts data. The shared files location is called the"key location" in the installer.

    The identification of an external system database to support data source access across the cluster Apache Kafka configuration information:

    The name of the Hybrid Data Pipeline cluster for Apache Kafka A comma separated list of the host name and port pairs for Apache Kafka nodes

    Ports used for messaging, node-to-node communication, and version information

    23Progress DataDirect Hybrid Data Pipeline: User's Guide: Version 4.3

    Deployment scenarios

    https://documentation.progress.com/output/DataDirect/hybridpipeinstall/

  • Load balancer configurationYou will need to consult your load balancer documentation for details on configuring your load balancer.Nevertheless, to ensure that your load balancer works with a Hybrid Data Pipeline cluster, the followingrequirements must be met.

    For SSL encryption, the load balancer must provide SSL termination. The SSL certificate supplied duringthe installation of the Hybrid Data Pipeline nodes will be used by the load balancer.

    The load balancer must support session affinity. The load balancer must either be configured to supply itsown cookies or to pass the cookies generated by the Hybrid Data Pipeline service back to the client. TheHybrid Data Pipeline service provides a cookie named C2S-SESSION that can be used by the load balancer.For ODBC and JDBC applications, the ODBC and JDBC drivers automatically use cookies for sessionaffinity. OData applications should be configured to echo cookies for optimal performance.

    The load balancer must pass the hostname in the Host header when a request is made to an individualHybrid Data Pipeline node. For example, if the hostname used to access the cluster is hdp.mycorp.comand the individual nodes behind the load balancer have the hostnames hdpsvr1.mycorp.com,hdpsvr2.mycorp.com, hdpsvr3.mycorp.com, then the Host header in the request forwarded to theHybrid Data Pipeline node must be the load balancer hostname hdp.mycorp.com.

    The load balancer must supply the X-Forwarded-Proto header to indicate to the Hybrid Data Pipeline nodewhether the request was received by the load balancer as an HTTP or HTTPS request.

    The load balancer must supply the X-Forwarded-For header if the client IP address is needed for HybridData Pipeline access logs. If the X-Forwarded-For header is not supplied, the IP address in the access logswill always be the load balancers IP address.

    The load balancer may also be configured to use the Health Check API to run HTTP health checks againstnodes. See Get Configurations on page 1110 and Update Configuration for given ID on page 1115.

    Application and driver configurationClient applications must be appropriately configured. In conjunction with ODBC and JDBC applications, ODBCand JDBC drivers will also need to be configured. OData applications will need their own modifications.

    For the most part, configuration of the ODBC and JDBC drivers is handled during the installation of the drivers.If the drivers are installed using the configuration files generated by the Hybrid Data Pipeline server installation,then they will use the public DNS for the load balancer and will use cookies sent by the load balancer. However,you may wish to configure the drivers in other ways. For details, see Getting started with the ODBC Driver onpage 462 and Getting started with the JDBC driver on page 541.

    OData applications must be modified to use the public DNS of the load balancer for HTTP or HTTPS requests.Additionally, for optimal performance, OData applications should be configured to echo cookies for sessionaffinity. OData applications must also be configured appropriately for SSL. See Node-to-node communicationin OData client cluster environment on page 25 for details on communication between nodes when an ODataclient cannot be configured to echo cookies.

    For information on configuring OAuth applications, see Google Analytics parameters on page 86.

    Browser configuration for the Web UIThe browser you are using when working with the Hybrid Data Pipeline Web UI must be configured to echocookies for session affinity. This is the default for most browsers.

    Progress DataDirect Hybrid Data Pipeline: User's Guide: Version 4.324

    Chapter 1: Welcome to DataDirect Hybrid Data Pipeline

  • Node-to-node communication in OData client cluster environmentTo achieve session affinity and optimize OData query performance in a Hybrid Data Pipeline cluster, the loadbalancer and OData clients should be configured to handle cookies. The load balancer should supply its owncookies or pass the cookies generated by the Hybrid Data Pipeline service back to the OData client. In turn,the OData client should echo cookies to allow the load balancer to direct query requests to the node that initiallyreceived the query.

    However, it is not always possible to configure an OData client to echo cookies. In such cases, Hybrid DataPipeline uses an internal mechanism called the distributed file persistence manager to achieve session affinity.When a query is executed that requires file persistence, execution results are stored temporarily on the nodethat initially received the query. The manager associates the query with the node and the execution resultsstored there. If a request from the same query is routed to a different node in the cluster, the manager obtainsthe persisted execution results from the original node. The query results are then returned to the client by thenode that received the request.

    The distributed file persistence manager requires node-to-node communication using the HTTP protocol toachieve session affinity. The internal API port specified during Hybrid Data Pipeline server installation is theport used for this node-to-node communication. Data remains secure in the following respects. First, the internalAPI port (8190 default) is not exposed externally to the public facing network. Each node registers itself usingthis port, and communications are restricted. Second, a UUID is generated during the node registration process.This UUID is passed in as a HTTP header to confirm the validity of node-to-node communications. Third, theservice stores persisted files on only a temporary basis.

    Cloud deploymentThis scenario describes a deployment where on-premises data sources are exposed for secure access bycloud-based applications. For this deployment, a Hybrid Data Pipeline server is installed in the cloud, and theOn-Premises Connector is used to perform secure connections through the firewall to the Microsoft SQL Serverdatabase. The cloud-based application is located in a separate cloud but connects with Hybrid Data Pipelinethrough an API such as OData, ODBC, or JDBC.

    This deployment could be suitable for an independent software vendor who wants to embed Hybrid DataPipeline services in the cloud to give customers direct access to data sources.

    25Progress DataDirect Hybrid Data Pipeline: User's Guide: Version 4.3

    Deployment scenarios

  • For a more detailed discussion of this scenario, watch a video.

    On-premises deploymentThis scenario describes a deployment where the Hybrid Data Pipeline server is installed behind a firewall withon-premises data sources while a number of applications reside in the cloud. With the Hybrid Data Pipelineserver behind a firewall, you don't need to maintain a cloud-based service, and you can use SSL to secureyour data.

    This deployment scenario could be suitable when using cloud-based OData applications, for example, creatinga real-time connectivity between Salesforce and an on-premises database.

    Progress DataDirect Hybrid Data Pipeline: User's Guide: Version 4.326

    Chapter 1: Welcome to DataDirect Hybrid Data Pipeline

    http://documentation.progress.com/output/video/DataDirect/DataPipelineScenarios.htmlhttp://documentation.progress.com/output/video/DataDirect/DataPipelineScenarios.html

  • For a more detailed discussion of this scenario, watch a video.

    Getting started for administratorsAfter evaluating Hybrid Data Pipeline, the next step for an administrator is to deploy the service into a productionenvironment. The production environment requires a high level of availability and security to prevent serviceoutages and protect data. For those reasons, we recommend installing a cluster deployment along with severalkey elements that improve both the failover recovery and security of the service. This section guidesadministrators through the recommended set-up of Hybrid Data Pipeline for the production environment.

    Configuration of the Hybrid Data Pipeline server for a cluster deployment takes place during the installationprocess. The Hybrid Data Pipeline server must be installed consistently across the nodes in a cluster. Whenyou install the first server in the Hybrid Data Pipeline cluster, you are creating a template to be used forsubsequent installation of the server on additional nodes. Refer to the Hybrid Data Pipeline installation guidefor detailed installation instructions.

    Before installing the Hybrid Data Pipeline server, the following elements should be addressed to ensure asuccessful deployment into a production environment.

    27Progress DataDirect Hybrid Data Pipeline: User's Guide: Version 4.3

    Getting started for administrators

    http://documentation.progress.com/output/video/DataDirect/DataPipelineScenarios.htmlhttp://documentation.progress.com/output/video/DataDirect/DataPipelineScenarios.htmlhttps://documentation.progress.com/output/DataDirect/hybridpipeinstall/

  • Login credentialsYou must specify passwords for the default administrator and user during installation of the Hybrid DataPipeline server. The default administrator and user are d2cadmin and d2cuser, respectively. When initiallylogging in to theWeb UI or using the API, you must authenticate as one of these users. See Login credentialson page 29 for details.

    Load balancer configurationIn the production environment, we recommend that the Hybrid Data Pipeline service is deployed on multiplenodes behind a load balancer to support scalability and improved availability.

    External system databaseAn external database must be used to store system information in the cluster environment. This providesimproved security while supporting data source access from across the cluster. Best practices dictate thatthe system database is replicated, or mirrored, to promote the continuous availability of the system.

    Apache Kafka clusterWe recommend employing an Apache Kafka cluster as the message queue. This removes the single-pointof failure that is inherent to using the default TCP message queue.

    Shared files locationYou must determine the location of shared files used in the installation and operation of the server. Theshared location stores properties files, encryption keys, and system information. The shared files locationmust be accessible to all the nodes in a cluster. In addition, the shared files location should be secured ona system separate from the system that stores encrypted data, or encrypts or decrypts data. The sharedfiles location is called the "key location" in the installer.

    Access PortsThe access ports used for Hybrid Data Pipeline should be enabled for incoming traffic and unallocated forother purposes. For a complete list of these ports and defaults for a cluster deployment, see Access portson page 37.

    SSL certificate for the load balancerTraffic coming into a cluster deployment is routed to through the load balancer. As such, we recommendthat the load balancer provide the SSL termination for a cluster deployment to ensure that data is encryptedas it is transmitted.

    Application and driver configurationApplications and drivers must be properly configured for a successful cluster deployment.

    Browser configuration for the Web UIYour browser must be properly configured to work with the Web UI.

    See the following topics for detailed information.

    Login credentials on page 29

    Load balancer configuration on page 29

    External system database on page 33

    Apache Kafka Cluster on page 33

    Shared files location on page 34

    SSL certificate for the load balancer on page 36

    Access ports on page 37

    Progress