Add a Hive database connection
You can add a connection to a Hive database using ThoughtSpot DataFlow.
Follow these steps:
- 
Select Connections in the top navigation bar. 
- 
In the Connections interface, select Add connection in the upper-right corner. 
- 
In the Create Connection interface, select the Connection type. 
- 
After you select the Hive Connection type, the rest of the connection properties appear. See Connection properties for details, defaults, and examples. - Connection name
- 
Name your connection. 
- Connection type
- 
Choose the Hive connection type. 
- HiveServer2 HA configured
- 
Specify this option if using HiveServer2 High Availability. 
- HiveServer2 zookeeper namespace
- 
Specify zookeeper namespace as hiveserver2. This is the default value. Only when using Hiveserver2 HA. 
- Host
- 
Specify the hostname or the IP address of the Hadoop system. Only when not using Hiveserver2 HA. 
- Port
- 
Specify the port. Only when not using Hiveserver2 HA. 
- Hive security authentication
- 
Specifies the type of security protocol to connect to the instance. Based on the type of security select the authentication type and provide details. 
- User
- 
Specify the user to connect to Hive. This user must have data access privileges. 
- Password
- 
Specify the password. 
- Trust store
- 
Specify the trust store name for authentication. For SSL and Kerberos authentication only. 
- Trust store password
- 
Specify the password for the trust store. For SSL and Kerberos authentication only. 
- Hive transport mode
- 
Applicable only for hive process engine. This specifies the network protocol used for communicating between hive nodes. 
- HTTP path
- 
This is specified as an option when http transport mode is selected. For HTTP transport mode only. 
- Hadoop distribution
- 
Provide the Hadoop distribution of the connection. 
- Distribution version
- 
Provide the version of the Hadoop distribution. 
- Hadoop conf path
- 
By default, the system picks the Hadoop configuration files from the HDFS. To override, specify an alternate location. Applies only when using configuration settings that are different from global Hadoop instance settings. 
- DFS HA configured
- 
Specify if using High Availability for DFS. 
- DFS name service
- 
Specify the logical name of the HDFS nameservice. 
- DFS name node IDs
- 
Specify a comma-separated list of NameNode IDs. System uses this property to determine all NameNodes in the cluster. XML property name is dfs.ha.namenodes.dfs.nameservices.
- RPC address for namenode1
- 
Specify the fully-qualified RPC address for each listed NameNode. Defined as dfs.namenode.rpc-address.dfs.nameservices.name node ID 1. For DFS HA and Hadoop Extract only.
- RPC address for namenode2
- 
Specify the fully-qualified RPC address for each listed NameNode. Define as dfs.namenode.rpc-address.dfs.nameservices.name node ID 2.
- DFS host
- 
Specify the DFS hostname or the IP address. 
- DFS port
- 
Specify the associated DFS port. 
- Default DFS location
- 
Specify the location for the default source/target location. 
- Temp DFS location
- 
Specify the location for creating temp directory. 
- DFS security authentication
- 
Select the type of security being enabled. 
- Hadoop RPC protection
- 
Hadoop cluster administrators control the quality of protection using the configuration parameter hadoop.rpc.protection.
- Hive principal
- 
Principal for authenticating hive services. 
- User principal
- 
To authenticate via a key-tab you must have supporting key-tab file which is generated by Kerberos Admin and also requires the user principal associated with Key-tab (Configured while enabling Kerberos). 
- User keytab
- 
To authenticate via a key-tab you must have supporting key-tab file which is generated by Kerberos Admin and also requires the user principal associated with Key-tab (Configured while enabling Kerberos). 
- KDC host
- 
Specify KDC Host Name where as KDC (Kerberos Key Distribution Center) is a service that runs on a domain controller server role (Configured from Kerberos configuration-/etc/krb5.conf). 
- Default realm
- 
A Kerberos realm is the domain over which a Kerberos authentication server has the authority to authenticate a user, host or service (Configured from Kerberos configuration-/etc/krb5.conf). 
- Queue name
- 
Specify the queue name followed by a coma separated form in yarn.scheduler.capacity.root.queues. For Hadoop Extract only. 
- YARN web UI port
- 
Yarn Providing web UI for yarn RM and by default 8088 in use. For Hadoop Extract only. 
- Zookeeper quorum host
- 
Specify the value of hadoop.registry.zk.quorum from yarn-site.xml. Only when using Hiveserver2 HA. 
- Yarn timeline webapp host
- 
Specify the ip address of yarn timeline service web application. 
- Yarn timeline webapp port
- 
Specify the port associated with the yarn timeline service web application. 
- Yarn timeline webapp version
- 
Specify the version associated with the yarn timeline service web application. 
- JDBC options
- 
Specify the options associated with the JDBC URL. 
 
- 
Select Create connection.