tsload connector reference
The tsload connector APIs enable you to load data into ThoughtSpot.
The tsload connector supports the following APIs:
Login
Use this API to authenticate and sign in a user. Login establishes a session with the ThoughtSpot ETL HTTP server. The authentication requires a username and password.
Request Parameters
- username
-
ThoughtSpot username
- Data type
-
string
- password
-
ThoughtSpot password
- Data type
-
string
StartLoad
After login, you use this API start the data load operation. The API call to be used here is “/ts_dataservice/v1/public/loads”. If the load is initiated successfully, the cycle ID, and the load balancer IP are returned. After this completes, use Load to start the actual data load.
Request Parameters
Target
Specification of the target. This D`B/Schema/Table` must exist on the destination ThoughtSpot system.
database
-
Database in ThoughtSpot
- Data type
-
string
schema
-
(optional) Schema in ThoughtSpot
- Data type
-
string
- Default value
-
falcon_default_schema
table
-
Table in ThoughtSpot
- Data type
-
string
Format
Format specifiers for parsing the input data.
type
-
(optional) Input format; Either
CSV
, orDELIMITED
.- Data type
-
string
- Default
-
CSV
field_separator
-
(optional) Field separator character in source data.
- Data type
-
string
- Default
-
"," (comma)
trailing_field_separator
-
(optional)
True
if input data has trailing field separator,false
otherwise.- Data type
-
boolean
- Default
-
false
enclosing_character
-
(optional) The enclosing character in csv source format. This option applies only to
csv
format.- Data type
-
string
- Default
-
"\" (backslash)
escape_character
-
(optional) Escape character in source data. This applies only to delimited data format. This option is ignored for other data sources.
- Data type
-
string
- Default
-
"" (null)
null_value
-
(optional) Escape character in source data. This applies only to delimited data format. This option is ignored for other data sources.
- Data type
-
string
- Default
-
"" (null)
has_header_row
-
(optional)
True
if input data file has header row,false
otherwise.- Data type
-
boolean
- Default
-
false
flexible
-
(optional) Whether input data file exactly matches target schema.
The flexible option is not available for columnar formats such as parquet and orc. When
When |
- Data type
-
boolean
- Default value
-
false
date_time
converted_to_epoch
-
(optional) Whether date or datetime fields are already converted to epoch in source CSV. This option is ignored for other source types.
- Data type
-
boolean
- Default
-
true
date_time_format
-
(optional) Format string for datetime values. Default is System accepts date time format specifications supported in strptime datetime library.
- Data type
-
string
- Default
-
"%Y%m%d %H:%M:%S" (yearmonthday hour:minute:second)
- Example
-
December 30, 2001 1:15:12 is
20011230 01:15:12
- date_format
-
(optional) Format string for date values. System accepts date format specifications supported in strptime datetime library.
- Data type
-
string
- Default
-
"%Y%m%d" (
yearmonthday
) - Example
-
December 30, 2001 is
20011230
time_format
-
(optional) Format string for time values. Default is
hour:minute:second
. System accepts time format specifications supported instrptime datetime
library.- Data type
-
string
- Default
-
"%H:%M:%S" (
hour:minute:second
) - Example
-
1:15:12 is
01:15:12
skip_second_fraction
-
(optional) When
true
, skip fractional part of seconds: milliseconds, microseconds, or nanoseconds from either datetime or time values if that level of granularity is present in the source data.This option is ignored for other source types.
Skipping fractional components from input data can have a negative impact when upserting data because non-unique fractional values for same time or datetime values can incorrectly replace valid rows. - Data type
-
boolean
- Default
-
false
boolean
use_bit_values
-
(optional) If true, the source
csv
uses one bit for boolean values.If
false
, boolean values are interpreted using the flagboolean_representation
.This option is valid for
csv
only, ad ignored for other types.-
False
is represented as0x0
-
True
is represented as 0x1.- Data type
-
boolean
- Default
-
false
-
true_format
-
(optional) Represents True for boolean values in input.
- Data type
-
string
- Default
-
T
false_format
-
(optional) Represents False for boolean values in input.
- Data type
-
string
- Default
-
F
load_options
- empty_target
-
(optional) If
true
, current rows in the target table or file are dropped before loading new data. Iffalse
, current rows are appended to target table or file.- Data type
-
boolean
- Default
-
false
max_ignored_rows
-
(optional) Maximum number of rows that can be ignored for successful load. If number of ignored rows exceeds this limit, the load is aborted.
- Data type
-
integer
- Default
-
0
advanced_options
max_reported_parsing_errors
-
(optional) Maximum number of parsing errors to report back along with the status.
- Data type
-
integer
- Default
-
100
Example: using parameters
{
target : {
database : "<DB_NAME>",
schema : "falcon_default_schema",
table : "<TABLE_NAME>"
},
format : {
type : "CSV",
field_separator : ",",
trailing_field_separator : false,
enclosing_character : "\"",
escape_character : "",
null_value : "(null)",
date_time : {
converted_to_epoch : false,
date_time_format : "%Y%m%d %H:%M:%S",
date_format : "%Y%m%d",
time_format : "%H:%M:%S",
skip_second_fraction : false
}
boolean : {
use_bit_values : false,
true_format : "T",
false_format : "F"
}
has_header_row : false,
flexible : false
},
load_options : {
empty_target : false,
max_ignored_rows : 0,
},
advanced_options : {
max_reported_parsing_errors : 100
}
}
Request
curl -i -X POST -b 'JSESSIONID=<GUID-XYZ>' -d '{"target_database": "<DB1>", "target_schema": "<SCHEMA1>", "target_table": "<TABLE1>", "field_separator": ",", "empty_target": false}' https://<TS_CLUSTER>:8442/ts_dataservice/v1/public/loads
Response
Status: 202 Accepted
Content-Type: text/plain
Content-Length: xx
{
"node_address": {
"host": "host",
"port": port
},
"cycle_id": "cycle_id"
}
Example failure responses
Status: 401 UNAUTHORIZED
Unable to verify user. Please login again.
Status: 403 FORBIDDEN
User does not have required privileges. Please contact your administrator.
Status: 401 UNAUTHORIZED
Unable to verify user. Please login again.
Status: 500 INTERNAL SERVER ERROR
error code = INTERNAL, message = Couldn't resolve the authentication service.
Load
Use this API to load your data.
You can load data in multiple chunks for the same cycle ID. All data is uploaded directly to the ThoughtSpot cluster, unless you issue a commit load.
Request
POST /ts_dataservice/v1/public/loads/<cycle_id>
Cookie: <token>
Content-Type: multipart/form-data; boundary=bndry
--bndry
Content-Disposition: form-data; name="file"; filename="sample.csv"
<CSV Data>
--bndry--
We only support multipart form/data. |
Response
Status: 202 Accepted
Content-Type: text/plain
Content-Length: xx
Connection: Close
Upload Complete.
Example failure responses
Status: 401 UNAUTHORIZED
Unable to verify user. Please login again.
Status: 403 FORBIDDEN
User does not have required privileges. Please contact your administrator.
Status: 400 BAD REQUEST
Unable to find table in Falcon. Cannot load data.
Status: 400 BAD REQUEST
Cycle_id=[cycle_id] does not exist.
Status: 400 BAD REQUEST
Cannot not connect to falcon_manager.
Status: 500 INTERNAL SERVER ERROR
error code = INTERNAL, message = Couldn't resolve the authentication service.
CommitLoad
AbortLoad
Use this API to stop loading data.
Example failure responses
Status: 401 UNAUTHORIZED
Unable to verify user. Please login again.
Status: 403 FORBIDDEN
User does not have required privileges. Please contact your administrator.
Status: 500 INTERNAL SERVER ERROR
error code = INTERNAL, message = Couldn't resolve the authentication service.
Status of load
Use the api to get the current status of a load.
Example failure responses
Status: 401 UNAUTHORIZED
Unable to verify user. Please login again.
Status: 403 FORBIDDEN
User does not have required privileges. Please contact your administrator.
Status: 500 INTERNAL SERVER ERROR
error code = INTERNAL, message = Couldn't resolve the authentication service.
Bad records
Use this api to view the bad records file data.
Request
GET /ts_dataservice/v1/public/loads/<cycle_id>/bad_records_file
Cookie: token
Content-range: xxx-xxxx
Example failure responses
Status: 401 UNAUTHORIZED
Unable to verify user. Please login again.
Status: 403 FORBIDDEN
User does not have required privileges. Please contact your administrator.
Status: 500 INTERNAL SERVER ERROR
Node does not exist: /tmp/cycle_id.bad_record
Status: 500 INTERNAL SERVER ERROR
error code = INTERNAL, message = Couldn't resolve the authentication service.
Related information