Search data error reference

This reference identifies the error codes that can appear when you use Search data in ThoughtSpot, including the cause and suggested resolution.

Search data error codes


11026

Description

Nesting group-aggregate formulas is not supported.

Details

This error is displayed as a query generation error when a query plan detects that the underlying formula is defined with nested group functions. An example of nested group function is the following where:

  • the inner group aggregate is a level of detail calculation for sales at the country level.

  • the outer group aggregate creates a moving sum of the country sales that ignores any date filters. This ensures the moving calculation is correct when date filters are applied.

Sales by Country ( worksheet formula) = sum(amount), {country}, query_filters())

3 Month Average of Sales by Country= group_aggregate(moving_average(group_aggregate(Sales by Country,2,0,date),query_groups(), query_filters()-{date})

Cause

ThoughtSpot does not support nested group functions. Refer to the formula reference for list of all group functions.

Resolution

A data modelling solution is required to calculate the inner group function. In this example this could be one of the following constructs to pre-calculate the sales by country value:

  • a table or column in the underlying database;

  • a ThoughtSpot Search based View;

  • a ThoughtSpot SQL based View.


11027

Description

Query must include at least one base table column.

Details

This error is displayed when the search data query has not selected a token that references a column from the underlying database.

Cause

With formulas, it is possible to create ThoughtSpot columns that do not reference the underlying database. The following is a simple example formula. If a search is defined with this single token, then the error will be displayed.

Current Date = today()

A second example could involve a parameter, where the user has created a formula to calculate the number of days between the date specified by the parameter and the current date. As this formula does not reference a column in the database, this would return an error if this formula was the only token in the search.

Days Difference = diff_days(parameterDate, today())

Resolution

Update the search to include a column that references the underlying database.


11028

Description

Data samples are not supported for models/Worksheets with chasm traps. Use Search to get sample data.

Details

When a user selects the Data Samples tab from a Worksheet or model, ThoughtSpot attempts to display 50 rows of data. If a chasm trap query plan is detected, then a warning message is displayed to the user.

Cause

Schemas that may have multiple fact tables are joined through shared dimensional tables. This is known as a chasm trap. ThoughtSpot supports chasm traps for analysis. However, these are not supported for Worksheet data samples. Refer to the Chasm trap article for further details.

Resolution

Leverage Search Data to review columns and their values.


11029

Description

The table data could not be retrieved. Table data is not supported for Worksheets with aggregate column filters.

Details

Worksheets support the ability to define filters based upon aggregate formulas. This article provides an example for how aggregate filters can be used to protect PII information.

Cause

Worksheet data samples do not support the complex query planning required to return results with aggregate filters.

Resolution

Leverage Search Data to review columns and their values.


11032

Description

ThoughtSpot does not currently support complex usages of median, rank, rank_percentile, percentile, and aggregate SQL passthrough functions.

Details

This error is displayed as a query generation error when the usage of median, rank, rank_percentile, and aggregate SQL passthrough functions are used in conjunction with an underlying query plan of a fan or chasm trap. Refer to this community article for further details regarding these complex query patterns.

The following is a simple example of a fan trap. Assume the data model has a central fact table for Sales. This is joined as a many-to-one to a region dimension table and another many-to-one join on the customer table.

[Sales] by [Region] [Count Customers]

The [Count Customers] token results in a fan trap. This ensures that there is no overcounting of the results.

Cause

These functions do not currently support advanced formula planning. This is required to ensure that the results that are returned are correct.

Resolution

For certain situations it may be possible to leverage the group_aggregate function to force an independent query plan for these functions. The fan trap example is extended to include a function to return the median value. This requires a formula to be defined:

fxMedian = median(sales)

The resulting search token is the following, which results in an error:

[Sales] by [Region] [Count Customers] [fxMedian]

The formula can be updated with the group_aggregate definition. Note that after the query_groups() section there is an additional definition of ‘+{}’. This is required so that query generation does not attempt to simplify the query plan.

fxMedian = group_aggregate(median(sales),query_groups()+{},query_filters())

11035

Description

Unsupported query: unique count and a rank-related function coming from the same fact table.

Details

last_value is a ThoughtSpot formula function introduced in 9.12.0.cl . This function is designed to provide analysts the capabilities to create formulas for semi-additive measures. An example could be inventory balances.

For example: "How much stock does Acme Corp maintain for each Region?"

A secondary metric could be Average Stock on Hand. A common definition for this is: balance of stock divided by the unique count of locations.

In ThoughtSpot this formula could be defined as:

last_value(sum(balance),query_groups(),{transaction date}) / unique count(location id)

An alternative example could be a user asking the question as separate measures. For example: "Show me the balance and unique count of locations for each region".

In this scenario the search would return two measure columns: Balance & Unique Count Locations.

Cause

In the example defined, the last value aggregate column is ‘balance’ and the unique count column is ‘location id’. Error 11035 is returned when the unique count column, ‘location id’, is derived from either the same fact table as balance or as a dimensional table that is joined to the fact table.

Resolution

Unique count can be combined with last_value across a chasm trap query. Therefor, pending resolution for this limitation, an analyst can leverage group_aggregate to force a chasm trap query.

For example: The group_aggregate definition ensures a chasm trap is generated and the inclusion of query_groups()+{} ensures that this query plan is not optimized.

group_aggregate(unique count(location id), query_groups()+{}, query_filters())

The complete formula would be:

last_value(sum(balance),query_groups(),{transaction date}) / group_aggregate(unique count(location id), query_groups()+{}, query_filters()). ---

11036

Description

Unsupported query: nested aggregate function, for example, fx = max(avg(c1)).

Details

This error is displayed in the formula editor when an aggregate formula is defined directly within another aggregate formula, for example, average(sum(sales)). Note that this error may be displayed when an aggregate is wrapped around a column, such as average(total sales). In this scenario, the worksheet column would be defined with an aggregate formula, such as total sales = sum(sales).

Cause

can’t use an aggregate function within an aggregate function.

Resolution

To resolve this, group functions should be used to force the query generation to create a CTE query plan. That is, assume the business requirement is to return the average daily sales for each product. the formula would be defined as the following:

average daily sales = average(group_aggregate(sum(sales),query_groups()+{date(transactionDate)}, query_filters()))

Where the following logic is applied

group_aggregate(sum(sales),query_groups()+{date(transactionDate)}, query_filters()) → calculate the daily sum of sales.

average() → calculate the average of the daily sales for the level of detail of the search.

The resulting query would be: [Product] [Average Daily Sales].


11037

Description

Formulas that mix aggregates and non-aggregate expressions are not supported.

Details

This error is displayed in the formula editor when a formula incorrectly includes a combination of an aggregate expression and a non-aggregate expression (non-group-by).

The following is an example of a formula that results in an error. Where

  • region='apac' is the non-aggregate

  • sum(sales) is an aggregate

fx = if(region='apac' then sum(sales) else 0.

Cause

Historically these formulas could be written in ThoughtSpot. However, these often resulted in SQL generation queries. In the example above, if the region column is not included in the search, then a group by sql query would be returned. This is because the:

  • aggregate function is calculated across all rows of data : sum(sales).

  • non-aggregate is a calculation at the row level : region='apac'.

Resolution

The formula must be re-written with the following as examples:

Option 1: wrap the non-aggregate column in an aggregate function.

fx = if(max(region)='apac' then sum(sales) else 0

Option 2: re-write the function as an aggregate if formula.

fx = sum_if(region='apac', sales)


22001

Description

Multiple query paths detected. The selected path is the first table based upon alphabetical sorting of the table names.

Details
Simple example

The following image represents a chasm trap data model. Refer to this article for further details on chasm traps and attribution.

Scenario 1: A user has selected a column (or columns) from the group dim table and the datetime dim table. This results in two possible root tables and resulting paths to join these columns:

Option 1: Join through the membership fact table.

Option 2: Join through the group tag fact table.

These root tables are considered equal as both paths have the same number of children nodes. That is, both paths join to group dim and datetime dim. In this scenario, ThoughtSpot will select the path based upon the alphabetical sorting of the root tables. As G is sorted before M, the group tag fact table is selected.

A chasm trap example where the membership fact table is joined to entity dim

Scenario 2: A user has selected a column (or columns) from the group dim table, datetime dim table, and the entity dim table. In this scenario, the path via membership fact results in three children nodes: entity dim, group dim, and datetime dim. Whereas group tag fact has only two children: group dim and datetime dim. Therefore, the resulting query joins through the membership fact table.

Extended example

The following image represents an extended chasm trap data model. Note that F indicates a fact table and D indicates a dimensional table. Joins are many-to-one from the fact tables to the dimensional tables.

A complicated chasm trap which shows fact table one joined to dim tables zero

Scenario Query 1: D0 D1 D2 D3 D4 D5

This query will NOT display a warning message. F1 provides coverage for all selected columns.

Scenario Query 2: D0 D1 D2 D3 D4 D5 D6 D7 D8 D9

This query will NOT display a warning message where the following rules have been applied:

  • F1 is the first maximal root. That is, this fact table has the most related dimensional tables that meet the columns selected.

    • In this case, F1: D0 D1, D2, D3, D4, D5 (six children nodes).

    • Note that these columns are not removed from the context for future evaluation.

  • F2 and F3 have a tie as both have three related dimensional tables, when those columns from the first maximal root are removed. A tie breaker is required to determine which fact table to join D7 through. That is, this is available for both F2 and F3.

    • F2: D6, D7, D8 (three children nodes)

    • F3: D7, D8, D9 (three children nodes)

  • F2 and F3 have an equal number of related dimensional tables. Therefore, the original context is considered as the first tie breaker.

    • F2: D3, D4, D6, D7, D8 (five children nodes)

    • F3: D3, D7, D8, D9 (four children nodes)

    • F2 is considered as the second maximal root and F3 the third.

    • Therefore D7 is joined through F2 rather than F3.

Scenario Query 3: D0 D1 D2 D3 D5 D6 D7 D8 D9

This query will display a warning message where the following rules have been applied. Specifically, the warning is in relation to which table D7 and D8 are joined through.

  • F1 is the first maximal root. That is, this fact table has the most related dimensional tables that meet the columns selected.

    • In this case, F1: D0, D1, D2, D3, D5 (five children nodes)

    • Note that these columns are not removed from the context for future evaluation.

  • F2 and F3 have an equal number of related dimensional tables.

    • F2: D6, D7, D8 (three children nodes)

    • F3: D7, D8, D9 (three children nodes)

    • The original context does not break the tie between F2 and F3. This is because both are joined to D3 and neither are joined to D5.

    • Therefore, alphabetical sorting of the table name is required to break the tie. F2 is considered as the second root as this is sorted first. Therefore, D7 and D8 are joined through F2 rather than F3.

Resolution

Consider the following options where the desired behaviour is to join via the membership fact:

Option 1

Rename the tables so that membership fact sorts first. That is, add a prefix to membership fact such as a_membership_fact.

Option 2

For TABLE queries, a column can be selected from the entity dim table. This can be marked as hidden. Hidden columns are included in the query plan. Note that for CHART queries, not visualised columns are not included in the query plan.


Was this page helpful?