December 11, 2024

How to Use ThoughtSpot For Data Engineer USER

By Lynda Chao

A data engineer’s primary role in ThoughtSpot is to establish data connections for their business and end users to utilize. They are responsible for the design, build, and maintenance of the data infrastructure that powers the analytics platform. 

In this blog, we will cover the essentials around how to connect to popular data connections in ThoughtSpot, data modeling, and setting up your business users for success. 

Data Connections Supported in ThoughtSpot

ThoughtSpot is compatible with over 20 data connections:

Amazon Athena

Amazon Aurora MySQL

Amazon Aurora PostgreSQL

Amazon RDS MySQL

Amazon RDS PostgreSQL

Amazon Redshift

Azure Synapse

Databricks

Denodo

Dremio

Generic JDBC

Google AlloyDB for PostgreSQL

Google BigQuery

Google Cloud SQL for MySQL

Google Cloud SQL for PostgreSQL

Google Cloud SQL for SQL Server

MySQL

Oracle

PostgreSQL

Presto

SingleStore

Snowflake

SQL Server

Starburst

Trino

Click here to view the documentation on how to connect to all of these data sources. In this blog, we will focus on how to connect to two types of common data connections:

  1. Snowflake AI Data Cloud

  2. CSV

Connecting to Data in ThoughtSpot

How to Connect to Snowflake in ThoughtSpot

  1. Ensure default roles in Snowflake and usage have been granted on the database, warehouse, schema, and tables that will be added into ThoughtSpot as a data connection.

  2. Select DATA in the top navigation bar and click on Connections.

If you are in the new user interface, then click on the 3×3 square icon in the top right corner and click on Developer and then Connections in the left Data workspace pane. The new user interface is still in early access and it will eventually be rolled out to the general audience.

  1. Create a name for your connection and then select Continue.

  1. Select your Authentication type and enter the connection details. There are 6 authentication types:

  1. Optional – Configure proxy settings and additional key-value pairs for your Snowflake Connection by selecting the Advanced Config.

  2. Select Continue to select the tables connected to load into ThoughtSpot.

  1. Select Create Connection after table selections.

  2. Established connections appear on the Data > Connections page.

How to Connect to CSV and Excel Files in ThoughtSpot

Only users who belong to a group that has the privilege Have administration privileges or Can upload user data can upload their own data from the browser. Contact your ThoughtSpot support to ensure this feature is enabled.

  1. Select DATA in the top navigation bar and click on Utilities.

If you are in the new user interface, then click on the 3×3 square icon in the top right corner and click on Developer and then Utilities in the left Data workspace pane.  

  1. Click on Upload CSV, upload your file, and answer the two questions at the bottom of the file upload window.

  1. Click Next to review column names and reconfigure any column names as needed.

  1. Click Next to review column types and reconfigure any types as needed.

  1. Click Upload 

  2. Uploaded files appear on the Data > Connections page.

Data Modeling Loaded Tables in ThoughtSpot

There are two methods of data modeling loaded tables in ThoughtSpot: one for making a few small changes and another for making many changes. Both methods will change the model.

Making Small Changes

If you need to make small changes, such as column names or column types, then change the table’s data model in the ThoughtSpot web interface: 

  1. Select the table you uploaded and want to edit; in this example, I am clicking on the Payments table. 

  1. Here in the system table, you can rename columns, add descriptions, change the column type, etc. 

  1. Click Save Changes to ensure your updates are saved.

Making Large Changes

To make large changes, download the system-wide data model. This is typically an Excel file. This modeling file contains a list of all your Worksheets, tables, and columns. 

  1. Go to Data in the top navigation bar, click Settings, and then Business Data Model.

If you are in the new user interface, then click on the 3×3 square icon in the top right corner and click on Developer and then Utilities in the left Data workspace pane. Then click on Import/Export Modeling TSV.

  1. Click Download to download the model file

  2. Change the model in Excel, vi/vim, or your text editing tool of choice. 

  3. Do not modify any value in columns that contain DoNotModify in the field under the column heading. 

  4. Upload changes back into the system, where ThoughtSpot automatically re-indexes your file.

Important Note

Once you finalize your data models, worksheets, and views, inherit the models from the base tables they reference. However, changes made to the data models afterward won’t update existing worksheets. To avoid manual updates, ensure your data models meet all requirements before creating worksheets. 

How to Create Joins in ThoughtSpot

Table Joins

Table joins are joins created at the system table level (uploaded data). Table-level joins allow users to make a single search and retrieve data across multiple tables in your database. 

  1. Select the table in your Connections area on which you want to create a join.

  2. Click on Joins and then + Add Join.

  1. Create your join

    1. Table 1  is the origin table

    2. Table 2 is the other table you are joining to, typically a dimension table

    3. Make sure the fields you are joining are the same data types. If they are not the same data type, you will need to update the data source in Snowflake, Excel, or whatever backend it is being loaded from.

  1. Click on Save. 

  2. You should now see any join you created under Joins.

Worksheet Joins

Worksheets are logical views of data built from system tables loaded into ThoughtSpot. Worksheet joins are created at the Worksheet level, where the worksheet joins to a table or view to create a relationship that allows users to search for additional views and tables together. The process of creating joins looks similar to creating table joins. 

You can see in the worksheet that was created at the system table level:

You can click on the drop-down menu Joins within this Worksheet and select Joins from this worksheet to view and create joins at the worksheet level instead:

You must have either the Can administer ThoughtSpot privilege, or the Can manage data privilege to create a join relationship. If you aren’t an administrator, you also need edit permissions on the table, view, or worksheet. When creating a join between the columns in two data sources, the columns being linked must have the same data type and the same meaning.

How to Create Worksheets in ThoughtSpot

ThoughtSpot worksheets are logical views of data to model complex datasets. The purpose of worksheets is to simplify access to data so that end-users can search for answers and build Liveboards. They are built from table joins, make it easier to rename fields, select fields necessary to the business, hide fields, and change default aggregations. 

  1. In the Data workspace pane, select + Create new and click on Worksheet.

  1. Click on Choose Sources or the + sign after Sources in the left-hand data pane.

  1. Select the system table data sources to build the worksheet and hit Close.

  1. Double-click on the field names in the lefthand column to add the columns to the view.

  1. Edit the title of the worksheet and click Save.

  2. The worksheet is now visible on the Home page of the Data Workspace page.

Reasons to Use a Worksheet

  • Group multiple related tables together in a logical way

  • Pre-join multiple tables together

  • Simplify access to data for end-users

  • Give users access to only a part of the underlying data

  • Include the derived column using the formula

  • Rename columns to make the data easier to search

  • Build in specific filters or aggregations

Sharing Objects for End-User Enablement in ThoughtSpot

All objects a developer or user creates must be shared with others in order to be used. Sharing tables and worksheets enables end-users to search for Answers and develop Liveboards. Depending on the permissions, users can also edit joins, or you can lock that feature to prevent end-users from modifying the data models. 

  1. Select the objects you want to share and click on Share.

  1. Select a user, user group and edit user permissions for the shared objects. 

  1. An email notification will go out to users or groups you select to Share objects with.

Closing

Overall, loading data and connecting to data in ThoughtSpot is straightforward – just make sure you have all the necessary permissions and access to your database before you get started. ThoughtSpot also makes it easy for data engineers to create simple data models in its interface, but if you require more complex modeling, you can download the data model to edit it and reupload the model back into ThoughtSpot. Most important of all, be sure to share the data objects you loaded and created so end-users can use them to carry out their self-service analytics! 

If you need help with setting up ThoughtSpot, creating data models, or improving your analytics setup, we are here to support you. Contact us to get expert guidance and make the most of your data.

Data Coach is our premium analytics training program with one-on-one coaching from renowned experts.

Accelerate and automate your data projects with the phData Toolkit