A data engineer’s primary role in ThoughtSpot is to establish data connections for their business and end users to utilize. They are responsible for the design, build, and maintenance of the data infrastructure that powers the analytics platform.
In this blog, we will cover the essentials around how to connect to popular data connections in ThoughtSpot, data modeling, and setting up your business users for success.
Data Connections Supported in ThoughtSpot
ThoughtSpot is compatible with over 20 data connections:
Amazon Athena
Amazon Aurora MySQL
Amazon Aurora PostgreSQL
Amazon RDS MySQL
Amazon RDS PostgreSQL
Amazon Redshift
Azure Synapse
Databricks
Denodo
Dremio
Generic JDBC
Google AlloyDB for PostgreSQL
Google BigQuery
Google Cloud SQL for MySQL
Google Cloud SQL for PostgreSQL
Google Cloud SQL for SQL Server
MySQL
Oracle
PostgreSQL
Presto
SingleStore
Snowflake
SQL Server
Starburst
Trino
Click here to view the documentation on how to connect to all of these data sources. In this blog, we will focus on how to connect to two types of common data connections:
Connecting to Data in ThoughtSpot
How to Connect to Snowflake in ThoughtSpot
Ensure default roles in Snowflake and usage have been granted on the database, warehouse, schema, and tables that will be added into ThoughtSpot as a data connection.
Select DATA in the top navigation bar and click on Connections.
If you are in the new user interface, then click on the 3×3 square icon in the top right corner and click on Developer and then Connections in the left Data workspace pane. The new user interface is still in early access and it will eventually be rolled out to the general audience.
Create a name for your connection and then select Continue.
Select your Authentication type and enter the connection details. There are 6 authentication types:
Service Account
Key Pair
OAuth
Ensure OAuth for Snowflake is configured first
External OAuth
Ensure Azure AD OAuth or Okta OAuth is configured first.
For AZURE AD OAuth – When the Microsoft sign-in screen appears, sign in to your account using the email and password of your Microsoft account associated with Azure.
OAuth with PKCE
External OAuth with PCKE
Refer to the Snowflake connection reference for more information on each of the specific attributes you must enter for your connection.
Optional – Configure proxy settings and additional key-value pairs for your Snowflake Connection by selecting the Advanced Config.
Select Continue to select the tables connected to load into ThoughtSpot.
Select Create Connection after table selections.
Established connections appear on the Data > Connections page.
How to Connect to CSV and Excel Files in ThoughtSpot
Only users who belong to a group that has the privilege Have administration privileges or Can upload user data can upload their own data from the browser. Contact your ThoughtSpot support to ensure this feature is enabled.
Select DATA in the top navigation bar and click on Utilities.
If you are in the new user interface, then click on the 3×3 square icon in the top right corner and click on Developer and then Utilities in the left Data workspace pane.
Click on Upload CSV, upload your file, and answer the two questions at the bottom of the file upload window.
Click Next to review column names and reconfigure any column names as needed.
Click Next to review column types and reconfigure any types as needed.
Click Upload
Uploaded files appear on the Data > Connections page.
Data Modeling Loaded Tables in ThoughtSpot
There are two methods of data modeling loaded tables in ThoughtSpot: one for making a few small changes and another for making many changes. Both methods will change the model.
Making Small Changes
If you need to make small changes, such as column names or column types, then change the table’s data model in the ThoughtSpot web interface:
Select the table you uploaded and want to edit; in this example, I am clicking on the Payments table.
Here in the system table, you can rename columns, add descriptions, change the column type, etc.
Click Save Changes to ensure your updates are saved.
Making Large Changes
To make large changes, download the system-wide data model. This is typically an Excel file. This modeling file contains a list of all your Worksheets, tables, and columns.
Go to Data in the top navigation bar, click Settings, and then Business Data Model.
If you are in the new user interface, then click on the 3×3 square icon in the top right corner and click on Developer and then Utilities in the left Data workspace pane. Then click on Import/Export Modeling TSV.
Click Download to download the model file
Change the model in Excel, vi/vim, or your text editing tool of choice.
Do not modify any value in columns that contain DoNotModify in the field under the column heading.
Upload changes back into the system, where ThoughtSpot automatically re-indexes your file.
Important Note
Once you finalize your data models, worksheets, and views, inherit the models from the base tables they reference. However, changes made to the data models afterward won’t update existing worksheets. To avoid manual updates, ensure your data models meet all requirements before creating worksheets.
How to Create Joins in ThoughtSpot
Table Joins
Table joins are joins created at the system table level (uploaded data). Table-level joins allow users to make a single search and retrieve data across multiple tables in your database.
Select the table in your Connections area on which you want to create a join.
Click on Joins and then + Add Join.
Create your join
Table 1 is the origin table
Table 2 is the other table you are joining to, typically a dimension table
Make sure the fields you are joining are the same data types. If they are not the same data type, you will need to update the data source in Snowflake, Excel, or whatever backend it is being loaded from.
Click on Save.
You should now see any join you created under Joins.
Worksheet Joins
Worksheets are logical views of data built from system tables loaded into ThoughtSpot. Worksheet joins are created at the Worksheet level, where the worksheet joins to a table or view to create a relationship that allows users to search for additional views and tables together. The process of creating joins looks similar to creating table joins.
You can see in the worksheet that was created at the system table level:
You can click on the drop-down menu Joins within this Worksheet and select Joins from this worksheet to view and create joins at the worksheet level instead:
You must have either the Can administer ThoughtSpot privilege, or the Can manage data privilege to create a join relationship. If you aren’t an administrator, you also need edit permissions on the table, view, or worksheet. When creating a join between the columns in two data sources, the columns being linked must have the same data type and the same meaning.
How to Create Worksheets in ThoughtSpot
ThoughtSpot worksheets are logical views of data to model complex datasets. The purpose of worksheets is to simplify access to data so that end-users can search for answers and build Liveboards. They are built from table joins, make it easier to rename fields, select fields necessary to the business, hide fields, and change default aggregations.
In the Data workspace pane, select + Create new and click on Worksheet.
Click on Choose Sources or the + sign after Sources in the left-hand data pane.
Select the system table data sources to build the worksheet and hit Close.
Double-click on the field names in the lefthand column to add the columns to the view.
Edit the title of the worksheet and click Save.
The worksheet is now visible on the Home page of the Data Workspace page.
Reasons to Use a Worksheet
Group multiple related tables together in a logical way
Pre-join multiple tables together
Simplify access to data for end-users
Give users access to only a part of the underlying data
Include the derived column using the formula
Rename columns to make the data easier to search
Build in specific filters or aggregations
Sharing Objects for End-User Enablement in ThoughtSpot
All objects a developer or user creates must be shared with others in order to be used. Sharing tables and worksheets enables end-users to search for Answers and develop Liveboards. Depending on the permissions, users can also edit joins, or you can lock that feature to prevent end-users from modifying the data models.
Select the objects you want to share and click on Share.
Select a user, user group and edit user permissions for the shared objects.
An email notification will go out to users or groups you select to Share objects with.
Closing
Overall, loading data and connecting to data in ThoughtSpot is straightforward – just make sure you have all the necessary permissions and access to your database before you get started. ThoughtSpot also makes it easy for data engineers to create simple data models in its interface, but if you require more complex modeling, you can download the data model to edit it and reupload the model back into ThoughtSpot. Most important of all, be sure to share the data objects you loaded and created so end-users can use them to carry out their self-service analytics!
If you need help with setting up ThoughtSpot, creating data models, or improving your analytics setup, we are here to support you. Contact us to get expert guidance and make the most of your data.