Data governance is a very important and complex part of a company’s overall data strategy. While data governance is a heavily overloaded term with a lot of definitions, for the context of this blog post, we’ll define it as the following:
Data governance describes how enterprises put in place a collection of processes and practices that effectively manage data, data assets, and metadata within the platform. This includes controls for data availability, usability, integrity, and security.
When starting to implement a data warehouse such as the Snowflake Data Cloud, there are many different data governance controls that need to be put into place to operate effectively. You’re likely going to need to answer questions such as:
- Who has access to this data or resource?
- How did they get access to it?
- Who has ACCOUNTADMIN (unrestricted) access?
- How do I provision resources or access to resources securely and consistently?
phData has built out a number of tools to both provide functionality and visibility to address data governance principles. Over the last couple of years, you may have heard of Tram and Traverse. Earlier this month, we announced that we’ve combined these two (along with SQLMorph) into our new phData Toolkit.
Let’s take a look at what it can do for you.
Why is Auditing Your Snowflake Account Important?
One of the most common tasks for any database administrator (DBA) is to audit who has access to sensitive information. For businesses with a small amount of databases, schemas, tables, and views this may be a simple task. A DBA could go resource by resource and query who has access to the data and how the privilege was granted.
This doesn’t work at scale.
Another common task is ensuring that personally identifiable information (PII) and other sensitive information is not able to be accessed by unauthorized users. Imagine you’ve integrated your enterprise resource planning (ERP) data, that contains customer and transactional information, into your Snowflake account.
This data is likely full of PII and also contains all the finances of your organization. You would want to restrict this data down to just the users and processes that need this to perform their job function.
How Do I Use the phData Toolkit to Audit my Snowflake Account?
Within the phData Toolkit, there’s a few different tools that help you view and manage your Snowflake account. We’re going to take you through a couple sample questions that you may have to answer during an audit and show you how the phData Toolkit works.
To get started, when you visit the phData Toolkit homepage, you’ll see in the left navigation that there’s a number of different tools available such as:
As you can probably guess from the names, we’re going to be spending a lot of time within Privilege Audit for answering these audit questions.
Let’s start with a common audit question.
Which of my Snowflake Users Have ACCOUNTADMIN?
This is a very basic question but is frequently the starting point for any Snowflake audit. This is because the ACCOUNTADMIN role has admin privileges across your entire Snowflake account and should be restricted to a few administrators strictly for account level restricted actions.
PRO TIP – Users with the ACCOUNTADMIN role can view and operate on all objects in the account, can view and manage Snowflake billing and credit data, and can stop any running SQL statements.
Access to this role (and any other role) can be shown within the Privilege Audit app in the phData Toolkit! In order to view this, open the Privilege Audit app, sign into your phData account (or create one), and then you’ll need to import your Snowflake account into the app. This is done so the tool can read your Snowflake objects and visualize them.
PRO TIP – If you need help connecting Privilege Audit to your Snowflake account, check out this writeup.
Once you’ve hooked up your Snowflake account, inside of Privilege Audit, you’ll be able to search for ACCOUNTADMIN in the top search bar and click into the role. Then click on the “graph” visualization option on the top of the page, click “Ownership Privileges”, and click to show Users.
We’ve already hooked up a test Snowflake account, and this is what it looks like for us:
You can also view this information as a table with linear privilege paths if you prefer.
In the above screenshot, the right column represents the privilege path from ACCOUNTADMIN to objects that are assigned or inherit the ACCOUNTADMIN role. This allows you to quickly see how privileges to the object were assigned and what you may need to change as a result of your audit. Expanded it looks like the following:
So that was an easy example. Let’s take a look at a harder example.
Who Has Access to my Snowflake Table and How Did They Get Access?
In our previous example with our ERP system, we mentioned that there would be a couple tables that have sensitive information within them. We need to ensure that only users who are required to have access to this information are able to access it. We also need to know how users who shouldn’t have access are able to access the data.
In the event that somebody does have improper access to restricted information, you’ll need to determine the best path towards resolution. This will likely be situation dependent but most commonly will be removing a role that was either granted to another role or to a user directly.
In the following example, we’ve taken the same demo Snowflake account and put a couple filters in place. We want to see who has access to the Tram database, the tables within that database, the roles that have access to the objects in that database, and what users are assigned to those roles (or role grants).
You may be squinting to read what any of this graph shows. There’s a lot of information being shown here that would be very complicated to represent within a SQL query output, and that’s where the Privilege Audit tool really comes in handy. You have the ability to visualize your environment, zoom in (so you can actually read it), and connect the dots on how users are accessing your data.
Note, in Privilege Audit, you could also view this same information more easily in a table.
You’re also able to export the table as a csv file for further analysis outside of the toolkit.
Available Out-of-the-box Audit Reports
Within our Privilege Audit tool in the phData Toolkit, we also provide several pre-canned reports that you can run on your Snowflake account. This includes:
- Object counts by type
- The amount of time travel used on each database
- Each securable object and its owner
- Users that are disabled and cannot login to your Snowflake account
- Objects owned by ACCOUNTADMIN
- Role ownership outside standard Snowflake system roles
- Users granted ACCOUNTADMIN (shortcut for our above guide)
- Users granted each standard Snowflake system role
In order to run one of these pre-canned reports, go to Privilege Audit and click Reports in the top navigation bar. For our test account, clicking the object counts by type report yields the following:
We will be continuing to add to this pre-canned report list as we generate more best practices and get feedback from customers.
How Do I Automate Resource and Access Provisioning within Snowflake?
While regularly auditing your data governance controls is a key part of any data strategy, it’s a good idea to put in place a system to ensure resource management and access are reviewed. It’s also important to have repeatable patterns that teams can follow to create these resources.
If your organization is not using Infrastructure as Code (IaC) for your Snowflake, AWS, or Azure resources, we highly recommend it. At its core, IaC is the practice of defining and deploying your resources via configuration files rather than interactive user interfaces or command line interfaces. This allows for repeatable, predictable, and auditable changes to your environment along with redeploying your infrastructure in a new environment with ease.
Within the phData Toolkit, we have a tool called Project Administration. This tool (previously known as Tram), allows you to define models and members. A model might be a user workspace that comes with a schema and a set of tables for every member in the model. As you need to add members, they’re programmatically given what has been established in the model. This allows you to create templates for the pieces of access that a given group of users might have within your system and ensures that the same access is given accordingly.
This tool also gives you the ability to view changes to your environment within a pull request to your configuration repository or integrate with ticketing solutions such as Jira or ServiceNow.
Putting It All Together
The phData Toolkit gives you the ability to perform many data governance functions against your Snowflake instance. We give you the ability to run either custom or pre-canned reports against your Snowflake account, programmatically create and administer your Snowflake account via infrastructure-as-code, and allow you to easily audit your Snowflake instance.
As your Snowflake usage continues to grow, the types of questions you’ll have to answer will get more granular and complicated to answer. Let the phData Toolkit make it easier for you!