May 8, 2024

What Is Coalesce? Everything You Need to Know

By Justin Delisi

As the first visual transformation solution for building data pipelines quickly and efficiently at scale within the Snowflake Data Cloud, Coalesce has begun to make a name for itself in the data warehouse space. 

This guide will explain everything you need to know about Coalesce, including:

  • What Coalesce is exactly, and what it is used for.

  • Why you should consider using Coalesce and how it compares to other transformation tools.

  • Who on your team should use Coalesce, and what use cases it can solve for your organization.

What Is Coalesce?

Coalesce is a data transformation platform specifically designed for Snowflake. It is a hybrid development environment that combines code-first and GUI capabilities, allowing users to build complex transformations visually or write code directly. With Coalesce, users can extend and scale their projects using customizable templates for frequently used transformations and automatically generate standardized, best-practice SQL. 

What Is Coalesce Used For?

Data Pipelines

Creating data pipelines manually can be an error-prone and time-consuming process. Massive code bases written by different developers in different ways can be cumbersome to maintain. And once it’s created, documentation is difficult to keep up with. Coalesce alleviates these by providing:

  • Visual interface where objects can be quickly created by using templates without the need for writing code, increasing productivity.

  • Along with the visual interface, Coalesce offers the full ability to write custom code so as not to be limited in the way you can create pipelines.

  • Object templates that can be customized so that your developers all create the code the way you want it.

  • Auto-generated documentation so that anyone can easily understand what the pipeline is doing and how they can use the data for insights.

Data Transformations

Coalesce is built completely on top of Snowflake, meaning it can use every feature Snowflake has to offer. What this means is that any transformation your developer can write in Snowflake SQL can be done in Coalesce with code or with the click of a button. With its visual interface and templated structure, transformations can be standardized, allowing your engineers to develop at speed.

Why Use Coalesce?

As a fully cloud-based transformation solution, Coalesce helps you get the most out of your Snowflake investment. The ability to use all of Snowflake’s features while designing and deploying data warehousing workloads with ease at enterprise scale makes Coalesce a game changer. All of this while your data stays within Snowflake the entire time, utilizing all the security and performance that Snowflake offers.

How Does Coalesce Compare to Other Data Transformation Tools?

User Interface

⁤Unlike some tools that are entirely code-based or purely visual, Coalesce offers a hybrid solution. ⁤⁤All objects are created using nodes, which give non-technical users the ability to point and click to build data pipelines but also allow technical users the ability to write as much code as needed for complex transformations. ⁤⁤

The simple interface also makes it easier for data analysts and engineers to collaborate on building data pipelines instead of only catering to a deeply technical group of users.

Templating of Objects

Coalesce streamlines data pipeline creation by using templating for data transformations, which translates to faster development and more consistent pipelines. Coalesce comes with pre-built node templates that capture the logic for common objects such as tables, views, and stages. Users can utilize these built-in templates or create their own custom nodes using Jinja.

Jinja is a templating language to write code with Python-like syntax to create advanced templates with variables and logic. With this, you can dynamically generate data transformation steps based on your specific data needs and business rules.

Automated Documentation

Writing documentation is every developer’s least favorite task. It is also time-consuming and tedious for the developer, preventing them from performing more high-value tasks. 

Coalesce automatically generates documentation as developers work, which frees them from this burden and ensures that essential documentation is written for every pipeline created. Documentation is also taken from within the transformation work, meaning you don’t need to create a transformation in one area and then move to another to add comments about the transformation for the auto-documentation to pick up.

Built-In CI/CD

Coalesce comes with built-in CI/CD features that allow deployment and refreshes using either the Coalesce command line interface (CLI) through its API or through the web UI. It even comes with (and sometimes forcefully recommends) GIT integration to manage changes within your pipelines.

Column-Aware Architecture

Uniquely, Coalesce is built on a column-aware architecture, meaning it approaches data at the lowest granularity possible: the column. This powers column-level lineage, letting developers see how a column is derived and its source. This enables them to view the impact of any changes that need to be made upstream. Coalesce even offers a feature to propagate any changes downstream for ease of development. 

With just a few clicks of the mouse, you can add or remove a column to as many (or as few) downstream nodes as you want without ever writing code.

Who Uses Coalesce?

Because Coalesce has such an intuitive interface, it can be used by a much broader audience than other transformation solutions. Here’s how Coalesce can be used by different users within your organization:

Data Engineers

For all the reasons listed in this blog, Coalesce can be a fantastic transformation tool, but even as easy as it is to use, Data Engineers should be the primary users of the tool. Engineers can use their knowledge of data and Snowflake to create more efficient pipelines with Coalesce templates while being able to use the UI and auto-generated docs to take care of simpler tasks.

Data Analysts

While Data Analysts (DA) may not have the knowledge and experience of Data Engineers (DE) when it comes to building data pipelines, with Coalesce they would be empowered to create their own pipelines using established templates without involving a DE. This benefits everyone involved as the DEs spend less time creating cookie-cutter pipelines, and the DAs/business users don’t need to wait for a DE to be available to get their data into the hands of the business. 

Additionally, a tool like Coalesce in the hands of DAs can help eliminate data transformations within their Business Intelligence (BI) tool, alleviating problems with consistency, transparency, and data governance.

What Are the Top Use Cases for Coalesce?

Coalesce is the ideal tool for anyone looking to create data pipelines and transform their data within Snowflake quickly and efficiently. These are the best use cases where Coalesce really stands out:

Standardization and Reusability of Objects

Every organization has its own business rules surrounding data. Whether it’s naming conventions, data types, or object constraints, all of these can be enforced within templated objects in Coalesce, ensuring all objects in your data warehouse are standardized. By the same token, many objects within a data warehouse are often very similarly created. With the reusability of these templated objects that Coalesce provides, objects can be created quicker than ever before.

Usage of Everything Snowflake Has to Offer

Because Coalesce was built exclusively for Snowflake, it can use literally any feature Snowflake offers in an easy, reusable way. Along with all the basic features, Coalesce can work with:

  • Streams and Tasks: Utilizing Steams and Tasks for Change Data Capture (CDC) in Snowflake can be a complex process. With Coalesce, developers can create User-Defined Nodes (UDN) to take the complexity out of using these features.

  • Dynamic Tables: Instead of creating a target table for your transformations and inserting data into it, Dynamic Tables allows users to declaratively define the result of their transformations using simple SQL statements. With Coalesce, a UDN can be created to use this powerful new feature.

  • Cortex Functions: Cortex functions allow users to create solutions for many machine learning use cases, such as time-series forecasting and anomaly detection, and (you guessed it) can be created as UDNs within Coalesce.

How Much Does Coalesce Cost?

One of the best parts of Coalesce is that since it exclusively runs in Snowflake, all transformations are performed within Snowflake. This means that every node you run directly runs a query in Snowflake, and you incur the same cost as if you ran that query yourself. However, ensure you know how costs work in Snowflake and what warehouse size you use before you start firing off Coalesce pipelines.

The only true cost associated with Coalesce is licensing, which can vary based on the number of users, workplaces, projects, etc. Coalesce doesn’t currently have its pricing publicly available, but it does offer a free plan.

Closing

Coalesce’s unique blend of visual development, code editing, and built-in automation enables users of all technical backgrounds to design, build, and manage data pipelines efficiently.

We hope this guide has given you a basic understanding of Coalesce, its core functionalities, competitive advantages, and use cases. Whether you’re a data engineer seeking to optimize pipeline development or a data analyst looking to collaborate on data transformations, Coalesce offers a valuable set of tools to streamline your workflow and unlock the true potential of your data in Snowflake.

Unlock Your Data’s Potential

If you’re interested in maximizing the impact of your data with Coalesce, phData can help! From implementation to mentorship, our experts can help accelerate your success story with Coalesce.

FAQs

Coalesce offers a free trial account where you can try out its full functionality. For ease of installation, find it on Snowflake Partner Connect.

After deploying your pipeline to production, your data never leaves Snowflake, so it is as secure as your Snowflake account (which is very secure). While working on pipelines, Coalesce never stores your data at rest, and data in motion is always encrypted.

Data Coach is our premium analytics training program with one-on-one coaching from renowned experts.

Accelerate and automate your data projects with the phData Toolkit