As the technology landscape changes, it’s important for businesses to take advantage of the increased efficiencies this competitive landscape offers. The most recent advancement companies are racing to implement is leveraging the benefits of the cloud and elastic compute platforms that allow for on-demand availability for critical business processes. In practice, this means that applications can scale as required by business use cases, without having to worry about end-user or internal capacity issues.
Cloudera is continually innovating and their latest offering is no different: Introduced in September 2019 at the Strata Data Conference, Cloudera Data Platform (CDP) was built to empower businesses to standardize and scale platform operations wherever they might choose to deploy their data products. This allows customers to unlock the benefit of everything packaged as part of the Cloudera’s platform with on-demand availability and adaptive scaling, wherever our customers want to run their applications — whether it’s on-prem, on a private cloud, or on public cloud.
At phData, we specialize in guiding businesses through complex decisions by laying out the options and helping the customer weigh which ones are in their best interest. This includes analyzing platforms, architectures, and use cases to ensure that the right tool is being used for each job. We’ve been focused on Cloudera since we started the company, building deep expertise with the platform to make informed decisions. And we view CDP as the next step in our customers’ journey — and as an opportunity to create a competitive advantage by getting more out of their data.
Moving Beyond Technology Constraints
Our customers operate across a wide range of verticals, but they all have use cases and visions for having a centralized source of information to handle business wide analysis. Most have a need to move faster, to allow more applications to be deployed reliably, and to better meet the demands of customers and business users. This boils down to a requirement for the business to come up with a reliable, efficient way to predict and support capacity needs over an extended period of time. Even the most organized businesses struggle with this, and internal IT organizations end up competing with cloud providers around the capability to rapidly deploy and scale.
Building the Church for Easter Sunday
To better grasp this point, consider how businesses might traditionally plan and build clusters in their data center, procuring and manually installing servers to meet their needs. The required capacity for these clusters is typically estimated to ensure the most intensive workload can be handled under the desired SLA. In practice, this means the platform will be able to handle its largest workload; but when that workload is not running, there is a lot of unused capacity sitting idle.
In the charts below, which visualize available capacity vs. the required capacity to support an application over time, you can see the inefficiencies inherent to the traditional, “static capacity planning” approach described above:
Our customers are all looking for guidance on how to move away from static capacity planning, and more towards adaptive capacity, so they can make the most out of their technology investments. This is a process that introduces changes in each organization by being more agile in the DevOps lifecycle and identifying which use cases or workloads are best for each of the options available.
Benefits of Adaptive Scaling
Another key benefit of moving towards adaptive capacity planning is that it allows you to eliminate large, upfront investments in hardware and data center capacity, and to instead rely upon on-demand, cloud-based infrastructure that can scale with the needs of the platform. This reduces large capital expenses and minimizes the likelihood of paying for capacity you don’t need, or of lacking sufficient capacity to capitalize on demand as shown in the following diagram:
Another benefit of adaptive scaling is that it allows us to pick the workload that gets migrated and have it be deployed in the same manner regardless of resource location. This simplifies the DevOps process by ensuring the environments provide a consistent experience.
Why the Cloudera Data Platform?
When deciding between which tools or platforms a business will choose to run, there is a never-ending list of variables to consider. Typically, we see a lot of excitement around new technology options or specific cloud offerings. These options are great; however, more often than not, they are point solutions that require extensive integration with other cloud services to meet enterprise needs.
Because most customers have requirements to avoid getting locked into a specific cloud vendor and ecosystem, the ability to run workloads in multiple clouds, as well as on-prem, is critical — whether it’s due to cost, security, or simply the need for ramp-up time to get accustomed to running workloads in the cloud.
Since the Cloudera platform already has the flexibility to choose specific deployment options as part of its data platform (currently supporting over 30 big data engines), the ability to meet these needs is readily present out of the box. Choosing this platform also allows you to satisfy enterprise requirements around governance, security, and compliance, instead of building custom solutions that would be required for the point solutions offered by the cloud providers. CDP also comes in a variety of deployment options, unlocking the potential to move to adaptive scale capacity planning as each customer matures with the technology.
CDP Continuing Blog Series
This multi-part blog series is dedicated to exploring how CDP can benefit your business and enable seamless deployment across your data products. We’ll be updating this series as Cloudera releases new products and experiences. So be on the lookout for future CDP content; there are a lot of exciting offerings to look forward to. Reach out to the phData team at sales@phdata.io to see how we can help you with your data needs.