Cloudera CDH 6 Support is Ending, Now What?
If you haven’t already started making plans for the CDH 6 End of Support Life (EOSL) coming up in March 2022, now is the time
What To Do With Unsupported CDH 6
Let’s say you have thought through your long-term CDH 6 migration plans, and have found that you will not be able to complete a full
Apache Kudu Integration Testing in Scala/SBT Applications
Introduction to Kudu Integration Testing Beginning with the 1.9.0 release, Apache Kudu published new testing utilities that include Java libraries for starting and stopping a
How to Use the Kudu Quickstart on Windows
This blog post was written by Donald Sawyer and Frank Rischner. Introduction to Apache Kudu Apache Kudu is a distributed, highly available, columnar storage manager
CDP Data Warehouse Experience: The Hadoop Paradigm Shift
Cloudera Data Platform (CDP) represents a major step forward toward combining the value-added distributions of Hadoop from both Cloudera (CDH) and Hortonworks (HDP) into a
Introducing the Cloudera Data Platform: Unlock Adaptive Scaling
As the technology landscape changes, it’s important for businesses to take advantage of the increased efficiencies this competitive landscape offers. The most recent advancement companies
Implementing Metadata as Part of Data Management
Data centralization without careful metadata implementation is like stocking a warehouse without sorting and labeling all the boxes. Yes, you may have everything you need
Archway: Self-Service Data Engineering on Cloudera CDH
Data engineering in a production environment is complex. Engineers and data scientists need to be onboarded onto a platform where they can share data and
How to Tame Apache Impala Users with Admission Control
Introduction A common problem encountered with Apache Impala is resource management. Everyone wants to use as many resources (i.e. memory) as they can to try
How to Query a Kudu Table Using Impala JDBC in Cloudera Data Science Workbench
Kudu is an excellent storage choice for many data science use cases that involve streaming, predictive modeling, and time series analysis. However, in industries like