Since the release of Cloudera Data Engineering (CDE) more than a year ago, our number one goal was operationalizing Spark pipelines at scale with first class tooling designed to streamline automation and observability. In working with thousands of customers deploying Spark applications, we saw significant challenges with managing Spark as well as automating, delivering, and optimizing secure data pipelines.
“Water, water, everywhere, nor any drop to drink.” The famous line from Samuel Taylor Coleridge’s epic poem “The Rime of the Ancient Mariner” has a fitting application to today’s data problem. Enterprises are deluged with data, but they often have no way to leverage it. According to most experts, only a small percentage of data is usable and made useful, and most of it is in the dark — thus the term, “dark data.”
Out with the old; in with the new! If you haven’t already checked out the new Snowflake® interface (aka Snowsight®), make it your New Year’s resolution. Set yourself up for success in 2022 by spending a few minutes getting to know the new features and experiences that are in public preview—available when you click the Snowsight button at the top of your console’s menu bar.
With interest in big data and cloud increasing around the same time, it wasn’t long until big data began being deployed in the cloud. Big data comes with some challenges when deployed in traditional, on-premises settings. There’s significant operational complexity, and, worst of all, scaling deployments to meet the continued exponential growth of data is difficult, time-consuming, and costly.
If you have ever seen the 1976 movie ‘All the President’s Men’ you may remember the phrase “follow the money.” The idea behind this is that political corruption could be exposed merely by looking at financial transfers between parties. In testing, I like to give a slight tweak on this phrase and say, ‘follow the revenue.’ What does this mean? Plainly, we should focus most of our testing efforts in the ways that we will see the most positive return.
While writing a comparison of Kubernetes and Koyeb, we tried to determine how much operating a Kubernetes cluster really costs. This section of our comparison took us hours to write and ended up being so long that we decided to write a dedicated post about it. Full disclaimer: At Koyeb, we're building a serverless platform and we have a purpose-built orchestration engine.
We are simplifying code signing on Bitrise. Now there are two ways to automate code signing on Bitrise: using Xcode Build/Archive Steps (with iOS Auto Provision Steps merged into them) and not using these Steps, but 'Manage iOS Code Signing' instead. In both cases, we've reduced the number of things that could go wrong. Let's see what has changed!
Caching is a common technique for making your applications faster. It lets you avoid slow operations by reusing previous results. In this article, Ayo Asaiah walks us through the different options for caching in NodeJS applications.
In the BigQuery Spotlight series, we talked about Monitoring. This post focuses on using Audit Logs for deep dive monitoring. BigQuery Audit Logs are a collection of logs provided by Google Cloud that provide insight into operations related to your use of BigQuery. A wealth of information is available to you in the Audit Logs. Cloud Logging captures events which can show “who” performed “what” activity and “how” the system behaved.