Analytics

Best Practices for Building a Cloud Data Lake You Can Trust

Using Talend and Amazon Web Services (AWS), financial institutions are building cloud data lakes to consolidate customer data across hundreds of sources. By validating the quality of that data and correlating data sets with automated processes, you can deliver trusted reporting that meets regulatory requirements and uncover insights for new business.

BigQuery and surrogate keys: a practical approach

When working with tables in data warehouse environments, it is fairly common to come across a situation in which you need to generate surrogate keys. A surrogate key is a system-generated identifier that uniquely identifies a record within a table. Why do we need to use surrogate keys? Quite simply: contrary to natural keys, they persist over time (i.e. they are not tied to any business meaning) and they allow for unlimited values.

Are you missing out leaving important data standing on the outside?

Every now and then you can't beat a bit of Meat Loaf, the singer not the food, as I've not had the pleasure to taste it. I recently found myself recalling a cult classic "Standing on the outside" not because of any failed break up but thinking about the abundance of data available externally that can be used in combination with your internal data. Unfortunately, many are still leaving this data standing on the outside.