Thanks for reading. Here you will find a huge range of information in text, audio and video on topics such as Data Science, Data Engineering, Machine Learning Engineering, DataOps and much more. The show notes for “Data Science in Production” are also collated here.

Posts by Ust Oldfield

Partnering with dbt Labs: enhancing our analytical engineering capabilities

We are thrilled to announce that we have partnered with dbt Labs, the company behind the premier analytics engineering product: dbt, which is fast becoming the new standard for data transformation workflows, that power and populate our curated layers.

Ust OldfieldMay 7, 2024Comment

How to Achieve Perfect Sprint Planning

In the world of Agile and Scrum, chasing perfection is a futile endeavour. Instead, Sprint Planning should focus on achieving 'good enough,' a practical and attainable standard. It's about transparent communication, meaningful discussions, and setting clear expectations to deliver value, with the flexibility to iterate and adapt for continuous improvement.

DevOpsUst OldfieldOctober 19, 2023Scrum, DevOpsComment

Bringing Fabric to your Data Lakehouse

With the advent of Fabric, many organisations with existing lakehouse implementations in Azure are wondering what changes Fabric will herald for them. Do they continue with their existing lakehouse implementation and design, or do they migrate entirely to Fabric?

Ust OldfieldJune 27, 2023FabricComment

Testing Data

This blog post will cover the topic of testing data within DBT, focusing on the easiest aspect first - verifying the reliability of the query applied to the data. The author will discuss the importance of defining what is being tested and how this can impact the validity of the data. The other testing questions, more aligned with integration and regression testing, will be saved for another time.

Ust OldfieldFebruary 9, 2023dbt, sql, testing, databricks, data, anayticsComment

Seeding Files With dbt

Ust OldfieldDecember 15, 2022Comment

Getting Started with dbt

Ust OldfieldDecember 5, 2022Comment

Introduction to Analytics Engineering

Ust OldfieldNovember 30, 2022Comment

Scheduling Databricks Cluster Uptime

Ust OldfieldJuly 28, 2022Comment

Sustainability Reporting Accelerator

Ust OldfieldJuly 27, 2022Comment

Why Data Quality is Important

Data is among the most valuable assets for any organisation. Without data, the ability to make informed decisions is diminished. So it stands to reason that Data Quality is incredibly important to any organisation. If data doesn’t meet the expectations of accuracy, validity, completeness, and consistency that an organisation sets it, then the data could have severe implications for the organisation. Conversely, if data does meet those expectations, then it is a real asset that can be used to drive value across an organisation.

Ust OldfieldMarch 15, 2022Comment

Using Auto Loader on Azure Databricks with AWS S3

Advancing SparkUst OldfieldOctober 22, 2021databricks, autoloader, S3, Azure, AWS, Engineering, data engineering, authenticationComment

Databricks Labs: Data Generator

Databricks recently released the public preview of a Data Generator for use within Databricks to generate synthetic data.

Ust OldfieldAugust 9, 2021Comment

Data Mesh Deep Dive

In a previous post, we laid down the foundational principles of a Data Mesh, and touched on some of the problems we have with the current analytical architectures. In this post, I will go deeper into the underlying principles of Data Mesh, particularly why we need an architecture paradigm like Data Mesh.

Ust OldfieldAugust 5, 2021Comment

What is Data Mesh?

To be able to properly describe what Data Mesh is, we need to contextualise in which analytical generation we currently are, mostly so that we can describe what it is not.

Ust OldfieldAugust 5, 2021Comment

Blog