The UCOVI Blog

The UCOVI Blog



Welcome to UCOVI's repository of data discussions and interviews.

➡ Click here if you wish to contribute an article.


Latest Post - Frustrated software developers and the devops-ification of data analytics


➡ Go to Previous Articles

Previous Articles



Data beyond AI: Microsoft Fabric vs Data Contracts (Ned Stratton: 30th January 2024)

No-code data part II (Ned Stratton: 22nd November 2023)

White Paper: The free-role data analyst (Ned Stratton: 4th September 2023)

Do data analysts need to read books? (Ned Stratton: 10th May 2023)

No code data tools: the complexity placebo (Ned Stratton: 17th March 2023)

The 2023 data job market with Jeremy Wyatt (Ned Stratton: 24th January 2023)

Making up the Numbers - When Data Analysts Go Rogue (Ned Stratton: 2nd December 2022)

Data in Politics Part 2 - Votesource (Ned Stratton: 12th September 2022)

Data in Politics Part 1 - MERLIN (Ned Stratton: 2nd September 2022)

Interview: Adrian Mitchell - Founder, Brijj.io (Ned Stratton: 28th June 2022)

The Joy of Clunky Data Analogies (Ned Stratton: 14th April 2022)

Event Review - SQLBits 2022, London (Ned Stratton: 17th March 2022)

Interview: Susan Walsh - The Classification Guru (Ned Stratton: 21st February 2022)

Upskilling as a data analyst - acquiring knowledge deep, broad and current (Ned Stratton: 31st January 2022)

Beyond SIC codes – web scraping and text mining at the heart of modern industry classification: An interview with Agent and Field's Matt Childs (Ned Stratton: 8th December 2021)

Debate: Should Data Analytics teams sit within Sales/marketing or IT? (Ned Stratton: 26th October 2021)

Event Review: Big Data LDN 2021 (Ned Stratton: 27th September 2021)

The Swiss Army Knife of Data - IT tricks for data analysts (Ned Stratton: 9th September 2021)

UK Google Trends - Politics, Porn and Pandemic (Ned Stratton: 15th October 2020)

How the UK broadcast media have misreported the data on COVID-19 (Ned Stratton: 7th October 2020)

The Power BI End Game: Part 3 – Cornering the BI market (Ned Stratton: 21st September 2020)

The Power BI End Game: Part 2 – Beyond SSAS/SSIS/SSRS (Ned Stratton: 28th August 2020)

The Power BI End Game: Part 1 – From Data Analyst to Insight Explorer (Ned Stratton: 14th August 2020)

Excel VBA in the modern business - the case for and against (Ned Stratton: 13th July 2020)

An epic fail with Python Text Analysis (Ned Stratton: 20th June 2020)

Track and Trace and The Political Spectrum of Data - Liberators vs Protectors (Ned Stratton: 12th June 2020)

Defining the role of a Data Analyst (Slawomir Laskowski: 31st May 2020)

The 7 Most Common Mistakes Made in Data Analysis (Slawomir Laskowski: 17th May 2020)

COVID-19 Mortality Rates - refining media claims with basic statistics (Ned Stratton: 10th May 2020)


Ned Stratton: 13th June 2024

I want to talk about a data analytics project I’ve worked on in as little detail as possible about the timeframe and subject matter (I don't want to piss people off), but as much detail as possible about the technical specifications of it (so that you grasp the magnificence of its engineering). Here goes.

Within the past six years, I've been on a data analytics project that involved a no-code data blending tool. It also involved SQL databases. Three of them: dev, test, and live. All changes to the dev database were version controlled using git. Several experienced IT technicians across different teams were consulted to advise on the SQL permissions group setup to avoid business users – who owned the source data – from causing accidental deletions or changes in the SQL database, even though they likely didn't know SQL.

And what, crucially, was the source data? It was an Excel spreadsheet with about 20 columns and 300 rows. (That's one SQL database for every 100 rows). And of course, the business users controlling the spreadsheet – themselves unencumbered by devops or change control – could do what they liked with it. They were quite relaxed about breaking changes such as renaming important columns or getting creative with date formats. The solution to this – I think you get the picture by now – was more technical complexity in the blending tool to anticipate it.

'Devops, devops everywhere and not a drop to drink'

Now, I'm not against devops in data solutions per se. Things need to be tested properly before release into production to prevent embarrassing errors that cause loss of trust. Version control and change tracking are invaluable means both to get to the root of errors and to enable concurrent work by multiple people on the same project.

However, at some point one has to take a bird's eye view of the change-controlled devops mega city with test, dev, git, and a data center in Texas that's been built on the foundations of Debbie-from-accounts's 2023 expenses tracker and think, "isn't this a bit ridiculous?"

One also has to take stock of how the devops imperative weighs on the priorities of data teams.

The grimly accepted reality that 80% of data analytics is finding, cleaning, and organising the data is just about acceptable on the proviso that the other 20% is for actual insight exploration. But if it turns out that 19% is configuring devops pipelines, agonising over commit messages, and merging pull requests, then it really is 5:28 on a Friday afternoon before the afterthought of maybe converting all of this data into something interesting for the business is considered.

Frustrated software developers

Where did this all come from? Well, there is the possibility that data has become so technically advanced and important to the modern business that it requires the same practices that software development teams follow. I don't discount it, but if I thought it told the whole story I wouldn't be writing this blog.

Data folk are sensitive souls who search love, attention and respect for the work they do and the things they know. The search starts within the business-focussed teams (product/sales/marketing) that they often report into, and ends unsuccessfully. They are not true business stakeholders (despite the business knowledge they acquire in their work). Furthermore, everything they do is requested, prioritised, and validated by these true business stakeholders, who often value politeness and confirmatory answers from their data analysts more than they value nuance or pushback. This inevitably causes feelings of resentment and powerlessness – the customers and waiters effect from my free-roles piece last year.

They are also beset by scope creep, last minute changes, unclear requests, being made to grovel and wait an age to get things installed, and most annoying of all, something breaking all the time that it's their job to keep operational even though they themselves didn't build it.

It is at this point that the appeal of devops and the illusory sense of order it could bring kicks in. "If everything is version controlled and tested on two servers before rollout, not only will breaking changes not happen, but also the business will become so impressed by the technical organisation and architecture of our BI that it will drive the cultural sea-change around data needed to stop time-consuming requests that prevent us from building advanced, maintainable, and lasting things with data." Or so it goes.

As this magical, great-new-world-over-the-choppy-sea, "this is how big companies do it" thinking takes hold, something else takes hold as well that was seeded in the one-sided business-stakeholder relationships. Devops and the desire to operationalize and automate everything becomes the goal over the delivery of new insights.

If you're simply bored by building the same sales dashboards all the time and demoralised by the "it doesn't match these numbers" feedback loop, then really leaning into the nitty-gritty of configuring devops pipelines, security administration, and coded automations for manual processes can provide the desired channel for your creativity and curiosity, as well as the sense of there being a finished end-product to your work. In essence, you're more like a software developer, and you have the git repo to prove it.

Except, you're not really a software developer. You're supporting and investigating rather than producing from a design. The next "release" is when a random stakeholder asks for a tweak, or after some upstream change to a column in the raw data you have no control over. You're like a zebra in a safari park that's become jealous of the giraffes because they can pick the high-hanging fruit with their longer necks. All of your efforts to paint your neck yellow with brown spots and time spent on stretching exercises to elongate it will never get you to the point where you, as a zebra, can pick high fruit as effectively as the giraffes, or have the giraffes accept you and grant you the status of giraffe rather than zebra.

There's no stat from a Gartner report about increased spend on Redgate by data teams backing this up; I'm merely stating what I've observed about the data profession from the conversation on LinkedIn, what dominates the agenda at data conferences, and my own experience of working in data teams over 7-8 years and how my job role has changed.

Fundamentally, the over devops-ification of data is essentially an expensive substitute for having the backbone to tell well-paid people to prove the need for a new report and think through its content properly before committing to produce it on a scheduled basis. It's the making of frustrated developers as opposed to productive data teams that are first-class citizens within their businesses. For that to occur, the thrill of insight discovery needs to come back into fashion.


Previous Articles

Data beyond AI: Microsoft Fabric vs Data Contracts (Ned Stratton: 30th January 2024)

No-code data part II (Ned Stratton: 22nd November 2023)

White Paper: The free-role data analyst (Ned Stratton: 4th September 2023)

Do data analysts need to read books? (Ned Stratton: 10th May 2023)

No code data tools: the complexity placebo (Ned Stratton: 17th March 2023)

The 2023 data job market with Jeremy Wyatt (Ned Stratton: 24th January 2023)

Making up the Numbers - When Data Analysts Go Rogue (Ned Stratton: 2nd December 2022)

Data in Politics Part 2 - Votesource (Ned Stratton: 12th September 2022)

Data in Politics Part 1 - MERLIN (Ned Stratton: 2nd September 2022)

Interview: Adrian Mitchell - Founder, Brijj.io (Ned Stratton: 28th June 2022)

The Joy of Clunky Data Analogies (Ned Stratton: 14th April 2022)

Event Review - SQLBits 2022, London (Ned Stratton: 17th March 2022)

Interview: Susan Walsh - The Classification Guru (Ned Stratton: 21st February 2022)

Upskilling as a data analyst - acquiring knowledge deep, broad and current (Ned Stratton: 31st January 2022)

Beyond SIC codes – web scraping and text mining at the heart of modern industry classification: An interview with Agent and Field's Matt Childs (Ned Stratton: 8th December 2021)

Debate: Should Data Analytics teams sit within Sales/marketing or IT? (Ned Stratton: 26th October 2021)

Event Review: Big Data LDN 2021 (Ned Stratton: 27th September 2021)

The Swiss Army Knife of Data - IT tricks for data analysts (Ned Stratton: 9th September 2021)

UK Google Trends - Politics, Porn and Pandemic (Ned Stratton: 15th October 2020)

How the UK broadcast media have misreported the data on COVID-19 (Ned Stratton: 7th October 2020)

The Power BI End Game: Part 3 – Cornering the BI market (Ned Stratton: 21st September 2020)

The Power BI End Game: Part 2 – Beyond SSAS/SSIS/SSRS (Ned Stratton: 28th August 2020)

The Power BI End Game: Part 1 – From Data Analyst to Insight Explorer (Ned Stratton: 14th August 2020)

Excel VBA in the modern business - the case for and against (Ned Stratton: 13th July 2020)

An epic fail with Python Text Analysis (Ned Stratton: 20th June 2020)

Track and Trace and The Political Spectrum of Data - Liberators vs Protectors (Ned Stratton: 12th June 2020)

Defining the role of a Data Analyst (Slawomir Laskowski: 31st May 2020)

The 7 Most Common Mistakes Made in Data Analysis (Slawomir Laskowski: 17th May 2020)

COVID-19 Mortality Rates - refining media claims with basic statistics (Ned Stratton: 10th May 2020)