Future-Proof The Value Of Your Data Science Capability | by Robert Dowd | Jan, 2024


#FutureProof #Data #Science #Capability #Robert #Dowd #Jan

By integrating data-engineering aptitude

Robert Dowd
Towards Data Science

This article will cover a topic that is a commonly overlooked requirement for building & future-proofing highly valuable data science capabilities.

  • Why integrating data capabilities matters
  • Accomplishing integration within an organization
  • Overlapping priorities & skillsets for success

For every data scientist, it is vital to stay on top of technology trends and tools as the industry evolves. With a recent boom in Artificial Intelligence, there is a great deal of focus on emerging technology like chatGPT as a large-scale LLM-powered data product, Github Copilot to assist programmers in writing code with on-the-fly-suggestions, and of course many more.

However, a data scientists’ ability to employ these new technology & skills are heavily impacted by a phrase we all know and love: “Garbage in, garbage out”. This concept revolves around the idea that solid data pipelines are a crux for good data science. While many understand this to be true, the reality is that data-centric organizations don’t often supplement their data science teams with dedicated, or sometimes any, data engineering support.

An unfortunately frequent set of consequences stemming from the siloing of data science teams from their engineering counterparts cause headaches all around, such as:

  1. Data Scientists must wade through an ocean of data spread across the organization’s infrastructure and are often unequipped to properly engineer access to resources needed, resulting in a lot of time spent hacking together “quick”-fixes.
  2. Data Engineers encounter “hand-off”s of models or code being provided with very few requirements and the context that is vital for efficiently deploying to production environments & maintaining with quality support.
  3. Impressive (and sometimes expensive to build) data products never make it out the door to the hands of customers!
Data “Science” without the proper foundations (image generated by FlowGPT + DALL-E 3)

So as a crucial component in the success-map for enabling a highly-valuable data science…