Skip to main content
Bluecoders

At what stage should your data be handled by a data team?

Christophe HébertMay 2, 2022

Yesterday, companies could employ teams of statisticians to comb through data manually. Today, the volume of streams and the diversity of data far exceed what manual analysis can handle.

That's where Data Science was born, and it unlocked a wave of jobs specialized in data processing.

Data Scientist, that "job of the future" — well, the future is now! It's evolving so fast that it's splitting into several more specialized roles, since data is everywhere and unavoidable today in keeping companies running.

The Data Scientist was long seen as a Swiss army knife, able to take on every role. Ingesting and distributing data across infrastructure, building beautiful visualizations and models, predictive analytics, plus crunching the numbers. Can a single human really cover all of that? Yes — but only up to a certain scale.

That's why today the role is evolving and even splitting. We're breaking up this slightly catch-all Data Scientist role into more precise functions that are more relevant and, above all, more achievable for even greater impact. We now also talk about Data Engineers, Data Analysts, Machine Learning Engineers, and Data Architects.

But is there enough work for all of these different roles?

Oh yes — you can be dealing with data flows so massive that they're measured in terabytes or even petabytes, plenty to keep a regiment of Data experts busy! At that volume, these experts are armed with powerful technologies to make use of it all and help companies gain a competitive edge.

But back to the new role of the Data Scientist!

Their main mission revolves around modeling — that is, doing predictive analytics from a given dataset. Their role is also to find homogeneous groups within data by optimizing mathematical algorithms in order to deliver broader, deeper analyses. The goal is to measure how close each item is to others based on defined criteria. In other words, data clustering.

Although it's hard today to find two identical definitions of these roles, what often comes through is that, unlike the Data Analyst or the Data Engineer, the Data Scientist is more of a person with business sensitivity. They're able to identify the key indicators needed to answer a business question rooted in the company and its industry.

The Data Engineer comes in upstream. They work with raw data, sometimes invalid or erroneous. Their job is to track down those issues and make the data reliable so the Data Scientist can plug in their algorithms. Poorly prepared data will directly affect the results, which will be incorrect or disappointing. On the data job market, the volume of Data Engineer roles open across all sectors is much higher than that of Data Scientists.

______________________________

Now that we know who's who and who does what, how do you know at what stage your data needs to be looked after by experts?

"When there's data to process." Thanks, Captain Obvious! 💡

More seriously, you need to think about it from the moment the company is founded! Thinking about it early lets you anticipate and reflect on:

What are we collecting?

How do we collect it, and therefore which data profile should we hire?

How do we analyze that data?

How do we improve the product with it?

So which profile should you hire first for your early-stage startup needs?

With all these complementary roles, which one should you turn to first when your data volume isn't large enough to justify a whole team dedicated to Data?

And no, not every company can afford the luxury of a big army of Data Scientists and Data Analysts. Sometimes a few basics are enough, depending on the volume of data to handle.

Our advice, if you're a startup, is to start by hiring a Data Engineer. Why?

Because the first part of your strategy is going to be: "How do we capture this data, ingest it, and format it?"

Early on, you need someone who likes to touch every part of the stack and who has a feel for each Data discipline. The classic Swiss army knife — the generalist profile.

Why not a Data Scientist?

If you do hire that kind of profile, make sure they also wear a Data Analyst hat so that, from day one, they can identify the levers of progress for the company by analyzing the data. They need to know how to process data on distributed infrastructure, build models that scale, and ship them to production.

The goal early on is to lay the groundwork. So you may also need to bring in a Data Architect upstream to define the platform.

Ready to find the missing piece of your team?

Let's talk about your hiring needs. A team member will get back to you quickly to qualify the brief and kick off the search.