Modern Data Stack: a modern approach to data

6
min
Created in:
August 4, 2020
Updated:
4/22/2024

Companies of all sizes already understand the existence of the power of data , recognize the importance of its use in business, but do not know how to overcome the challenges regarding the organization, integration and management of the information that arise.

In this sense, the Modern Data Stack (MDS) , or modern data approach, emerges. A concept that has arrived to revolutionize and modernize companies’ data infrastructure .

Organizations that want to grow and remain competitive need to invest in a robust data infrastructure , capable of managing large volumes of information. This can be done with the Modern Data Stack.

In this post, you will read a clear and summarized explanation of this approach we use here at Indicium .

Modern Data Stack

The new combination of best practices and tools for creating data infrastructures is known as the Modern Data Stack (MDS).

One of its most striking features is the combination of several open-source tools to respond to the demands of a complex data infrastructure, in a highly efficient way.

What does this mean in practice?

With a Modern Data Stack , it is possible to combine tools that perform different functions, such as integrating, storing or visualizing data , to create a modern, changeable and more independent data structure .

For example, consider a company that has drastically increased its customer base and needs to expand its data storage solution.

If it uses the modern data approach , it will have two options:

  1. adapt your current solution to new demands.
  2. replace it with another tool that meets your needs, without having to completely redesign your data infrastructure.

In other words, with MDS , organizations have more flexibility to make specific adjustments and reinvent their structure without having to completely transform it. The result?

Less costs, more scalability and autonomy.

Today, thanks to new technologies and accessible tools, it is much easier to adopt the modern approach to data .

However, to be successful in implementing these practices, you need to understand how all the pieces fit together.

Building a Modern Data Stack

An efficient data structure combines several services into a data stack , which is a pile of data .

Overall, a data stack has three fundamental functions :

  1. collect and integrate data into a data warehouse (a “home” for the data).
  2. clean them and transform them into information .
  3. add value to decision making through intuitive visualizations , such as BI dashboards .

All of these functions are processes in a data pipeline (a flow through which data enters, is processed and is transformed).

The tools used for each of these processes form the data stack . And, although the architecture of a pipeline varies according to each company, all data pipelines have these processes incorporated.

A modern approach to data (MDS).

To further clarify the modern data approach (MDS) for you, below we present, according to the respective process, the main tools available on the market that are successful in thousands of data projects of all sizes in Brazil and abroad .

1) Data collection and integration

Making data available from multiple isolated sources for analysis is one of the main challenges of data projects. To overcome this, you need to invest in data collection and integration .

Tools like Fivetran and Stitchdata are the leaders in cloud data integration . They allow you to move data from hundreds of sources, such as ERPs, CRMs, databases , REST APIs, etc., directly to a data warehouse (in the cloud or on-premises) . Furthermore, they can be combined.

Therefore, there is no need for large investments in software licenses or implementation hours.

Additionally, companies looking to collect more accurate data online and offline can also use Segment or Snowplow to get a complete view of their customers.

2) Data warehousing

Another fundamental step of the modern data approach is the transformation of raw data into modeled data , which occurs within a data warehouse (DW).

By centralizing data transformations in DW , there are huge efficiency gains in the project, especially through the ELT approach, which increases flexibility in the pipeline and guarantees autonomy for business analysts to define business rules in DW, accelerating the project in months .

In data warehousing , the two main Modern Data Stack tools used for data transformation are dbt and Dataform .

Another recent and essential innovation in this approach are cloud DWs, such as Amazon Redshift and Google Big Query , which allow you to quickly store and query huge volumes of data through their scalable architecture .

3) Business intelligence (BI)

Analytical intelligence is a priority at Modern Data Stack .

With a modern data infrastructure in place, you can use different business intelligence tools to visualize, analyze and generate insights from data.

There are several robust open-source alternatives for this, such as Metabase and SaaS platforms , such as Microsoft PowerBI , Looker , Tableau , among others.

Important: the essential thing for the modern approach is that BI does not have an end in itself, but quickly generates value for the company.

4) Machine learning

Machine learning , artificial intelligence and modeling are advanced analytics techniquesapplied for more complex analyzes within the data stack .

To this end, in addition to the various libraries in the R and Python languages , tools such as MLFlow and Kedro help in the execution of predictive and prescriptive models , and optimize the development process, reducing the time between modeling and data use , the Achilles heel of any advanced analytics project.

5) Deployment

Tools such as Docker and Kubernetes are widely used to deploy in conjunction with orchestrators, such as Airflow and Prefect .

The difference between these technologies is that all the “Lego pieces” talk harmoniously to each other, ensuring that data flows in harmony throughout the data structure.

Modern Data Stack for everyone

The Modern Data Stack (MDS) is the link between raw data and business intelligence, that is, it is an integrated system of applications that collects, combines, analyzes and realizes the value of data for companies.

Inserting MDS is essential for modern companies that want to succeed in the data era.

Fortunately, data stack components are much cheaper, simpler to configure and handle. Thus, companies of all sizes can use it to gain a competitive advantage and develop analytical maturity .

Do you want to implement the Modern Data Stack in your company?

We have a highly qualified team to help you.

Get in touch today here .

Tags:
Modern Data Stack

Daniel Avancini

Chief Data Officer

Isabela Blasi

CBDO and co-founder at Indicium

Keep up to date with what's happening at Indicium by following our networks:

Prepare the way for your organization to lead the market for decades to come. Get in touch.

Click on the button, fill in the form and our team will contact you shortly. We're ready to help and collaborate on your data initiatives.