Data Pipelines – The Solution to integrating Data Silos


Small businesses are spoilt for choice when it comes to off-the-shelf software applications or data silos these days. If we just look at the options at a high level, these include Xero, Quickbooks, SageOne for accounting, Hubspot, Freshworks, Salesforce, etc., as CRM options. Most businesses also have some sort of line of business applications; for example, if you are a Short Term Insurance Broker, you might use Grail, Cardinal or Nimbus as your policy administration software.  Added to that, most businesses will also have a website or webstore, depending on whether you only use your website for advertising your business or making online sales.


The bane of manually updating data silos

These systems are designed to be used in isolation, meaning they come with their own database. In reality, this means that your data is dispersed among different systems and a lot of manual effort is needed to keep these isolated systems in sync. I have regularly seen that businesses, most times, give up on the efforts to keep these in sync and then instead start relying on one of these systems to become the “master”, and all of the others are just kept up to date to the point where it serves a small business function. For example, let’s go back to the Short Term Insurance Broker. The broker might become totally reliant on their policy management software for the business’s day-to-day running. At the same time, their accounting system might just have enough client information to do their billing. So, what happens if the client changes some of their details? He would have to update all that information on the most critical systems, using a very manual process – a complete time-waster!

While manual syncing one or two systems might still be doable when the business is relatively small in size, this quickly becomes unsustainable once the business size and, with it, transaction volume increases. It also means that smart analysis of data becomes impossible.


What is smart analysis of data?

Big corporates have long valued the benefits of centralising their data, running simple queries on the data to provide them with valuable insights into the business, even giving them a competitive advantage.

Let’s look at another simple example to illustrate my point. Let’s say company XYZ manufacture widgets. They have an online store to sell the widgets, which has its own database. XYZ uses Pastel as their accounting system, and orders from the web store get sent via email to the accounting department that checks the stock availability of widgets and raises an invoice to the client. Thus, they have two data silos: however, XYZ’s web store does not sync with Pastel’s database, meaning that all product info displayed on the webstore is kept up to date as per the manual efforts of staff. Did I mention that XYZ sells 15 000 different widgets?

Here we are already looking at two data silos, one database held by the webstore and one held by Xero.  The process of keeping these two systems in sync already place an enormous administrative burden on staff. It also leads to mistakes, and product info on the webstore often become out of date. Besides keeping the web store up to date, every sale from the web store requires a sequence of manual checks, i.e. Are the widgets bought by the client in stock? Are the client’s details correct in Pastel? Etc. Even though this client is trying to sync the systems, it’s not really syncing of data but rather a one way update. In our example, the client’s details will be updated in Pastel, but what if the client updated their details via phone or email, is there a manual process in place to make sure the update happens on Pastel and the web store? Should XYZ require CRM software in future, how will this be kept up to date, and will difficulty updating the system be a deciding factor against the benefits of the CRM?

Although we have captured the sale of widgets on Pastel, and we can pull a report on how many widgets were sold (and hopefully to which clients), some valuable insights and the opportunity for smart data analysis are lost.  Some examples of unanswered questions:

  1. Did we get this sale due to a marketing campaign or some promotion we are running?
  2. Did the client look at some different widgets before deciding on the one to buy?
  3. Is this a recurring client or a new client, and if it is a new client, how do we add them to our efforts to turn new clients into recurring clients?
  4. At the end of the month, how many sales did we make via our different sales channels, i.e. the web store, telephone, email, walk-in?
  5. How many widgets that clients want are currently out of stock?
  6. How much stock do we need to keep for regular sales from the web store?

The list can go on.


The Solution to multiple Data Silos? Data Pipelines

The solution is quite simple. We build data pipelines to connect these silos. If Pastel and the webstore in the above example were connected via a data pipeline, it would immediately remove the administrative burden on staff and free them up to focus on what is really important: customer service! Efforts could instead be focused on client engagement and delivering the product to the client. Since both systems are in sync, either of these systems can provide reporting, especially important if one of the systems does not provide sufficient reporting on its own. This will be the case more often than not since the focus of these systems is completely different.


How can this be implemented?

Although the Solution is simple, the implementation can be a lot more complex. Therefore, it is extremely important to understand what the business upfront requires end result. What will be valuable to the business, add to the bottom line and produce the data in a useable form? Having data in a database means nothing. However, providing the relevant departments with regular reports from that data or updating these disjoined software applications makes a lot more sense. To get this right, the end use is not something to be considered at the end of the process, but rather right from the start.


The Tools – The Infrastructure

We have so many tools, and infrastructure options that cloud service providers like Microsoft Azure and Amazon AWS provide tailored towards solving this exact problem.  However, it is essential to have a partner that can guide you on what is possible and what choices to make. The options can be overwhelming, and a specialist must make recommendations on the correct decisions; otherwise, it could lead to inefficiency and very costly implementations.


IT in a Box specialises in providing data integration services. We first assess what the business needs are while also making recommendations knowing what is possible. From this point, we work backwards and build integration services between existing systems to provide the required outcome.

Reach out for a no-obligation consultation to discuss your current integration issue.

Email or call 021 300 1495 for more information.

Keep up to date with our valuable advice and specials via our social channels