Developing Data Room

Every business generates a lot of data and use various tools for each dedicated area of it's operations.

Each tool allows either API access or CSV exporting capabilities. The problem really is laying down in presentation of that data and crafting answers for specific questions using that data.

There are major players in data visualization niche, Microsoft PowerBI, Tabbleau and Looker.

There are also multiple ETL providers that would allow to connect to some or most systems and fetch data from those.

However there is one issue none of those systems addressing - data ownership.

ETL or Dashboard visualization tool would transport your data and present it while you still use the same app, what about backups? What about a data ownership topic where you need to migrate to another app or platform and keep the data continuity in form of Answers to your specific Questions.

In this blogpost I want to zoom in on how Connector can help to remove the vendor lock and bring data ownership while ultimately allowing deployment of Dashboard with Answers.

Let's quickly remind what Connector does - it fetching data from various sources either utilizing API's or parsing XLSX files, CSV files and Google Spreadsheets.

Using this blog post I want to explore our journey where we helping out on bring together data from multiple sources enrich that data using supporting meta data added on top of it as a spreadsheet and then present in a nice and easy to consume way to decision makers and stakeholders.

What components we will need to ensure success

Good looking and functional Tables - with sorting, with quick search, with ability to explode rows and fetch more data about specific line item
Charts, simple clean and flat as well as animated and more complex
Widgets to assemble dashboard for a quick and easy consumption of key most important insights

Tables

I gone for some inspiration to codepen and got a ton of cool looking CSS for tables

Tables functionality - datatables it allows filtering, sorting, quick search, exploding the row as well as quick inline editing

Charts

We are picking up charts that are open source and not web-based 3rd party SaaS

You would ask why bother? let me remind you that main goal of each SaaS is to be sold to a bigger SaaS and that goal is additionally pushed further by VC's and investors that want to get their x10 x100 back... so things are either acquired to be a part of a bigger thing or to be killed.

chartio sunsetting acquired by atlassian

But from our perspective open source even if abandoned in the future at least will keep functioning.

There are ton of JS charting libs on Github, so we would pick one that would work nicely without REACT with just JS or jQuery.

For deeper and more wild charts visualizations we always have - Echarts Apache

null

Widgets

for inspiration I went to Radix UI but it's heavily React infused thing...

null

then asked friend (he is btw a great front-end developer specialized in dashboards complex advanced react implementations etc) he suggested to take a look on Tremor, they did a great job making glossary of blocks and elements

null

to finish it up and fill up all the possible remaining gaps we picked 2 themes on WrapBootstrap to have a nice scaffolding and a base for putting features into it.

Where data comes from:

What is challenging in that type of projects that it's suppose to consolidate multiple data sources

QuickBooks
Data from industry vertical type apps
Fetching data
Own data organized in form of Google Spreadsheets (meta data)
CSV exported from other places
PDF documents

Let's start from the end - PDF documents:

those are tricky ones because they can come in a form of a scan of printed document, as well as PDF with actual text in it. Those type of documents are sent via email, live in eSigning platforms if not downloaded, or eventually collecting dust inside Box.com or Dropbox folders hard to be discovered and not-searchable.

There is initial temptation that AI can parse those documents (and it's capable) the main thing is - it's need to be splitted in steps and become a process.

There are some tools out there

screenshot of AI tool that allow chatting with documents

there are open source github published tools that will scrape and make sense of your files stored on computer: https://github.com/iyaja/llama-fs

The approach I'm proposing is to make them tagged first, and then organize by "pre-filtered views" and then take it step further into building abstracts.

Fetching Data

I don't want to go into a deepness of ETL topic, there are hundreds solutions out there, as well as what I call "soft ETL" where the connector outputs data from a tool into google spreadsheets example - Coefficient:

null

those tools not solving an issue they basically move data from closed platforms into "Google Drive World" of files where they are easily lost and hard to manage, not mentioning building a bigger picture out of and most importantly refresh the underlying data at a meaningful cycle - weekly or monthly.

Thats why it's important to have own database. Ultimately a place where apps, people, connectors are sending data and this data is normalized and then possible to be used either via custom interface or 3rd party visualization/internal tools platforms like Retool