Developing Data Room
Every business generates a lot of data and use various tools for each dedicated area of it's operations.
Each tool allows either API access or CSV exporting capabilities. The problem really is laying down in presentation of that data and crafting answers for specific questions using that data.
There are major players in data visualization niche, Microsoft PowerBI, Tabbleau and Looker.
There are also multiple ETL providers that would allow to connect to some or most systems and fetch data from those.
However there is one issue none of those systems addressing - data ownership.
ETL or Dashboard visualization tool would transport your data and present it while you still use the same app, what about backups? What about a data ownership topic where you need to migrate to another app or platform and keep the data continuity in form of Answers to your specific Questions.
In this blogpost I want to zoom in on how Connector can help to remove the vendor lock and bring data ownership while ultimately allowing deployment of Dashboard with Answers.
Let's quickly remind what Connector does - it fetching data from various sources either utilizing API's or parsing XLSX files, CSV files and Google Spreadsheets.
Using this blog post I want to explore our journey where we helping out on bring together data from multiple sources enrich that data using supporting meta data added on top of it as a spreadsheet and then present in a nice and easy to consume way to decision makers and stakeholders.
What components we will need to ensure success
- Good looking and functional Tables - with sorting, with quick search, with ability to explode rows and fetch more data about specific line item
- Charts, simple clean and flat as well as animated and more complex
- Widgets to assemble dashboard for a quick and easy consumption of key most important insights
Tables
I gone for some inspiration to codepen and got a ton of cool looking CSS for tables
Tables functionality - datatables it allows filtering, sorting, quick search, exploding the row as well as quick inline editing
Charts
We are picking up charts that are open source and not web-based 3rd party SaaS
You would ask why bother? let me remind you that main goal of each SaaS is to be sold to a bigger SaaS and that goal is additionally pushed further by VC's and investors that want to get their x10 x100 back... so things are either acquired to be a part of a bigger thing or to be killed.
But from our perspective open source even if abandoned in the future at least will keep functioning.
There are ton of JS charting libs on Github, so we would pick one that would work nicely without REACT with just JS or jQuery.
For deeper and more wild charts visualizations we always have - Echarts Apache
Widgets
for inspiration I went to Radix UI but it's heavily React infused thing...
then asked friend (he is btw a great front-end developer specialized in dashboards complex advanced react implementations etc) he suggested to take a look on Tremor, they did a great job making glossary of blocks and elements
to finish it up and fill up all the possible remaining gaps we picked 2 themes on WrapBootstrap to have a nice scaffolding and a base for putting features into it.
Where data comes from:
What is challenging in that type of projects that it's suppose to consolidate multiple data sources
- QuickBooks
- Data from industry vertical type apps
- Fetching data
- Own data organized in form of Google Spreadsheets (meta data)
- CSV exported from other places
- PDF documents
Let's start from the end - PDF documents:
those are tricky ones because they can come in a form of a scan of printed document, as well as PDF with actual text in it. Those type of documents are sent via email, live in eSigning platforms if not downloaded, or eventually collecting dust inside Box.com or Dropbox folders hard to be discovered and not-searchable.
There is initial temptation that AI can parse those documents (and it's capable) the main thing is - it's need to be splitted in steps and become a process.
There are some tools out there
there are open source github published tools that will scrape and make sense of your files stored on computer: https://github.com/iyaja/llama-fs
The approach I'm proposing is to make them tagged first, and then organize by "pre-filtered views" and then take it step further into building abstracts.
Fetching Data
I don't want to go into a deepness of ETL topic, there are hundreds solutions out there, as well as what I call "soft ETL" where the connector outputs data from a tool into google spreadsheets example - Coefficient:
those tools not solving an issue they basically move data from closed platforms into "Google Drive World" of files where they are easily lost and hard to manage, not mentioning building a bigger picture out of and most importantly refresh the underlying data at a meaningful cycle - weekly or monthly.
Thats why it's important to have own database. Ultimately a place where apps, people, connectors are sending data and this data is normalized and then possible to be used either via custom interface or 3rd party visualization/internal tools platforms like Retool