DATA : FINDING THE NEEDLE IN THE HAYSTACK
It must have happened to you. It certainly happens to me every once in a while. I need to find something urgently and even have no clue where to start searching. Typically a spare part of one or another device. A few days ago it was the coffee maker . I remembered buying a couple of seals a long time ago and storing them in secure place where they surely wouldn’t get lost… I obviously completely forgot about the spares and where I stored them. And suddenly, me dying for a nice cup of coffee, the device started leaking! I needed the seal and I needed it NOW!
I remembered having bought the spares. However, I could have stored them in or outside the house. Or did I accidentally throw them away when I did the last spring cleaning? It looked like searching a needle in a haystack. Or worse, it looked like searching in multiple haystacks with possibly no needle to find….I just didn’t know.
And funnily enough, very often, when I start searching for one thing, 99 times out of 100 I find another that I did not even miss….but that could and would address other (less urgent) needs. But guess what happens when I need that other thing....indeed!
You can imagine the attractiveness, at such crisis moment, of having a well organized warehouse where I could go and find every little piece I need, when I need it, right away…However, how much time and effort would it take to set up this warehouse and to properly store every single item in it, given the fact that house, garage, summer cottage and any other property after a few decades, contain quite some stuff… The warehouse idea may sound really appealing, the idea of having to get everything properly organized however, sounds horrific, and not really a big help for my urgent need to find the spare that I need NOW.
This little story could be a metaphor for the challenge that banks and large institutions in general are facing. Whereas my needle in the haystack was a simple spare part, their needles are single data items , hidden in tens if not hundreds or even thousands of haystacks that can be anywhere in the world, in their own systems and data centers or somewhere in “The Cloud”. Every second of the day there is an urgent need to find something : a client request, a mobile app, a regulatory report, a management report….
Every request triggers a massive search for a different needle in a different haystack or for multiple needles in multiple haystacks or for masses of needles and masses of copies, possibly in slightly different formats so that the needles may look differently…It sounds like an impossible thing to do….and that is exactly what we are being told by those institutions when it comes to regulatory compliance and digitalization.
Returning to my own search for the seal, how nice would it be if someone came along who told me to relax, to sit back and to wait for something “magic” to happen. That “magic” would be some sort of a robot that , in a minimum lapse of time (say a few minutes) finds my desperately needed seal. The robot would be “told” to go and visit my house, my garage and any of my properties. It would be ordered to gather all information about literally everything there is to find in each of the places. This information would be presented to me in a format that enables me to search for whateverI may need, based on the information that I have to identify it, fairly easy for a coffee maker seal. This robot would not only instantly find the one particular seal, the needle in the haystack that I am looking for, but also every other spare that is likely to conform with my identification, call it my search criteria. Sounds like a fairy tale, doesn’t it?
And what if a similar robot-like thing would become available to a bank? Imagine a software that just needs to be told where to go and find all data sources used by the bank. Then it connects to every single source it is given access to.
Once the software has access to the data, just like my imaginary robot, it starts collecting and building all knowledge about the data sources and about the data in those sources. On top op of that it compares all data items in all the sources, all the needles in all the haystacks, and identifies the relationships. At the end of the exercise, all the knowledge is made readily available for anybody (man or machine) searching for any needle in any haystack. Sounds like a fairy tale, doesn’t it?
Imagine the impact of such a piece of technology :
GDPR becomes a piece of cake, as any hidden data gets surfaced.Data inconsistencies become apparent and can be quickly solved.A data warehouse becomes a “nice-to-have” , rather than an “urgent” need.Unnecessary copies of data and possibly whole systems can be removed, without unexpected and unwanted consequences.AI and ML algorithms are fed with clean and relevant data onlyYou become the master of your own data!
You may think that finding such a software is as unlikely as finding a needle in a haystack that may not even exist….
Well, I still invite you to start looking for it, finding this one might be much easier than finding the seal for my coffee maker.