Listen to this article
I recently took up the offer and challenge by Microsoft to take up data science. There is too much data out there and too few people who know how to extract knowledge.
This was not empty rhetoric, as Microsoft went a step further and made available free online training that allows one to qualify as a data scientist.
I am currently on my fourth module, and so far, it is clear that there is a need for more ICT professionals to take up the study of data science, as it will soon be more important than an MBA.
As part of the exercises, Microsoft gives you access to huge data sets hosted on their Azure cloud platform. This allows you to appreciate the fact that you do not need huge in-house servers to take advantage of the big data sets.
The more I study, the more handicapped I feel as I cannot find relevant local data sets on which I can test my newly acquired skills. Data in Kenya has yet to be liberated, not only by the public sector but also private sector custodians. I call them custodians because the collection of the data was paid for by the public sector and there is a need to make it available to its rightful owners.
Access to Information Act, 2016, that gives effect to Article 35 of the Constitution of Kenya declares that the provision of information is a fundamental freedom enjoyable by all citizens, yet I am unable to get access to information such as the censor’s data or even detailed results of KCSE and KCPE results.
You must be wondering why I put Safaricom in the title and not the government. Well, it is because Safaricom has more data about me than any other entity, so to me, they are the ones denying me the larger portion of my fundamental freedom.
The empty rhetoric by Safaricom of their dedication to encourage innovation is the other reason why I felt it prudent to target them instead of others who give no such assertions.
Safaricom knows how much I spend, where I spend I,t and when I spend it. They know all my close contacts as well as casual contacts. They know where I have been, who I visited, and for how long. With the vibration sensor on my phone, they will soon be able to know what I did, when, where and with whom.
All that is my personal data. Throw in the new surveillance cameras, and they can now verify that it was me carrying out all those activities. It is annoying that we cannot make such basic data available. Imagine how more difficult it will be to implement an electronic medical records (EMR) policy.
I am not looking to have them gagged or make to sign nondisclosure agreements, but only to release the same to me in a format that I can extract knowledge from it to make the new life they have provided me with more palatable.
They should also be able to anonymize the rest of the data and make it available to me so that I can extract even greater insights into what is going on around me. I should be able to find out whether I am being stocked or even when my activities have become routine so that I can change them to retain my sanity.
That is only on a personal scale. Imagine making the data available to allow the analysis of vehicular and foot traffic patterns that can assist the police who are tasked with controlling the flow within the cities. It would also be able to help us provide business opportunities using the data that Safaricom already uses to optimize locations of mPesa agents.
This is public data, and the first thing we need to do is include telecommunications transactions under the banking and insurance acts as they relate to retention of data. Safaricom is currently dumbing-down data vital to developing artificial intelligence, which is a critical part of the innovation for which they are giving lip service.
Safaricom, it is my constitutional and fundamental right to have access to ALL the data that you hold on me. This includes which broadcast transmission station (BTS) that I interacted with, when, and for how long, to the mPesa agent who I use the most frequently.
I look forward to you releasing my data so that I can continue with my data science endeavors utilizing relevant data to produce insights on myself.