Almost in every AI or Machine Learning conferences I’ve been to lately, there’s a track dedicating to biases or “injustices” in algorithmic decisions. Books have been published (Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor, Algorithms of Oppression: How Search Engines Reinforce Racism etc.) and fear has been spread (Elon Musk says AI development should be better regulated, even at Tesla ).

The fear of unknown is, perhaps, more persuasive than a realistic survey of the state of AGI (Artificial General Intelligence) development. Admittedly, from the very beginning of my career, I despised those who lacks the imagination of how data and algorithms can improve the quality of human decisions - I have always believed that human intelligence could be drastically improved when augmented with the right information at the right time.

Data engineers rarely have a say in what’s coming in the systems we’ve built. This presents great challenges where data systems often need to be tolerant about unseen events and at the same time have extra monitoring or QA processes to allow human to determine if the exception actually signals a broader system failure. Machine learning systems have brought this challenge to a new level - in data pipelines, system failures are mostly deterministic or at least reproducible when certain conditions are met. Machine learning applications outputs are stochastic, when exceptions are raised, there are way more probable causes from data to application where stochastic behavior does not make investigation any easier.