AI, Data Science, Economy, Garbage Collection, ML, Platforms, Rambling, Rant

The Near and Medium Term Risks of ChatGPT and Other Similar Developments

As many other people before me, I will opine about the new developments on the realm of “AI”. Namely ChatGPT, Dall-E, Midjourney, etc., that are able to generate text, image, video or sound outputs given some sort of text input. These systems have been showing very interesting results that on face value seem to support the idea that this is some kind of intelligent crafting that can get reliable results.

There are though two initial big problems that are often not addressed on the news hype:

  • First, the data that is being used to train the underlying statistical generative model is not the necessarily licensed to be used in this way. And there are big legal questions in terms of who owns an output that is in fact a statistical representation of copyrighted material or even a full copy of the original.
  • Second, these are statistical generative models and depend on the work of engineers and analysts to develop and evolve the model to get accurate results. The problem is, you can get 90% right but the remainder 10% will be hell on earth to get it right.

There has been a large advancement nonetheless from previous experiments on generative models, specially in the way that ChatGPT can generate readable and mostly grammatically correct text. It is also not generating annoying or even insulting text resulting from several cycles of interactions with internet netizens or social media trolls. A fate that cancelled previous “AI” chat bots, that eventually degenerated into complete screaming sociopaths. So this means that the researchers at least were able to tame some aspects of the generative model as to not quickly become offensive. And this is an important aspect if it is to be used in the Corporate world.

On another aspect, I think that many companies like Google or Facebook avoided to launch something similar to ChatGPT until recently, cause the results from these generative models can fail in surprising and very non-deterministic ways. That cannot be managed with some meetings and a pep talk. There are risks, specially if it offends people or gives wrong results, that might cause further liabilities and reputation loss.

But for all the buzz and promise there are aspects that will distinguish generated results from something done by a human being.

Training the Models and Trained Models

The first big aspect of these generative models is the large amount of data that they require to be trained to get generally accurate results. It is not the same thing as with humans or even other animals, that they require just a limited number of examples to get things mostly right. For example, a young child will be able to infer what is a cat or a dog quite quickly from examples that are in some cases drawings that have minimal detail.

On the other hand you can supply a generative model thousands of examples, and sometimes it will say that a dog is a cat. That means that a lot of work needs to go on fiddling with the parameters of the model, and acquiring datasets that are well categorized and big enough so the generated results are accurate enough.

But there are limits, and these limits are often not very trivial and often represent aspects that even a person will struggle with. And I will use as example of this the generative models Dall-E and Midjourney, these are image generation models.

And these have become a new source of YouTube videos, such as “Mexican Star Wars”, something something as a 1980 TV sitcom, and so on. Where all kinds of mash-ups are being used to get some surprising results. Which means that your 9 year old self can get to see all of those imagined universes that were mixing in your mind at the time.

The images that are generated might be interesting, but they often show some interesting imperfections:

  • After a while even after modifying the theme slightly the same general image pattern starts to become predominant. That means the same face, same posture, same very similar expression, with just some motive and theme changes.
  • More than one iris in the eyes, and also the eyes pointing to odd directions or not conforming with usual eye direction positions.
  • Hands with more that five fingers, and odd hand postures, and even more than one hand in different postures, or an amalgamate of hands that merged together. That can also be said about leg postures and feet.

For the first problem the lack of examples in the dataset that fall into the input categories will mean that the output will start to converge to very similar outcomes. And in some cases given the right inputs it will give you practically the original image that is in the learning dataset. Also, it will converge on dominant features within the dataset for particular categories, if you have only images with people in the same body posture. That posture will be replicated time and again, and even merge with other type of postures when it is unable to tell them apart to get very odd results. In some cases these dominant features in the model will appear as artefacts on the image, like a ghost from a double exposure film.

The other problem is when we need to generate more granular structures in the image that have more degrees of freedom of movement. And either aren’t properly categorized or the model does not account for specific forms of variation. This is the case with hands, fingers, leg postures, knees, feet, teeth and eyes, these are features that require some detail. And in the case of hands, limbs and feet are something that it isn’t easy even for artists to master. Because the model doesn’t know what a posture is, it fails to detect the direction of the limbs and fingers. So it tends to merge features that it defines as similar.

These models don’t compartmentalize aspects that need to be taken into account separately when generating the image. They seem to optimized to deal with the 90% of the image, the percentage that is well characterized and large enough to fit well under the training model verification. So faces, hair, close-up pictures, and any feature that has less degrees of freedom of movement.

The fact is that these generative models are very dependent on the type of categorization and size of the dataset, and the complexity of parameters that are being used in the model. That in turn will tell in some way where it generally works well and where it fails to produce accurate results.

ChatGPT the Killer App

At the current moment Microsoft is using the buzz around the integration of ChatGPT on its Bing search engine to grow its pitiful market share. It is probably too early to tell if that will get the results that they are expecting, but it will be a good outcome if we start getting good answers from search engines for a change.

The current state of Google with its massive drive for ad monetization and excessive SEO manipulation has led to ever more loss of relevance on its search results. And, I remember that ten years ago I used to get very good results from my search queries. But lately it is just SEO dribble and ads, so anything that cuts through the noise will be a big improvement. Although once Bing’s market share gets to a certain level, I bet Microsoft managers will find some way to nerf the ChatGPT model to deliver extra ads and extra paid content that is not relevant for the user.

The generative model of ChatGPT offers a lot of opportunities to quickly generate essays, articles and even computer code. It will have the advantage of using a large semantic model and having it being trained on a large dataset. In most cases the advantage of text sources is that they can be more easily categorized, so it enables generation of various forms of text content and on various subjects. But again, that won’t mean that the generated text will be completely accurate.

This accuracy problem will be more visible when the user knows more about the domain, and when the generative model cannot differentiate between domains or some convergent feature in the data ends up eliminating the relevance of a domain specific piece of information.

So you probably can ask ChatGPT to talk about Cerenkov Radiation in the style of Rodney Dangerfield, and you might get some laughs in the process. But once you start trying to generate articles or essays on more specific subjects and themes, you might find that you are getting well written garbage.

Worse of all is when you get actually something that is broadly correct but has errors in particular details. Or it lacks context that is fundamental to understand the subject of the text. So, there is a risk that ChatGPT will generate a superficial text that fails to inform the user of particular nuances that are important. And give an incorrect sense of understanding on a subject. This is a time where laziness will be extremely rewarded but at what cost…

Hey, how about this Cerenkov Radiation, huh? I tell you, it gets no respect. You got all these particles zooming around faster than the speed of light, and what do they get for it? A measly little blue glow. No respect, I tell ya.

I mean, come on, this is some wild stuff. We're talking about particles going so fast that they break the cosmic speed limit, and what do they get in return? A little bit of light. It's like, "Hey, thanks for violating the laws of physics, here's a blue glow as your reward."

And don't even get me started on how this stuff was discovered. I mean, it took a guy named Pavel Cerenkov to figure this out. Can you imagine that conversation? "Hey, I discovered a new type of radiation. It's blue and it happens when particles go faster than light." "Uh, yeah, sure you did, Pavel."

But you know what? Despite all the lack of respect, Cerenkov Radiation is actually pretty cool. It's used in all sorts of scientific instruments, from particle detectors to radiation therapy machines. And hey, maybe one day we'll even figure out a way to harness this blue glow for something really useful, like a new kind of energy source or something. Who knows? The possibilities are endless.

ChatGPT Is Going For Your Job

In the short term there are two professions most at risk with ChatGPT, journalists and people that work in customer service. The first, because these generative models will generate quick and short articles full of SEO terms, and probably in time it will also generate larger articles based on some seed material like a newsletter, a police report, or a PR document. Currently making articles for news websites is already a cut throat business, and people writing these articles often are forced to write lots of them in a very quick pace, hence the downward quality trend that troubles most of the news these days.

The result is that we might reach a day that the initial news was generated by a ChatGPT similar bot, and then other bots just process and regurgitate some form of the article that fits into the brand of the news outlet. This can have dire results if the information is not correctly vetted, checked and edited, as it can produce at an industrial scale rumours, lies and propaganda.

For call centres and customer service the threat is around the complete automation of first tier voice call representatives. Specially in cases where people aren’t familiar with the logic of the automated call scripts and require a voice operator. Since at this stage voice synthesizers are more than good enough and the issue is more about the problem of voice recognition. Since voice patterns vary very wildly depending on age, gender, health condition and if one is a native speaker, although in the last item it is possible that this will also enable handling of calls in someone’s native language. From the point of the call centre operators the issue is how to control the chat bot to not go off the rails and start being offensive. Or also very important, not being exploited by the caller and do something it is not supposed to do.

There is also the problem of using this technology for outbound sales calls, this might present a series of problems. Because the bots might be trained to be manipulative and adjust their sales strategy to the emotions of the targeted person. The regulation of psychological manipulative practises by automated bots might need to be done with urgency, and the practise needs to be controlled otherwise we might find and industry with already low ethical standards to become a threat to society’s welfare.

Artists On the Line of Fire

Artists have a lot to lose with systems like ChatGPT, the first problem for writers, graphic artists, painters and film makers, is that it will generalize the entry of people in theirs fields that will over saturate the market for content. People that will be in practise nothing more than artistic frauds but that will only be at most producers. Because their part of the work is giving directions to generative model and making corrections to output here and there. But even the generated results are nothing more than munching on the work of others, since that is how the model was trained in the first place. There will be little originality, and we all will be bombarded with mostly mash-ups and derivative content, and occasionally something that merges aspects of several works that resonates culturally. But at the same time it is very dodgy in terms of copyright, since the datasets used to train the models might be full of copyrighted material.

Also actors and voice actors might also have more trouble finding work, since the artificially generated versions will be cheaper and more pliant to the wishes of directors and producers. It is very possible that film stage sets become mere green rooms with place actors with costumes that will be used to allow for the projection of any version of a virtual actor. This in some cases might be even the common appearance of dead actors into current productions, suddenly Bruce Lee is still alive and kicking on a streaming service.

The risk is that the production of any type of content will become stale, first because of the audience tendency towards nostalgia and sometimes a very unhealthy emotional attachment to the initial versions of the characters and stories. Second because companies will make risk averse decisions that will prioritize derivative content for things that have market recognition.

Again, I will stress. That this will make it more difficult for artists to be able to be recognized from all the noise of the generated content. This will impoverish artists, and will increase the market power of big corporations.

Coders From Talent to Replaceable

Since ChatGPT started to get buzz in the media, the examples where it generated code started to be reported and appear on social media posts. This type of thing is not new, several big companies are trying to automate the work of developers. And sometimes a piece of news appears, that some company got some system working to generate code and correct bugs and then after a while nothing more appears. Either the company got protective of the technology or the bot got off the rails and it was more trouble than it was worth.

From companies point of view, being able to replace expensive developers would be a plus, and probably the work would be done faster… But… Sometimes one has to be careful what he\she wishes for. And, yes… ChatGPT can be helpful in getting some answers to some problems or find some examples, or boiler plate code when search engines have failed. But coding isn’t just getting some code that answers a specific question. It is very often a very large integration job that has to deal with existing code bases. And there are already examples of ChatGPT generating code that is not fit for purpose, although at first glance seems correct.

Because of this, it forces the person to proof read the code and verify it, which might add some extra risks. From my point of view this will be a boom for frauds and people that are barely passable as coders. This profession is full of entry code tests and all kinds of code challenges that ask for people to solve irrelevant problems in most day to day professional life. Which often tend to benefit either people that barely left university, or people that train relentlessly on those tests.

This will mean that situations like I had to endure in the past will become more common, of a former co-worker that would make mostly boilerplate code, and fake his tests as to pass them. But not implement the business logic, and then say he was finished and passed to another project. And I was stuck to solve the bugs afterwards, so… ChatGPT won’t solve favouritism inside companies.

I don’t think that there is anything wrong with using ChatGPT to find a solution to a problem, or find a piece of information. I use search engines for that as well, although the quality of the results is getting poorer and poorer. But one thing is asking questions, the other thing is being unable to think. In my view there are tree general skills that are usual lacking from coders in the market:

  • Proof reading code, and being able to read other people’s code (or hell on earth).
  • Capacity to think without making too many dogmatic assumptions.
  • Capacity to understand several domains of business and knowledge.

The bigger risk for developers with ChatGPT is that it will increase the competition for available “good jobs” and that it will decrease the wages and increase the work loads. Since barely passable coders can fake it when working on the most simple problems or those that the generative model works better. At the same time this will also mean a ramp up on stupid code tests, where one is applying for a role developing on Amazon AWS, and will be asked to do some idiotic sorting algorithm challenges under 30 minutes. News Flash, data structures and algorithms are only important in university and barely 1% of the jobs in the market.

The result of this higher competition will be the erosion of skills, and of offloading to the generative model of an increasing share of the codebase. This will have impact on the nature of bugs, and will impact the lead times between discovery and resolution.

In Conclusion

The introduction of these systems like ChatGPT can have a very disruptive effects on the jobs market, if not done well it might even represent the collapse of most States fiscal base. Since corporations that barely pay any tax will have a bigger share of the pie, and people that work for a living and are responsible for most of the tax receipts lose market power, their jobs or have lower wages.

It is also very important to address the fact that the learning datasets have copyrighted material, and that authors need to have the right to opt-out of these datasets. And also need to have special licenses to be paid for the use of their work in this way.

The other aspect is the centralization of this technology, since these systems will be housed on large Data Centers in the cloud. Subjected to be billed as a service, which means cloud providers will rack enormous amounts of money simply by having monopoly power. These system aren’t sold as an appliance or as a program to install on your machine run locally. Cause in this case, I could see a very useful application. If one is an artist and needs to automate his\hers work, by feeding its work examples to the model and allow it to do the most tedious work in his\hers style.

Because… Once a system is in the Cloud, it is not yours. You don’t have really control, and… Even with assurances, sooner or later some policy change in the platform will allow it to peek and copy your work.

Standard
AI, China, ML

Are China’s Bets in AI Poised to Topple US Dominance?

The theme of AI is becoming a talking issue when it comes to big powers competition, although the way it can affect the world’s balance of power is very nebulous at best. AI seems to promise everything for everyone, but most developments are still far into the future. That doesn’t deter interested parties, commentators, writers and media outlets of drumming the next big threat to the status quo.

I remember in the 1980’s some similar noises, and similar patterns, that time it was fear of Japan’s economic dominance. The big threat of the fifth generation computer initiative, the threat of Japanese export manufacturers and their take-over of US assets. Things got quite farcical when Bush Senior got indisposed in an official banquet with Japanese Prime Minister and officials, the media had a field day. Events though conspired to make dire predictions of impending US doom void, the burst of the Bubble Economy in Japan had severe effects that still have an impact to this day and made the writers of incoming Japanese superiority a bit more humble in their claims.

Nowadays it is China, and instead of new computing hardware (since most is actually being produced in China), it is AI the next big threat. The Chinese officials are making some inroads in policies that either help local companies to invest in AI or Machine Learning, or are investing directly into these technologies for direct use by the State. This has been taken by the US and some other Western media outlets as a large spurt from the Chinese state to get ahead in the global pecking order by using this new technology.

Even Chinese commentators and insiders like Kai-Fu Lee that wrote a book called “AI Superpowers: China, Silicon Valley and the New World Order”, and has a PhD in AI and has worked for Apple, Microsoft and Google, and runs a Chinese Venture Capital firm, thinks that China has a big advantage on this field. For him the large pools of talent, capital, data and friendly regulation are the items that support a bigger edge to compete with Silicon Valley.

I am a bit skeptic about both parties claims, the Chinese people clearly have an abundance of talented and creative people, many work in the US as researchers and entrepreneurs. And China has a large pool of big companies and rich individuals that can invest in developing promising AI applications. But the issues are not about the individual or collective talent pools, it is about the capability of the institutions to cope with the challenges, how they focus their efforts and what is the scope when integrating these innovations within the host society. These are challenges that are common for all societies that are actively developing AI and ML technologies and applications.

First of all AI and ML systems aren’t at this point general problem solvers, they are specialized systems that learn from data according to a goal that was set by a modeller, it needs a trainer that can be a human or another program. Initially this was limited to textual data classification, but the focus has been widening. Now google is able to identify features in images and describe them with text. Not only that, there AI programs that can capture a video streams and one is able to sync the video of a person with the a voice of another creating a deep fake.

The tools and techniques that have been developed so far can be used in more than classification problems, it is possible to do signal processing, systems control, hold a text conversation with a chat bot or as Google demonstrated set appointments through a phone call. But these systems are still highly specialized systems in most cases lacking a high degree of inter-systems integration. So far, only Google is showing clear signs of integrating several of these AI subsystems to provide services on the fly for things other than digital assistants.

I honestly get puzzled by the concern of American politicians on China’s bet on AI, although I can understand the concerns of US tech leaders since this means direct competition to their businesses and potential loss of opportunities in expanding to the Chinese market and in the wider East Asian markets. It is no big risk to predict that AI advances of western tech companies would be matched by Chinese tech giants, it also opens up the scenario of Chinese tech companies competing directly in western markets if key advances in language translation technologies are made.

Now, when it’s about State to State competition I do find some issues that aren’t very straight forward and aren’t usually accounted for. Yes, China has made large advancements in economic development, it has grown at a staggering pace, and in many industrial sectors it is the dominant manufacturer. Large parts of supply chains are so deeply integrated in the South of China that it is now impossible for any other western country to establish similar industries without large economic protection measures. But, at the same time the Chinese Institutions have been slow to change their institutional frameworks.

Chinese policy makers have been good in the art of the catch-up, using economic incentives, subsidies direct or indirect, actively working on knowledge transfer to private and state companies, providing capital and facilities for Chinese businesses to grow and expand. And this can be done by using policies that are highly Centralized in terms of the overall setting up of objectives, some leeway was and is allowed to regional and local officials but when central authorities flex their muscles everyone knows where the buttered side of the toast will fall. Which means that China doesn’t rely on a network of independent institutions that coordinate between themselves to adapt to internal and external challenges but rely on a heavily centralized institutional framework to provide overall coordination to the society at large.

In this there is risk that cannot be fully calculated a priori, if the political centre is capable of meeting the challenges and responding with skill then the upside can be big and the downsides can be mitigated. If the centre is not able to rise to the challenges then there is a large possible downside for which it cannot share the responsibility.

I will now group some of the possible items where AI and ML can be explored by the Chinese State and Chinese Companies.

Panopticon

China has a large internal surveillance apparatus that generates an enormous amount of information on phone data, video surveillance feeds and all sort of other types of electronic networks and agents in the field. In general the authorities seem to try to censor and control any flare up of discontent, any critique of the authorities, anything that could be considered deviant or subversive.

The authorities already use some forms of control using social networks monitoring and facial recognition, it is not a big jump to think that AI could be used to predict if someone could be a possible dissident or a possible protester. There are all sorts of other possible uses like predicting possible areas of flare-ups of discontent, identification of protest ring leaders, identification of covert sympathizers, etc…

This is not something that only China is interested in, and actually some democracies and private companies have also a keen interest in these kinds of surveillance technologies. In some cases authorities are already using AI/ML systems to make law enforcement decisions, in the US, police forces and judges are using a system to calculate the probability of a defendant to flee judgement after making bail. The system was recorded to fail when one black 16 year old girl with no priors had to spend two days in jail waiting for judgement on a petty theft charge, while a white ex-convict with several priors was let go on bail for a similar charge only to be arrested weeks later for a more serious crime.

The risk with using AI techniques for policing, surveillance and law enforcement is that the authorities will blindly believe what the algorithm is telling them, creating all sorts of injustices and chaotic feedback loops that can make things worse. On the other side, this technology can be used to insulate the political and business elites from the rest of society, insulating them from dissent and from critique. The danger is that they might think they are impervious and invulnerable, and this kind of hubris sooner or later will get its reckoning.

Manufacturing & Services

AI and ML application can be a large boon for industrial companies in areas like quality control, adaptive control systems for robotics, adaptive design of machine parts or structural elements, intelligent logistical systems for warehouses. This can have the promise to automate large parts of the industrial complex with machines that are more flexible and automate the logistics which can be a clear economic advantage for competing in the marketplace.

In the services industry AI/ML promises to automate all sorts of clerical tasks, enable better marketing tools and more precise targeting of consumers, create all sorts of support bots that can be productivity multipliers for the workforce. It also can provide all sorts of possible new services to current users of technology platforms, provide automated language services that can provide a way to render services to countries where cultural and language barriers would prevent doing business in.

The negatives here are around possible impacts on the number and quality of jobs that would remain available to humans, this is a common concern and isn’t exclusive to China. I would guess that this would also be a concern to the authorities, specially if the impacts were such as to cause high social impacts with a high probability of social and labour unrest. Now, authorities might go with acting as this is more of a policing issue, or decide that this will need establishing some kind of comprehensive welfare state to keep social peace. It is too soon to understand where the chips will fall ultimately, and I hope that China’s and other governments around the world are wise in making the right policy choices.

Dual Use

Here are grouped AI/ML technologies that might have civil and military usages, this might be around sensor technologies, and sensor information integration that can either give more effective radar and sonar systems, more effective CAT scan and MRI equipments. Also, the possibility of AI in chip or chipset instead of a large data center, this would open a large new area of applications. Making it possible to have physical bots that could interact with people to provide services or as auxiliary tools but with also clear military applications.

The risks are associated with the possibility of these AI/ML systems have unrecognised holes in their making or training. These can make them make serious errors or start hallucinating (yes, AI’s can hallucinate, Deep Blue hallucinated chess pieces in the board in the first tournament with Garry Kasparov).

Military Use

AI/ML can provide tools to analyse strategic and tactical operations in conflict situations, it can make several considerations as a support tool. But there will be limitations, combat operations have larger degrees of variation than chess or GO and finding data that can give correct modelling for war situations isn’t a trivial task.

In weapons technology, AI in a chip can provide a cheap way to retrofit older systems with new capabilities or increase their operational utility. For example an older silkworm subsonic missile could be made more effective, capable of being able to identify possible targets after being fired and capable of more advanced manoeuvring and counter-measures. This could create situations were point defences of opposing naval vessels would be saturated and made more vulnerable to attack. For these kinds of retrofitted autonomous weapon systems the Ais would only need to be as smart as a pigeon or as a parrot (during the WWII and some years after the US was exploring using pigeons to guide missiles to target ships).

Autonomous weapons using AI can also be made from the ground-up, this represents a larger danger for civilian populations on both sides, it is not clear if these weapons can discern their own civilians from the opposing forces. These systems could provide advanced area denial capabilities similar to mine fields but with more dynamic responses. Also, these could be used as autonomous swarm systems that attack and harass opposing forces either to beat them by attrition or maintain them pinned down.

These kinds of systems are being developed by the US, and other countries. And for countries that don’t have the same type of space communications superiority as the US there is a large incentive to have these kinds of autonomous systems in their arsenal. The questions that are posed about command and control, and their vulnerability to hacking have not been thoroughly though out. The risks of unleashing these systems in the wild that could not only kill the enemy forces but also friendly forces and civilians is great.

Conclusion

It is too soon to tell if AI/ML developments will have any significant impact in the China-US relationship, and it might be the case that both will remain either in parity or with a smaller edge that doesn’t confer any significant advantage. For me it seems that AI is not a problem for now and there are other things in the present that need more attention.

And I am also of the opinion, that issues of strategic competition between the US and China will need to be resolved through negotiation, mutual understanding and in ways that satisfy needs of both societies. Otherwise the future promises to be quite grim.

Standard
CodeProject, ML, Rant

Machine Learning Road To Disappointment

From the media hype around AI and ML (Machine Learning) we would feel that common usage is just around the corner, with promises of fantastic results in terms of productivity and dire news for human employment. Although some of this might be partially true, I believe that for most part, the promise is being oversold and that a partial disappointment is on the cards. When I say partial disappointment, it is that many of the techniques that were developed in the 1970’s onwards have either matured or the conditions are now present to make their use viable, but more is needed besides computing power and lots of available data.

I will talk more here about ML, and not about AI related techniques, because many of these techniques already have been taught in the universities for decades but seldom were applied in the context of private enterprises — Heck, even tried and proven statistical methods are rarely applied today, so it is no surprise that ML had little traction. In any case, ML was considered to be mostly in the realm of the ivory tower academic research full with dreams and lots of promises but very little practical results, not that the results weren’t impressive sometimes. The difficulty in translating research into applications and distrust on the part of decision makers, on the viability of said techniques in getting results without a very high cost. When I talk about cost, it is not only time taken and money spent, it is also about the credibility and reputation cost associated to projects that fail to meet expectations. Since there were practical difficulties that made it challenging to use ML, issues like:

  • Lack of appropriate datasets for training and testing models, this in time was much less of an issue.
  • Lack of computing power to run several model iterations, again this was less of an issue over time.
  • Lack of development environments with tools and libraries that could make testing and comparing different ML algorithms and different models much easier.
  • Lack of people who had both ML technical experience but also domain experience.

In the early 2000’s using ML still meant implementing ML in a general purpose programming language or Matlab, some statistical packages like SAS provided some tools, BI tools provided some extensions that enabled some ML facilities. But these were the realm of the specialist, these were tools that were either too expensive or time-consuming to not allow a wider usage. Only big institutions that generated massive amounts of data and had big pockets could afford these tools. Inference Models, Decision Trees and other techniques were used to detect credit fraud, health care fraud, parse genetic data, anything that meant sifting through tonnes of data and find a possible needle.

The advent of R and Python enabled a democratization of the access to ML learning, but the biggest kick to enthuse the interest in ML was Google, Apple and Facebook work in the field. Without products like Siri, Google Translator, and many other related bots and autonomous agents, this field would still be relegated to the research labs. Now, these uber companies compete to get the available AI/ML researchers to develop their portfolio of products in an AI arms race.

Like I said earlier, lack of available professionals in ML with domain knowledge was always a problem. Having a ML professional with no domain knowledge in the field creates its own type of friction. It makes communication difficult with related stakeholders and generally means a higher learning curve to develop a successful application. And since, there is a lot to choose from on the ML toolbox, from numerical and logistical regression, k-means, vector machines, decision trees and random forests, each method with its own particular strengths and weaknesses so good judgement is a key factor.

But ML comes with another set of aggravations, being mostly data driven and statistical in nature it fails in a key aspect, the human need for certainty and predictable outcomes when in an organizational setting. An organization structure likes predictability, our codes of law in some cases require it under threat of penalties, and shareholders love it. What ML can provide depends on the analyst capacity to tweak and adapt the model, and the training data available as to minimize the total error. This means, at any given time the model that is used will flag some cases as false when they are true (a false negative), or flag some cases as positive when they are false (false positive).

What manager would like to know for a critical business process that the ML model has a 63% accuracy, even if in reality the current process has only 55% accuracy but the process is well-known and familiar. Now, if the current process has 90% accuracy with human operators but costs 20 times more and takes weeks instead of hours. Well, there is always that trade-off moment…

This somewhat uncertain pay-off meant that organizations focused their IT efforts on the development of systems that automate processes through sets of required rules with the expectation that these are adequate for the business. And for many cases this was more than adequate, and has been quite successful. There is little need for AI in a simple CRUD front end that is merely an interface to push and pull form data in a database.

The problem arises when there is need to classify data, when there is a low signal to noise ratio, or there is too much data for a human to classify within a reasonable time-frame. These problems are becoming more frequent as organizations accumulate a lot of data or buy it for marketing purposes. One thing is to aggregate and cross tabulate with a OLAP engine on large datasets, which is useful but with some loss of context, the other is targeting with an ML algorithm on specific groups of individuals to achieve particular behaviours. This has the promise of making marketing budgets much more effective but also has very troublesome implications.

The trend to move for ML/AI will not be smooth from the point of view of development teams and organizations. Big tech companies like Google, Facebook, Amazon and Fintech companies can afford the R&D, and this matches their business models well, while older tech giants like IBM might struggle for relevance in the field. Tech startups can also do well in technical terms, though profitability might not be a sure thing. Non tech small and medium companies, and mature and conservative companies, might struggle a lot to make sense of it all and in some cases might get gobbled up or go out of business because of it.

In many of these companies development teams live in their own microcosm, sometimes the less is known the better. But there are traits that are common, here are some examples:

  • In many companies BI and application development are separate silos.
  • Team leads are suspicious of technologies they don’t understand. And, they try to push for the use of tools that fit they particular tech niche (don’t underestimate the need some people have to use a database for everything).
  • Another aspect is that ML might be used as a status project moniker to advance someone’s career with the blessing of management, but where neither it makes business sense or the person to be advanced doesn’t have the required skills.
  • The team lacks the skills and is hostile to changes in the technology stack that might jeopardize their job.
  • Risk aversion on part of middle management leads to paralysis and delays in program implementation.

This doesn’t mean that these companies are doomed, probably they can live well in their particular niche for quite a long time, till their whole development team is replaced by attrition, or people go up in the organization. Implementing ML in the context of a SME or on a mature non-tech company is not a recipe for success by itself, and for most cases it will be invisible within and outside the organization.

Standard