As many other people before me, I will opine about the new developments on the realm of “AI”. Namely ChatGPT, Dall-E, Midjourney, etc., that are able to generate text, image, video or sound outputs given some sort of text input. These systems have been showing very interesting results that on face value seem to support the idea that this is some kind of intelligent crafting that can get reliable results.
There are though two initial big problems that are often not addressed on the news hype:
- First, the data that is being used to train the underlying statistical generative model is not the necessarily licensed to be used in this way. And there are big legal questions in terms of who owns an output that is in fact a statistical representation of copyrighted material or even a full copy of the original.
- Second, these are statistical generative models and depend on the work of engineers and analysts to develop and evolve the model to get accurate results. The problem is, you can get 90% right but the remainder 10% will be hell on earth to get it right.
There has been a large advancement nonetheless from previous experiments on generative models, specially in the way that ChatGPT can generate readable and mostly grammatically correct text. It is also not generating annoying or even insulting text resulting from several cycles of interactions with internet netizens or social media trolls. A fate that cancelled previous “AI” chat bots, that eventually degenerated into complete screaming sociopaths. So this means that the researchers at least were able to tame some aspects of the generative model as to not quickly become offensive. And this is an important aspect if it is to be used in the Corporate world.
On another aspect, I think that many companies like Google or Facebook avoided to launch something similar to ChatGPT until recently, cause the results from these generative models can fail in surprising and very non-deterministic ways. That cannot be managed with some meetings and a pep talk. There are risks, specially if it offends people or gives wrong results, that might cause further liabilities and reputation loss.
But for all the buzz and promise there are aspects that will distinguish generated results from something done by a human being.
Training the Models and Trained Models
The first big aspect of these generative models is the large amount of data that they require to be trained to get generally accurate results. It is not the same thing as with humans or even other animals, that they require just a limited number of examples to get things mostly right. For example, a young child will be able to infer what is a cat or a dog quite quickly from examples that are in some cases drawings that have minimal detail.
On the other hand you can supply a generative model thousands of examples, and sometimes it will say that a dog is a cat. That means that a lot of work needs to go on fiddling with the parameters of the model, and acquiring datasets that are well categorized and big enough so the generated results are accurate enough.
But there are limits, and these limits are often not very trivial and often represent aspects that even a person will struggle with. And I will use as example of this the generative models Dall-E and Midjourney, these are image generation models.
And these have become a new source of YouTube videos, such as “Mexican Star Wars”, something something as a 1980 TV sitcom, and so on. Where all kinds of mash-ups are being used to get some surprising results. Which means that your 9 year old self can get to see all of those imagined universes that were mixing in your mind at the time.
The images that are generated might be interesting, but they often show some interesting imperfections:
- After a while even after modifying the theme slightly the same general image pattern starts to become predominant. That means the same face, same posture, same very similar expression, with just some motive and theme changes.
- More than one iris in the eyes, and also the eyes pointing to odd directions or not conforming with usual eye direction positions.
- Hands with more that five fingers, and odd hand postures, and even more than one hand in different postures, or an amalgamate of hands that merged together. That can also be said about leg postures and feet.
For the first problem the lack of examples in the dataset that fall into the input categories will mean that the output will start to converge to very similar outcomes. And in some cases given the right inputs it will give you practically the original image that is in the learning dataset. Also, it will converge on dominant features within the dataset for particular categories, if you have only images with people in the same body posture. That posture will be replicated time and again, and even merge with other type of postures when it is unable to tell them apart to get very odd results. In some cases these dominant features in the model will appear as artefacts on the image, like a ghost from a double exposure film.
The other problem is when we need to generate more granular structures in the image that have more degrees of freedom of movement. And either aren’t properly categorized or the model does not account for specific forms of variation. This is the case with hands, fingers, leg postures, knees, feet, teeth and eyes, these are features that require some detail. And in the case of hands, limbs and feet are something that it isn’t easy even for artists to master. Because the model doesn’t know what a posture is, it fails to detect the direction of the limbs and fingers. So it tends to merge features that it defines as similar.
These models don’t compartmentalize aspects that need to be taken into account separately when generating the image. They seem to optimized to deal with the 90% of the image, the percentage that is well characterized and large enough to fit well under the training model verification. So faces, hair, close-up pictures, and any feature that has less degrees of freedom of movement.
The fact is that these generative models are very dependent on the type of categorization and size of the dataset, and the complexity of parameters that are being used in the model. That in turn will tell in some way where it generally works well and where it fails to produce accurate results.
ChatGPT the Killer App
At the current moment Microsoft is using the buzz around the integration of ChatGPT on its Bing search engine to grow its pitiful market share. It is probably too early to tell if that will get the results that they are expecting, but it will be a good outcome if we start getting good answers from search engines for a change.
The current state of Google with its massive drive for ad monetization and excessive SEO manipulation has led to ever more loss of relevance on its search results. And, I remember that ten years ago I used to get very good results from my search queries. But lately it is just SEO dribble and ads, so anything that cuts through the noise will be a big improvement. Although once Bing’s market share gets to a certain level, I bet Microsoft managers will find some way to nerf the ChatGPT model to deliver extra ads and extra paid content that is not relevant for the user.
The generative model of ChatGPT offers a lot of opportunities to quickly generate essays, articles and even computer code. It will have the advantage of using a large semantic model and having it being trained on a large dataset. In most cases the advantage of text sources is that they can be more easily categorized, so it enables generation of various forms of text content and on various subjects. But again, that won’t mean that the generated text will be completely accurate.
This accuracy problem will be more visible when the user knows more about the domain, and when the generative model cannot differentiate between domains or some convergent feature in the data ends up eliminating the relevance of a domain specific piece of information.
So you probably can ask ChatGPT to talk about Cerenkov Radiation in the style of Rodney Dangerfield, and you might get some laughs in the process. But once you start trying to generate articles or essays on more specific subjects and themes, you might find that you are getting well written garbage.
Worse of all is when you get actually something that is broadly correct but has errors in particular details. Or it lacks context that is fundamental to understand the subject of the text. So, there is a risk that ChatGPT will generate a superficial text that fails to inform the user of particular nuances that are important. And give an incorrect sense of understanding on a subject. This is a time where laziness will be extremely rewarded but at what cost…
Hey, how about this Cerenkov Radiation, huh? I tell you, it gets no respect. You got all these particles zooming around faster than the speed of light, and what do they get for it? A measly little blue glow. No respect, I tell ya.
I mean, come on, this is some wild stuff. We're talking about particles going so fast that they break the cosmic speed limit, and what do they get in return? A little bit of light. It's like, "Hey, thanks for violating the laws of physics, here's a blue glow as your reward."
And don't even get me started on how this stuff was discovered. I mean, it took a guy named Pavel Cerenkov to figure this out. Can you imagine that conversation? "Hey, I discovered a new type of radiation. It's blue and it happens when particles go faster than light." "Uh, yeah, sure you did, Pavel."
But you know what? Despite all the lack of respect, Cerenkov Radiation is actually pretty cool. It's used in all sorts of scientific instruments, from particle detectors to radiation therapy machines. And hey, maybe one day we'll even figure out a way to harness this blue glow for something really useful, like a new kind of energy source or something. Who knows? The possibilities are endless.
ChatGPT Is Going For Your Job
In the short term there are two professions most at risk with ChatGPT, journalists and people that work in customer service. The first, because these generative models will generate quick and short articles full of SEO terms, and probably in time it will also generate larger articles based on some seed material like a newsletter, a police report, or a PR document. Currently making articles for news websites is already a cut throat business, and people writing these articles often are forced to write lots of them in a very quick pace, hence the downward quality trend that troubles most of the news these days.
The result is that we might reach a day that the initial news was generated by a ChatGPT similar bot, and then other bots just process and regurgitate some form of the article that fits into the brand of the news outlet. This can have dire results if the information is not correctly vetted, checked and edited, as it can produce at an industrial scale rumours, lies and propaganda.
For call centres and customer service the threat is around the complete automation of first tier voice call representatives. Specially in cases where people aren’t familiar with the logic of the automated call scripts and require a voice operator. Since at this stage voice synthesizers are more than good enough and the issue is more about the problem of voice recognition. Since voice patterns vary very wildly depending on age, gender, health condition and if one is a native speaker, although in the last item it is possible that this will also enable handling of calls in someone’s native language. From the point of the call centre operators the issue is how to control the chat bot to not go off the rails and start being offensive. Or also very important, not being exploited by the caller and do something it is not supposed to do.
There is also the problem of using this technology for outbound sales calls, this might present a series of problems. Because the bots might be trained to be manipulative and adjust their sales strategy to the emotions of the targeted person. The regulation of psychological manipulative practises by automated bots might need to be done with urgency, and the practise needs to be controlled otherwise we might find and industry with already low ethical standards to become a threat to society’s welfare.
Artists On the Line of Fire
Artists have a lot to lose with systems like ChatGPT, the first problem for writers, graphic artists, painters and film makers, is that it will generalize the entry of people in theirs fields that will over saturate the market for content. People that will be in practise nothing more than artistic frauds but that will only be at most producers. Because their part of the work is giving directions to generative model and making corrections to output here and there. But even the generated results are nothing more than munching on the work of others, since that is how the model was trained in the first place. There will be little originality, and we all will be bombarded with mostly mash-ups and derivative content, and occasionally something that merges aspects of several works that resonates culturally. But at the same time it is very dodgy in terms of copyright, since the datasets used to train the models might be full of copyrighted material.
Also actors and voice actors might also have more trouble finding work, since the artificially generated versions will be cheaper and more pliant to the wishes of directors and producers. It is very possible that film stage sets become mere green rooms with place actors with costumes that will be used to allow for the projection of any version of a virtual actor. This in some cases might be even the common appearance of dead actors into current productions, suddenly Bruce Lee is still alive and kicking on a streaming service.
The risk is that the production of any type of content will become stale, first because of the audience tendency towards nostalgia and sometimes a very unhealthy emotional attachment to the initial versions of the characters and stories. Second because companies will make risk averse decisions that will prioritize derivative content for things that have market recognition.
Again, I will stress. That this will make it more difficult for artists to be able to be recognized from all the noise of the generated content. This will impoverish artists, and will increase the market power of big corporations.
Coders From Talent to Replaceable
Since ChatGPT started to get buzz in the media, the examples where it generated code started to be reported and appear on social media posts. This type of thing is not new, several big companies are trying to automate the work of developers. And sometimes a piece of news appears, that some company got some system working to generate code and correct bugs and then after a while nothing more appears. Either the company got protective of the technology or the bot got off the rails and it was more trouble than it was worth.
From companies point of view, being able to replace expensive developers would be a plus, and probably the work would be done faster… But… Sometimes one has to be careful what he\she wishes for. And, yes… ChatGPT can be helpful in getting some answers to some problems or find some examples, or boiler plate code when search engines have failed. But coding isn’t just getting some code that answers a specific question. It is very often a very large integration job that has to deal with existing code bases. And there are already examples of ChatGPT generating code that is not fit for purpose, although at first glance seems correct.
Because of this, it forces the person to proof read the code and verify it, which might add some extra risks. From my point of view this will be a boom for frauds and people that are barely passable as coders. This profession is full of entry code tests and all kinds of code challenges that ask for people to solve irrelevant problems in most day to day professional life. Which often tend to benefit either people that barely left university, or people that train relentlessly on those tests.
This will mean that situations like I had to endure in the past will become more common, of a former co-worker that would make mostly boilerplate code, and fake his tests as to pass them. But not implement the business logic, and then say he was finished and passed to another project. And I was stuck to solve the bugs afterwards, so… ChatGPT won’t solve favouritism inside companies.
I don’t think that there is anything wrong with using ChatGPT to find a solution to a problem, or find a piece of information. I use search engines for that as well, although the quality of the results is getting poorer and poorer. But one thing is asking questions, the other thing is being unable to think. In my view there are tree general skills that are usual lacking from coders in the market:
- Proof reading code, and being able to read other people’s code (or hell on earth).
- Capacity to think without making too many dogmatic assumptions.
- Capacity to understand several domains of business and knowledge.
The bigger risk for developers with ChatGPT is that it will increase the competition for available “good jobs” and that it will decrease the wages and increase the work loads. Since barely passable coders can fake it when working on the most simple problems or those that the generative model works better. At the same time this will also mean a ramp up on stupid code tests, where one is applying for a role developing on Amazon AWS, and will be asked to do some idiotic sorting algorithm challenges under 30 minutes. News Flash, data structures and algorithms are only important in university and barely 1% of the jobs in the market.
The result of this higher competition will be the erosion of skills, and of offloading to the generative model of an increasing share of the codebase. This will have impact on the nature of bugs, and will impact the lead times between discovery and resolution.
In Conclusion
The introduction of these systems like ChatGPT can have a very disruptive effects on the jobs market, if not done well it might even represent the collapse of most States fiscal base. Since corporations that barely pay any tax will have a bigger share of the pie, and people that work for a living and are responsible for most of the tax receipts lose market power, their jobs or have lower wages.
It is also very important to address the fact that the learning datasets have copyrighted material, and that authors need to have the right to opt-out of these datasets. And also need to have special licenses to be paid for the use of their work in this way.
The other aspect is the centralization of this technology, since these systems will be housed on large Data Centers in the cloud. Subjected to be billed as a service, which means cloud providers will rack enormous amounts of money simply by having monopoly power. These system aren’t sold as an appliance or as a program to install on your machine run locally. Cause in this case, I could see a very useful application. If one is an artist and needs to automate his\hers work, by feeding its work examples to the model and allow it to do the most tedious work in his\hers style.
Because… Once a system is in the Cloud, it is not yours. You don’t have really control, and… Even with assurances, sooner or later some policy change in the platform will allow it to peek and copy your work.