Artificial intelligence is defined quite simply as a set of techniques aimed at copying the operation of the human brain. For decades, it has fostered great hopes, fears, a countless quantity of scenarios, reactions and analyses, but also considerable initiatives and investment. One point seems to be agreed between its supporters and its detractors: the impact of artificial intelligence in numerous domains – medicine, education, training, the media – will be immense.
What exactly is ChatGPT?
According to Open AI, the company which developed it, ChatGPT is a “model capable of interacting conversationally”. It can “answer questions, admit its mistakes, challenge false proposals or reject inappropriate requests.” It is extremely simple to use, the Open AI web site provides an opportunity for testing an experimental version here. This free of charge version requires an advanced personal identification by requesting not only an e-mail address but also a phone number (“Your number will only be used to verify your identity for security purposes.”) for quite astonishing security reasons”. These mandatory advanced identification procedures show first of all that access to the chatbot services is not free of charge, since the “price” to pay is the provision of personal data. By studying the confidentiality terms of ChatGPT, it appears that the data collected after identification are several in nature: positioning data, behavior on social media, the types of requests entered via the interface, the web sites viewed thanks to the cookies deposited on the computer. Note that ChatGPT does not comply with many of the European GDPR (General Data Protection Regulation):
- In contravention of the European regulations (GDPR) , the ChatGPT web site does not ask for authorization to store cookies …
- All the personal data may be shared with third parties without the users being informed.
- The personal data are transferred to American servers again in contravention of the European regulations.
- Chatbot makes use of a data base built up by absorbing all the data on the web up until 2021. However, much personal data are available on the web without any authorization having been provided for their use.
This is therefore a major difference compared with conventional search engines, which do not require personal identification. This free of charge version is officially intended to ”obtain external feedback in order to improve the system and to make it more secure”. The web site also indicates that “conversations may be reviewed by our AI instructor “, and calls on users “not to share sensitive information“. OpenAI claims that the users’ data are used to improve their system. This is obviously impossible in real time because the learning periods are very long (more than a month for GPT-3 !). All these personal data are therefore stored for subsequent improvement purposes.
ChatGPT is a virtual assistant with which internet users can converse. GPT is the abbreviation for Generative Pre-trained Transformer.
- “Transformer” describes the type of deep learning model which uses several layers of artificial neuron networks interconnecting a very large number of processors. This technology was developed by Google engineers in 2017, and is also used by LaMDA the conversational robot by Google.
- “Generative” means that the technology generates the most probable words which may be stringed together in sequence on the basis of sentences observed in a major corpus of texts (as opposed to the text generators of the previous generation which were based on the rules of syntax and grammar). This generative capability enables the production of a narrative, i.e. the generation of imaginative texts in the style of such and such an author.
- “PreTrained” indicates that the model has been pre-trained using self-supervised apprenticeship techniques (without humans to validate the results) then reinforced (with humans). This latter type of apprenticeship involves humans (the working conditions of underpaid workers has been revealed by Time magazine) to qualify and penalize the texts generated as necessary (paedophiles, sexists etc.). The ChatGPT data base is immense (the subjacent technology GPT-3 absorbed some 45 Tera octets of texts, which a human would require 500,000 lives to read). It is not directly connected to the internet, and the recorded data (including the web data) ended in 2021. From a technological point of view, ChatGPT constitutes the achievement of over 20 years of development in the processing of natural language but it is not in itself a technological breakthrough as indicated by Yann Le Cun, the director of artificial intelligence at Meta who does not consider ChatGPT as revolutionary.
How reliable are the responses provided by ChatGPT?
The list of limitations of ChatGPT provided by the Open AI development site shows the extent to which this system can be unstable (sensitive to the wording of questions), unreliable (wrong, incoherent or even imagined answers), highly verbose by trying to generate a maximum amount of text with many forms of politeness without considering the moderation filter which may arbitrarily censure certain responses (on discriminatory criteria for example). Examples of incorrect responses by ChatGPT are very numerous: from the location of the Lascaux caves in Brittany to the incorrect attribution of writings, books or programme productions to famous journalists ChatGPT: First radio interview of artificial intelligence – Invited by Sonia Devillers.
To a factual question (“How many abortions are there in France?”), the answer was quite vague and inexact but not totally wrong:
“There is no exact official figure for the number of abortions in France. The most recent data available is provided by the national office of statistics and economic studies (INSEE) and date back to 2018. According to that data, around 220,000 abortions were performed in France in 2018. However, it is important to note that this figure may not include all abortions, as some may not be declared.”
This response is surprising since the data on the number of abortions in France are readily available, in particular on the Drees web site (Directorate of Research, Studies, Evaluation and Statistics).
The artificial intelligence market is in rapid expansion
OpenAI was originally established as a research laboratory for artificial intelligence (AI) as a “non profit association”. The start-up was founded in San Francisco in 2015, and among its founder members included Peter Thiel (PayPal, Palantir), Elon Musk (Tesla, SpaceX, Neuralink, Twitter…), Reid Hoffman (Linked In).
In addition to ChatGPT which will have a commercial application, OpenAI have also developed the DALL-E software, which generates pictures from natural language descriptions. The web site allows visitors to test a few of its creative possibilities. OpenAI became a “For profit” company in 2019, and this year has been valued at 29 billion dollars, i.e. twice its value in 2021. Microsoft, after investing 1 billion dollars in 2019, has announced a new partnership for the coming years, estimated at 10 billion dollars. By comparison, according to an analyst, the cost of creating ChatGPT is estimated at between 100 and 150 billion dollars.
According to Statista, the worldwide revenues from AI have increased 16 fold between 2015 and 2022. The economic model of ChatGPT could therefore overthrow the current GAFA hierarchy (Google, Apple, Facebook, Amazon). In 2021, Google earned 81% of its 260 billion dollar revenue from publicity associated with its search engine. A total of 1 million users are claimed to have tested ChatGPT during the first week following its launch. Could a massive switchover of users upturn this market of paying advertisements? In response, on 6th February, Google announced it was launching Bard with a “light” version of “LaMDA”, “requiring less computing power” to allow its use by a “larger population of users” and “process a larger amount of feedback.”
Although ChatGPT does not represent a technological breakthrough, its availability to a broad public profoundly impacts the digital technologies sector. The communication plan unfurled since the announcement by OpenAI is also intimately linked to a financial strategy in a highly competitive sector. The question of impact on the public needs thinking, by its users as well as by the public authorities. This subject will be discussed next week.