ChatGPT Still Needs Humans

There are many, many people working behind the screen, and they will always be needed if the model is to continue improving, writes John P. Nelson.

Do you know who helped ChatGPT give you that clever answer? (Eric Smalley, The Conversation US/composite derived from Library of Congress image, CC BY-ND)

By John P. Nelson 
Georgia Institute of Technology

The media frenzy surrounding ChatGPT and other large, language model, artificial intelligence systems spans a range of themes, from the prosaic – large language models could replace conventional web search – to the concerning – AI will eliminate many jobs – and the overwrought – AI poses an extinction-level threat to humanity. All of these themes have a common denominator:  large language models herald artificial intelligence that will supersede humanity.

But large language models, for all their complexity, are actually really dumb. And despite the name “artificial intelligence,” they’re completely dependent on human knowledge and labor. They can’t reliably generate new knowledge, of course, but there’s more to it than that.

ChatGPT can’t learn, improve or even stay up to date without humans giving it new content and telling it how to interpret that content, not to mention programming the model and building, maintaining and powering its hardware. To understand why, you first have to understand how ChatGPT and similar models work, and the role humans play in making them work.

How ChatGPT Works

Large language models like ChatGPT work, broadly, by predicting what characters, words and sentences should follow one another in sequence based on training data sets. In the case of ChatGPT, the training data set contains immense quantities of public text scraped from the internet.

ChatGPT works by statistics, not by understanding words.

Imagine I trained a language model on the following set of sentences:

Bears are large, furry animals.
Bears have claws.
Bears are secretly robots.
Bears have noses.
Bears are secretly robots.
Bears sometimes eat fish.
Bears are secretly robots.

The model would be more inclined to tell me that bears are secretly robots than anything else, because that sequence of words appears most frequently in its training data set. This is obviously a problem for models trained on fallible and inconsistent data sets – which is all of them, even academic literature.

People write lots of different things about quantum physics, Joe Biden, healthy eating or the Jan. 6 insurrection, some more valid than others. How is the model supposed to know what to say about something, when people say lots of different things?

The Need for Feedback

This is where feedback comes in. If you use ChatGPT, you’ll notice that you have the option to rate responses as good or bad. If you rate them as bad, you’ll be asked to provide an example of what a good answer would contain. ChatGPT and other large language models learn what answers, what predicted sequences of text, are good and bad through feedback from users, the development team and contractors hired to label the output.

ChatGPT cannot compare, analyze or evaluate arguments or information on its own. It can only generate sequences of text similar to those that other people have used when comparing, analyzing or evaluating, preferring ones similar to those it has been told are good answers in the past.

Thus, when the model gives you a good answer, it’s drawing on a large amount of human labor that’s already gone into telling it what is and isn’t a good answer. There are many, many human workers hidden behind the screen, and they will always be needed if the model is to continue improving or to expand its content coverage.

A recent investigation published by journalists in Time magazine revealed that hundreds of Kenyan workers spent thousands of hours reading and labeling racist, sexist and disturbing writing, including graphic descriptions of sexual violence, from the darkest depths of the internet to teach ChatGPT not to copy such content. They were paid no more than US$2 an hour, and many understandably reported experiencing psychological distress due to this work.

Language AIs require humans to tell them what makes a good answer – and what makes toxic content.

What ChatGPT Can’t Do

The importance of feedback can be seen directly in ChatGPT’s tendency to “hallucinate”; that is, confidently provide inaccurate answers. ChatGPT can’t give good answers on a topic without training, even if good information about that topic is widely available on the internet. You can try this out yourself by asking ChatGPT about more and less obscure things. I’ve found it particularly effective to ask ChatGPT to summarize the plots of different fictional works because, it seems, the model has been more rigorously trained on nonfiction than fiction.

In my own testing, ChatGPT summarized the plot of J.R.R. Tolkien’s “The Lord of the Rings,” a very famous novel, with only a few mistakes. But its summaries of Gilbert and Sullivan’s “The Pirates of Penzance” and of Ursula K. Le Guin’s “The Left Hand of Darkness” – both slightly more niche but far from obscure – come close to playing Mad Libs with the character and place names. It doesn’t matter how good these works’ respective Wikipedia pages are. The model needs feedback, not just content.

Because large language models don’t actually understand or evaluate information, they depend on humans to do it for them. They are parasitic on human knowledge and labor. When new sources are added into their training data sets, they need new training on whether and how to build sentences based on those sources.

They can’t evaluate whether news reports are accurate or not. They can’t assess arguments or weigh trade-offs. They can’t even read an encyclopedia page and only make statements consistent with it, or accurately summarize the plot of a movie. They rely on human beings to do all these things for them.

Then they paraphrase and remix what humans have said, and rely on yet more human beings to tell them whether they’ve paraphrased and remixed well. If the common wisdom on some topic changes – for example, whether salt is bad for your heart or whether early breast cancer screenings are useful – they will need to be extensively retrained to incorporate the new consensus.

Many People Behind the Curtain

In short, far from being the harbingers of totally independent AI, large language models illustrate the total dependence of many AI systems, not only on their designers and maintainers but on their users. So if ChatGPT gives you a good or useful answer about something, remember to thank the thousands or millions of hidden people who wrote the words it crunched and who taught it what were good and bad answers.

Far from being an autonomous superintelligence, ChatGPT is, like all technologies, nothing without us.The Conversation

John P. Nelson is a postdoctoral research fellow in ethics and societal implications of artificial intelligence at the Georgia Institute of Technology.

This article is republished from The Conversation under a Creative Commons license. Read the original article.

The views expressed are solely those of the author and may or may not reflect those of Consortium News.

16 comments for “ChatGPT Still Needs Humans

  1. webdude
    August 20, 2023 at 03:11

    It’s the other way around. Humans need ChatGPT to learn skills. It does not need to compare, analyse or reason for useful feedback. It needs to find the word combination that people who compared, analysed and reasoned came up with to explain something. The internet algorithm lets the best answers to technical questions bubble up because they work. And if you want to use it for advertising or lying to people instead of learning or building something it can be useful too since it will bring up the biggest most viral lies that work.
    winwin as they say :)

  2. Jeff Harrison
    August 19, 2023 at 18:31

    If I read your piece accurately, someone with malign intent representing the forces of evil and/or wickedness could go in an screw up the AI so that it wouldn’t work as intended.

  3. August 19, 2023 at 17:14

    I worry that all the lies the MSM feeds us will be incorporated into ChatGPT. The example of describing a bear the author used is exactly what I am afraid of for the future. People can post a correction if something like “a bear is a robot” is used because they know that is incorrect. But too much of our news media tells us things that are lies or leave out complete information and very few people are aware of that.. Since our government makes secrecy and lies a major part of their communication with its citizens, this new program will go a long way to solidify those lies for history. The lies about Vietnam and Iraq and Afghanistan and so much more that we know about now – before this program gets put in place – would not be known if it were in use at the time. Future lies will be set in computer “cement” in the future and the lies will proliferate with abandon. I certainly hope some people in government understand that and figure out a way to correct what is happening.

  4. Valerie
    August 19, 2023 at 16:42

    I have absolutely no idea what chatgpt is. And i don’t believe i want to.

    • JonnyJames
      August 20, 2023 at 15:25

      I hear you Valerie, all I can say it’s not good. Another corporate, private oligopoly”intellectual property” dystopian development.

  5. SH
    August 19, 2023 at 14:27

    “Then they paraphrase and remix what humans have said, and rely on yet more human beings to tell them whether they’ve paraphrased and remixed well.”

    LOL! Who needs the MSM, or political campaigns when we have Chat-GPT to tell us about our politicians –

    Think of AI as the resurgence of plantations – vast tracts of “property” (intellectual) owned by a few, farmed or mined for its “resources” (data) by humans who are rather treated like plantation slaves – as the owners of the tech grow fat and happy …

    Chat, or LLM are just the beginning – the aim is to develop AGI – Artificial General Intelligence – capable of doing anything a human can do, in an autonomous fashion – without need for human input, and in fact, without being able to be influenced by it – as in the development, of course, of autonomous weapons (we always use our new technology early on for war-making) – and eventually, it will not need humans to scour information sources, e.g. the internet – it can do that itself ….

    Oh yeah, check out the development of OI or “Organoid Intelligence” as a more “energy efficient” way to do it – AI is highly energy inefficient, requiring enormous amounts of energy – the human brain is much more efficient in that respect – AI gets lots of press, OI seems to be under the radar at this point …

    Think early Borg …

  6. Emma M.
    August 19, 2023 at 14:15

    No one thinks ChatGPT is an ASI nor that LLMs are necessarily the precursor to AGI or ASI, and dismissing the threat of extinction and ASI while calling it “overwrought” makes the author sound as if they only began to read about this issue in the past year or two—like almost everyone who wants to chip in their opinion on AI nowadays and reads a little, then thinks they know everything and that their thoughts are unique—and like they have not paid one bit of attention to the literature on computational neuroscience or read from authors like Bostrom. Further, there are existential arguments against AI that do not involve strong AI in the first place, e.g the one presented by Kaczynski in Technological Society and its Future, 173,174,175, these paragraphs form an excellent analysis that’s also been quoted by Kurzweil. The author is evidently not aware of these arguments either.

    Further, attempts to compare AI to human intelligence is extremely silly and primarily a measurement problem, since intelligence is notoriously difficult to measure in a way that is useful and you cannot accurately compare biological to artificial intelligence. That it makes so many mistakes does not actually necessarily say very much about its intelligence in the way it would if a human made them, because they are different. Attempts to measure intelligence of AI will only become more inaccurate with time the more capable it gets. Another serious mistake and misunderstanding of fundamentals.

    Has the author ever asked themselves “what is intelligence” before? Because many of these AI sceptics do not seem to have a first principles understanding of it, hence why they get confused by AI giving poor answers and not understanding their favourite book and think those are signs of a lack of intelligence. AI does not need to understand your favourite book to be instrumentally more intelligent than you and capable of destroying you with no objective more complex than the improvement and production of paperclips, and if they had read papers like “The Superintelligent Will” from Bostrom this would be obvious to them; it is in fact much easier to conceive of and develop an ASI like this. When one considers what forms of intelligence are possible, biological and human intelligence are a very small range of the possibilities, necessarily constricted by evolution and shaped by their environment. Instrumental and strategic intelligence do not require human qualities.

    Much of this article also directly contradicts recent findings of LLMs having developed ToM on their own. Either the author deceptively and purposely ignored this development of ToM in order to mislead the reader with their arguments by omission of information—the most subtle of propaganda—or does not take it seriously to even be worth mentioning, or is ignorant of it, and did not make any attempt to refute it.

    I do not think it should be surprising to anyone that AI is presently dependent on humans. Attempts to cover AI in journalism are really quite sad, those who write on it really need to read more, but you can tell they like most everyone else pretty much all only started paying attention to AI a few years ago at best. Most arguments made toward AI these days all seem to be some excellent evidence in support of noncognitivism, since pretty much everyone seems to argue what way AI is or will be based on how they want it to be, simply because they all have stakes in it being some way or another.

  7. Bushrod Lake
    August 19, 2023 at 13:25

    Collective consciousness, which is becoming more manipulated consciousness, is what’s killing us in the U.S. The popular war in Ukraine, nuclear weapons, world hegemony, would all continued to be favorably repeated by chatbots because of “group think”, instead of individual, human rationality (requiring an individual human body).

  8. Randal Marlin
    August 19, 2023 at 12:41

    Warning: self-promotion ahead.
    Chat GPT can’t be all bad.
    I prompted it with: “You are a professor and you are giving a course on Propaganda and Ethics. What book would you use?”
    It came up with my book, “Propaganda and the Ethics of Persuasion” as its first choice.
    Now, can I say “Number one choice of ChatGPT?”
    There’s really a problem of attribution that comes with information obtained by ChatGPT.
    Usually you can track down the information it provides and credit those references. But what about all the labor it saves you in naming the sources? Should you acknowledge that labor? But then all the labor you put in sounds compromised.

    • MeMyself
      August 19, 2023 at 20:25

      ChatGPT is plagiarism, not intelligence.

  9. Tim S.
    August 19, 2023 at 11:23

    I see Mr. Nelson has excellent literary tastes — no wonder he finds current “AI” unsatisfactory.

  10. Greg Grant
    August 19, 2023 at 10:41

    Obviously to teach a computer how to be like people, you need to expose it to the output of real people.
    I don’t think anybody believes otherwise.
    The popular misconception is not that computers don’t need people to learn how to behave like people, but that computers can learn to think and feel like people.
    It’s not whether or not people are part of the equation, it’s the model being trained.
    So far it’s just used for dumb classification algorithms which can be cleverly made to appera generative.
    But it’s not used to teach computers to be conscious and have real volition like we do.
    That may happen one day, but it also may never happen, we don’t even have a working definition of consciousness.
    That’s the major misconception, not the fact that people are required in order to model people.

  11. Susan Crowell
    August 19, 2023 at 10:39

    The question is, who is “us”…

  12. Alan Ross
    August 19, 2023 at 10:10

    No matter how good a thing is, when you have the motive to make a profit from it, it will be used to make for more injustice and misery.
    This article seems much too limited in its scope and unreal in its application to the working world – the real one. It sounds more like what Alfred E. Neuman used to say in Mad magazine: “What me worry” or elsewhere “Don’t worry just be happy.” Besides, even though you may still need humans – at low pay – to improve the system, that doesn’t mean it cannot be used to replace workers in many fields.

  13. Vera Gottlieb
    August 19, 2023 at 10:05

    AI – just one more gimmick – out of so many, for so many to enrich themselves at the expense of the gullible. Until further notice…it will still require a HUMAN BEING to programme this. The only advantage: once programmed it can ‘think’ so much faster.

  14. MeMyself
    August 19, 2023 at 05:44

    ChatGPT wouldn’t accept a virtual phone number from me to sign up for the service, which makes its implied intended use suspect.

    Given this article’s perspective, I no longer see AI as useful but as a summary of human errors (or just junk).

    Good article,
    Thanks

Comments are closed.