New ask Hacker News story: The Limits of LLM
The Limits of LLM
2 by flr001 | 1 comments on Hacker News.
It is possible that LLMs use a similar concept than the Jennifer Aniston neuron (we have found that some neurons only fire when we see a particular person's face) https://ift.tt/7oyh1S3. A LLM could have the Donald Trump neuron. Whenever the prompt mentions Trump, the neuron activates and that neuron cascades activation of paricular sentiment neurons and those neurons re-shape all word's probabilities so that at the end of the process, we have text that contains "believe me" and "tremendous" and "a lot of money". Then it becomes easy for the model to transform any piece of text in the style of Eminem or Shakira or anything for which there is a neuron that represents that concept. This also goes in hand with the improvement in NLP we see with larger and larger models. Bigger models have more space to assign specific concepts to specific neurons. And bigger models MUST be trained with bigger datasets because if there are no more concepts to extract from the text, the representation of the world that the model learns caps. Like books and bookshelves. Thinking in those terms hints at the limitations of LLMs because models can't learn things that are not in the books. Unless we are willing to believe that the whole corpus of books out there contain more information than the information intended to be put by the writers. Maybe there is information in the spaces between books that can be harvest by LLM which humans have not yet tapped into. I think LLMs will not beat the top humans in any particular area of expertise but they will beat everyone else. They will be below top knowledge but way above average.
2 by flr001 | 1 comments on Hacker News.
It is possible that LLMs use a similar concept than the Jennifer Aniston neuron (we have found that some neurons only fire when we see a particular person's face) https://ift.tt/7oyh1S3. A LLM could have the Donald Trump neuron. Whenever the prompt mentions Trump, the neuron activates and that neuron cascades activation of paricular sentiment neurons and those neurons re-shape all word's probabilities so that at the end of the process, we have text that contains "believe me" and "tremendous" and "a lot of money". Then it becomes easy for the model to transform any piece of text in the style of Eminem or Shakira or anything for which there is a neuron that represents that concept. This also goes in hand with the improvement in NLP we see with larger and larger models. Bigger models have more space to assign specific concepts to specific neurons. And bigger models MUST be trained with bigger datasets because if there are no more concepts to extract from the text, the representation of the world that the model learns caps. Like books and bookshelves. Thinking in those terms hints at the limitations of LLMs because models can't learn things that are not in the books. Unless we are willing to believe that the whole corpus of books out there contain more information than the information intended to be put by the writers. Maybe there is information in the spaces between books that can be harvest by LLM which humans have not yet tapped into. I think LLMs will not beat the top humans in any particular area of expertise but they will beat everyone else. They will be below top knowledge but way above average.
Comments
Post a Comment