Skip to main content
A
The internet may not be big enough for the LLMs.

That’s because LLMs need really good data to train off of, and while the internet has a lot of data, it also has a lot of the Livejournal posts I wrote in 2003 that no one should be training anything they want to be coherent on.

This Wall Street Journal piece explores the way AI companies on beginning to reckon with a potential shortage of data to train on. Apparently it could mean a lot fewer “do anything” enormous LLMs and a lot more models trained for specific tasks on specific data sets. The people using LLMs trained on my Livejournal probably appreciate that.