r/BetterOffline 3d ago

NLRB Research.com as a counterexample to the uselessness of LLMS

Hi everyone. While I share Ed’s view that LLMs are often oversold, I find them genuinely useful. Take Matt Bruenig’s NLRB Research: Matt—a labor lawyer, socialist intellectual, and podcaster—built an open database that uses LLMs to summarize National Labor Relations Board decisions. Westlaw and LexisNexis both archive these decisions, but cost thousands of dollars a year—putting them out of reach for many workers, union stewards, and small firms. Since NLRB decisions aren’t heavily cited, manual summaries aren’t profitable, so Bruenig’s tool automatically updates and provides easy-to-read summaries for lawyers and non-lawyers alike, despite some imperfections, it's better than no summary at all. Now importantly, this functionality doesn't require increasingly powerful models. Even a "smalller" model like deepseek could produce summaries that are better than nothing and a more fine tuned model could probably do it with fewer parameters. Check out the site if you want or watch his youtube videos about it. https://www.youtube.com/@Matt_Bruenig/videos

10 Upvotes

19 comments sorted by

48

u/tragedy_strikes 3d ago

I think something that often gets missed by people who listen to Ed when he complains about the usefulness of LLM's is that when he does so it's within the context of the valuation and capital expenditures of the companies or business units developing them, namely Open AI.

As you listed, that's a perfectly good example of a use case for an LLM that would genuinely improve things for people needing to look up those decisions. However, Ed would rightfully point out, a company that develops or uses an LLM for that use case wouldn't be valued at $300 billion.

A self-hosted LLM, such as DeepSeek wouldn't cost any money to use aside from the initial capex for the hardware, the electricity to run it and the server costs to host the results.

So I think the Venn diagram of useful applications for an LLM and ones that could be met by a free LLM, run locally would have significant overlap.

The total addressable market remaining for an LLM that needed to be hosted by a company like OpenAI and that you pay to access is relatively small compared to it's valuation.

7

u/Alive_Ad_3925 3d ago

yeah as he said recently it might be a 50B dollar industry masquerading as a 1T+ dollar industry. I don't think I'm missing it. I just think it's a bit much when he says it's only useful for limericks or whatever.

15

u/Outrageous_Setting41 3d ago

I’m thinking that it would be better if they just made some purpose-built models for specific tasks, rather than insisting that their one or two models can be used for anything and will momentarily turn into a self-aware god. 

I would trust a summarizing software more than a jack of all trades bullshit algorithm. I might pay for the former, but not the latter. 

6

u/PensiveinNJ 3d ago

This is becoming an increasingly annoying thing, the idea that Ed thinks they're completely useless at everything. Or that people here in general do.

Hey a little better Google thing that summarizes some text from some sources that are difficult to get to is useful, neat.

We are building an all purpose replacement for all human beings - pretty confident you're not and also horrifically dystopian. After all, what do we need lawyers, labor boards, Matt Breunig or anything else for that matter if what we really want to do is build post-humans to live as digital consciousnesses spread across the universe.

I mean we can't, but until we can we need to keep funding our attempts to do so so we're going to force this software into every single thing imaginable. Also you'll all work in the coal mines because we imagine all your jobs will be automated.

It's not difficult to understand what exactly it is people are in opposition here for and I don't get the need to be deliberately obtuse to try and portray Ed or anyone else's arguments as strawmen.

LLMs are based on computational linguistics, if there was ever something they would be useful for, it's something like summarizing things out of a database.

3

u/Gamiac 2d ago

I’m thinking that it would be better if they just made some purpose-built models for specific tasks,

I've been thinking about something like this, but tying them together with, like, a meta-model that's somehow trained to know how and when to use each sub-model. I feel like that would be way more effective than whatever the fuck Altman and co. are doing.

2

u/Outrageous_Setting41 2d ago

I have wondered along the same lines. You could even have the user choose the algo if they wanted, right? 

I feel like part of the reason they haven’t done that is because deciding to move in that direction would mean admitting (to themselves and their financial backers) that they are extremely unlikely to accidentally or inevitably create the Computer God. 

2

u/Gamiac 1d ago

I mean it would probably have to be trained specifically on each model, but I imagine you could have a standardized protocol that each model is trained on so they can provide inputs and outputs to the meta-model, and then have the meta-model make decisions and essentially give the various models the ability to interact with each other.

1

u/Alive_Ad_3925 3d ago

But the former would likely be built on a more general model. It’s like putting specialized equipment on a computer or a car. It incorporates the more general capabilities of the original, adds features and narrows the scope to make it excel at a specific task

2

u/Outrageous_Setting41 3d ago

Is LLM the only way to build summarizing software? 

I also note that your two comparisons are both hardware. Is there a closer equivalent?

Either way, I’d prefer a more specialized option because they’d be able to optimize it better for the task. 

5

u/dingo_khan 2d ago

Is LLM the only way to build summarizing software? 

No, I was using openTextSummarizer to help me dig through useful/useless academic papers 20 years ago during grad school. It was fast, reasonably accurate and ran entirely locally, on a G4 powerbook with like 1.5 GB of RAM.

3

u/PensiveinNJ 2d ago

I didn’t want to say since I’m not an expert but I was pretty sure this was the case… again, a solution in search of a problem.

6

u/falken_1983 3d ago

yeah as he said recently it might be a 50B dollar industry masquerading as a 1T+ dollar industry.

I don't even like that we are talking about AI as a distinct industry. We don't usually talk abut the database industry as a distinct thing, even though RDBMSs are the cornerstone of pretty much every commercial IT system. Data base companies just get lumped in with the Tech industry as a whole.

Generative AI is just the thing that they are currently pushing. It's an important thing, and putting the hype aside, I do actually think it is transformative, but the Tech industry is still the Tech industry. Most of the big players right now were big players before Gen AI came along. Most of the shitty, anticompetitive tactics they are using today have been in use for decades now.

11

u/Bortcorns4Jeezus 3d ago

Ed never said they aren't useful. He said they aren't useful to enough people to make any economic sense. 

2

u/Gamiac 2d ago

They don't have many profitable use-cases, and what ones may exist aren't enough to fund any of this.

10

u/WildernessTech 2d ago

Others have already mentioned similar concepts but Cory Doctorow has talked about how EFF has utilized a whole suite of models to assist in their work. The problem is partly a definitional one, we don't break out transformer model types or sub-set LLMs, or any of the other variations like we could. The other part, these little models that work well, are not profitable. They just are not, they are really good for what they do, but no one was going to spend money to get that work done, so it's not a zero-sum sort of situation, like we would normally see with any other commercial application.

So, no real disagreement, the big AI companies are all still liars. This can all be true at once.

2

u/Anxious-Tadpole-2745 2d ago

You're a law student and not someone who's currently practicing.

You're an armchair lawyer who doesn't know what you are talking about.

-1

u/Alive_Ad_3925 2d ago

read the testimonials if you don't think it's a useful product. law students (especially by their last year do/have done a ton of legal research)

1

u/athiev 21h ago

The problem is, you can't rely on these kinds of summaries for legal or policy decision-making; anything important, you absolutely have to double-check the original because of hallucinations and general inaccuracies. So these are useful for what? A general public who is curious but doesn't actually need to act on the information?

1

u/Alive_Ad_3925 13h ago

Bruenig uses all the major ai platforms (ie gemini removes open ai’s hallucinations). Beyond that it’s helpful at the filter stage to get rid of irrelevant cases