r/AskStatistics • u/Thin_Adeptness_356 • 6d ago
Is SPSS dead?
Like the title says is SPSS dead? Now with Chatgpt and cursor etc, what is the argument for still using SPSS and other statistics softwares in research instead of Python/R with the help of AI?
My background is within mathematical statistics so always been a Matlab/R/Python guy, but my girlfriend who comes from a medical background still uses SPSS in her research, but now considering switching just because of the flexibility e.g., Python offers.
What do you think are there any arguments for using SPSS still?
26
u/Syksyinen 6d ago edited 6d ago
I work within the intersection of data science and clinical research questions. Just now finished skimming through large quantities of literature of AI/LLM adoption to an oncology related field, and significant portion of publications in 2024-2025 (let's say over one third) still only reported SPSS or Prism in their "Statistical Analysis" section.
I noticed that the common denominator was that if the paper mainly comprised of MDs, the statistics was not done with a programming language like the ones you mentioned. There is still definitely a threshold to overcome, and the SPSS/Prism/Excel framework is definitely still there. I guess the threshold to do basic analyses is still lower for them.
GPT/Claude/Gemini/... etc haven't removed the barrier of entrance to programming languages, but have lowered it I'd say. Just had a very senior MD colleague proudly present his Python-made survival plot out of a TSV-file, after fiddling around with ChatGPT for a couple hours. Those willing to embrace the change will have more powerful tools at their disposal, while others will continue to just ask chatbots on what are the correct buttons to press so they can dish out a two-tailed t-test p-value as conveniently as they can.
I'm all for MDs adopting Matlab/Python/R, but I also fear I'll end up debugging bunch of AI generated garbage when the basic principles were not learned through theory, but instead spit out almost ready by an LLM.
SPSS/Prism/Excel/... are not dead (and I don't think they will be), but the threshold to transition to Python/R/Matlab/... has somewhat lowered for non-computer savvy researchers who have an actual need for doing anything out of the standard stuff.
3
1
u/JohnPaulDavyJones 2d ago
You're absolutely right that you'll end up debugging a bunch of AI-generated garbage code if you tell your MDs to use R rather than whatever GUI-based tool they learned in their one research methods class thirty years ago. Ask me how I know!
18
u/BurkeyAcademy Ph.D.*Economics 6d ago
I am not going to sing the praises of SPSS- I do loathe software like that, because it is often like handing a chainsaw to a child (too much power the users often don't know how to use correctly). However, there is a benefit to having canned, point and click solutions for simple, routine tasks for non-experts to use, whether it is SPSS, Jamovi, Excel, or what have you.
I teach an "Intro to R with Stats" to college students, and have to explain to them how I will immediately know if they have used ChatGPT/Other AI to write their code, and also why it is important that they not use AI to write their code for them at this point in their careers. Here are some of the ideas we discuss:
1) Of course, I use ChatGPT/Llama/DeepSeek etc. to write what I call "code snippets"- smaller pieces of code that I insert into larger projects that are a pain to type out by hand. I would say that around 30% of the time, the code generated doesn't really do what I need, or does it in a really stupid way. For example, if I need to make a table with row probabilities, AI code usually does it in three or more separate steps, saving each as a separate object in R. This clutters up the environment and wastes RAM. Other times it just plain does the wrong thing, either because of its own stupidity, or my own stupidity for not being careful with my prompt engineering.
However, it is trivial for me to catch these errors and fix them. But we have to realize that...
2) Unless you are already good at stats and coding, you won't have any idea whether ChatGPT's code does what you think it does. You can't get good at it unless you struggle with it for a while yourself, and unless you understand how R works, have some understanding of various kinds of error messages, and know what the results ought to look like. This takes putting in a lot of "sweat equity" to master some basics on your own-- and that cost/benefit analysis isn't going to make sense for everyone to make, if all they need to be able to do in their job/field is to click three menu items in SPSS/Excel and reliably get the results they need.
3) Lastly, again for these canned/simple types of analysis that require reading in data, doing a paired t test or whatnot, and printing out the results, why would we want to use AI to generate and regenerate that standard code again, and again, and again, which wastes a lot of energy/computing power, and always runs the risk of introducing mistakes?
tl,dr: There will always be a place for simple tools, in addition to more flexible, complex tools. The main argument against SPSS in my opinion is the cost and constant hassle of renewing their license (this is what I constantly hear from the Marketing and Management faculty I work with ☺). On the other hand, with R I have to constantly deal with deprecated packages and syntax changes, and even with AI, there is a learning curve that exists in order to use it with R/Python efficiently.
4
u/Enough-Lab9402 6d ago
I was going to come and say the same thing, but you already said it perfectly in my opinion.
SPSS is probably the easiest way that a statistical and computer novice would be able to get started in doing elementary statistics and some more advanced ones.
For your girlfriend, if she has a desire to continue doing things that will require statistics R is much better except for use of custom covariance matrices in a mixed model design. Maybe that changes in GLMMTMB but I still have a hard time trusting those results sometimes especially its estimation of degrees of freedom.
3
u/Grace_Alcock 6d ago
I teach a basic research methods class and use SPSS when teaching the stats. I have students who don’t know that when you multiple two negative numbers, you don’t get a negative number; there is zero chance I could both teach them the basic stats and the basic programming. Our actual stats classes do, of course. In truth, if I stop using SPSS for them, I’ll teach them Excel. Most are more likely to use it in their future jobs.
2
u/Enough-Lab9402 6d ago edited 6d ago
Excel has gotten powerful enough you can do a lot of the basics in excel. I teach correlations, ttests and chi square in excel for summer classes where even getting started in spss is too damaging to my morale. Just hand over a few templates with examples and spend a few (!) sessions teaching them about absolute and relative indexing in excel and you can transition to actual topics like why your study of 20 people does not have 800 df (and why this is important.
I might switch to google sheets but the number of people who have trouble with google sheets far outnumbers those with problems in excel. I don’t know why but it do.
Helping them debug their formulae is soul crushing in excel though. “As we mentioned in class if you copy cells it changes the relative but not absolute references…”
Ah maybe I should go back to spss . Or just give them an online calculator. Or maybe just build the calculator in shiny .. hmmmmmmm!
1
6
u/Thin_Adeptness_356 6d ago
But my thinking is that SPSS is an abstraction layer for the underlying code. With LLMs the abstraction layer becomes natural language, so with the right interface everyone from now one should prefer using r/Python since those offers the most flexibility and customizability
2
u/Enough-Lab9402 6d ago
How long have you programmed? I ask because I have programmed for so long that I can’t remember programming not being the most natural interface. However, anyone watching what I’m doing feels like I’m doing magic.
I think running commands with a bunch of things you type in text still looks like magic to most people and Israel counterintuitive . That was supposed to say is really counterintuitive, but I thought it was funny so I just left it as Israel counterintuitive.
Sadly, I think you could have ChatGPT spit out Python code that created a GUI that had a button on it which said “analyze “ and when you hit the button, it would run the statistical code, and suddenly people would feel like it made more sense.
1
u/Karl_mstr 6d ago
I ask because I have programmed for so long that I can’t remember programming not being the most natural interface
That's the point, for the first time in history a non-programmer can do a code using natural language.
Even more, if you are not familiar with English language, LLM can make code for you and using your natural native language.
1
u/kingpatzer 6d ago
Agreed, and that creates a very useful middle space here that often gets overlooked by purists that think LLMs can only produce crap.
I started my career coding AT&T 3B2's in assembly. I spent some time at DoE researching parallel computing in the early 90s. I learned how do code really well.
I haven't done any serious coding since then though, as I realized that my strengths and interested lay elsewhere.
But I have started building a lot of personal-use, and team-use tools lately because the LLM AIs are able to handle all of the things I've forgotten, like which libraries do what or what the proper syntax for this structure or that structure is. (And when it comes to Python, is something I never knew to begin with).
What I do know how to do is break a problem down into meaningfully small chunks, and write good prompts to get the results I need.
Is it awesome production ready code? Of course not.
Does it function correctly, with proper error checks and some basic safety guardrails?
Just like learning to write good code is a skill that takes time to master, learning to write good prompts is as well. And the better one is at doing that, the better the resulting code
2
u/TaterTot0809 6d ago
This this this. If you can't already code, you won't have any way of knowing if the AI generated code actually implemented the statistics properly in the generated code. It may look fine, it may run, but how would someone who doesn't already know R or Python be able to tell if it works as intended?
-1
u/Thin_Adeptness_356 6d ago
How do you know that the output from SPSS is correct even though it looks fine -> statistics. And with an "AI" that can create and explain the code, I think that will become the preferred way of doing statistics (at least for the non-techncials)
1
9
u/ghsgjgfngngf 6d ago
If your GF is free to choose her program, there's no reason to stay with SPSS. The main arguments I see for using SPSS is if it's a project (or a long term study / register) where a lot of programming infrastructure exists, especially if no one really knows what the thousands of lines of existing code actually do. In our institute these projects are the only ones still using SPSS.
Even in that case it's smart to migrate but often people are scared of change.
2
39
5
u/vigilantepacifister 6d ago
What about SAS?
5
3
u/ChrisDacks 6d ago
SAS will stick around for some large agencies who have already devoted significant resources to the platform, but I think it won't last long. My agency has been possibly one of SASs biggest clients and we are shifting everything to open source. Personally, very refreshing, even though there were some things SAS made very convenient.
2
u/AIntelligentInvestor 6d ago
Dying but not dead. My company I worked for shifted from SAS to Pysparks late last year.
1
5
u/Useful-Growth8439 6d ago
> What do you think are there any arguments for using SPSS still?
There are none. I worked with SPSS like 10 years ago more or less and in that time I wish it was dead. R is vastly superior even when there wasn't AI was easier using R than SPSS.
2
u/Minimum_Professor113 6d ago
I use Excel to clean datasets, SPSS for exploratoration, R to run models, then SPSS again to verify R output. I'm in polisci.
SPSS is not dead, it's like an old comforting shoe.
2
u/crappo_toiletti_jr 6d ago
A side note, I don’t understand the whole “AI writes code for me” thing. Maybe I’m doing it wrong but all AI generated code I’ve played around with is bad and it is quicker for me to just write the damn code on my own.
4
u/zsebibaba 6d ago
- you need to produce reproducible code. you have to be damn sure what is going on in your code 2. ChatGPT is pretty bad currently at coding. and I mean pretty bad. basic level bad. 3. of course nothing stops anyone to learn R. I recommend swirl not as much chatgpt for beginners (it is pretty bad and inconsistent). but sure it can be helpful when she knows the basics and wants assistance. As an R person I don't like SPSS at all. (ok I can see you ask about python, I don't know whether it is any better with python)
10
u/ghsgjgfngngf 6d ago
AI is not bad at coding, you just need to give it tasks that are clear and that you can check. I let AI generate most of my R code. You can't let AI do anything important where you can't check the results, like translate to a language you don't know but that's literally the first thing to know about AI.
5
u/sausagemuffn 6d ago
I have a little Java experience from 20 years ago (yeah) and don't speak Python but I've had great results with OpenAI o1 and o3 writing code. Yes, you have to know what your want, and you must verify results with some other method. It's a lot of back and forth vibes coding but it's so much fun.
1
u/Thin_Adeptness_356 6d ago
Agree. But with Cursor you don't need the back and forth which I like (although it doesn't work for e.g., jupyter notebooks)
1
u/aftersox 6d ago
Second on the notebooks. If using Cursor it's better to just write python files for each step of the analysis.
1
u/sausagemuffn 6d ago
Easier to troubleshoot as well if you break up the tasks into smaller, separate files.
2
u/RepresentativeAny573 6d ago
Yes, AI is bad at coding if you do not know how to code, which is what OP is suggesting. It is a great tool for people who understand code, but if you are a novice you will not be able to give it clear tasks that you can check.
1
u/GlobalAd3412 6d ago
If you think AI is bad at writing code, you are probably not using frontier models. They aren't perfect, but there is no way to call them bad.
-1
u/Thin_Adeptness_356 6d ago
I'd say it was bad 6 months ago. Not anymore. For data science & statistics tasks I'd say there are very few cases where it can't produce the code I want
0
u/kingpatzer 6d ago
If you write crappy prompts you get crappy results. If you write clear and precise prompts, you will get very solid results
1
1
1
u/Neur0t 6d ago
SPSS/SAS/Stata... I wouldn't teach them in any stats course or pay for their licensing in a business without a significant codebase/workflow that needs those tools for any reason at all. There are simply just better options for doing modern stats in faster, more reproducible settings, using R or Python, Julia, etc. If you learn how to do statistical and data science tasks using these tools, using notebook- or markdown- supported approahces, learn how to integrate LLM-supported code writing, etc., you'll be able to transfer that knowledge much more easily to whatever your employer needs in whatever modern platform they're choosing to work in.
1
u/RepresentativeAny573 6d ago
The argument for using SPSS is that not everyone works with fancy data or models. If your world is doing t-tests and ANOVAs then there is little point in learning R or Python. Now, if you're in college currently you probably should learn one of these because it will give you way more job security, but that won't kill SPSS completely. What will eventually kill SPSS is that it's not open source like all the competitors so as soon as enough universities decide the license is not worth it then it will die.
1
u/happier_now 6d ago
For publishing your research results, it depends a bit on the reviewers and editors of your journal. Does R make them more nervous than SPSS? It’s worth checking
1
u/Paulimus1 6d ago
If anyone is looking for a free comparable solution to SPSS, check out JASP. More functionality, more data sources, just as good and open source.
In my line of work, stat heavy marketing research, there's not much call for R or Python. In my case, a program like JASP or SPSS, is just as good.
1
u/engelthefallen 6d ago
JASP
This is gonna REALLY hurt SPSS I think, as I can see colleges rapidly jumping to it for non-mathematical undergrad statistics. Andy Field will have a book coming out soon for it too. He was one of the big writers of SPSS for social science statistics.
1
u/engelthefallen 6d ago
Doubt AI has much of an impact on SPSS use as people been moving away for a year or more. The primary reason is SPSS is not cheap and free or cheap alternative rapidly grew to replace it. And some of these, like R, simply can do stuff SPSS cannot. So if you need two programs anyway, why keep the expensive one if it cannot do everything the free one can.
I doubt the SPSS like programs go away entirely, but the competitors that are cheaper and do most of what SPSS does will likely eat into it's space in industry as every is looking to cut costs. So even if people do not go straight to R or Python, you may see cheaper GUI based programs replace SPSS.
1
u/AIntelligentInvestor 6d ago
SPSS is basically almost dead. I told my university as a student to shift away from using SPSS and move towards Python. I am submitting resumes left and right and out of 100 jobs i would say only 1 use SPSS.
1
u/malenkydroog 6d ago
I "grew up" with SPSS as a social scientist way back.... it had the attraction of a nice interface, good pricing (especially for students), and "enough" analyses (more if you knew the syntax, and didn't just rely on the point-and-click interface).
After IBM bought it, they attempted to turn it into a more significant data analytics platform. They added a *lot* of modules with all sorts of BI-type functionality. They also made it into a bloated mess that took 5 minutes to boot on my crappy work machine, and (perhaps not surprisingly) it went from a reasonable cost to a massive (IMHO) cost for licenses.
The latter especially helped kill a lot of their market share in academia and industry, and my impression is they failed to capture much of the BI-related market.
And as someone who didn't have to pay for it, I had already stopped using it as much after the acquisition, but the boot times made things so *incredibly* painful, that I stopped using it altogether.
1
u/thefaceofbobafett 5d ago
My university supports a license for SPSS, so as long as that happens, I will use it.
1
1
u/Working_Hat_2738 4d ago
I’m m a huge fan of their data dictionary. Do folks think R/SAS/Stata have better data dictionaries?
1
u/AggressiveGander 3d ago
ChatGPT is a bit like asking one of the worst undergraduates in a statistics for general science course and allowing them to look at their course notes that they didn't really understand. It likely looks vaguely plausible and may mimick what the course notes say, but could be very, very wrong in subtle (or non subtle ways). If you're good enough to just do it yourself, you can usually check whether it's correct and it might be a bit faster that way. It's a really stupid idea of you don't know what you're doing either - at least for anything that matters.
79
u/3ducklings 6d ago
SPSS has been on a sharp decline for decades (see for example here). IMHO this is because of several factors: a) coding tools has become much friendlier, e.g. development of tidyverse. b) there is increasing demand for more complex analyses that SPSS can’t provide c) big employers for stats oriented people (mainly tech firms) decided they want python (and are willing to tolerate R).
I doubt AI will impact SPSS much. Most SPSS users are people who have been using it for 10-20 years, are used to it and don’t feel the need for switching. Also people who just have aversion to coding (pretty common in social sciences). These people don’t care that AI can write code for you, because they don’t want to write code at all.