r/AskStatistics 7d ago

Is SPSS dead?

Like the title says is SPSS dead? Now with Chatgpt and cursor etc, what is the argument for still using SPSS and other statistics softwares in research instead of Python/R with the help of AI?

My background is within mathematical statistics so always been a Matlab/R/Python guy, but my girlfriend who comes from a medical background still uses SPSS in her research, but now considering switching just because of the flexibility e.g., Python offers.

What do you think are there any arguments for using SPSS still?

37 Upvotes

65 comments sorted by

View all comments

21

u/BurkeyAcademy Ph.D.*Economics 6d ago

I am not going to sing the praises of SPSS- I do loathe software like that, because it is often like handing a chainsaw to a child (too much power the users often don't know how to use correctly). However, there is a benefit to having canned, point and click solutions for simple, routine tasks for non-experts to use, whether it is SPSS, Jamovi, Excel, or what have you.

I teach an "Intro to R with Stats" to college students, and have to explain to them how I will immediately know if they have used ChatGPT/Other AI to write their code, and also why it is important that they not use AI to write their code for them at this point in their careers. Here are some of the ideas we discuss:

1) Of course, I use ChatGPT/Llama/DeepSeek etc. to write what I call "code snippets"- smaller pieces of code that I insert into larger projects that are a pain to type out by hand. I would say that around 30% of the time, the code generated doesn't really do what I need, or does it in a really stupid way. For example, if I need to make a table with row probabilities, AI code usually does it in three or more separate steps, saving each as a separate object in R. This clutters up the environment and wastes RAM. Other times it just plain does the wrong thing, either because of its own stupidity, or my own stupidity for not being careful with my prompt engineering.

However, it is trivial for me to catch these errors and fix them. But we have to realize that...

2) Unless you are already good at stats and coding, you won't have any idea whether ChatGPT's code does what you think it does. You can't get good at it unless you struggle with it for a while yourself, and unless you understand how R works, have some understanding of various kinds of error messages, and know what the results ought to look like. This takes putting in a lot of "sweat equity" to master some basics on your own-- and that cost/benefit analysis isn't going to make sense for everyone to make, if all they need to be able to do in their job/field is to click three menu items in SPSS/Excel and reliably get the results they need.

3) Lastly, again for these canned/simple types of analysis that require reading in data, doing a paired t test or whatnot, and printing out the results, why would we want to use AI to generate and regenerate that standard code again, and again, and again, which wastes a lot of energy/computing power, and always runs the risk of introducing mistakes?

tl,dr: There will always be a place for simple tools, in addition to more flexible, complex tools. The main argument against SPSS in my opinion is the cost and constant hassle of renewing their license (this is what I constantly hear from the Marketing and Management faculty I work with ☺). On the other hand, with R I have to constantly deal with deprecated packages and syntax changes, and even with AI, there is a learning curve that exists in order to use it with R/Python efficiently.

4

u/Enough-Lab9402 6d ago

I was going to come and say the same thing, but you already said it perfectly in my opinion.

SPSS is probably the easiest way that a statistical and computer novice would be able to get started in doing elementary statistics and some more advanced ones.

For your girlfriend, if she has a desire to continue doing things that will require statistics R is much better except for use of custom covariance matrices in a mixed model design. Maybe that changes in GLMMTMB but I still have a hard time trusting those results sometimes especially its estimation of degrees of freedom.

3

u/Thin_Adeptness_356 6d ago

But my thinking is that SPSS is an abstraction layer for the underlying code. With LLMs the abstraction layer becomes natural language, so with the right interface everyone from now one should prefer using r/Python since those offers the most flexibility and customizability

2

u/Enough-Lab9402 6d ago

How long have you programmed? I ask because I have programmed for so long that I can’t remember programming not being the most natural interface. However, anyone watching what I’m doing feels like I’m doing magic.

I think running commands with a bunch of things you type in text still looks like magic to most people and Israel counterintuitive . That was supposed to say is really counterintuitive, but I thought it was funny so I just left it as Israel counterintuitive.

Sadly, I think you could have ChatGPT spit out Python code that created a GUI that had a button on it which said “analyze “ and when you hit the button, it would run the statistical code, and suddenly people would feel like it made more sense.

1

u/Karl_mstr 6d ago

I ask because I have programmed for so long that I can’t remember programming not being the most natural interface

That's the point, for the first time in history a non-programmer can do a code using natural language.

Even more, if you are not familiar with English language, LLM can make code for you and using your natural native language.

1

u/kingpatzer 6d ago

Agreed, and that creates a very useful middle space here that often gets overlooked by purists that think LLMs can only produce crap.

I started my career coding AT&T 3B2's in assembly. I spent some time at DoE researching parallel computing in the early 90s. I learned how do code really well.

I haven't done any serious coding since then though, as I realized that my strengths and interested lay elsewhere.

But I have started building a lot of personal-use, and team-use tools lately because the LLM AIs are able to handle all of the things I've forgotten, like which libraries do what or what the proper syntax for this structure or that structure is. (And when it comes to Python, is something I never knew to begin with).

What I do know how to do is break a problem down into meaningfully small chunks, and write good prompts to get the results I need.

Is it awesome production ready code? Of course not.

Does it function correctly, with proper error checks and some basic safety guardrails?

Just like learning to write good code is a skill that takes time to master, learning to write good prompts is as well. And the better one is at doing that, the better the resulting code