Moving from R to SAS

R-to-SAS.jpg

Are you an R or Python user? Could you do something for me, before you read any further? Search for “SAS language”. What do you think?

This is one of the first things I did when moving from academia, knowing and loving all things R, to data analysis at Butterfly Data where the SAS language would be my main tool. On day one my manager set me up with a SAS account, showed me the virtual learning environment, and more or less said “go!”

So, after perusing the courses available, admiring the website, appreciating the fancy looking analytics packages and copious amounts of documentation, I decided to look up “SAS language” on my preferred search engine. Until this point, I had no knowledge of anyone utilising SAS or with past experience of writing SAS code. Yet Wikipedia informed me that the SAS institute is the “world's largest privately held software business and its software is used by most of the Fortune 500.” I had been doing data analysis and coding for the past 4 years, why had I never heard of SAS before?

Twitter is rife with Python versus R debates and tweets belittling Excel. Journal snapshots of figures made in Prism are obvious, even SPSS gets a mention on occasion. Yet SAS, used by the heavyweights of the global economy and government organisations. With an ethical corporate culture.  Around since England last won the world cup. Started in a university. It doesn’t even get a mention… What is going on?

My mind tried to find a logical reason to explain the inconsistency… $3.1 Billion in revenue and no social media debate…

“Obviously it’s not open source, so people active on Twitter and in academia are less likely to use it…”

“Could be the Twitter circles I move in?”

“Maybe the language is too old…”

“Perhaps it isn’t trendy…”

“They don’t have an active community of talking users, everyone works in their own silo…”

Still not satisfied with a single answer I embarked on my training journey. Perhaps after spending some time getting to know the language and working in the various IDEs, I would be able to grasp the SAS market and understand why the worlds of R and Python are so far away?

I could bore people with the differences in loops, functions and macros between SAS and R. But in reality, I’m not here to convince anyone that one language is better than another. When I worked with biological data, I used R because that was the language most commonly used by my peers. I now work with data where security and data protection are key priorities, and where many of our clients’ systems incorporate SAS, so I use the SAS language. For most people, changing to a different language won’t be the result of some blog. All I can do here is impart an idea of what it is like to switch from R to SAS.

As when you begin to pick up any language, the training videos (of which SAS has many) can only take you so far, you only really start to learn and get a feel for the code when you actually have a problem and need to solve it in a script. That’s what I’m doing now.

So, after two months, how do I feel?

Well, the data I work with now is considerably bigger, but the challenges are similar. The SAS Enterprise Guide IDE is great, and is far less crash-prone when viewing large tables than RStudio. Viewing the data is easier and SAS gives you a warning when you bite off more than your machine can chew. I’m still learning when it’s better to do something in a “data step” or a “proc step.” Much like R there are multiple ways to do the same thing with your data. 

After running code, the SAS output log is interesting. It has been designed to be informative. Errors and notes are easily highlighted and can be viewed in isolation, avoiding the need to read through a stream of output line by line. Also, sometimes if you make a typo, SAS will actually fix and run your code. That simply does not seem right to me, where is the sacred struggle for total code perfection? 

I don’t mind the SAS semicolons; I enjoy the ability to organise my data into libraries. Mostly I miss R and its tidyverse pipes. There was something so clean about putting one thing %>% into another and then doing %>% something with it. Distilling my code to smaller and neater chunks that could produce the same result was always incredibly fulfilling. SAS seems more reader friendly, however this can mean there is more code on the screen at a given time. 

The reduced reliance on brackets, and use of spacing reminds me more of Python than R. However, manipulating tables feels very familiar through the use of PROC SQL commands and terminology that is the same as that used by dplyr in R. 

As for data visualisation I can’t comment on that yet (but when I can, it will be deserving of its own post). That satisfaction of writing working code and making a graph or a table is still there, and it is that day-to-day data and coding which matters most for me.

My proudest moment in R was climbing the tower of abstraction to the point where I could intuitively write functions and use Purrr’s map and pmap to condense my code as much as possible. I’m not sure things will get as abstract with SAS macro, but I hope they do.

When comparing programming languages, specifically R versus Python, there are many parallels with Cristiano Ronaldo versus Lionel Messi. Ronaldo is obviously Python (successful and widely used across many domains, proven in multiple clubs and his country) and Messi is R (extremely successful and brilliant in two domains, one club and country). The shared domains being academia and data science, with Python also entering into web development and other fields. So, who is SAS?

Well a good analogy might be Sergio Busquets. Most people outside of the footballing world have never heard of him. He’s not stylish, does not get media attention, and he can’t do everything Ronaldo and Messi can do. However, he has won everything there is to win. He was the backbone of both the most successful club and national football team of the past two decades. A consistent and steady performer for years. Underrated?

Much like Sergio Busquets, I don’t feel like I can do just about anything in SAS (yet). I know what I can do. And what I can do is solid, reliable, and easy to pick up. What is missing is that feeling of freedom, often balanced by hopeless disorientation, that comes when working in R, that isn’t quite there in SAS. Give me a few more months and I’ll let you know if it comes.

Previous
Previous

Anti-fraud Analytics Solutions with Butterfly Data

Next
Next

Eco-friendly Retail-trends