Over the past two months, SAS have been staking their claim for in-memory analytics for Hadoop. They have been providing data scientists and analysts with fast, powerful new tools for advanced analytics. First, there was SAS In-Memory Statistics for Hadoop, allowing users to build, explore and model data, all within the Hadoop framework. Now, they’re launching SAS Visual Statistics, which will allow users to build and modify predictive and prescriptive models on large volumes of data and visualise them, with SAS’ fast in-memory capabilities.
SAS’ speed lies in its in-memory framework, which bypasses the traditional (and much more timely) MapReduce framework, and stamps out costly data movement. “The problem with MapReduce is one node can’t talk to another,” states Wayne Thompson, chief data scientist at SAS. “If I’m trying to do analytic computation, and I’m trying to minimize some kind of residual–some kind of mean square estimate or something–those nodes can work together to reach the solution, so that’s a big advantage to why we’re quicker.”
Comparisons have been drawn between SAS’ products and other offerings such as ApacheSpark, which also relies on In-Memory Processing. However, Thompson remains confident in SAS’ unique capabilities. “The advantage of SAS is we can do all that without moving outside of Hadoop,” he says. “We can read the data just once into memory and go through that full analytical lifecycle. No other vendor is doing that. There’s some open source work going on out in California with the Spark project that has some of these kinds of capabilities, but none of the other vendors are doing this.”
The Visual Statistics package has many similarities with SAS’ Visual Analytics package, released 18 months ago and already used by over 1,300 sites. SAS state the previous Visual Analytics package is tailored for data analysts, whilst the new Visual Statistics package is tailored for data scientists. Visual Statistics also features a new ‘Group By’ tool, allowing users to group by certain parameters (say, purchase affinity with a promotion in a particular store) on a second, separate model within seconds.
SAS has been a leader in the field of advanced analytics since its foundation in 1976. It’s good to see older, more established companies innovating with newer technologies such as Hadoop and capitalizing on the in-memory gap in the market.